Warning

This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

galaxy.model.dataset_collections package

Subpackages

Submodules

galaxy.model.dataset_collections.builder module

galaxy.model.dataset_collections.builder.build_collection(type: BaseDatasetCollectionType, dataset_instances: DatasetInstanceMapping, collection: DatasetCollection | None = None, associated_identifiers: set[str] | None = None, fields: str | list[FieldDict] | None = None, column_definitions=None, rows: dict[str, SampleSheetRow | None] | None = None)[source]

Build DatasetCollection with populated DatasetcollectionElement objects corresponding to the supplied dataset instances or throw exception if this is not a valid collection of the specified type.

galaxy.model.dataset_collections.builder.set_collection_elements(dataset_collection: DatasetCollection, type: BaseDatasetCollectionType, dataset_instances: DatasetInstanceMapping, associated_identifiers: set[str], fields: str | list[FieldDict] | None = None, rows: dict[str, SampleSheetRow | None] | None = None) DatasetCollection[source]
galaxy.model.dataset_collections.builder.guess_fields(dataset_instances: DatasetInstanceMapping) list[FieldDict][source]
class galaxy.model.dataset_collections.builder.CollectionBuilder(collection_type_description)[source]

Bases: object

Purely functional builder pattern for building a dataset collection.

__init__(collection_type_description)[source]
replace_elements_in_collection(template_collection: CollectionAdapter | DatasetCollection, replacement_dict: dict[DatasetInstance, DatasetInstance]) None[source]
get_level(identifier: str, row: SampleSheetRow | None = None) CollectionBuilder[source]
add_dataset(identifier: str, dataset_instance: DatasetInstance, row: SampleSheetRow | None = None) None[source]
build_elements() DatasetInstanceMapping[source]
build_elements_and_rows() tuple[DatasetInstanceMapping, dict[str, SampleSheetRow | None] | None][source]
build() DatasetCollection[source]
class galaxy.model.dataset_collections.builder.BoundCollectionBuilder(dataset_collection)[source]

Bases: CollectionBuilder

More stateful builder that is bound to a particular model object.

__init__(dataset_collection)[source]
populate_partial()[source]
populate()[source]

galaxy.model.dataset_collections.matching module

class galaxy.model.dataset_collections.matching.CollectionsToMatch[source]

Bases: object

Structure representing a set of collections that need to be matched up when running tools (possibly workflows in the future as well).

__init__()[source]
add(input_name, hdca, subcollection_type=None, linked=True)[source]
has_collections()[source]
items()[source]
class galaxy.model.dataset_collections.matching.MatchingCollections[source]

Bases: object

Structure holding the result of matching a list of collections together. This class being different than the class above and being created in the DatasetCollectionManager layer may seem like overkill but I suspect in the future plugins will be subtypable for instance so matching collections will need to make heavy use of the dataset collection type registry managed by the dataset collections service - hence the complexity now.

__init__()[source]
slice_collections()[source]
subcollection_mapping_type(input_name)[source]
property structure

Yield cross product of all unlinked collections structures to linked collection structure.

map_over_action_tuples(input_name)[source]
is_mapped_over(input_name)[source]
static for_collections(collections_to_match, collection_type_descriptions) MatchingCollections | None[source]

galaxy.model.dataset_collections.registry module

class galaxy.model.dataset_collections.registry.DatasetCollectionTypesRegistry[source]

Bases: object

__init__()[source]
get(plugin_type)[source]
prototype(plugin_type, fields=None)[source]

galaxy.model.dataset_collections.structure module

Module for reasoning about structure of and matching hierarchical collections of data.

class galaxy.model.dataset_collections.structure.Leaf[source]

Bases: object

children_known = True
property is_leaf
clone()[source]
multiply(other_structure)[source]
sliced_collection_type(collection)[source]
class galaxy.model.dataset_collections.structure.BaseTree(collection_type_description)[source]

Bases: object

__init__(collection_type_description)[source]
class galaxy.model.dataset_collections.structure.UninitializedTree(collection_type_description)[source]

Bases: BaseTree

children_known = False
clone()[source]
property is_leaf
multiply(other_structure)[source]
class galaxy.model.dataset_collections.structure.Tree(children, collection_type_description, when_values=None, columns_metadata=None, column_definitions=None)[source]

Bases: BaseTree

children_known = True
__init__(children, collection_type_description, when_values=None, columns_metadata=None, column_definitions=None)[source]
static for_dataset_collection(dataset_collection, collection_type_description)[source]
walk_collections(collection_dict)[source]
property is_leaf
compatible_shape(other_structure)[source]

Symmetric sibling-matching check.

Both sides have already passed connection-time edge validation; here we only compare shape. Uses compatible (not accepts) so order of arrival does not change the answer.

multiply(other_structure)[source]
clone()[source]
galaxy.model.dataset_collections.structure.tool_output_to_structure(get_sliced_input_collection_structure, tool_output, collections_manager)[source]
galaxy.model.dataset_collections.structure.dict_map(func, input_dict)[source]
galaxy.model.dataset_collections.structure.get_collection(dataset_collection_instance: CollectionLike) DatasetCollection[source]

Return the DatasetCollection contained by a collection instance.

A DatasetCollectionElement has two collection references:
  • collection: the parent collection this element belongs to

  • child_collection: the nested collection this element contains

An HDCA has one:
  • collection: the collection it wraps

This helper returns the contained collection in both cases (child_collection for DCE, collection for HDCA/adapters) and is intended for callers that still hold a wrapper object and need a DatasetCollection to pass to get_structure or walk_collections.

galaxy.model.dataset_collections.structure.get_structure(collection: DatasetCollection, collection_type_description: CollectionTypeDescription, leaf_subcollection_type: str | None = None)[source]

Build a Tree (or UninitializedTree) describing a collection’s shape.

collection_type_description controls the depth of the tree: elements below leaf_subcollection_type are treated as leaves.

galaxy.model.dataset_collections.subcollections module

galaxy.model.dataset_collections.subcollections.split_dataset_collection_instance(dataset_collection_instance: HistoryDatasetCollectionAssociation, collection_type: str) list[DatasetCollectionElement | PromoteCollectionElementToCollectionAdapter][source]

Split up collection into collection.

galaxy.model.dataset_collections.type_description module

Collection type descriptions and the compatibility algebra.

Three operations on collection types, each answering a distinct question:

  • accepts(other): asymmetric direct-edge check. True iff an output collection of type other can be connected to an input slot whose declared type is self. Used at workflow-editor edge validation. Convention: input_type.accepts(output_type).

  • compatible(other): symmetric sibling-matching check. True iff two collection types match such that they could drive a common map-over over sibling inputs of one tool. Used where neither side is the input and order of arrival must not change the answer.

  • can_map_over(other): asymmetric nesting check. True iff self has proper subcollections of type other — i.e. self can be mapped over to feed a slot expecting other. Convention: output_type.can_map_over(input_type).

The TypeScript equivalents live in client/src/components/Workflow/Editor/modules/collectionTypeDescription.ts and must stay in sync (accepts / compatible / canMapOver). See types/collection_semantics.yml “Type Compatibility Algebra” for the lattice diagram and worked examples.

class galaxy.model.dataset_collections.type_description.CollectionTypeDescriptionFactory(type_registry=<galaxy.model.dataset_collections.registry.DatasetCollectionTypesRegistry object>)[source]

Bases: object

__init__(type_registry=<galaxy.model.dataset_collections.registry.DatasetCollectionTypesRegistry object>)[source]
for_collection_type(collection_type, fields: str | list[FieldDict] | None = None)[source]
class galaxy.model.dataset_collections.type_description.CollectionTypeDescription(collection_type: str | CollectionTypeDescription, collection_type_description_factory: CollectionTypeDescriptionFactory, fields: str | list[FieldDict] | None = None)[source]

Bases: object

Abstraction over dataset collection type that ties together string representation in database/model with type registry.

__init__(collection_type: str | CollectionTypeDescription, collection_type_description_factory: CollectionTypeDescriptionFactory, fields: str | list[FieldDict] | None = None)[source]
collection_type: str
child_collection_type()[source]
child_collection_type_description()[source]
effective_collection_type_description(subcollection_type)[source]
effective_collection_type(subcollection_type)[source]
can_map_over(other_collection_type) bool[source]

Asymmetric nesting check: can this collection be mapped over to feed an input requiring other_collection_type?

Convention: output.can_map_over(input). True iff self has proper subcollections matching other — a type is not considered to map over itself (that’s a direct edge, handled by accepts).

Mirrors TypeScript CollectionTypeDescription.canMapOver. Naming kept parallel across languages because both encode the same operational question at workflow-editor connection time.

accepts(other_collection_type) bool[source]

Asymmetric direct-edge check: does an input slot of type self accept an output of type other_collection_type?

Convention: input_type.accepts(output_type). Used at workflow-editor edge validation. For sibling-matching (where neither side is the input slot), use compatible instead.

See types/collection_semantics.yml “Type Compatibility Algebra”.

compatible(other_collection_type) bool[source]

Symmetric sibling-matching check: do self and other match such that they could drive a common map-over over sibling inputs of a single tool?

Implemented as self.accepts(other) or other.accepts(self). Used at sibling-matching sites (Python Tree.compatible_shape at runtime; TS mappingConstraints at connection time) where neither side is the input slot and order of arrival should not change the answer.

See types/collection_semantics.yml “Type Compatibility Algebra”.

subcollection_type_description()[source]
has_subcollections()[source]
rank_collection_type()[source]

Return the top-level collection type corresponding to this collection type. For instance the “rank” type of a list of paired data (“list:paired”) is “list”.

rank_type_plugin()[source]
property dimension
multiply(other_collection_type)[source]
validate()[source]

Validate that this collection type is a valid Galaxy collection type.

galaxy.model.dataset_collections.type_description.map_over_collection_type(mapped_over_collection_type, target_collection_type)[source]