Warning

This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

galaxy.objectstore package

objectstore package, abstraction for storing blobs of data for use in Galaxy.

all providers ensure that data can be accessed on the filesystem for running tools

class galaxy.objectstore.UserObjectStoreResolver(*args, **kwargs)[source]

Bases: Protocol

resolve_object_store_uri_config(uri: str) AwsS3ObjectStoreConfiguration | Boto3ObjectStoreConfiguration | DiskObjectStoreConfiguration | AzureObjectStoreConfiguration | GenericS3ObjectStoreConfiguration | OnedataObjectStoreConfiguration[source]
resolve_object_store_uri(uri: str) ConcreteObjectStore[source]
__init__(*args, **kwargs)
class galaxy.objectstore.BaseUserObjectStoreResolver(*args, **kwargs)[source]

Bases: UserObjectStoreResolver

abstract resolve_object_store_uri_config(uri: str) AwsS3ObjectStoreConfiguration | Boto3ObjectStoreConfiguration | DiskObjectStoreConfiguration | AzureObjectStoreConfiguration | GenericS3ObjectStoreConfiguration | OnedataObjectStoreConfiguration[source]

Resolve the supplied object store URI into a concrete object store configuration.

resolve_object_store_uri(uri: str) ConcreteObjectStore[source]
class galaxy.objectstore.ObjectStore[source]

Bases: object

ObjectStore interface.

FIELD DESCRIPTIONS (these apply to all the methods in this class):

Parameters:
  • obj (StorableObject) – A Galaxy object with an assigned database ID accessible via the .id attribute.

  • base_dir (string) – A key in self.extra_dirs corresponding to the base directory in which this object should be created, or None to specify the default directory.

  • dir_only (boolean) – If True, check only the path where the file identified by obj should be located, not the dataset itself. This option applies to extra_dir argument as well.

  • extra_dir (string) – Append extra_dir to the directory structure where the dataset identified by obj should be located. (e.g., 000/extra_dir/obj.id). Valid values include ‘job_work’ (defaulting to config.jobs_directory = ‘$GALAXY_ROOT/database/jobs_directory’); ‘temp’ (defaulting to config.new_file_path = ‘$GALAXY_ROOT/database/tmp’).

  • extra_dir_at_root (boolean) – Applicable only if extra_dir is set. If True, the extra_dir argument is placed at root of the created directory structure rather than at the end (e.g., extra_dir/000/obj.id vs. 000/extra_dir/obj.id)

  • alt_name (string) – Use this name as the alternative name for the created dataset rather than the default.

  • obj_dir (boolean) – Append a subdirectory named with the object’s ID (e.g. 000/obj.id)

abstract exists(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) bool[source]

Return True if the object identified by obj exists, False otherwise.

abstract construct_path(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False, in_cache: bool = False) str[source]
abstract create(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False)[source]

Mark the object (obj) as existing in the store, but with no content.

This method will create a proper directory structure for the file if the directory does not already exist.

The method returns the concrete objectstore the supplied object is stored in.

abstract empty(obj, base_dir=None, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) bool[source]

Test if the object identified by obj has content.

If the object does not exist raises ObjectNotFound.

abstract size(obj, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) int[source]

Return size of the object identified by obj.

If the object does not exist, return 0.

abstract delete(obj, entire_dir: bool = False, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) bool[source]

Delete the object identified by obj.

Parameters:

entire_dir (boolean) – If True, delete the entire directory pointed to by extra_dir. For safety reasons, this option applies only for and in conjunction with the extra_dir or obj_dir options.

abstract get_data(obj, start=0, count=-1, base_dir=None, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False)[source]

Fetch count bytes of data offset by start bytes using obj.id.

If the object does not exist raises ObjectNotFound.

Parameters:
  • start (int) – Set the position to start reading the dataset file

  • count (int) – Read at most count bytes from the dataset

abstract get_filename(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False, sync_cache: bool = True) str[source]

Get the expected filename with absolute path for object with id obj.id.

This can be used to access the contents of the object.

abstract update_from_file(obj, base_dir=None, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False, file_name=None, create: bool = False, preserve_symlinks: bool = False) None[source]

Inform the store that the file associated with obj.id has been updated.

If file_name is provided, update from that file instead of the default. If the object does not exist raises ObjectNotFound.

Parameters:
  • file_name (string) – Use file pointed to by file_name as the source for updating the dataset identified by obj

  • create (boolean) – If True and the default dataset does not exist, create it first.

abstract get_object_url(obj, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False)[source]

Return the URL for direct access if supported, otherwise return None.

Note: need to be careful to not bypass dataset security with this.

abstract get_concrete_store_name(obj)[source]

Return a display name or title of the objectstore corresponding to obj.

To accommodate nested objectstores, obj is passed in so this metadata can be returned for the ConcreteObjectStore corresponding to the object.

If the dataset is in a new or discarded state and an object_store_id has not yet been set, this may return None.

abstract get_concrete_store_description_markdown(obj)[source]

Return a longer description of how data ‘obj’ is stored.

To accommodate nested objectstores, obj is passed in so this metadata can be returned for the ConcreteObjectStore corresponding to the object.

If the dataset is in a new or discarded state and an object_store_id has not yet been set, this may return None.

abstract get_concrete_store_badges(obj) List[BadgeDict][source]

Return a list of dictified badges summarizing the object store configuration.

abstract is_private(obj) bool[source]

Return True iff supplied object is stored in private ConcreteObjectStore.

object_store_ids(private=None)[source]

Return IDs of all concrete object stores - either private ones or non-private ones.

This should just return an empty list for non-DistributedObjectStore object stores, i.e. concrete objectstores and the HierarchicalObjectStore since these do not use the object_store_id column for objects (Galaxy Datasets).

object_store_allows_id_selection() bool[source]

Return True if this object store respects object_store_id and allow selection of this.

validate_selected_object_store_id(user, object_store_id: str | None) str | None[source]
object_store_ids_allowing_selection() List[str][source]

Return a non-emtpy list of allowed selectable object store IDs during creation.

get_concrete_store_by_object_store_id(object_store_id: str) ConcreteObjectStore | None[source]

If this is a distributed object store, get ConcreteObjectStore by object_store_id.

abstract get_store_usage_percent()[source]

Return the percentage indicating how full the store is.

abstract get_store_by(obj)[source]

Return how object is stored (by ‘uuid’, ‘id’, or None if not yet saved).

Certain Galaxy remote data features aren’t available if objects are stored by ‘id’.

abstract cache_targets() List[CacheTarget][source]

Return a list of CacheTargets used by this object store.

abstract to_dict() Dict[str, Any][source]
abstract get_quota_source_map()[source]

Return QuotaSourceMap describing mapping of object store IDs to quota sources.

abstract get_device_source_map() DeviceSourceMap[source]

Return DeviceSourceMap describing mapping of object store IDs to device sources.

class galaxy.objectstore.BaseObjectStore(config, config_dict=None, **kwargs)[source]

Bases: ObjectStore

store_by: str
store_type: str
__init__(config, config_dict=None, **kwargs)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

start()[source]

Call all postfork function(s) here. These functions spawn a thread. We register start(self) with app, so app starts the threads. Override this function in subclasses, as needed.

shutdown()[source]

Close any connections for this ObjectStore.

classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

classmethod from_xml(config, config_xml, **kwd)[source]
to_dict()[source]
exists(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) bool[source]

Return True if the object identified by obj exists, False otherwise.

construct_path(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False, in_cache: bool = False) str[source]
create(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False)[source]

Mark the object (obj) as existing in the store, but with no content.

This method will create a proper directory structure for the file if the directory does not already exist.

The method returns the concrete objectstore the supplied object is stored in.

empty(obj, base_dir=None, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) bool[source]

Test if the object identified by obj has content.

If the object does not exist raises ObjectNotFound.

size(obj, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) int[source]

Return size of the object identified by obj.

If the object does not exist, return 0.

delete(obj, entire_dir: bool = False, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False) bool[source]

Delete the object identified by obj.

Parameters:

entire_dir (boolean) – If True, delete the entire directory pointed to by extra_dir. For safety reasons, this option applies only for and in conjunction with the extra_dir or obj_dir options.

get_data(obj, start=0, count=-1, base_dir=None, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False)[source]

Fetch count bytes of data offset by start bytes using obj.id.

If the object does not exist raises ObjectNotFound.

Parameters:
  • start (int) – Set the position to start reading the dataset file

  • count (int) – Read at most count bytes from the dataset

get_filename(obj, base_dir=None, dir_only=False, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False, sync_cache: bool = True) str[source]

Get the expected filename with absolute path for object with id obj.id.

This can be used to access the contents of the object.

update_from_file(obj, base_dir=None, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False, file_name=None, create: bool = False, preserve_symlinks: bool = False) None[source]

Inform the store that the file associated with obj.id has been updated.

If file_name is provided, update from that file instead of the default. If the object does not exist raises ObjectNotFound.

Parameters:
  • file_name (string) – Use file pointed to by file_name as the source for updating the dataset identified by obj

  • create (boolean) – If True and the default dataset does not exist, create it first.

get_object_url(obj, extra_dir=None, extra_dir_at_root=False, alt_name=None, obj_dir: bool = False)[source]

Return the URL for direct access if supported, otherwise return None.

Note: need to be careful to not bypass dataset security with this.

get_concrete_store_name(obj)[source]

Return a display name or title of the objectstore corresponding to obj.

To accommodate nested objectstores, obj is passed in so this metadata can be returned for the ConcreteObjectStore corresponding to the object.

If the dataset is in a new or discarded state and an object_store_id has not yet been set, this may return None.

get_concrete_store_description_markdown(obj)[source]

Return a longer description of how data ‘obj’ is stored.

To accommodate nested objectstores, obj is passed in so this metadata can be returned for the ConcreteObjectStore corresponding to the object.

If the dataset is in a new or discarded state and an object_store_id has not yet been set, this may return None.

get_concrete_store_badges(obj) List[BadgeDict][source]

Return a list of dictified badges summarizing the object store configuration.

get_store_usage_percent()[source]

Return the percentage indicating how full the store is.

get_store_by(obj, **kwargs)[source]

Return how object is stored (by ‘uuid’, ‘id’, or None if not yet saved).

Certain Galaxy remote data features aren’t available if objects are stored by ‘id’.

is_private(obj) bool[source]

Return True iff supplied object is stored in private ConcreteObjectStore.

cache_targets() List[CacheTarget][source]

Return a list of CacheTargets used by this object store.

classmethod parse_private_from_config_xml(config_xml)[source]
classmethod parse_badges_from_config_xml(badges_xml)[source]
get_quota_source_map()[source]

Return QuotaSourceMap describing mapping of object store IDs to quota sources.

get_device_source_map()[source]

Return DeviceSourceMap describing mapping of object store IDs to device sources.

class galaxy.objectstore.ConcreteObjectStore(config, config_dict=None, **kwargs)[source]

Bases: BaseObjectStore

Subclass of ObjectStore for stores that don’t delegate (non-nested).

Adds store_by and quota_source functionality. These attributes do not make sense for the delegating object stores, they should describe files at actually persisted, not how a file is routed to a persistence source.

cloud: bool = False
__init__(config, config_dict=None, **kwargs)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

device_id: str | None = None
badges: List[StoredBadgeDict]
to_dict()[source]
to_model(object_store_id: str) ConcreteObjectStoreModel[source]
property cache_target: CacheTarget | None
cache_targets() List[CacheTarget][source]

Return a list of CacheTargets used by this object store.

get_quota_source_map()[source]

Return QuotaSourceMap describing mapping of object store IDs to quota sources.

get_device_source_map() DeviceSourceMap[source]

Return DeviceSourceMap describing mapping of object store IDs to device sources.

class galaxy.objectstore.DiskObjectStore(config, config_dict)[source]

Bases: ConcreteObjectStore

Standard Galaxy object store.

Stores objects in files under a specific directory on disk.

>>> from galaxy.util.bunch import Bunch
>>> import tempfile
>>> file_path=tempfile.mkdtemp()
>>> obj = Bunch(id=1)
>>> s = DiskObjectStore(Bunch(umask=0o077, jobs_directory=file_path, new_file_path=file_path, object_store_check_old_style=False, enable_quotas=True), dict(files_dir=file_path))
>>> o = s.create(obj)
>>> s.exists(obj)
True
>>> assert s.get_filename(obj) == file_path + '/000/dataset_1.dat'
store_type: str = 'disk'
__init__(config, config_dict)[source]
Parameters:
  • config (object) –

    An object, most likely populated from galaxy/config.ini, having the same attributes needed by ObjectStore plus:

    • file_path – Default directory to store objects to disk in.

    • umask – the permission bits for newly created files.

  • file_path (str) – Override for the config.file_path value.

  • extra_dirs (dict) – Keys are string, values are directory paths.

classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

to_dict()[source]
badges: List[StoredBadgeDict]
store_by: str
class galaxy.objectstore.NestedObjectStore(config, config_xml=None)[source]

Bases: BaseObjectStore

Base for ObjectStores that use other ObjectStores.

Example: DistributedObjectStore, HierarchicalObjectStore

__init__(config, config_xml=None)[source]

Extend ObjectStore’s constructor.

backends: Dict
shutdown()[source]

For each backend, shuts them down.

cache_targets() List[CacheTarget][source]

Return a list of CacheTargets used by this object store.

galaxy.objectstore.user_object_store_configuration_to_config_dict(object_store_config: AwsS3ObjectStoreConfiguration | Boto3ObjectStoreConfiguration | DiskObjectStoreConfiguration | AzureObjectStoreConfiguration | GenericS3ObjectStoreConfiguration | OnedataObjectStoreConfiguration, id) Dict[str, Any][source]
class galaxy.objectstore.DistributedObjectStore(config, config_dict, fsmon=False, user_object_store_resolver: UserObjectStoreResolver | None = None)[source]

Bases: NestedObjectStore

ObjectStore that defers to a list of backends.

When getting objects the first store where the object exists is used. When creating objects they are created in a store selected randomly, but with weighting.

backends: Dict[str, Any]
store_type: str = 'distributed'
__init__(config, config_dict, fsmon=False, user_object_store_resolver: UserObjectStoreResolver | None = None)[source]
Parameters:
  • config (object) –

    An object, most likely populated from galaxy/config.ini, having the same attributes needed by NestedObjectStore plus:

    • distributed_object_store_config_file

  • fsmon (bool) – If True, monitor the file system for free space, removing backends when they get too full.

classmethod parse_xml(config_xml, legacy=False)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

classmethod from_xml(config, config_xml, fsmon=False, user_object_store_resolver: UserObjectStoreResolver | None = None, **kwd)[source]
to_dict(object_store_uris: Set[str] | None = None) Dict[str, Any][source]
shutdown()[source]

Shut down. Kill the free space monitor if there is one.

get_quota_source_map() QuotaSourceMap[source]

Return QuotaSourceMap describing mapping of object store IDs to quota sources.

get_device_source_map() DeviceSourceMap[source]

Return DeviceSourceMap describing mapping of object store IDs to device sources.

object_store_ids(private=None)[source]

Return IDs of all concrete object stores - either private ones or non-private ones.

This should just return an empty list for non-DistributedObjectStore object stores, i.e. concrete objectstores and the HierarchicalObjectStore since these do not use the object_store_id column for objects (Galaxy Datasets).

object_store_allows_id_selection() bool[source]

Return True if this object store respects object_store_id and allow selection of this.

validate_selected_object_store_id(user, object_store_id: str | None) str | None[source]
object_store_ids_allowing_selection() List[str][source]

Return a non-empty list of allowed selectable object store IDs during creation.

get_concrete_store_by_object_store_id(object_store_id: str) ConcreteObjectStore | None[source]

If this is a distributed object store, get ConcreteObjectStore by object_store_id.

class galaxy.objectstore.HierarchicalObjectStore(config, config_dict, fsmon=False)[source]

Bases: NestedObjectStore

ObjectStore that defers to a list of backends.

When getting objects the first store where the object exists is used. When creating objects only the first store is used.

backends: Dict[int, BaseObjectStore]
store_type: str = 'hierarchical'
__init__(config, config_dict, fsmon=False)[source]

The default constructor. Extends NestedObjectStore.

classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

to_dict()[source]
get_quota_source_map()[source]

Return QuotaSourceMap describing mapping of object store IDs to quota sources.

galaxy.objectstore.serialize_static_object_store_config(object_store: ObjectStore, object_store_uris: Set[str]) Dict[str, Any][source]

Serialize a static object store configuration for database-less serialization.

The database-less part here comes from the fact these are used in job directories during extended metadata collection. Any database/vault/app config details should be unpacked and the result should be an object store configuration that doesn’t depend on those entities but which resolves to the same locations.

class galaxy.objectstore.QuotaModel(*, source: str | None = None, enabled: bool)[source]

Bases: BaseModel

source: str | None
enabled: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class galaxy.objectstore.ConcreteObjectStoreModel(*, object_store_id: str | None = None, private: bool, name: str | None = None, description: str | None = None, quota: QuotaModel, badges: List[BadgeDict], device: str | None = None)[source]

Bases: BaseModel

object_store_id: str | None
private: bool
name: str | None
description: str | None
quota: QuotaModel
badges: List[BadgeDict]
device: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

galaxy.objectstore.type_to_object_store_class(store: str, fsmon: bool = False, user_object_store_resolver: UserObjectStoreResolver | None = None) Tuple[Type[BaseObjectStore], Dict[str, Any]][source]
galaxy.objectstore.build_test_object_store_from_user_config(config, object_store_config: AwsS3ObjectStoreConfiguration | Boto3ObjectStoreConfiguration | DiskObjectStoreConfiguration | AzureObjectStoreConfiguration | GenericS3ObjectStoreConfiguration | OnedataObjectStoreConfiguration)[source]
galaxy.objectstore.build_object_store_from_config(config, fsmon=False, config_xml=None, config_dict=None, disable_process_management=False, user_object_store_resolver: UserObjectStoreResolver | None = None)[source]

Invoke the appropriate object store.

Will use the object_store_config_file attribute of the config object to configure a new object store from the specified XML file.

Or you can specify the object store type in the object_store attribute of the config object. Currently ‘disk’, ‘s3’, ‘swift’, ‘distributed’, ‘hierarchical’, ‘irods’, and ‘pulsar’ are supported values.

class galaxy.objectstore.UserObjectStoresAppConfig(*, object_store_cache_path: str, object_store_cache_size: int, user_config_templates_use_saved_configuration: typing_extensions.Literal[fallback, preferred, never], jobs_directory: str, new_file_path: str, umask: int, gid: int)[source]

Bases: BaseModel

object_store_cache_path: str
object_store_cache_size: int
user_config_templates_use_saved_configuration: typing_extensions.Literal[fallback, preferred, never]
jobs_directory: str
new_file_path: str
umask: int
gid: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

galaxy.objectstore.concrete_object_store(object_store_configuration: AwsS3ObjectStoreConfiguration | Boto3ObjectStoreConfiguration | DiskObjectStoreConfiguration | AzureObjectStoreConfiguration | GenericS3ObjectStoreConfiguration | OnedataObjectStoreConfiguration, app_config: UserObjectStoresAppConfig) ConcreteObjectStore[source]
galaxy.objectstore.local_extra_dirs(func)[source]

Non-local plugin decorator using local directories for the extra_dirs (job_work and temp).

galaxy.objectstore.config_to_dict(config)[source]

Dict-ify the portion of a config object consumed by the ObjectStore class and its subclasses.

class galaxy.objectstore.QuotaSourceInfo(label, use)[source]

Bases: tuple

label: str | None

Alias for field number 0

use: bool

Alias for field number 1

class galaxy.objectstore.DeviceSourceMap(device_id=None)[source]

Bases: object

__init__(device_id=None)[source]
get_device_id(object_store_id: str) str | None[source]
class galaxy.objectstore.QuotaSourceMap(source=None, enabled=True)[source]

Bases: object

__init__(source=None, enabled=True)[source]
get_quota_source_info(object_store_id)[source]
get_quota_source_label(object_store_id)[source]
get_quota_source_labels()[source]
default_usage_excluded_ids()[source]
get_id_to_source_pairs(include_default_quota_source=False)[source]
ids_per_quota_source(include_default_quota_source=False)[source]
exception galaxy.objectstore.ObjectCreationProblem[source]

Bases: Exception

exception galaxy.objectstore.ObjectCreationProblemSharingDisabled[source]

Bases: ObjectCreationProblem

client_message = 'Job attempted to create sharable output datasets in a storage location with sharing disabled'
exception galaxy.objectstore.ObjectCreationProblemStoreFull[source]

Bases: ObjectCreationProblem

client_message = 'Job attempted to create output datasets in a full storage location, please contact your admin for more details'
class galaxy.objectstore.ObjectStorePopulator(has_object_store, user)[source]

Bases: object

Small helper for interacting with the object store and making sure all datasets from a job end up with the same object_store_id.

__init__(has_object_store, user)[source]
set_object_store_id(data: DatasetInstance, require_shareable: bool = False) None[source]
set_dataset_object_store_id(dataset: Dataset, require_shareable: bool = True) None[source]
galaxy.objectstore.persist_extra_files(object_store: ObjectStore, src_extra_files_path: str, primary_data: DatasetInstance, extra_files_path_name: str | None = None) None[source]
galaxy.objectstore.persist_extra_files_for_dataset(object_store: ObjectStore, src_extra_files_path: str, dataset: Dataset, extra_files_path_name: str)[source]

Subpackages

Submodules

galaxy.objectstore.azure_blob module

Object Store plugin for the Microsoft Azure Block Blob Storage system

galaxy.objectstore.azure_blob.parse_config_xml(config_xml)[source]
class galaxy.objectstore.azure_blob.AzureBlobObjectStore(config, config_dict)[source]

Bases: CachingConcreteObjectStore

Object store that stores objects as blobs in an Azure Blob Container. A local cache exists that is used as an intermediate location for files between Galaxy and Azure.

store_type: str = 'azure_blob'
cloud: bool = True
__init__(config, config_dict)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

enable_cache_monitor: bool
cache_monitor_interval: int
cache_size: int
staging_path: str
cache_updated_data: bool
to_dict()[source]
classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

shutdown()[source]

Close any connections for this ObjectStore.

extra_dirs: Dict[str, str]
config: Any

galaxy.objectstore.cloud module

Object Store plugin for Cloud storage.

class galaxy.objectstore.cloud.Cloud(config, config_dict)[source]

Bases: CachingConcreteObjectStore, UsesAxel

Object store that stores objects as items in an cloud storage. A local cache exists that is used as an intermediate location for files between Galaxy and the cloud storage.

store_type: str = 'cloud'
__init__(config, config_dict)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

enable_cache_monitor: bool
cache_monitor_interval: int
cache_size: int
staging_path: str
cache_updated_data: bool
classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

to_dict()[source]
shutdown()[source]

Close any connections for this ObjectStore.

extra_dirs: Dict[str, str]
config: Any

galaxy.objectstore.irods module

Object Store plugin for the Integrated Rule-Oriented Data System (iRODS)

galaxy.objectstore.irods.parse_config_xml(config_xml)[source]
class galaxy.objectstore.irods.IRODSObjectStore(config, config_dict)[source]

Bases: CachingConcreteObjectStore

Object store that stores files as data objects in an iRODS Zone. A local cache exists that is used as an intermediate location for files between Galaxy and iRODS.

store_type: str = 'irods'
__init__(config, config_dict)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

cache_size: int
staging_path: str
cache_updated_data: bool
shutdown()[source]

Close any connections for this ObjectStore.

classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

start_connection_pool_monitor()[source]
start()[source]

Call all postfork function(s) here. These functions spawn a thread. We register start(self) with app, so app starts the threads. Override this function in subclasses, as needed.

to_dict()[source]
extra_dirs: Dict[str, str]
config: Any
enable_cache_monitor: bool
cache_monitor_interval: int

galaxy.objectstore.pithos module

galaxy.objectstore.pithos.parse_config_xml(config_xml)[source]

Parse and validate config_xml, return dict for convenience :param config_xml: (lxml.etree.Element) root of XML subtree :returns: (dict) according to syntax :raises: various XML parse errors

class galaxy.objectstore.pithos.PithosObjectStore(config, config_dict)[source]

Bases: CachingConcreteObjectStore

Object store that stores objects as items in a Pithos+ container. Cache is ignored for the time being.

store_type: str = 'pithos'
__init__(config, config_dict)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

staging_path: str
classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

to_dict()[source]
extra_dirs: Dict[str, str]
config: Any
cache_updated_data: bool
enable_cache_monitor: bool
cache_size: int
cache_monitor_interval: int

galaxy.objectstore.pulsar module

class galaxy.objectstore.pulsar.PulsarObjectStore(config, config_xml)[source]

Bases: BaseObjectStore

Object store implementation that delegates to a remote Pulsar server.

This may be more aspirational than practical for now, it would be good to Galaxy to a point that a handler thread could be setup that doesn’t attempt to access the disk files returned by a (this) object store - just passing them along to the Pulsar unmodified. That modification - along with this implementation and Pulsar job destinations would then allow Galaxy to fully manage jobs on remote servers with completely different mount points.

This implementation should be considered beta and may be dropped from Galaxy at some future point or significantly modified.

__init__(config, config_xml)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

file_ready(obj, **kwds)[source]
shutdown()[source]

Close any connections for this ObjectStore.

store_by: str
store_type: str

galaxy.objectstore.s3 module

Object Store plugin for the Amazon Simple Storage Service (S3)

galaxy.objectstore.s3.download_directory(bucket, remote_folder, local_path)[source]
galaxy.objectstore.s3.parse_config_xml(config_xml)[source]
class galaxy.objectstore.s3.CloudConfigMixin[source]

Bases: object

class galaxy.objectstore.s3.S3ObjectStore(config, config_dict)[source]

Bases: CachingConcreteObjectStore, CloudConfigMixin, UsesAxel

Object store that stores objects as items in an AWS S3 bucket. A local cache exists that is used as an intermediate location for files between Galaxy and S3.

store_type: str = 'aws_s3'
cloud: bool = True
__init__(config, config_dict)[source]
Parameters:

config (object) –

An object, most likely populated from galaxy/config.ini, having the following attributes:

  • object_store_check_old_style (only used by the DiskObjectStore subclass)

  • jobs_directory – Each job is given a unique empty directory as its current working directory. This option defines in what parent directory those directories will be created.

  • new_file_path – Used to set the ‘temp’ extra_dir.

enable_cache_monitor: bool
cache_monitor_interval: int
cache_size: int
staging_path: str
cache_updated_data: bool
classmethod parse_xml(config_xml)[source]

Parse an XML description of a configuration for this object store.

Return a configuration dictionary (such as would correspond to the YAML configuration) for the object store.

to_dict()[source]
shutdown()[source]

Close any connections for this ObjectStore.

extra_dirs: Dict[str, str]
config: Any
class galaxy.objectstore.s3.GenericS3ObjectStore(config, config_dict)[source]

Bases: S3ObjectStore

Object store that stores objects as items in a generic S3 (non AWS) bucket. A local cache exists that is used as an intermediate location for files between Galaxy and the S3 storage service.

store_type: str = 'generic_s3'
staging_path: str
extra_dirs: Dict[str, str]
config: Any
cache_updated_data: bool
enable_cache_monitor: bool
cache_size: int
cache_monitor_interval: int
badges: List[StoredBadgeDict]
store_by: str
use_axel: bool

galaxy.objectstore.s3_multipart_upload module

Split large file into multiple pieces for upload to S3. This parallelizes the task over available cores using multiprocessing. Code mostly taken form CloudBioLinux.

galaxy.objectstore.s3_multipart_upload.mp_from_ids(s3server, mp_id, mp_keyname, mp_bucketname)[source]

Get the multipart upload from the bucket and multipart IDs.

This allows us to reconstitute a connection to the upload from within multiprocessing functions.

galaxy.objectstore.s3_multipart_upload.transfer_part(s3server, mp_id, mp_keyname, mp_bucketname, i, part)[source]

Transfer a part of a multipart upload. Designed to be run in parallel.

galaxy.objectstore.s3_multipart_upload.multipart_upload(s3server, bucket, s3_key_name, tarball, mb_size)[source]

Upload large files using Amazon’s multipart upload functionality.