Warning
This document is for an old release of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.
galaxy.jobs package¶
Support for running a tool in Galaxy via an internal job management system
-
class
galaxy.jobs.
JobDestination
(**kwds)[source]¶ Bases:
galaxy.util.bunch.Bunch
Provides details about where a job runs
-
class
galaxy.jobs.
JobToolConfiguration
(**kwds)[source]¶ Bases:
galaxy.util.bunch.Bunch
Provides details on what handler and destination a tool should use
A JobToolConfiguration will have the required attribute ‘id’ and optional attributes ‘handler’, ‘destination’, and ‘params’
-
class
galaxy.jobs.
JobConfiguration
(app)[source]¶ Bases:
object
,galaxy.util.handlers.ConfiguresHandlers
A parser and interface to advanced job management features.
These features are configured in the job configuration, by default,
job_conf.xml
-
DEFAULT_NWORKERS
= 4¶
-
JOB_RESOURCE_CONDITIONAL_XML
= '<conditional name="__job_resource">\n <param name="__job_resource__select" type="select" label="Job Resource Parameters">\n <option value="no">Use default job resource parameters</option>\n <option value="yes">Specify job resource parameters</option>\n </param>\n <when value="no"/>\n <when value="yes"/>\n </conditional>'¶
-
get_tool_resource_xml
(tool_id, tool_type)[source]¶ Given a tool id, return XML elements describing parameters to insert into job resources.
Tool id: A tool ID (a string) Tool type: A tool type (a string) Returns: List of parameter elements.
-
default_job_tool_configuration
¶ The default JobToolConfiguration, used if a tool does not have an explicit defintion in the configuration. It consists of a reference to the default handler and default destination.
Returns: JobToolConfiguration – a representation of a <tool> element that uses the default handler and destination
-
get_job_tool_configurations
(ids)[source]¶ Get all configured JobToolConfigurations for a tool ID, or, if given a list of IDs, the JobToolConfigurations for the first id in
ids
matching a tool definition.Note
You should not mix tool shed tool IDs, versionless tool shed IDs, and tool config tool IDs that refer to the same tool.
Parameters: ids (list or str.) – Tool ID or IDs to fetch the JobToolConfiguration of. Returns: list – JobToolConfiguration Bunches representing <tool> elements matching the specified ID(s). Example tool ID strings include:
- Full tool shed id:
toolshed.example.org/repos/nate/filter_tool_repo/filter_tool/1.0.0
- Tool shed id less version:
toolshed.example.org/repos/nate/filter_tool_repo/filter_tool
- Tool config tool id:
filter_tool
- Full tool shed id:
-
get_destination
(id_or_tag)[source]¶ Given a destination ID or tag, return the JobDestination matching the provided ID or tag
Parameters: id_or_tag (str) – A destination ID or tag. Returns: JobDestination – A valid destination Destinations are deepcopied as they are expected to be passed in to job runners, which will modify them for persisting params set at runtime.
-
get_destinations
(id_or_tag)[source]¶ Given a destination ID or tag, return all JobDestinations matching the provided ID or tag
Parameters: id_or_tag (str) – A destination ID or tag. Returns: list or tuple of JobDestinations Destinations are not deepcopied, so they should not be passed to anything which might modify them.
-
get_job_runner_plugins
(handler_id)[source]¶ Load all configured job runner plugins
Returns: list of job runner plugins
-
is_id
(collection)[source]¶ Given a collection of handlers or destinations, indicate whether the collection represents a tag or a real ID
Parameters: collection (tuple or list) – A representation of a destination or handler Returns: bool
-
-
class
galaxy.jobs.
JobWrapper
(job, queue, use_persisted_destination=False)[source]¶ Bases:
object
,galaxy.jobs.HasResourceParameters
Wraps a ‘model.Job’ with convenience methods for running processes and state management.
-
cleanup_job
¶ Remove the job after it is complete, should return “always”, “onsuccess”, or “never”.
-
requires_containerization
¶
-
shell
¶
-
disable_commands_in_new_shell
()[source]¶ Provide an extension point to disable this isolation, Pulsar builds its own job script so this is not needed for remote jobs.
-
strict_shell
¶
-
commands_in_new_shell
¶
-
galaxy_lib_dir
¶
-
galaxy_virtual_env
¶
-
get_job_runner
()¶
-
job_destination
¶ Return the JobDestination that this job will use to run. This will either be a configured destination, a randomly selected destination if the configured destination was a tag, or a dynamically generated destination from the dynamic runner.
Calling this method for the first time causes the dynamic runner to do its calculation, if any.
Returns: JobDestination
-
prepare
(compute_environment=None)[source]¶ Prepare the job to run by creating the working directory and the config files.
-
fail
(message, exception=False, stdout='', stderr='', exit_code=None)[source]¶ Indicate job failure by setting state and message on all output datasets.
-
set_job_destination
(job_destination, external_id=None, flush=True, job=None)[source]¶ Persist job destination params in the database for recovery.
self.job_destination is not used because a runner may choose to rewrite parts of the destination (e.g. the params).
-
home_target
¶
-
tmp_target
¶
-
get_destination_configuration
(key, default=None)[source]¶ Get a destination parameter that can be defaulted back in app.config if it needs to be applied globally.
-
finish
(stdout, stderr, tool_exit_code=None, check_output_detected_state=None, remote_working_directory=None, remote_metadata_directory=None)[source]¶ Called to indicate that the associated command has been run. Updates the output datasets based on stderr and stdout from the command, and the contents of the output files.
-
tmp_dir_creation_statement
¶
-
setup_external_metadata
(exec_dir=None, tmp_dir=None, dataset_files_path=None, config_root=None, config_file=None, datatypes_config=None, resolve_metadata_dependencies=False, set_extension=True, **kwds)[source]¶
-
user
¶
-
user_system_pwent
¶
-
galaxy_system_pwent
¶
-
get_output_destination
(output_path)[source]¶ Destination for outputs marked as from_work_dir. This is the normal case, just copy these files directly to the ulimate destination.
-
requires_setting_metadata
¶
-
-
class
galaxy.jobs.
TaskWrapper
(task, queue)[source]¶ Bases:
galaxy.jobs.JobWrapper
Extension of JobWrapper intended for running tasks. Should be refactored into a generalized executable unit wrapper parent, then jobs and tasks.
-
prepare
(compute_environment=None)[source]¶ Prepare the job to run by creating the working directory and the config files.
-
finish
(stdout, stderr, tool_exit_code=None, **kwds)[source]¶ Called to indicate that the associated command has been run. Updates the output datasets based on stderr and stdout from the command, and the contents of the output files.
-
-
class
galaxy.jobs.
ComputeEnvironment
[source]¶ Bases:
object
Definition of the job as it will be run on the (potentially) remote compute server.
Bases:
galaxy.jobs.SimpleComputeEnvironment
Default ComputeEnviornment for job and task wrapper to pass to ToolEvaluator - valid when Galaxy and compute share all the relevant file systems.
-
class
galaxy.jobs.
NoopQueue
[source]¶ Bases:
object
Implements the JobQueue / JobStopQueue interface but does nothing
-
class
galaxy.jobs.
ParallelismInfo
(tag)[source]¶ Bases:
object
Stores the information (if any) for running multiple instances of the tool in parallel on the same set of inputs.
Subpackages¶
- galaxy.jobs.actions package
- galaxy.jobs.deferred package
- galaxy.jobs.metrics package
- galaxy.jobs.runners package
- Subpackages
- galaxy.jobs.runners.state_handlers package
- galaxy.jobs.runners.util package
- Subpackages
- galaxy.jobs.runners.util.cli package
- galaxy.jobs.runners.util.condor package
- galaxy.jobs.runners.util.drmaa package
- galaxy.jobs.runners.util.job_script package
- Submodules
- galaxy.jobs.runners.util.env module
- galaxy.jobs.runners.util.external module
- galaxy.jobs.runners.util.kill module
- galaxy.jobs.runners.util.retry module
- galaxy.jobs.runners.util.sudo module
- Subpackages
- Submodules
- galaxy.jobs.runners.chronos module
- galaxy.jobs.runners.cli module
- galaxy.jobs.runners.condor module
- galaxy.jobs.runners.drmaa module
- galaxy.jobs.runners.godocker module
- galaxy.jobs.runners.kubernetes module
- galaxy.jobs.runners.local module
- galaxy.jobs.runners.pbs module
- galaxy.jobs.runners.pulsar module
- galaxy.jobs.runners.slurm module
- galaxy.jobs.runners.state_handler_factory module
- galaxy.jobs.runners.tasks module
- Subpackages
- galaxy.jobs.splitters package
Submodules¶
galaxy.jobs.command_factory module¶
-
galaxy.jobs.command_factory.
build_command
(runner, job_wrapper, container=None, modify_command_for_container=True, include_metadata=False, include_work_dir_outputs=True, create_tool_working_directory=True, remote_command_params={}, metadata_directory=None)[source]¶ Compose the sequence of commands necessary to execute a job. This will currently include:
- environment settings corresponding to any requirement tags
- preparing input files
- command line taken from job wrapper
- commands to set metadata (if include_metadata is True)
galaxy.jobs.datasets module¶
Utility classes allowing Job interface to reason about datasets.
-
class
galaxy.jobs.datasets.
DatasetPath
(dataset_id, real_path, false_path=None, false_extra_files_path=None, mutable=True)[source]¶ Bases:
object
-
class
galaxy.jobs.datasets.
DatasetPathRewriter
[source]¶ Bases:
object
Used by runner to rewrite paths.
-
class
galaxy.jobs.datasets.
NullDatasetPathRewriter
[source]¶ Bases:
object
Used by default for jobwrapper, do not rewrite anything.
-
class
galaxy.jobs.datasets.
OutputsToWorkingDirectoryPathRewriter
(working_directory)[source]¶ Bases:
object
Rewrites all paths to place them in the specified working directory for normal jobs when Galaxy is configured with app.config.outputs_to_working_directory. Job runner base class is responsible for copying these out after job is complete.
galaxy.jobs.dynamic_tool_destination module¶
-
exception
galaxy.jobs.dynamic_tool_destination.
MalformedYMLException
[source]¶ Bases:
exceptions.Exception
-
exception
galaxy.jobs.dynamic_tool_destination.
ScannerError
[source]¶ Bases:
exceptions.Exception
-
galaxy.jobs.dynamic_tool_destination.
get_keys_from_dict
(dl, keys_list)[source]¶ This function builds a list using the keys from nest dictionaries
-
class
galaxy.jobs.dynamic_tool_destination.
RuleValidator
[source]¶ This class is the primary facility for validating configs. It’s always called in map_tool_to_destination and it’s called for validating config directly through DynamicToolDestination.py
-
classmethod
validate_rule
(rule_type, return_bool=False, *args, **kwargs)[source]¶ This function is responsible for passing each rule to its relevant function.
@type rule_type: str @param rule_type: the current rule’s type
@type return_bool: bool @param return_bool: True when we are only interested in the result of the
validation, and not the validated rule itself.@rtype: bool, dict (depending on return_bool) @return: validated rule or result of validation (depending on return_bool)
-
classmethod
-
galaxy.jobs.dynamic_tool_destination.
parse_yaml
(path='/config/tool_destinations.yml', test=False, return_bool=False)[source]¶ Get a yaml file from path and send it to validate_config for validation.
@type path: str @param path: the path to the config file
@type test: bool @param test: indicates whether to run in test mode or production mode
@type return_bool: bool @param return_bool: True when we are only interested in the result of the
validation, and not the validated rule itself.@rtype: bool, dict (depending on return_bool) @return: validated rule or result of validation (depending on return_bool)
-
galaxy.jobs.dynamic_tool_destination.
validate_config
(obj, return_bool=False)[source]¶ Validate received config.
@type obj: dict @param obj: the entire contents of the config
@type return_bool: bool @param return_bool: True when we are only interested in the result of the
validation, and not the validated rule itself.@rtype: bool, dict (depending on return_bool) @return: validated rule or result of validation (depending on return_bool)
-
galaxy.jobs.dynamic_tool_destination.
bytes_to_str
(size, unit='YB')[source]¶ Uses the bi convention: 1024 B = 1 KB since this method primarily has inputs of bytes for RAM
@type size: int @param size: the size in int (bytes) to be converted to str
@rtype: str @return return_str: the resulting string
-
galaxy.jobs.dynamic_tool_destination.
str_to_bytes
(size)[source]¶ Uses the bi convention: 1024 B = 1 KB since this method primarily has inputs of bytes for RAM
@type size: str @param size: the size in str to be converted to int (bytes)
@rtype: int @return curr_size: the resulting size converted from str
-
galaxy.jobs.dynamic_tool_destination.
importer
(test)[source]¶ Uses Mock galaxy for testing or real galaxy for production
@type test: bool @param test: True when being run from a test
-
galaxy.jobs.dynamic_tool_destination.
map_tool_to_destination
(job, app, tool, user_email, test=False, path=None)[source]¶ Dynamically allocate resources
@param job: galaxy job @param app: current app @param tool: current tool
@type test: bool @param test: True when running in test mode
@type path: str @param path: path to tool_destinations.yml
galaxy.jobs.error_level module¶
galaxy.jobs.handler module¶
Galaxy job handler, prepares, runs, tracks, and finishes Galaxy jobs
-
class
galaxy.jobs.handler.
JobHandler
(app)[source]¶ Bases:
object
Handle the preparation, running, tracking, and finishing of jobs
-
class
galaxy.jobs.handler.
JobHandlerQueue
(app, dispatcher)[source]¶ Bases:
galaxy.util.monitors.Monitors
,object
Job Handler’s Internal Queue, this is what actually implements waiting for jobs to be runnable and dispatching to a JobRunner.
-
STOP_SIGNAL
= <object object>¶
-
-
class
galaxy.jobs.handler.
JobHandlerStopQueue
(app, dispatcher)[source]¶ Bases:
galaxy.util.monitors.Monitors
A queue for jobs which need to be terminated prematurely.
-
STOP_SIGNAL
= <object object>¶
-
-
class
galaxy.jobs.handler.
DefaultJobDispatcher
(app)[source]¶ Bases:
object
galaxy.jobs.manager module¶
Top-level Galaxy job manager, moves jobs to handler(s)
-
class
galaxy.jobs.manager.
JobManager
(app)[source]¶ Bases:
object
Highest level interface to job management.
- TODO: Currently the app accesses “job_queue” and “job_stop_queue” directly.
- This should be decoupled.
-
class
galaxy.jobs.manager.
MessageJobHandler
(app)[source]¶ Bases:
galaxy.jobs.manager.NoopHandler
Implements the JobHandler interface but just to send setup messages on startup
TODO: It should be documented that starting two Galaxy uWSGI master processes simultaneously would result in a race condition that could cause two handlers to pick up the same job.
The recommended config for now will be webless handlers if running more than one uWSGI (master) process
galaxy.jobs.mapper module¶
-
exception
galaxy.jobs.mapper.
JobMappingException
(failure_message)[source]¶ Bases:
exceptions.Exception
-
exception
galaxy.jobs.mapper.
JobNotReadyException
(job_state=None, message=None)[source]¶ Bases:
exceptions.Exception
galaxy.jobs.output_checker module¶
-
galaxy.jobs.output_checker.
check_output
(tool, stdout, stderr, tool_exit_code, job)[source]¶ Check the output of a tool - given the stdout, stderr, and the tool’s exit code, return DETECTED_JOB_STATE.OK if the tool exited succesfully or error type otherwise. No exceptions should be thrown. If this code encounters an exception, it returns OK so that the workflow can continue; otherwise, a bug in this code could halt workflow progress.
Note that, if the tool did not define any exit code handling or any stdio/stderr handling, then it reverts back to previous behavior: if stderr contains anything, then False is returned.
galaxy.jobs.rule_helper module¶
-
class
galaxy.jobs.rule_helper.
RuleHelper
(app)[source]¶ Bases:
object
Utility to allow job rules to interface cleanly with the rest of Galaxy and shield them from low-level details of models, metrics, etc….
Currently focus is on figuring out job statistics for a given user, but could interface with other stuff as well.
-
supports_docker
(job_or_tool)[source]¶ Job rules can pass this function a job, job_wrapper, or tool and determine if the underlying tool believes it can be containered.
-
should_burst
(destination_ids, num_jobs, job_states=None)[source]¶ Check if the specified destinations
destination_ids
have at leastnum_jobs
assigned to it - send injob_state
asqueued
to limit this check to number of jobs queued.See stock_rules for an simple example of using this function - but to get the most out of it - it should probably be used with custom job rules that can respond to the bursting by allocating resources, launching cloud nodes, etc….
-
choose_one
(lst, hash_value=None)[source]¶ Choose a random value from supplied list. If hash_value is passed in then every request with that same hash_value would produce the same choice from the supplied list.
-
job_hash
(job, hash_by=None)[source]¶ Produce a reproducible hash for the given job on various criteria - for instance if hash_by is “workflow_invocation,history” - all jobs within the same workflow invocation will receive the same hash - for jobs outside of workflows all jobs within the same history will receive the same hash, other jobs will be hashed on job’s id randomly.
Primarily intended for use with
choose_one
above - to consistent route or schedule related jobs.
-
galaxy.jobs.stock_rules module¶
Stock job ‘dynamic’ rules for use in job_conf.xml - these may cover some simple use cases but will just proxy into functions in rule_helper so similar functionality - but more tailored and composable can be utilized in custom rules.
galaxy.jobs.transfer_manager module¶
Manage transfers from arbitrary URLs to temporary files. Socket interface for IPC with multiple process configurations.
-
class
galaxy.jobs.transfer_manager.
TransferManager
(app)[source]¶ Bases:
object
Manage simple data transfers from URLs to temporary locations.