galaxy.datatypes package

Subpackages

Submodules

galaxy.datatypes.annotation module

class galaxy.datatypes.annotation.SnapHmm(**kwd)[source]

Bases: Text

file_ext = 'snaphmm'
edam_data = 'data_1364'
set_peek(dataset)[source]

Set the peek. This method is used by various subclasses of Text.

display_peek(dataset)[source]

Create HTML table, used for displaying peek

sniff_prefix(file_prefix: FilePrefix)[source]

SNAP model files start with zoeHMM

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.annotation.Augustus(**kwd)[source]

Bases: CompressedArchive

Class describing an Augustus prediction model

file_ext = 'augustus'
edam_data = 'data_0950'
compressed = True
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

sniff(filename)[source]

Augustus archives always contain the same files

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.anvio module

Datatypes for Anvi’o https://github.com/merenlab/anvio

class galaxy.datatypes.anvio.AnvioComposite(**kwd)[source]

Bases: Html

Base class to use for Anvi’o composite datatypes. Generally consist of a sqlite database, plus optional additional files

file_ext = 'anvio_composite'
composite_type: Optional[str] = 'auto_primary_file'
generate_primary_file(dataset=None)[source]

This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately

get_mime()[source]

Returns the mime type of the datatype

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML content, used for displaying peek.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioDB(*args, **kwd)[source]

Bases: AnvioComposite

Class for AnvioDB database files.

file_ext = 'anvio_db'
__init__(*args, **kwd)[source]

Initialize the datatype

set_meta(dataset, **kwd)[source]

Set the anvio_basename based upon actual extra_files_path contents.

metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioStructureDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Structure DB database files.

file_ext = 'anvio_structure_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioGenomesDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Genomes DB database files.

file_ext = 'anvio_genomes_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioContigsDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Contigs DB database files.

file_ext = 'anvio_contigs_db'
__init__(*args, **kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioProfileDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Profile DB database files.

file_ext = 'anvio_profile_db'
__init__(*args, **kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioPanDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Pan DB database files.

file_ext = 'anvio_pan_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioSamplesDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Samples DB database files.

file_ext = 'anvio_samples_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.assembly module

velvet datatypes James E Johnson - University of Minnesota for velvet assembler tool in galaxy

class galaxy.datatypes.assembly.Amos(**kwd)[source]

Bases: Text

Class describing the AMOS assembly file

edam_data = 'data_0925'
edam_format = 'format_3582'
file_ext = 'afg'
sniff_prefix(file_prefix: FilePrefix)[source]

Determines whether the file is an amos assembly file format Example:

{CTG
iid:1
eid:1
seq:
CCTCTCCTGTAGAGTTCAACCGA-GCCGGTAGAGTTTTATCA
.
qlt:
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
.
{TLE
src:1027
off:0
clr:618,0
gap:
250 612
.
}
}
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.assembly.Sequences(**kwd)[source]

Bases: Fasta

Class describing the Sequences file generated by velveth

edam_data = 'data_0925'
file_ext = 'sequences'
sniff_prefix(file_prefix: FilePrefix)[source]

Determines whether the file is a velveth produced fasta format The id line has 3 fields separated by tabs: sequence_name sequence_index category:

>SEQUENCE_0_length_35   1       1
GGATATAGGGCCAACCCAACTCAACGGCCTGTCTT
>SEQUENCE_1_length_35   2       1
CGACGAATGACAGGTCACGAATTTGGCGGGGATTA
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'sequences': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.assembly.Roadmaps(**kwd)[source]

Bases: Text

Class describing the Sequences file generated by velveth

edam_format = 'format_2561'
file_ext = 'roadmaps'
sniff_prefix(file_prefix: FilePrefix)[source]
Determines whether the file is a velveth produced RoadMap::

142858 21 1 ROADMAP 1 ROADMAP 2 …

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.assembly.Velvet(**kwd)[source]

Bases: Html

composite_type: Optional[str] = 'auto_primary_file'
file_ext = 'velvet'
__init__(**kwd)[source]

Initialize the datatype

generate_primary_file(dataset=None)[source]
regenerate_primary_file(dataset)[source]

cannot do this until we are setting metadata

set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'long_reads': <galaxy.model.metadata.MetadataElementSpec object>, 'paired_end_reads': <galaxy.model.metadata.MetadataElementSpec object>, 'short2_reads': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.binary module

Binary classes

class galaxy.datatypes.binary.Binary(**kwd)[source]

Bases: Data

Binary data

edam_format = 'format_2333'
file_ext = 'binary'
static register_sniffable_binary_format(data_type, ext, type_class)[source]

Deprecated method.

static register_unsniffable_binary_ext(ext)[source]

Deprecated method.

set_peek(dataset, **kwd)[source]

Set the peek and blurb text

get_mime()[source]

Returns the mime type of the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Ab1(**kwd)[source]

Bases: Binary

Class describing an ab1 binary sequence file

file_ext = 'ab1'
edam_format = 'format_3000'
edam_data = 'data_0924'
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Idat(**kwd)[source]

Bases: Binary

Binary data in idat format

file_ext = 'idat'
edam_format = 'format_2058'
edam_data = 'data_2603'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Cel(**kwd)[source]

Bases: Binary

Cel File format described at: http://media.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cel.html

is_binary: Union[bool, typing_extensions.Literal[maybe]] = 'maybe'
file_ext = 'cel'
edam_format = 'format_1638'
edam_data = 'data_3110'
sniff(filename)[source]

Try to guess if the file is a Cel file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('affy_v_agcc.cel')
>>> Cel().sniff(fname)
True
>>> fname = get_test_fname('affy_v_3.cel')
>>> Cel().sniff(fname)
True
>>> fname = get_test_fname('affy_v_4.cel')
>>> Cel().sniff(fname)
True
>>> fname = get_test_fname('test.gal')
>>> Cel().sniff(fname)
False
set_meta(dataset, **kwd)[source]

Set metadata for Cel file.

set_peek(dataset)[source]

Set the peek and blurb text

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MashSketch(**kwd)[source]

Bases: Binary

Mash Sketch file. Sketches are used by the MinHash algorithm to allow fast distance estimations with low storage and memory requirements. To make a sketch, each k-mer in a sequence is hashed, which creates a pseudo-random identifier. By sorting these identifiers (hashes), a small subset from the top of the sorted list can represent the entire sequence (these are min-hashes). The more similar another sequence is, the more min-hashes it is likely to share.

file_ext = 'msh'
is_binary: Union[bool, typing_extensions.Literal[maybe]] = 'maybe'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.CompressedArchive(**kwd)[source]

Bases: Binary

Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.

file_ext = 'compressed_archive'
compressed = True
is_binary: Union[bool, typing_extensions.Literal[maybe]] = 'maybe'
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Meryldb(**kwd)[source]

Bases: CompressedArchive

MerylDB is a tar.gz archive, with 128 files. 64 data files and 64 index files.

file_ext = 'meryldb'
sniff(filename)[source]

Try to guess if the file is a Cel file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('affy_v_agcc.cel')
>>> Meryldb().sniff(fname)
False
>>> fname = get_test_fname('read-db.meryldb')
>>> Meryldb().sniff(fname)
True
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Bref3(**kwd)[source]

Bases: Binary

Bref3 format is a binary format for storing phased, non-missing genotypes for a list of samples.

file_ext = 'bref3'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(sniff_prefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.DynamicCompressedArchive(**kwd)[source]

Bases: CompressedArchive

compressed_format: str
uncompressed_datatype_instance: Data
matches_any(target_datatypes) bool[source]

Treat two aspects of compressed datatypes separately.

metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GzDynamicCompressedArchive(**kwd)[source]

Bases: DynamicCompressedArchive

compressed_format: str = 'gzip'
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

uncompressed_datatype_instance: Data
class galaxy.datatypes.binary.Bz2DynamicCompressedArchive(**kwd)[source]

Bases: DynamicCompressedArchive

compressed_format: str = 'bz2'
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

uncompressed_datatype_instance: Data
class galaxy.datatypes.binary.CompressedZipArchive(**kwd)[source]

Bases: CompressedArchive

Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.

file_ext = 'zip'
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GenericAsn1Binary(**kwd)[source]

Bases: Binary

Class for generic ASN.1 binary format

file_ext = 'asn1-binary'
edam_format = 'format_1966'
edam_data = 'data_0849'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BamNative(**kwd)[source]

Bases: CompressedArchive, _BamOrSam

Class describing a BAM binary file that is not necessarily sorted

edam_format = 'format_2572'
edam_data = 'data_0863'
file_ext = 'unsorted.bam'
sort_flag: Optional[str] = None
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

static merge(split_files, output_file)[source]

Merges BAM files

Parameters
  • split_files – List of bam file paths to merge

  • output_file – Write merged bam file to this location

init_meta(dataset, copy_from=None)[source]
sniff(filename)[source]
classmethod is_bam(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

to_archive(dataset, name='')[source]

Collect archive paths and file handles that need to be exported when archiving dataset.

Parameters
  • dataset – HistoryDatasetAssociation

  • name – archive name, in collection context corresponds to collection name(s) and element_identifier, joined by ‘/’, e.g ‘fastq_collection/sample1/forward’

groom_dataset_content(file_name)[source]

Ensures that the BAM file contents are coordinate-sorted. This function is called on an output dataset after the content is initially generated.

get_chunk(trans, dataset, offset=0, ck_size=None)[source]
display_data(trans, dataset, preview=False, filename=None, to_ext=None, offset=None, ck_size=None, **kwd)[source]

Displays data in central pane if preview is True, else handles download.

Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.

TOOD: Document alternatives to overridding this method (data providers?).

validate(dataset, **kwd)[source]
metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Bam(**kwd)[source]

Bases: BamNative

Class describing a BAM binary file

edam_format = 'format_2572'
edam_data = 'data_0863'
file_ext = 'bam'
track_type: Optional[str] = 'ReadTrack'
data_sources: Dict[str, str] = {'data': 'bai', 'index': 'bigwig'}
get_index_flag(file_name)[source]

Return pysam flag for bai index (default) or csi index (contig size > (2**29 - 1) )

dataset_content_needs_grooming(file_name)[source]

Check if file_name is a coordinate-sorted BAM file

set_meta(dataset, overwrite=True, metadata_tmp_files_dir=None, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(file_name)[source]
line_dataprovider(dataset, **settings)[source]
regex_line_dataprovider(dataset, **settings)[source]
column_dataprovider(dataset, **settings)[source]
dict_dataprovider(dataset, **settings)[source]
header_dataprovider(dataset, **settings)[source]
id_seq_qual_dataprovider(dataset, **settings)[source]
genomic_region_dataprovider(dataset, **settings)[source]
genomic_region_dict_dataprovider(dataset, **settings)[source]
samtools_dataprovider(dataset, **settings)[source]

Generic samtools interface - all options available through settings.

dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'column': <function Bam.column_dataprovider>, 'dict': <function Bam.dict_dataprovider>, 'genomic-region': <function Bam.genomic_region_dataprovider>, 'genomic-region-dict': <function Bam.genomic_region_dict_dataprovider>, 'header': <function Bam.header_dataprovider>, 'id-seq-qual': <function Bam.id_seq_qual_dataprovider>, 'line': <function Bam.line_dataprovider>, 'regex-line': <function Bam.regex_line_dataprovider>, 'samtools': <function Bam.samtools_dataprovider>}
metadata_spec: metadata.MetadataSpecCollection = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.ProBam(**kwd)[source]

Bases: Bam

Class describing a BAM binary file - extended for proteomics data

edam_format = 'format_3826'
edam_data = 'data_0863'
file_ext = 'probam'
metadata_spec: metadata.MetadataSpecCollection = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BamInputSorted(**kwd)[source]

Bases: BamNative

A class for BAM files that can formally be unsorted or queryname sorted. Alignments are either ordered based on the order with which the queries appear when producing the alignment, or ordered by their queryname. This notaby keeps alignments produced by paired end sequencing adjacent.

sort_flag: Optional[str] = '-n'
file_ext = 'qname_input_sorted.bam'
sniff(file_name)[source]
dataset_content_needs_grooming(file_name)[source]

Groom if the file is coordinate sorted

metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BamQuerynameSorted(**kwd)[source]

Bases: BamInputSorted

A class for queryname sorted BAM files.

sort_flag: Optional[str] = '-n'
file_ext = 'qname_sorted.bam'
sniff(file_name)[source]
dataset_content_needs_grooming(file_name)[source]

Check if file_name is a queryname-sorted BAM file

metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.CRAM(**kwd)[source]

Bases: Binary

file_ext = 'cram'
edam_format = 'format_3462'
edam_data = 'data_0863'
set_meta(dataset, overwrite=True, metadata_tmp_files_dir=None, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

get_cram_version(filename)[source]
set_index_file(dataset, index_file)[source]
set_peek(dataset)[source]

Set the peek and blurb text

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'cram_index': <galaxy.model.metadata.MetadataElementSpec object>, 'cram_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BaseBcf(**kwd)[source]

Bases: CompressedArchive

edam_format = 'format_3020'
edam_data = 'data_3498'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Bcf(**kwd)[source]

Bases: BaseBcf

Class describing a (BGZF-compressed) BCF file

file_ext = 'bcf'
sniff(filename)[source]
set_meta(dataset, overwrite=True, metadata_tmp_files_dir=None, **kwd)[source]

Creates the index for the BCF file.

metadata_spec: MetadataSpecCollection = {'bcf_index': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BcfUncompressed(**kwd)[source]

Bases: BaseBcf

Class describing an uncompressed BCF file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('1.bcf_uncompressed')
>>> BcfUncompressed().sniff(fname)
True
>>> fname = get_test_fname('1.bcf')
>>> BcfUncompressed().sniff(fname)
False
file_ext = 'bcf_uncompressed'
compressed = False
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.H5(**kwd)[source]

Bases: Binary

Class describing an HDF5 file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.mz5')
>>> H5().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> H5().sniff(fname)
False
file_ext = 'h5'
edam_format = 'format_3590'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Loom(**kwd)[source]

Bases: H5

Class describing a Loom file: http://loompy.org/

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.loom')
>>> Loom().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Loom().sniff(fname)
False
file_ext = 'loom'
edam_format = 'format_3590'
sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

metadata_spec: MetadataSpecCollection = {'col_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'col_attrs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'col_graphs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'col_graphs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'description': <galaxy.model.metadata.MetadataElementSpec object>, 'doi': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object>, 'loom_spec_version': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'row_graphs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'row_graphs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'title': <galaxy.model.metadata.MetadataElementSpec object>, 'url': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Anndata(**kwd)[source]

Bases: H5

Class describing an HDF5 anndata files: http://anndata.rtfd.io

>>> from galaxy.datatypes.sniff import get_test_fname
>>> Anndata().sniff(get_test_fname('pbmc3k_tiny.h5ad'))
True
>>> Anndata().sniff(get_test_fname('test.mz5'))
False
>>> Anndata().sniff(get_test_fname('import.loom.krumsiek11.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_6_small2.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_6_small.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_7_4_small2.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_7_4_small.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_unk2.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_unk.h5ad'))
True
file_ext = 'h5ad'
sniff(filename)[source]
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'anndata_spec_version': <galaxy.model.metadata.MetadataElementSpec object>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'description': <galaxy.model.metadata.MetadataElementSpec object>, 'doi': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_size': <galaxy.model.metadata.MetadataElementSpec object>, 'obsm_count': <galaxy.model.metadata.MetadataElementSpec object>, 'obsm_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_count': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_size': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'title': <galaxy.model.metadata.MetadataElementSpec object>, 'uns_count': <galaxy.model.metadata.MetadataElementSpec object>, 'uns_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'url': <galaxy.model.metadata.MetadataElementSpec object>, 'var_count': <galaxy.model.metadata.MetadataElementSpec object>, 'var_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'var_size': <galaxy.model.metadata.MetadataElementSpec object>, 'varm_count': <galaxy.model.metadata.MetadataElementSpec object>, 'varm_layers': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Grib(**kwd)[source]

Bases: Binary

Class describing an GRIB file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.grib')
>>> Grib().sniff_prefix(FilePrefix(fname))
True
>>> fname = FilePrefix(get_test_fname('interval.interval'))
>>> Grib().sniff_prefix(fname)
False
file_ext = 'grib'
edam_format = 'format_2333'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

set_meta(dataset, **kwd)[source]

Set the GRIB edition.

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'grib_edition': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.GmxBinary(**kwd)[source]

Bases: Binary

Base class for GROMACS binary files - xtc, trr, cpt

magic_number: Optional[int] = None
file_ext = ''
sniff_prefix(sniff_prefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.Trr(**kwd)[source]

Bases: GmxBinary

Class describing an trr file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.trr')
>>> Trr().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Trr().sniff(fname)
False
file_ext = 'trr'
magic_number: Optional[int] = 1993
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Cpt(**kwd)[source]

Bases: GmxBinary

Class describing a checkpoint (.cpt) file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.cpt')
>>> Cpt().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Cpt().sniff(fname)
False
file_ext = 'cpt'
magic_number: Optional[int] = 171817
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Xtc(**kwd)[source]

Bases: GmxBinary

Class describing an xtc file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.xtc')
>>> Xtc().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Xtc().sniff(fname)
False
file_ext = 'xtc'
magic_number: Optional[int] = 1995
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Edr(**kwd)[source]

Bases: GmxBinary

Class describing an edr file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.edr')
>>> Edr().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Edr().sniff(fname)
False
file_ext = 'edr'
magic_number: Optional[int] = -55555
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Biom2(**kwd)[source]

Bases: H5

Class describing a biom2 file (http://biom-format.org/documentation/biom_format.html)

file_ext = 'biom2'
edam_format = 'format_3746'
sniff(filename)[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> Biom2().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Biom2().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> Biom2().sniff(fname)
False
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'format_url': <galaxy.model.metadata.MetadataElementSpec object>, 'format_version': <galaxy.model.metadata.MetadataElementSpec object>, 'generated_by': <galaxy.model.metadata.MetadataElementSpec object>, 'id': <galaxy.model.metadata.MetadataElementSpec object>, 'nnz': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'type': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Cool(**kwd)[source]

Bases: H5

Class describing the cool format (https://github.com/mirnylab/cooler)

file_ext = 'cool'
sniff(filename)[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('matrix.cool')
>>> Cool().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Cool().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> Cool().sniff(fname)
False
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> Cool().sniff(fname)
False
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MCool(**kwd)[source]

Bases: H5

Class describing the multi-resolution cool format (https://github.com/mirnylab/cooler)

file_ext = 'mcool'
sniff(filename)[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('matrix.mcool')
>>> MCool().sniff(fname)
True
>>> fname = get_test_fname('matrix.cool')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('test.mz5')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> MCool().sniff(fname)
False
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.H5MLM(**kwd)[source]

Bases: H5

Machine learning model generated by Galaxy-ML.

file_ext = 'h5mlm'
URL = 'https://github.com/goeckslab/Galaxy-ML'
max_peek_size = 1000
max_preview_size = 1000000
set_meta(dataset, overwrite=True, metadata_tmp_files_dir=None, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
get_repr(filename)[source]
get_config_string(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

display_data(trans, dataset, preview=False, filename=None, to_ext=None, **kwd)[source]

Displays data in central pane if preview is True, else handles download.

Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.

TOOD: Document alternatives to overridding this method (data providers?).

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'hyper_params': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.LudwigModel(**kwd)[source]

Bases: Html

Composite datatype that encloses multiple files for a Ludwig trained model.

composite_type: Optional[str] = 'auto_primary_file'
file_ext = 'ludwig_model'
__init__(**kwd)[source]

Initialize the datatype

generate_primary_file(dataset=None)[source]
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.HexrdMaterials(**kwd)[source]

Bases: H5

Class describing a Hexrd Materials file: https://github.com/HEXRD/hexrd

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.materials.h5')
>>> HexrdMaterials().sniff(fname)
True
>>> fname = get_test_fname('test.loom')
>>> HexrdMaterials().sniff(fname)
False
file_ext = 'hexrd.materials.h5'
edam_format = 'format_3590'
sniff(filename)[source]
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

metadata_spec: MetadataSpecCollection = {'LatticeParameters': <galaxy.model.metadata.MetadataElementSpec object>, 'SpaceGroupNumber': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'materials': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Scf(**kwd)[source]

Bases: Binary

Class describing an scf binary sequence file

edam_format = 'format_1632'
edam_data = 'data_0924'
file_ext = 'scf'
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Sff(**kwd)[source]

Bases: Binary

Standard Flowgram Format (SFF)

edam_format = 'format_3284'
edam_data = 'data_0924'
file_ext = 'sff'
sniff_prefix(sniff_prefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.BigWig(**kwd)[source]

Bases: Binary

Accessing binary BigWig files from UCSC. The supplemental info in the paper has the binary details: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btq351v1

edam_format = 'format_3006'
edam_data = 'data_3002'
file_ext = 'bigwig'
track_type: Optional[str] = 'LineTrack'
data_sources: Dict[str, str] = {'data_standalone': 'bigwig'}
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(sniff_prefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.BigBed(**kwd)[source]

Bases: BigWig

BigBed support from UCSC.

edam_format = 'format_3004'
edam_data = 'data_3002'
file_ext = 'bigbed'
data_sources: Dict[str, str] = {'data_standalone': 'bigbed'}
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.TwoBit(**kwd)[source]

Bases: Binary

Class describing a TwoBit format nucleotide file

edam_format = 'format_3009'
edam_data = 'data_0848'
file_ext = 'twobit'
sniff_prefix(sniff_prefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.SQlite(**kwd)[source]

Bases: Binary

Class describing a Sqlite database

file_ext = 'sqlite'
edam_format = 'format_3621'
init_meta(dataset, copy_from=None)[source]
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
sniff_table_names(filename, table_names)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

sqlite_dataprovider(dataset, **settings)[source]
sqlite_datatableprovider(dataset, **settings)[source]
sqlite_datadictprovider(dataset, **settings)[source]
dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'sqlite': <function SQlite.sqlite_dataprovider>, 'sqlite-dict': <function SQlite.sqlite_datadictprovider>, 'sqlite-table': <function SQlite.sqlite_datatableprovider>}
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GeminiSQLite(**kwd)[source]

Bases: SQlite

Class describing a Gemini Sqlite database

file_ext = 'gemini.sqlite'
edam_format = 'format_3622'
edam_data = 'data_3498'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'gemini_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.ChiraSQLite(**kwd)[source]

Bases: SQlite

Class describing a ChiRAViz Sqlite database

file_ext = 'chira.sqlite'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.CuffDiffSQlite(**kwd)[source]

Bases: SQlite

Class describing a CuffDiff SQLite database

file_ext = 'cuffdiff.sqlite'
edam_format = 'format_3621'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'cuffdiff_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'genes': <galaxy.model.metadata.MetadataElementSpec object>, 'samples': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MzSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Sqlite database

file_ext = 'mz.sqlite'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.PQP(**kwd)[source]

Bases: SQlite

Class describing a Peptide query parameters file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.pqp')
>>> PQP().sniff(fname)
True
>>> fname = get_test_fname('test.osw')
>>> PQP().sniff(fname)
False
file_ext = 'pqp'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]

table definition according to https://github.com/grosenberger/OpenMS/blob/develop/src/openms/source/ANALYSIS/OPENSWATH/TransitionPQPFile.cpp#L264 for now VERSION GENE PEPTIDE_GENE_MAPPING are excluded, since there is test data wo these tables, see also here https://github.com/OpenMS/OpenMS/issues/4365

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OSW(**kwd)[source]

Bases: SQlite

Class describing OpenSwath output

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.osw')
>>> OSW().sniff(fname)
True
>>> fname = get_test_fname('test.sqmass')
>>> OSW().sniff(fname)
False
file_ext = 'osw'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.SQmass(**kwd)[source]

Bases: SQlite

Class describing a Sqmass database

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.sqmass')
>>> SQmass().sniff(fname)
True
>>> fname = get_test_fname('test.pqp')
>>> SQmass().sniff(fname)
False
file_ext = 'sqmass'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BlibSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Spectral Library Sqlite database

file_ext = 'blib'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'blib_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.DlibSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Spectral Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.dlib')
>>> DlibSQlite().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DlibSQlite().sniff(fname)
False
file_ext = 'dlib'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dlib_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.ElibSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Chromatagram Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.elib')
>>> ElibSQlite().sniff(fname)
True
>>> fname = get_test_fname('test.dlib')
>>> ElibSQlite().sniff(fname)
False
file_ext = 'elib'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.IdpDB(**kwd)[source]

Bases: SQlite

Class describing an IDPicker 3 idpDB (sqlite) database

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.idpdb')
>>> IdpDB().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> IdpDB().sniff(fname)
False
file_ext = 'idpdb'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GAFASQLite(**kwd)[source]

Bases: SQlite

Class describing a GAFA SQLite database

file_ext = 'gafa.sqlite'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'gafa_schema_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.NcbiTaxonomySQlite(**kwd)[source]

Bases: SQlite

Class describing the NCBI Taxonomy database stored in SQLite as done by rust-ncbitaxonomy

file_ext = 'ncbitaxonomy.sqlite'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'ncbitaxonomy_schema_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>, 'taxon_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Xlsx(**kwd)[source]

Bases: Binary

Class for Excel 2007 (xlsx) files

file_ext = 'xlsx'
compressed = True
sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.ExcelXls(**kwd)[source]

Bases: Binary

Class describing an Excel (xls) file

file_ext = 'excel.xls'
edam_format = 'format_3468'
sniff_prefix(sniff_prefix)[source]
get_mime()[source]

Returns the mime type of the datatype

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.Sra(**kwd)[source]

Bases: Binary

Sequence Read Archive (SRA) datatype originally from mdshw5/sra-tools-galaxy

file_ext = 'sra'
sniff_prefix(sniff_prefix)[source]

The first 8 bytes of any NCBI sra file is ‘NCBI.sra’, and the file is binary. For details about the format, see http://www.ncbi.nlm.nih.gov/books/n/helpsra/SRA_Overview_BK/#SRA_Overview_BK.4_SRA_Data_Structure

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.RData(**kwd)[source]

Bases: CompressedArchive

Generic R Data file datatype implementation, i.e. files generated with R’s save or save.img function see https://www.loc.gov/preservation/digital/formats/fdd/fdd000470.shtml and https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.rdata')
>>> RData().sniff(fname)
True
>>> from galaxy.util.bunch import Bunch
>>> dataset = Bunch()
>>> dataset.metadata = Bunch
>>> dataset.file_name = fname
>>> dataset.has_data = lambda: True
>>> RData().set_meta(dataset)
>>> dataset.metadata.version
'3'
VERSION_2_PREFIX = b'RDX2\nX\n'
VERSION_3_PREFIX = b'RDX3\nX\n'
file_ext = 'rdata'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.RDS(**kwd)[source]

Bases: CompressedArchive

File using a serialized R object generated with R’s saveRDS function see https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('int-r3.rds')
>>> RDS().sniff(fname)
True
>>> fname = get_test_fname('int-r4.rds')
>>> RDS().sniff(fname)
True
>>> fname = get_test_fname('int-r3-version2.rds')
>>> RDS().sniff(fname)
True
>>> from galaxy.util.bunch import Bunch
>>> dataset = Bunch()
>>> dataset.metadata = Bunch
>>> dataset.file_name = get_test_fname('int-r4.rds')
>>> dataset.has_data = lambda: True
>>> RDS().set_meta(dataset)
>>> dataset.metadata.version
'3'
>>> dataset.metadata.rversion
'4.1.1'
>>> dataset.metadata.minrversion
'3.5.0'
file_ext = 'rds'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'minrversion': <galaxy.model.metadata.MetadataElementSpec object>, 'rversion': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.OxliBinary(**kwd)[source]

Bases: Binary

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliCountGraph(**kwd)[source]

Bases: OxliBinary

OxliCountGraph starts with “OXLI” + one byte version number + 8-bit binary ‘1’ Test file generated via:

load-into-counting.py --n_tables 1 --max-tablesize 1 \
    oxli_countgraph.oxlicg khmer/tests/test-data/100-reads.fq.bz2

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliCountGraph().sniff(fname)
False
>>> fname = get_test_fname("oxli_countgraph.oxlicg")
>>> OxliCountGraph().sniff(fname)
True
file_ext = 'oxlicg'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliNodeGraph(**kwd)[source]

Bases: OxliBinary

OxliNodeGraph starts with “OXLI” + one byte version number + 8-bit binary ‘2’ Test file generated via:

load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \
    khmer/tests/test-data/100-reads.fq.bz2

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliNodeGraph().sniff(fname)
False
>>> fname = get_test_fname("oxli_nodegraph.oxling")
>>> OxliNodeGraph().sniff(fname)
True
file_ext = 'oxling'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliTagSet(**kwd)[source]

Bases: OxliBinary

OxliTagSet starts with “OXLI” + one byte version number + 8-bit binary ‘3’ Test file generated via:

load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \
    khmer/tests/test-data/100-reads.fq.bz2;
mv oxli_nodegraph.oxling.tagset oxli_tagset.oxlits

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliTagSet().sniff(fname)
False
>>> fname = get_test_fname("oxli_tagset.oxlits")
>>> OxliTagSet().sniff(fname)
True
file_ext = 'oxlits'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliStopTags(**kwd)[source]

Bases: OxliBinary

OxliStopTags starts with “OXLI” + one byte version number + 8-bit binary ‘4’ Test file adapted from khmer 2.0’s “khmer/tests/test-data/goodversion-k32.stoptags”

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliStopTags().sniff(fname)
False
>>> fname = get_test_fname("oxli_stoptags.oxlist")
>>> OxliStopTags().sniff(fname)
True
file_ext = 'oxlist'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliSubset(**kwd)[source]

Bases: OxliBinary

OxliSubset starts with “OXLI” + one byte version number + 8-bit binary ‘5’ Test file generated via:

load-graph.py -k 20 example tests/test-data/random-20-a.fa;
partition-graph.py example;
mv example.subset.0.pmap oxli_subset.oxliss

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliSubset().sniff(fname)
False
>>> fname = get_test_fname("oxli_subset.oxliss")
>>> OxliSubset().sniff(fname)
True
file_ext = 'oxliss'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliGraphLabels(**kwd)[source]

Bases: OxliBinary

OxliGraphLabels starts with “OXLI” + one byte version number + 8-bit binary ‘6’ Test file generated via:

python -c "from khmer import GraphLabels; \
    gl = GraphLabels(20, 1e7, 4); \
    gl.consume_fasta_and_tag_with_labels('tests/test-data/test-labels.fa'); \
    gl.save_labels_and_tags('oxli_graphlabels.oxligl')"

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliGraphLabels().sniff(fname)
False
>>> fname = get_test_fname("oxli_graphlabels.oxligl")
>>> OxliGraphLabels().sniff(fname)
True
file_ext = 'oxligl'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.PostgresqlArchive(**kwd)[source]

Bases: CompressedArchive

Class describing a Postgresql database packed into a tar archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('postgresql_fake.tar.bz2')
>>> PostgresqlArchive().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar')
>>> PostgresqlArchive().sniff(fname)
False
file_ext = 'postgresql'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5Archive(**kwd)[source]

Bases: CompressedArchive

Class describing a FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5Archive().sniff(fname)
True
file_ext = 'fast5.tar'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5ArchiveGz(**kwd)[source]

Bases: Fast5Archive

Class describing a gzip-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveGz().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveGz().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveGz().sniff(fname)
False
file_ext = 'fast5.tar.gz'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5ArchiveBz2(**kwd)[source]

Bases: Fast5Archive

Class describing a bzip2-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveBz2().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveBz2().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveBz2().sniff(fname)
False
file_ext = 'fast5.tar.bz2'
sniff(filename)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.SearchGuiArchive(**kwd)[source]

Bases: CompressedArchive

Class describing a SearchGUI archive

file_ext = 'searchgui_archive'
set_meta(dataset, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'searchgui_major_version': <galaxy.model.metadata.MetadataElementSpec object>, 'searchgui_version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.NetCDF(**kwd)[source]

Bases: Binary

Binary data in netCDF format

file_ext = 'netcdf'
edam_format = 'format_3650'
edam_data = 'data_0943'
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.Dcd(**kwd)[source]

Bases: Binary

Class describing a dcd file from the CHARMM molecular simulation program

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test_glucose_vacuum.dcd')
>>> Dcd().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Dcd().sniff(fname)
False
file_ext = 'dcd'
edam_data = 'data_3842'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Vel(**kwd)[source]

Bases: Binary

Class describing a velocity file from the CHARMM molecular simulation program

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test_charmm.vel')
>>> Vel().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Vel().sniff(fname)
False
file_ext = 'vel'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.DAA(**kwd)[source]

Bases: Binary

Class describing an DAA (diamond alignment archive) file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('diamond.daa')
>>> DAA().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DAA().sniff(fname)
False
file_ext = 'daa'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.RMA6(**kwd)[source]

Bases: Binary

Class describing an RMA6 (MEGAN6 read-match archive) file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('diamond.rma6')
>>> RMA6().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> RMA6().sniff(fname)
False
file_ext = 'rma6'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.DMND(**kwd)[source]

Bases: Binary

Class describing an DMND file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('diamond_db.dmnd')
>>> DMND().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DMND().sniff(fname)
False
file_ext = 'dmnd'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.ICM(**kwd)[source]

Bases: Binary

Class describing an ICM (interpolated context model) file, used by Glimmer

file_ext = 'icm'
edam_data = 'data_0950'
set_peek(dataset)[source]

Set the peek and blurb text

sniff(dataset)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Parquet(**kwd)[source]

Bases: Binary

Class describing Apache Parquet file (https://parquet.apache.org/)

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('example.parquet')
>>> Parquet().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Parquet().sniff(fname)
False
file_ext = 'parquet'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(sniff_prefix)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.BafTar(**kwd)[source]

Bases: CompressedArchive

Base class for common behavior of tar files of directory-based raw file formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('brukerbaf.d.tar')
>>> BafTar().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar')
>>> BafTar().sniff(fname)
False
edam_data = 'data_2536'
edam_format = 'format_3712'
file_ext = 'brukerbaf.d.tar'
get_signature_file()[source]
sniff(filename)[source]
get_type()[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.YepTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Agilent/Bruker YEP format data

file_ext = 'agilentbrukeryep.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.TdfTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Bruker TDF format data

file_ext = 'brukertdf.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MassHunterTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Agilent MassHunter format data

file_ext = 'agilentmasshunter.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MassLynxTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Waters MassLynx format data

file_ext = 'watersmasslynx.raw.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.WiffTar(**kwd)[source]

Bases: BafTar

A tar’d up .wiff/.scan pair containing Sciex WIFF format data

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('some.wiff.tar')
>>> WiffTar().sniff(fname)
True
>>> fname = get_test_fname('brukerbaf.d.tar')
>>> WiffTar().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> WiffTar().sniff(fname)
False
file_ext = 'wiff.tar'
sniff(filename)[source]
get_type()[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Pretext(**kwd)[source]

Bases: Binary

PretextMap contact map file Try to guess if the file is a Pretext file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sample.pretext')
>>> Pretext().sniff(fname)
True
file_ext = 'pretext'
sniff_prefix(sniff_prefix)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.JP2(**kwd)[source]

Bases: Binary

JPEG 2000 binary image format

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.jp2')
>>> JP2().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> JP2().sniff(fname)
False
file_ext = 'jp2'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Npz(**kwd)[source]

Bases: CompressedArchive

Class describing an Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.images.npz')
>>> Npz().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Npz().sniff(fname)
False
file_ext = 'npz'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_meta(dataset, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.HexrdImagesNpz(**kwd)[source]

Bases: Npz

Class describing an HEXRD Images Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.images.npz')
>>> HexrdImagesNpz().sniff(fname)
True
>>> fname = get_test_fname('eta_ome.npz')
>>> HexrdImagesNpz().sniff(fname)
False
file_ext = 'hexrd.images.npz'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_meta(dataset, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object>, 'omegas': <galaxy.model.metadata.MetadataElementSpec object>, 'panel_id': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.HexrdEtaOmeNpz(**kwd)[source]

Bases: Npz

Class describing an HEXRD Eta-Ome Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.eta_ome.npz')
>>> HexrdEtaOmeNpz().sniff(fname)
True
>>> fname = get_test_fname('hexrd.images.npz')
>>> HexrdEtaOmeNpz().sniff(fname)
False
file_ext = 'hexrd.eta_ome.npz'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename)[source]
set_meta(dataset, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'HKLs': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.blast module

NCBI BLAST datatypes.

Covers the blastxml format and the BLAST databases.

class galaxy.datatypes.blast.BlastXml(**kwd)[source]

Bases: GenericXml

NCBI Blast XML Output data

file_ext = 'blastxml'
edam_format = 'format_3331'
edam_data = 'data_0857'
set_peek(dataset)[source]

Set the peek and blurb text

sniff_prefix(file_prefix: FilePrefix)[source]

Determines whether the file is blastxml

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('megablast_xml_parser_test1.blastxml')
>>> BlastXml().sniff(fname)
True
>>> fname = get_test_fname('tblastn_four_human_vs_rhodopsin.blastxml')
>>> BlastXml().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> BlastXml().sniff(fname)
False
static merge(split_files, output_file)[source]

Merging multiple XML files is non-trivial and must be done in subclasses.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.blast.BlastNucDb(**kwd)[source]

Bases: _BlastDb

Class for nucleotide BLAST database files.

file_ext = 'blastdbn'
composite_type: Optional[str] = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastProtDb(**kwd)[source]

Bases: _BlastDb

Class for protein BLAST database files.

file_ext = 'blastdbp'
composite_type: Optional[str] = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastDomainDb(**kwd)[source]

Bases: _BlastDb

Class for domain BLAST database files.

file_ext = 'blastdbd'
composite_type: Optional[str] = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.LastDb(**kwd)[source]

Bases: Data

Class for LAST database files.

file_ext = 'lastdb'
composite_type: Optional[str] = 'basic'
set_peek(dataset)[source]

Set the peek and blurb text.

display_peek(dataset)[source]

Create HTML content, used for displaying peek.

__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastNucDb5(**kwd)[source]

Bases: _BlastDb

Class for nucleotide BLAST database files.

file_ext = 'blastdbn5'
composite_type: Optional[str] = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastProtDb5(**kwd)[source]

Bases: _BlastDb

Class for protein BLAST database files.

file_ext = 'blastdbp5'
composite_type: Optional[str] = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastDomainDb5(**kwd)[source]

Bases: _BlastDb

Class for domain BLAST database files.

file_ext = 'blastdbd5'
composite_type: Optional[str] = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.checkers module

Module proxies galaxy.util.checkers for backward compatibility.

External datatypes may make use of these functions.

galaxy.datatypes.checkers.check_binary(name, file_path: bool = True) bool[source]
galaxy.datatypes.checkers.check_bz2(file_path: str, check_content: bool = True) Tuple[bool, bool][source]
galaxy.datatypes.checkers.check_gzip(file_path: str, check_content: bool = True) Tuple[bool, bool][source]
galaxy.datatypes.checkers.check_html(name, file_path: bool = True) bool[source]

Returns True if the file/string contains HTML code.

galaxy.datatypes.checkers.check_image(file_path: str)[source]

Simple wrapper around image_type to yield a True/False verdict

galaxy.datatypes.checkers.check_zip(file_path: str, check_content: bool = True, files=1) Tuple[bool, bool][source]
galaxy.datatypes.checkers.is_gzip(file_path: str) bool[source]
galaxy.datatypes.checkers.is_bz2(file_path: str) bool[source]

galaxy.datatypes.chrominfo module

class galaxy.datatypes.chrominfo.ChromInfo(**kwd)[source]

Bases: Tabular

file_ext = 'len'
metadata_spec: MetadataSpecCollection = {'chrom': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'length': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.constructive_solid_geometry module

Constructive Solid Geometry file formats.

class galaxy.datatypes.constructive_solid_geometry.Ply(**kwd)[source]

Bases: object

The PLY format describes an object as a collection of vertices, faces and other elements, along with properties such as color and normal direction that can be attached to these elements. A PLY file contains the description of exactly one object.

subtype = ''
abstract __init__(**kwd)[source]
sniff_prefix(file_prefix: FilePrefix)[source]

The structure of a typical PLY file: Header, Vertex List, Face List, (lists of other elements)

set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
display_peek(dataset)[source]
sniff(filename)
class galaxy.datatypes.constructive_solid_geometry.PlyAscii(**kwd)[source]

Bases: Ply, Text

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.plyascii')
>>> PlyAscii().sniff(fname)
True
>>> fname = get_test_fname('test.vtkascii')
>>> PlyAscii().sniff(fname)
False
file_ext = 'plyascii'
subtype = 'ascii'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'face': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.PlyBinary(**kwd)[source]

Bases: Ply, Binary

file_ext = 'plybinary'
subtype = 'binary'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'face': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.Vtk(**kwd)[source]

Bases: object

The Visualization Toolkit provides a number of source and writer objects to read and write popular data file formats. The Visualization Toolkit also provides some of its own file formats.

There are two different styles of file formats available in VTK. The simplest are the legacy, serial formats that are easy to read and write either by hand or programmatically. However, these formats are less flexible than the XML based file formats which support random access, parallel I/O, and portable data compression and are preferred to the serial VTK file formats whenever possible.

All keyword phrases are written in ASCII form whether the file is binary or ASCII. The binary section of the file (if in binary form) is the data proper; i.e., the numbers that define points coordinates, scalars, cell indices, and so forth.

Binary data must be placed into the file immediately after the newline (‘\n’) character from the previous ASCII keyword and parameter sequence.

TODO: only legacy formats are currently supported and support for XML formats should be added.

subtype = ''
abstract __init__(**kwd)[source]
sniff_prefix(file_prefix: FilePrefix)[source]

VTK files can be either ASCII or binary, with two different styles of file formats: legacy or XML. We’ll assume if the file contains a valid VTK header, then it is a valid VTK file.

set_meta(dataset, **kwd)[source]
set_initial_metadata(i, line, dataset)[source]
set_structure_metadata(line, dataset, dataset_type)[source]

The fourth part of legacy VTK files is the dataset structure. The geometry part describes the geometry and topology of the dataset. This part begins with a line containing the keyword DATASET followed by a keyword describing the type of dataset. Then, depending upon the type of dataset, other keyword/ data combinations define the actual data.

get_blurb(dataset)[source]
set_peek(dataset)[source]
display_peek(dataset)[source]
sniff(filename)
class galaxy.datatypes.constructive_solid_geometry.VtkAscii(**kwd)[source]

Bases: Vtk, Text

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.vtkascii')
>>> VtkAscii().sniff(fname)
True
>>> fname = get_test_fname('test.vtkbinary')
>>> VtkAscii().sniff(fname)
False
file_ext = 'vtkascii'
subtype = 'ASCII'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dataset_type': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimensions': <galaxy.model.metadata.MetadataElementSpec object>, 'field_components': <galaxy.model.metadata.MetadataElementSpec object>, 'field_names': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'lines': <galaxy.model.metadata.MetadataElementSpec object>, 'origin': <galaxy.model.metadata.MetadataElementSpec object>, 'points': <galaxy.model.metadata.MetadataElementSpec object>, 'polygons': <galaxy.model.metadata.MetadataElementSpec object>, 'spacing': <galaxy.model.metadata.MetadataElementSpec object>, 'triangle_strips': <galaxy.model.metadata.MetadataElementSpec object>, 'vertices': <galaxy.model.metadata.MetadataElementSpec object>, 'vtk_version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.VtkBinary(**kwd)[source]

Bases: Vtk, Binary

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.vtkbinary')
>>> VtkBinary().sniff(fname)
True
>>> fname = get_test_fname('test.vtkascii')
>>> VtkBinary().sniff(fname)
False
file_ext = 'vtkbinary'
subtype = 'BINARY'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'dataset_type': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimensions': <galaxy.model.metadata.MetadataElementSpec object>, 'field_components': <galaxy.model.metadata.MetadataElementSpec object>, 'field_names': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'lines': <galaxy.model.metadata.MetadataElementSpec object>, 'origin': <galaxy.model.metadata.MetadataElementSpec object>, 'points': <galaxy.model.metadata.MetadataElementSpec object>, 'polygons': <galaxy.model.metadata.MetadataElementSpec object>, 'spacing': <galaxy.model.metadata.MetadataElementSpec object>, 'triangle_strips': <galaxy.model.metadata.MetadataElementSpec object>, 'vertices': <galaxy.model.metadata.MetadataElementSpec object>, 'vtk_version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.STL(**kwd)[source]

Bases: Data

file_ext = 'stl'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.NeperTess(**kwd)[source]

Bases: Text

Neper Tessellation File

Example:

***tess
**format
    format
**general
    dim type
**cell
    number_of_cells
file_ext = 'neper.tess'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix)[source]

Neper tess format, starts with ***tess

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.neper.tess')
>>> NeperTess().sniff(fname)
True
>>> fname = get_test_fname('test.neper.tesr')
>>> NeperTess().sniff(fname)
False
set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

set_peek(dataset)[source]

Set the peek. This method is used by various subclasses of Text.

metadata_spec: MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.constructive_solid_geometry.NeperTesr(**kwd)[source]

Bases: Binary

Neper Raster Tessellation File

Example:

***tesr
**format
    format
**general
    dimension
    size_x size_y [size_z]
    voxsize_x voxsize_y [voxsize_z]
[*origin
    origin_x origin_y [origin_z]]
[*hasvoid has_void]
[**cell
    number_of_cells
file_ext = 'neper.tesr'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix)[source]

Neper tesr format, starts with ***tesr

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.neper.tesr')
>>> NeperTesr().sniff(fname)
True
>>> fname = get_test_fname('test.neper.tess')
>>> NeperTesr().sniff(fname)
False
set_meta(dataset, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

metadata_spec: MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'origin': <galaxy.model.metadata.MetadataElementSpec object>, 'size': <galaxy.model.metadata.MetadataElementSpec object>, 'voxsize': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.constructive_solid_geometry.NeperPoints(**kwd)[source]

Bases: Text

Neper Position File Neper position format has 1 - 3 floats per line separated by white space.

file_ext = 'neper.points'
__init__(**kwd)[source]

Initialize the datatype

set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

set_peek(dataset)[source]

Set the peek. This method is used by various subclasses of Text.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.NeperPointsTabular(**kwd)[source]

Bases: NeperPoints, Tabular

Neper Position File Neper position format has 1 - 3 floats per line separated by TABs.

file_ext = 'neper.points.tsv'
__init__(**kwd)[source]

Initialize the datatype

set_meta(dataset, **kwd)[source]

Tries to determine the number of columns as well as those columns that contain numerical values in the dataset. A skip parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many invalid comment lines should be skipped. Using None for skip will cause skip to be zero, but the first line will be processed as a header. A max_data_lines parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many data lines should be processed to ensure that the non-optional metadata parameters are properly set; if used, optional metadata parameters will be set to None, unless the entire file has already been read. Using None for max_data_lines will process all data lines.

Items of interest:

  1. We treat ‘overwrite’ as always True (we always want to set tabular metadata when called).

  2. If a tabular file has no data, it will have one column of type ‘str’.

  3. We used to check only the first 100 lines when setting metadata and this class’s set_peek() method read the entire file to determine the number of lines in the file. Since metadata can now be processed on cluster nodes, we’ve merged the line count portion of the set_peek() processing here, and we now check the entire contents of the file.

set_peek(dataset)[source]

Set the peek. This method is used by various subclasses of Text.

metadata_spec: MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.NeperMultiScaleCell(**kwd)[source]

Bases: Text

Neper Multiscale Cell File

file_ext = 'neper.mscell'
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.GmshMsh(**kwd)[source]

Bases: Binary

Gmsh Mesh File

file_ext = 'gmsh.msh'
is_binary: Union[bool, typing_extensions.Literal[maybe]] = 'maybe'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix)[source]

Gmsh msh format, starts with $MeshFormat

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.gmsh.msh')
>>> GmshMsh().sniff(fname)
True
>>> fname = get_test_fname('test.neper.tesr')
>>> GmshMsh().sniff(fname)
False
set_meta(dataset, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.constructive_solid_geometry.GmshGeo(**kwd)[source]

Bases: Text

Gmsh geometry File

file_ext = 'gmsh.geo'
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.ZsetGeof(**kwd)[source]

Bases: Text

Z-set geof File

file_ext = 'zset.geof'
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.constructive_solid_geometry.get_next_line(fh)[source]

galaxy.datatypes.coverage module

Coverage datatypes

class galaxy.datatypes.coverage.LastzCoverage(**kwd)[source]

Bases: Tabular

file_ext = 'coverage'
get_track_resolution(dataset, start, end)[source]
metadata_spec: MetadataSpecCollection = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'forwardCol': <galaxy.model.metadata.MetadataElementSpec object>, 'positionCol': <galaxy.model.metadata.MetadataElementSpec object>, 'reverseCol': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.data module

exception galaxy.datatypes.data.DatatypeConverterNotFoundException[source]

Bases: Exception

class galaxy.datatypes.data.DatatypeValidation(state, message)[source]

Bases: object

__init__(state, message)[source]
static validated()[source]
static invalid(message)[source]
static unvalidated()[source]
galaxy.datatypes.data.validate(dataset_instance)[source]
galaxy.datatypes.data.get_params_and_input_name(converter, deps, target_context=None)[source]
class galaxy.datatypes.data.DataMeta(name, bases, namespace, **kwargs)[source]

Bases: ABCMeta

Metaclass for Data class. Sets up metadata spec.

__init__(name, bases, dict_)[source]
class galaxy.datatypes.data.Data(**kwd)[source]

Bases: object

Base class for all datatypes. Implements basic interfaces as well as class methods for metadata.

>>> class DataTest( Data ):
...     MetadataElement( name="test" )
...
>>> DataTest.metadata_spec.test.name
'test'
>>> DataTest.metadata_spec.test.desc
'test'
>>> type( DataTest.metadata_spec.test.param )
<class 'galaxy.model.metadata.MetadataParameter'>
edam_data = 'data_0006'
edam_format = 'format_1915'
file_ext = 'data'
CHUNKABLE = False
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

copy_safe_peek = True
is_binary: Union[bool, typing_extensions.Literal[maybe]] = True
composite_type: Optional[str] = None
primary_file_name = 'index'
allow_datatype_change: Optional[bool] = None
track_type: Optional[str] = None
data_sources: Dict[str, str] = {}
dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>}
__init__(**kwd)[source]

Initialize the datatype

supported_display_apps: Dict[str, Any] = {}
composite_files: Dict[str, Any] = {}
classmethod is_datatype_change_allowed()[source]

Returns the value of the allow_datatype_change class attribute if set in a subclass, or True iff the datatype is not composite.

get_raw_data(dataset)[source]

Returns the full data. To stream it open the file_name and read/write as needed

dataset_content_needs_grooming(file_name)[source]

This function is called on an output dataset file after the content is initially generated.

groom_dataset_content(file_name)[source]

This function is called on an output dataset file if dataset_content_needs_grooming returns True.

init_meta(dataset, copy_from=None)[source]
set_meta(dataset: Any, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

missing_meta(dataset, check=None, skip=None)[source]

Checks for empty metadata values. Returns False if no non-optional metadata is missing and the missing metadata key otherwise. Specifying a list of ‘check’ values will only check those names provided; when used, optionality is ignored Specifying a list of ‘skip’ items will return True even when a named metadata value is missing; when used, optionality is ignored

set_max_optional_metadata_filesize(max_value)[source]
get_max_optional_metadata_filesize()[source]
property max_optional_metadata_filesize
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

to_archive(dataset, name='')[source]

Collect archive paths and file handles that need to be exported when archiving dataset.

Parameters
  • dataset – HistoryDatasetAssociation

  • name – archive name, in collection context corresponds to collection name(s) and element_identifier, joined by ‘/’, e.g ‘fastq_collection/sample1/forward’

display_data(trans, data, preview=False, filename=None, to_ext=None, **kwd)[source]

Displays data in central pane if preview is True, else handles download.

Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.

TOOD: Document alternatives to overridding this method (data providers?).

display_as_markdown(dataset_instance, markdown_format_helpers)[source]

Prepare for embedding dataset into a basic Markdown document.

This is a somewhat experimental interface and should not be implemented on datatypes not tightly tied to a Galaxy version (e.g. datatypes in the Tool Shed).

Speaking very losely - the datatype should should load a bounded amount of data from the supplied dataset instance and prepare for embedding it into Markdown. This should be relatively vanilla Markdown - the result of this is bleached and it should not contain nested Galaxy Markdown directives.

If the data cannot reasonably be displayed, just indicate this and do not throw an exception.

display_name(dataset)[source]

Returns formatted html of dataset name

display_info(dataset)[source]

Returns formatted html of dataset info

repair_methods(dataset)[source]

Unimplemented method, returns dict with method/option for repairing errors

get_mime()[source]

Returns the mime type of the datatype

add_display_app(app_id, label, file_function, links_function)[source]

Adds a display app to the datatype. app_id is a unique id label is the primary display label, e.g., display at ‘UCSC’ file_function is a string containing the name of the function that returns a properly formatted display links_function is a string containing the name of the function that returns a list of (link_name,link)

remove_display_app(app_id)[source]

Removes a display app from the datatype

clear_display_apps()[source]
add_display_application(display_application)[source]

New style display applications

get_display_application(key, default=None)[source]
get_display_applications_by_dataset(dataset, trans)[source]
get_display_types()[source]

Returns display types available

get_display_label(type)[source]

Returns primary label for display app

as_display_type(dataset, type, **kwd)[source]

Returns modified file contents for a particular display type

Returns a list of tuples of (name, link) for a particular display type. No check on ‘access’ permissions is done here - if you can view the dataset, you can also save it or send it to a destination outside of Galaxy, so Galaxy security restrictions do not apply anyway.

get_converter_types(original_dataset, datatypes_registry)[source]

Returns available converters by type for this dataset

find_conversion_destination(dataset, accepted_formats: List[str], datatypes_registry, **kwd) Tuple[bool, Optional[str], Optional[DatasetInstance]][source]

Returns ( direct_match, converted_ext, existing converted dataset )

convert_dataset(trans, original_dataset, target_type, return_output=False, visible=True, deps=None, target_context=None, history=None)[source]

This function adds a job to the queue to convert a dataset to another type. Returns a message about success/failure.

after_setting_metadata(dataset)[source]

This function is called on the dataset after metadata is set.

before_setting_metadata(dataset)[source]

This function is called on the dataset before metadata is set.

add_composite_file(name, **kwds)[source]
property writable_files
get_writable_files_for_dataset(dataset)[source]
get_composite_files(dataset=None)[source]
generate_primary_file(dataset=None)[source]
property has_resolution
matches_any(target_datatypes: List[Any]) bool[source]

Check if this datatype is of any of the target_datatypes or is a subtype thereof.

static merge(split_files, output_file)[source]

Merge files with copy.copyfileobj() will not hit the max argument limitation of cat. gz and bz2 files are also working.

get_visualizations(dataset)[source]

Returns a list of visualizations for datatype.

has_dataprovider(data_format)[source]

Returns True if data_format is available in dataproviders.

dataprovider(dataset, data_format, **settings)[source]

Base dataprovider factory for all datatypes that returns the proper provider for the given data_format or raises a NoProviderAvailable.

validate(dataset, **kwd)[source]
base_dataprovider(dataset, **settings)[source]
chunk_dataprovider(dataset, **settings)[source]
chunk64_dataprovider(dataset, **settings)[source]
handle_dataset_as_image(hda) str[source]
class galaxy.datatypes.data.Text(**kwd)[source]

Bases: Data

edam_format = 'format_2330'
file_ext = 'txt'
line_class = 'line'
is_binary: Union[bool, typing_extensions.Literal[maybe]] = False
get_mime()[source]

Returns the mime type of the datatype

set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

estimate_file_lines(dataset)[source]

Perform a rough estimate by extrapolating number of lines from a small read.

count_data_lines(dataset)[source]

Count the number of lines of data in dataset, skipping all blank lines and comments.

set_peek(dataset, line_count=None, WIDTH=256, skipchars=None, line_wrap=True, **kwd)[source]

Set the peek. This method is used by various subclasses of Text.

classmethod split(input_datasets, subdir_generator_function, split_params)[source]

Split the input files by line.

line_dataprovider(dataset, **settings)[source]

Returns an iterator over the dataset’s lines (that have been stripped) optionally excluding blank lines and lines that start with a comment character.

regex_line_dataprovider(dataset, **settings)[source]

Returns an iterator over the dataset’s lines optionally including/excluding lines that match one or more regex filters.

dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'line': <function Text.line_dataprovider>, 'regex-line': <function Text.regex_line_dataprovider>}
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.data.Directory(**kwd)[source]

Bases: Data

Class representing a directory of files.

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.data.GenericAsn1(**kwd)[source]

Bases: Text

Class for generic ASN.1 text format

edam_data = 'data_0849'
edam_format = 'format_1966'
file_ext = 'asn1'
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.data.LineCount(**kwd)[source]

Bases: Text

Dataset contains a single line with a single integer that denotes the line count for a related dataset. Used for custom builds.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.data.Newick(**kwd)[source]

Bases: Text

New Hampshire/Newick Format

edam_data = 'data_0872'
edam_format = 'format_1910'
file_ext = 'newick'
sniff(filename)[source]

Returning false as the newick format is too general and cannot be sniffed.

get_visualizations(dataset)[source]

Returns a list of visualizations for datatype.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.data.Nexus(**kwd)[source]

Bases: Text

Nexus format as used By Paup, Mr Bayes, etc

edam_data = 'data_0872'
edam_format = 'format_1912'
file_ext = 'nex'
sniff_prefix(file_prefix: FilePrefix)[source]

All Nexus Files Simply puts a ‘#NEXUS’ in its first line

get_visualizations(dataset)[source]

Returns a list of visualizations for datatype.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
galaxy.datatypes.data.get_test_fname(fname)[source]

Returns test data filename

galaxy.datatypes.data.get_file_peek(file_name, WIDTH=256, LINE_COUNT=5, skipchars=None, line_wrap=True)[source]

Returns the first LINE_COUNT lines wrapped to WIDTH.

>>> def assert_peek_is(file_name, expected, *args, **kwd):
...     path = get_test_fname(file_name)
...     peek = get_file_peek(path, *args, **kwd)
...     assert peek == expected, "%s != %s" % (peek, expected)
>>> assert_peek_is('0_nonewline', u'0')
>>> assert_peek_is('0.txt', u'0\n')
>>> assert_peek_is('4.bed', u'chr22\t30128507\t31828507\tuc003bnx.1_cds_2_0_chr22_29227_f\t0\t+\n', LINE_COUNT=1)
>>> assert_peek_is('1.bed', u'chr1\t147962192\t147962580\tCCDS989.1_cds_0_0_chr1_147962193_r\t0\t-\nchr1\t147984545\t147984630\tCCDS990.1_cds_0_0_chr1_147984546_f\t0\t+\n', LINE_COUNT=2)

galaxy.datatypes.flow module

Flow analysis datatypes.

class galaxy.datatypes.flow.FCS(**kwd)[source]

Bases: Binary

Class describing an FCS binary file

file_ext = 'fcs'
set_peek(dataset)[source]

Set the peek and blurb text

display_peek(dataset)[source]

Create HTML table, used for displaying peek

sniff_prefix(file_prefix: FilePrefix)[source]

Checking if the file is in FCS format. Should read FCS2.0, FCS3.0 and FCS3.1

Based on flowcore: https://github.com/RGLab/flowCore/blob/27141b792ad65ae8bd0aeeef26e757c39cdaefe7/R/IO.R#L667

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)

galaxy.datatypes.genetics module

rgenetics datatypes Use at your peril Ross Lazarus for the rgenetics and galaxy projects

genome graphs datatypes derived from Interval datatypes genome graphs datasets have a header row with appropriate columnames The first column is always the marker - eg columname = rs, first row= rs12345 if the rows are snps subsequent row values are all numeric ! Will fail if any non numeric (eg ‘+’ or ‘NA’) values ross lazarus for rgenetics august 20 2007

class galaxy.datatypes.genetics.GenomeGraphs(**kwd)[source]

Bases: Tabular

Tab delimited data containing a marker id and any number of numeric values

file_ext = 'gg'
__init__(**kwd)[source]

Initialize gg datatype, by adding UCSC display apps

set_meta(dataset, **kwd)[source]

Tries to determine the number of columns as well as those columns that contain numerical values in the dataset. A skip parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many invalid comment lines should be skipped. Using None for skip will cause skip to be zero, but the first line will be processed as a header. A max_data_lines parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many data lines should be processed to ensure that the non-optional metadata parameters are properly set; if used, optional metadata parameters will be set to None, unless the entire file has already been read. Using None for max_data_lines will process all data lines.

Items of interest:

  1. We treat ‘overwrite’ as always True (we always want to set tabular metadata when called).

  2. If a tabular file has no data, it will have one column of type ‘str’.

  3. We used to check only the first 100 lines when setting metadata and this class’s set_peek() method read the entire file to determine the number of lines in the file. Since metadata can now be processed on cluster nodes, we’ve merged the line count portion of the set_peek() processing here, and we now check the entire contents of the file.

as_ucsc_display_file(dataset, **kwd)[source]

Returns file

from the ever-helpful angie hinrichs angie@soe.ucsc.edu a genome graphs call looks like this

http://genome.ucsc.edu/cgi-bin/hgGenome?clade=mammal&org=Human&db=hg18&hgGenome_dataSetName=dname &hgGenome_dataSetDescription=test&hgGenome_formatType=best%20guess&hgGenome_markerType=best%20guess &hgGenome_columnLabels=best%20guess&hgGenome_maxVal=&hgGenome_labelVals= &hgGenome_maxGapToFill=25000000&hgGenome_uploadFile=http://galaxy.esphealth.org/datasets/333/display/index &hgGenome_doSubmitUpload=submit

Galaxy gives this for an interval file

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&position=chr1:1-1000&hgt.customText= http%3A%2F%2Fgalaxy.esphealth.org%2Fdisplay_as%3Fid%3D339%26display_app%3Ducsc

make_html_table(dataset)[source]

Create HTML table, used for displaying peek

validate(dataset, **kwd)[source]

Validate a gg file - all numeric after header row

sniff_prefix(file_prefix: FilePrefix)[source]

Determines whether the file is in gg format

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'test_space.txt' )
>>> GenomeGraphs().sniff( fname )
False
>>> fname = get_test_fname( '1.gg' )
>>> GenomeGraphs().sniff( fname )
True
get_mime()[source]

Returns the mime type of the datatype

metadata_spec: MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'markerCol': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.genetics.rgTabList(**kwd)[source]

Bases: Tabular

for sampleid and for featureid lists of exclusions or inclusions in the clean tool featureid subsets on statistical criteria -> specialized display such as gg

file_ext = 'rgTList'
__init__(**kwd)[source]

Initialize featurelistt datatype

display_peek(dataset)[source]

Returns formated html of peek

get_mime()[source]

Returns the mime type of the datatype

metadata_spec: MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.rgSampleList(**kwd)[source]

Bases: rgTabList

for sampleid exclusions or inclusions in the clean tool output from QC eg excess het, gender error, ibd pair member,eigen outlier,excess mendel errors,… since they can be uploaded, should be flexible but they are persistent at least same infrastructure for expression?

file_ext = 'rgSList'
__init__(**kwd)[source]

Initialize samplelist datatype

metadata_spec: MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.rgFeatureList(**kwd)[source]

Bases: rgTabList

for featureid lists of exclusions or inclusions in the clean tool output from QC eg low maf, high missingness, bad hwe in controls, excess mendel errors,… featureid subsets on statistical criteria -> specialized display such as gg same infrastructure for expression?

file_ext = 'rgFList'
__init__(**kwd)[source]

Initialize featurelist datatype

metadata_spec: MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Rgenetics(**kwd)[source]

Bases: Html

base class to use for rgenetics datatypes derived from html - composite datatype elements stored in extra files path

composite_type: Optional[str] = 'auto_primary_file'
file_ext = 'rgenetics'
generate_primary_file(dataset=None)[source]
regenerate_primary_file(dataset)[source]

cannot do this until we are setting metadata

get_mime()[source]

Returns the mime type of the datatype

set_meta(dataset, **kwd)[source]

for lped/pbed eg

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.SNPMatrix(**kwd)[source]

Bases: Rgenetics

BioC SNPMatrix Rgenetics data collections

file_ext = 'snpmatrix'
set_peek(dataset, **kwd)[source]

Set the peek. This method is used by various subclasses of Text.

sniff(filename)[source]

need to check the file header hex code

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Lped(**kwd)[source]

Bases: Rgenetics

linkage pedigree (ped,map) Rgenetics data collections

file_ext = 'lped'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Pphe(**kwd)[source]

Bases: Rgenetics

Plink phenotype file - header must have FID IID… Rgenetics data collections

file_ext = 'pphe'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Fphe(**kwd)[source]

Bases: Rgenetics

fbat pedigree file - mad format with ! as first char on header row Rgenetics data collections

file_ext = 'fphe'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Phe(**kwd)[source]

Bases: Rgenetics

Phenotype file

file_ext = 'phe'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Fped(**kwd)[source]

Bases: Rgenetics

FBAT pedigree format - single file, map is header row of rs numbers. Strange. Rgenetics data collections

file_ext = 'fped'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Pbed(**kwd)[source]

Bases: Rgenetics

Plink Binary compressed 2bit/geno Rgenetics data collections

file_ext = 'pbed'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.ldIndep(**kwd)[source]

Bases: Rgenetics

LD (a good measure of redundancy of information) depleted Plink Binary compressed 2bit/geno This is really a plink binary, but some tools work better with less redundancy so are constrained to these files

file_ext = 'ldreduced'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Eigenstratgeno(**kwd)[source]

Bases: Rgenetics

Eigenstrat format - may be able to get rid of this if we move to shellfish Rgenetics data collections

file_ext = 'eigenstratgeno'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Eigenstratpca(**kwd)[source]

Bases: Rgenetics

Eigenstrat PCA file for case control adjustment Rgenetics data collections

file_ext = 'eigenstratpca'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Snptest(**kwd)[source]

Bases: Rgenetics

BioC snptest Rgenetics data collections

file_ext = 'snptest'
metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.IdeasPre(**kwd)[source]

Bases: Html

This datatype defines the input format required by IDEAS: https://academic.oup.com/nar/article/44/14/6721/2468150 The IDEAS preprocessor tool produces an output using this format. The extra_files_path of the primary input dataset contains the following files and directories. - chromosome_windows.txt (optional) - chromosomes.bed (optional) - IDEAS_input_config.txt - compressed archived tmp directory containing a number of compressed bed files.

composite_type: Optional[str] = 'auto_primary_file'
file_ext = 'ideaspre'
__init__(**kwd)[source]

Initialize the datatype

set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

generate_primary_file(dataset=None)[source]
regenerate_primary_file(dataset)[source]
metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'chrom_bed': <galaxy.model.metadata.MetadataElementSpec object>, 'chrom_windows': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'input_config': <galaxy.model.metadata.MetadataElementSpec object>, 'tmp_archive': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Pheno(**kwd)[source]

Bases: Tabular

base class for pheno files

file_ext = 'pheno'
metadata_spec: MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.RexpBase(**kwd)[source]

Bases: Html

base class for BioC data structures in Galaxy must be constructed with the pheno data in place since that goes into the metadata for each instance

file_ext = 'rexpbase'
html_table = None
composite_type: Optional[str] = 'auto_primary_file'
__init__(**kwd)[source]

Initialize the datatype

generate_primary_file(dataset=None)[source]

This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately

get_mime()[source]

Returns the mime type of the datatype

get_phecols(phenolist, maxConc=20)[source]

sept 2009: cannot use whitespace to split - make a more complex structure here and adjust the methods that rely on this structure return interesting phenotype column names for an rexpression eset or affybatch to use in array subsetting and so on. Returns a data structure for a dynamic Galaxy select parameter. A column with only 1 value doesn’t change, so is not interesting for analysis. A column with a different value in every row is equivalent to a unique identifier so is also not interesting for anova or limma analysis - both these are removed after the concordance (count of unique terms) is constructed for each column. Then a complication - each remaining pair of columns is tested for redundancy - if two columns are always paired, then only one is needed :)

get_pheno(dataset)[source]

expects a .pheno file in the extra_files_dir - ugh note that R is wierd and adds the row.name in the header so the columns are all wrong - unless you tell it not to. A file can be written as write.table(file=’foo.pheno’,pData(foo),sep=’ ‘,quote=F,row.names=F)

set_peek(dataset, **kwd)[source]

expects a .pheno file in the extra_files_dir - ugh note that R is weird and does not include the row.name in the header. why?

get_peek(dataset)[source]

expects a .pheno file in the extra_files_dir - ugh

get_file_peek(filename)[source]

can’t really peek at a filename - need the extra_files_path and such?

regenerate_primary_file(dataset)[source]

cannot do this until we are setting metadata

init_meta(dataset, copy_from=None)[source]
set_meta(dataset, **kwd)[source]

NOTE we apply the tabular machinary to the phenodata extracted from a BioC eSet or affybatch.

make_html_table(pp='nothing supplied from peek\n')[source]

Create HTML table, used for displaying peek

display_peek(dataset)[source]

Returns formatted html of peek

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.genetics.Affybatch(**kwd)[source]

Bases: RexpBase

derived class for BioC data structures in Galaxy

file_ext = 'affybatch'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'pheCols':