Warning

This document is for an old release of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

galaxy.datatypes package

Subpackages

Submodules

galaxy.datatypes.annotation module

class galaxy.datatypes.annotation.SnapHmm(**kwd)[source]

Bases: Text

file_ext = 'snaphmm'
edam_data = 'data_1364'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek. This method is used by various subclasses of Text.

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

sniff_prefix(file_prefix: FilePrefix) bool[source]

SNAP model files start with zoeHMM

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.annotation.Augustus(**kwd)[source]

Bases: CompressedArchive

Class describing an Augustus prediction model

file_ext = 'augustus'
edam_data = 'data_0950'
compressed = True
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

sniff(filename: str) bool[source]

Augustus archives always contain the same files

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.anvio module

Datatypes for Anvi’o https://github.com/merenlab/anvio

class galaxy.datatypes.anvio.AnvioComposite(**kwd)[source]

Bases: Html

Base class to use for Anvi’o composite datatypes. Generally consist of a sqlite database, plus optional additional files

file_ext = 'anvio_composite'
composite_type: str | None = 'auto_primary_file'
generate_primary_file(dataset: HasExtraFilesAndMetadata) str[source]

This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately

get_mime() str[source]

Returns the mime type of the datatype

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML content, used for displaying peek.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioDB(*args, **kwd)[source]

Bases: AnvioComposite

Class for AnvioDB database files.

file_ext = 'anvio_db'
__init__(*args, **kwd)[source]

Initialize the datatype

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Set the anvio_basename based upon actual extra_files_path contents.

metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioStructureDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Structure DB database files.

file_ext = 'anvio_structure_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioGenomesDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Genomes DB database files.

file_ext = 'anvio_genomes_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioContigsDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Contigs DB database files.

file_ext = 'anvio_contigs_db'
__init__(*args, **kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioProfileDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Profile DB database files.

file_ext = 'anvio_profile_db'
__init__(*args, **kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioPanDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Pan DB database files.

file_ext = 'anvio_pan_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.anvio.AnvioSamplesDB(*args, **kwd)[source]

Bases: AnvioDB

Class for Anvio Samples DB database files.

file_ext = 'anvio_samples_db'
metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.assembly module

velvet datatypes James E Johnson - University of Minnesota for velvet assembler tool in galaxy

class galaxy.datatypes.assembly.Amos(**kwd)[source]

Bases: Text

Class describing the AMOS assembly file

edam_data = 'data_0925'
edam_format = 'format_3582'
file_ext = 'afg'
sniff_prefix(file_prefix: FilePrefix) bool[source]

Determines whether the file is an amos assembly file format Example:

{CTG
iid:1
eid:1
seq:
CCTCTCCTGTAGAGTTCAACCGA-GCCGGTAGAGTTTTATCA
.
qlt:
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
.
{TLE
src:1027
off:0
clr:618,0
gap:
250 612
.
}
}
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.assembly.Sequences(**kwd)[source]

Bases: Fasta

Class describing the Sequences file generated by velveth

edam_data = 'data_0925'
file_ext = 'sequences'
sniff_prefix(file_prefix: FilePrefix) bool[source]

Determines whether the file is a velveth produced fasta format The id line has 3 fields separated by tabs: sequence_name sequence_index category:

>SEQUENCE_0_length_35   1       1
GGATATAGGGCCAACCCAACTCAACGGCCTGTCTT
>SEQUENCE_1_length_35   2       1
CGACGAATGACAGGTCACGAATTTGGCGGGGATTA
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'sequences': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.assembly.Roadmaps(**kwd)[source]

Bases: Text

Class describing the Sequences file generated by velveth

edam_format = 'format_2561'
file_ext = 'roadmaps'
sniff_prefix(file_prefix: FilePrefix) bool[source]
Determines whether the file is a velveth produced RoadMap::

142858 21 1 ROADMAP 1 ROADMAP 2 …

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.assembly.Velvet(**kwd)[source]

Bases: Html

composite_type: str | None = 'auto_primary_file'
file_ext = 'velvet'
__init__(**kwd)[source]

Initialize the datatype

generate_primary_file(dataset: HasExtraFilesAndMetadata) str[source]
regenerate_primary_file(dataset: DatasetProtocol) None[source]

cannot do this until we are setting metadata

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Set the number of lines of data in dataset.

metadata_spec: MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'long_reads': <galaxy.model.metadata.MetadataElementSpec object>, 'paired_end_reads': <galaxy.model.metadata.MetadataElementSpec object>, 'short2_reads': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.binary module

Binary classes

class galaxy.datatypes.binary.Binary(**kwd)[source]

Bases: Data

Binary data

edam_format = 'format_2333'
file_ext = 'binary'
static register_sniffable_binary_format(data_type, ext, type_class)[source]

Deprecated method.

static register_unsniffable_binary_ext(ext)[source]

Deprecated method.

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

get_mime() str[source]

Returns the mime type of the datatype

get_structured_content(dataset, content_type, **kwargs)[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Ab1(**kwd)[source]

Bases: Binary

Class describing an ab1 binary sequence file

file_ext = 'ab1'
edam_format = 'format_3000'
edam_data = 'data_0924'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Idat(**kwd)[source]

Bases: Binary

Binary data in idat format

file_ext = 'idat'
edam_format = 'format_2058'
edam_data = 'data_2603'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Cel(**kwd)[source]

Bases: Binary

Cel File format described at: http://media.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cel.html

is_binary: bool | typing_extensions.Literal[maybe] = 'maybe'
file_ext = 'cel'
edam_format = 'format_1638'
edam_data = 'data_3110'
sniff(filename: str) bool[source]

Try to guess if the file is a Cel file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('affy_v_agcc.cel')
>>> Cel().sniff(fname)
True
>>> fname = get_test_fname('affy_v_3.cel')
>>> Cel().sniff(fname)
True
>>> fname = get_test_fname('affy_v_4.cel')
>>> Cel().sniff(fname)
True
>>> fname = get_test_fname('test.gal')
>>> Cel().sniff(fname)
False
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Set metadata for Cel file.

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MashSketch(**kwd)[source]

Bases: Binary

Mash Sketch file. Sketches are used by the MinHash algorithm to allow fast distance estimations with low storage and memory requirements. To make a sketch, each k-mer in a sequence is hashed, which creates a pseudo-random identifier. By sorting these identifiers (hashes), a small subset from the top of the sorted list can represent the entire sequence (these are min-hashes). The more similar another sequence is, the more min-hashes it is likely to share.

file_ext = 'msh'
is_binary: bool | typing_extensions.Literal[maybe] = 'maybe'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.CompressedArchive(**kwd)[source]

Bases: Binary

Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.

file_ext = 'compressed_archive'
compressed = True
is_binary: bool | typing_extensions.Literal[maybe] = 'maybe'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Meryldb(**kwd)[source]

Bases: CompressedArchive

MerylDB is a tar.gz archive, with 128 files. 64 data files and 64 index files.

file_ext = 'meryldb'
sniff(filename: str) bool[source]

Try to guess if the file is a Cel file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('affy_v_agcc.cel')
>>> Meryldb().sniff(fname)
False
>>> fname = get_test_fname('read-db.meryldb')
>>> Meryldb().sniff(fname)
True
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Visium(**kwd)[source]

Bases: CompressedArchive

Visium is a tar.gz archive with at least a ‘Spatial’ subfolder, a filtered h5 file and a raw h5 file.

file_ext = 'visium.tar.gz'
sniff(filename: str) bool[source]

Check data structure: Contains h5 files Contains spatial folder

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Bref3(**kwd)[source]

Bases: Binary

Bref3 format is a binary format for storing phased, non-missing genotypes for a list of samples.

file_ext = 'bref3'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.DynamicCompressedArchive(**kwd)[source]

Bases: CompressedArchive

compressed_format: str
uncompressed_datatype_instance: Data
matches_any(target_datatypes: List[Any]) bool[source]

Treat two aspects of compressed datatypes separately.

metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GzDynamicCompressedArchive(**kwd)[source]

Bases: DynamicCompressedArchive

compressed_format: str = 'gzip'
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

uncompressed_datatype_instance: Data
class galaxy.datatypes.binary.Bz2DynamicCompressedArchive(**kwd)[source]

Bases: DynamicCompressedArchive

compressed_format: str = 'bz2'
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

uncompressed_datatype_instance: Data
class galaxy.datatypes.binary.CompressedZipArchive(**kwd)[source]

Bases: CompressedArchive

Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.

file_ext = 'zip'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GenericAsn1Binary(**kwd)[source]

Bases: Binary

Class for generic ASN.1 binary format

file_ext = 'asn1-binary'
edam_format = 'format_1966'
edam_data = 'data_0849'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BamNative(**kwd)[source]

Bases: CompressedArchive, _BamOrSam

Class describing a BAM binary file that is not necessarily sorted

edam_format = 'format_2572'
edam_data = 'data_0863'
file_ext = 'unsorted.bam'
sort_flag: str | None = None
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

static merge(split_files: List[str], output_file: str) None[source]

Merges BAM files

Parameters:
  • split_files – List of bam file paths to merge

  • output_file – Write merged bam file to this location

init_meta(dataset: HasMetadata, copy_from: HasMetadata | None = None) None[source]
sniff(filename: str) bool[source]
classmethod is_bam(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

to_archive(dataset: DatasetProtocol, name: str = '') Iterable[source]

Collect archive paths and file handles that need to be exported when archiving dataset.

Parameters:
  • dataset – HistoryDatasetAssociation

  • name – archive name, in collection context corresponds to collection name(s) and element_identifier, joined by ‘/’, e.g ‘fastq_collection/sample1/forward’

groom_dataset_content(file_name: str) None[source]

Ensures that the BAM file contents are coordinate-sorted. This function is called on an output dataset after the content is initially generated.

get_chunk(trans, dataset: HasFileName, offset: int = 0, ck_size: int | None = None) str[source]
display_data(trans, dataset: DatasetHasHidProtocol, preview: bool = False, filename: str | None = None, to_ext: str | None = None, offset: int | None = None, ck_size: int | None = None, **kwd)[source]

Displays data in central pane if preview is True, else handles download.

Datatypes should be very careful if overriding this method and this interface between datatypes and Galaxy will likely change.

TODO: Document alternatives to overriding this method (data providers?).

validate(dataset: DatasetProtocol, **kwd) DatatypeValidation[source]
metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Bam(**kwd)[source]

Bases: BamNative

Class describing a BAM binary file

edam_format = 'format_2572'
edam_data = 'data_0863'
file_ext = 'bam'
track_type: str | None = 'ReadTrack'
data_sources: Dict[str, str] = {'data': 'bai', 'index': 'bigwig'}
get_index_flag(file_name: str) str[source]

Return pysam flag for bai index (default) or csi index (contig size > (2**29 - 1) )

dataset_content_needs_grooming(file_name: str) bool[source]

Check if file_name is a coordinate-sorted BAM file

set_meta(dataset: DatasetProtocol, overwrite: bool = True, metadata_tmp_files_dir: str | None = None, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
line_dataprovider(dataset: DatasetProtocol, **settings) FilteredLineDataProvider[source]
regex_line_dataprovider(dataset: DatasetProtocol, **settings) RegexLineDataProvider[source]
column_dataprovider(dataset: DatasetProtocol, **settings) ColumnarDataProvider[source]
dict_dataprovider(dataset: DatasetProtocol, **settings) DictDataProvider[source]
header_dataprovider(dataset: DatasetProtocol, **settings) RegexLineDataProvider[source]
id_seq_qual_dataprovider(dataset: DatasetProtocol, **settings) DictDataProvider[source]
genomic_region_dataprovider(dataset: DatasetProtocol, **settings) ColumnarDataProvider[source]
genomic_region_dict_dataprovider(dataset: DatasetProtocol, **settings) DictDataProvider[source]
samtools_dataprovider(dataset: DatasetProtocol, **settings) SamtoolsDataProvider[source]

Generic samtools interface - all options available through settings.

dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'column': <function Bam.column_dataprovider>, 'dict': <function Bam.dict_dataprovider>, 'genomic-region': <function Bam.genomic_region_dataprovider>, 'genomic-region-dict': <function Bam.genomic_region_dict_dataprovider>, 'header': <function Bam.header_dataprovider>, 'id-seq-qual': <function Bam.id_seq_qual_dataprovider>, 'line': <function Bam.line_dataprovider>, 'regex-line': <function Bam.regex_line_dataprovider>, 'samtools': <function Bam.samtools_dataprovider>}
metadata_spec: metadata.MetadataSpecCollection = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.ProBam(**kwd)[source]

Bases: Bam

Class describing a BAM binary file - extended for proteomics data

edam_format = 'format_3826'
edam_data = 'data_0863'
file_ext = 'probam'
metadata_spec: metadata.MetadataSpecCollection = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BamInputSorted(**kwd)[source]

Bases: BamNative

A class for BAM files that can formally be unsorted or queryname sorted. Alignments are either ordered based on the order with which the queries appear when producing the alignment, or ordered by their queryname. This notaby keeps alignments produced by paired end sequencing adjacent.

sort_flag: str | None = '-n'
file_ext = 'qname_input_sorted.bam'
sniff(filename: str) bool[source]
dataset_content_needs_grooming(file_name: str) bool[source]

Groom if the file is coordinate sorted

metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BamQuerynameSorted(**kwd)[source]

Bases: BamInputSorted

A class for queryname sorted BAM files.

sort_flag: str | None = '-n'
file_ext = 'qname_sorted.bam'
sniff(filename: str) bool[source]
dataset_content_needs_grooming(file_name: str) bool[source]

Check if file_name is a queryname-sorted BAM file

metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.CRAM(**kwd)[source]

Bases: Binary

file_ext = 'cram'
edam_format = 'format_3462'
edam_data = 'data_0863'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, metadata_tmp_files_dir: str | None = None, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

get_cram_version(filename: str) Tuple[int, int][source]
set_index_file(dataset: HasFileName, index_file) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'cram_index': <galaxy.model.metadata.MetadataElementSpec object>, 'cram_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BaseBcf(**kwd)[source]

Bases: CompressedArchive

edam_format = 'format_3020'
edam_data = 'data_3498'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Bcf(**kwd)[source]

Bases: BaseBcf

Class describing a (BGZF-compressed) BCF file

file_ext = 'bcf'
sniff(filename: str) bool[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, metadata_tmp_files_dir: str | None = None, **kwd) None[source]

Creates the index for the BCF file.

metadata_spec: MetadataSpecCollection = {'bcf_index': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BcfUncompressed(**kwd)[source]

Bases: BaseBcf

Class describing an uncompressed BCF file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('1.bcf_uncompressed')
>>> BcfUncompressed().sniff(fname)
True
>>> fname = get_test_fname('1.bcf')
>>> BcfUncompressed().sniff(fname)
False
file_ext = 'bcf_uncompressed'
compressed = False
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.H5(**kwd)[source]

Bases: Binary

Class describing an HDF5 file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.mz5')
>>> H5().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> H5().sniff(fname)
False
file_ext = 'h5'
edam_format = 'format_3590'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

get_structured_content(dataset, content_type=None, path='/', dtype='origin', format='json', flatten=False, selection=None, **kwargs)[source]

Implements h5grove protocol (https://silx-kit.github.io/h5grove/). This allows the h5web visualization tool (https://github.com/silx-kit/h5web) to be used directly with Galaxy datasets.

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Loom(**kwd)[source]

Bases: H5

Class describing a Loom file: http://loompy.org/

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.loom')
>>> Loom().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Loom().sniff(fname)
False
file_ext = 'loom'
edam_format = 'format_3590'
sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

metadata_spec: MetadataSpecCollection = {'col_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'col_attrs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'col_graphs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'col_graphs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'description': <galaxy.model.metadata.MetadataElementSpec object>, 'doi': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object>, 'loom_spec_version': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'row_graphs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'row_graphs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'title': <galaxy.model.metadata.MetadataElementSpec object>, 'url': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Anndata(**kwd)[source]

Bases: H5

Class describing an HDF5 anndata files: http://anndata.rtfd.io

>>> from galaxy.datatypes.sniff import get_test_fname
>>> Anndata().sniff(get_test_fname('pbmc3k_tiny.h5ad'))
True
>>> Anndata().sniff(get_test_fname('test.mz5'))
False
>>> Anndata().sniff(get_test_fname('import.loom.krumsiek11.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_6_small2.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_6_small.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_7_4_small2.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_0_7_4_small.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_unk2.h5ad'))
True
>>> Anndata().sniff(get_test_fname('adata_unk.h5ad'))
True
file_ext = 'h5ad'
sniff(filename: str) bool[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'anndata_spec_version': <galaxy.model.metadata.MetadataElementSpec object>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'description': <galaxy.model.metadata.MetadataElementSpec object>, 'doi': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_size': <galaxy.model.metadata.MetadataElementSpec object>, 'obsm_count': <galaxy.model.metadata.MetadataElementSpec object>, 'obsm_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_count': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_size': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'title': <galaxy.model.metadata.MetadataElementSpec object>, 'uns_count': <galaxy.model.metadata.MetadataElementSpec object>, 'uns_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'url': <galaxy.model.metadata.MetadataElementSpec object>, 'var_count': <galaxy.model.metadata.MetadataElementSpec object>, 'var_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'var_size': <galaxy.model.metadata.MetadataElementSpec object>, 'varm_count': <galaxy.model.metadata.MetadataElementSpec object>, 'varm_layers': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Grib(**kwd)[source]

Bases: Binary

Class describing an GRIB file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.grib')
>>> Grib().sniff_prefix(FilePrefix(fname))
True
>>> fname = FilePrefix(get_test_fname('interval.interval'))
>>> Grib().sniff_prefix(fname)
False
file_ext = 'grib'
edam_format = 'format_2333'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Set the GRIB edition.

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'grib_edition': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.GmxBinary(**kwd)[source]

Bases: Binary

Base class for GROMACS binary files - xtc, trr, cpt

magic_number: int | None = None
file_ext = ''
sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.Trr(**kwd)[source]

Bases: GmxBinary

Class describing an trr file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.trr')
>>> Trr().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Trr().sniff(fname)
False
file_ext = 'trr'
magic_number: int | None = 1993
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Cpt(**kwd)[source]

Bases: GmxBinary

Class describing a checkpoint (.cpt) file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.cpt')
>>> Cpt().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Cpt().sniff(fname)
False
file_ext = 'cpt'
magic_number: int | None = 171817
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Xtc(**kwd)[source]

Bases: GmxBinary

Class describing an xtc file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.xtc')
>>> Xtc().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Xtc().sniff(fname)
False
file_ext = 'xtc'
magic_number: int | None = 1995
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Edr(**kwd)[source]

Bases: GmxBinary

Class describing an edr file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.edr')
>>> Edr().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Edr().sniff(fname)
False
file_ext = 'edr'
magic_number: int | None = -55555
metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Biom2(**kwd)[source]

Bases: H5

Class describing a biom2 file (http://biom-format.org/documentation/biom_format.html)

file_ext = 'biom2'
edam_format = 'format_3746'
sniff(filename: str) bool[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> Biom2().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Biom2().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> Biom2().sniff(fname)
False
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'format_url': <galaxy.model.metadata.MetadataElementSpec object>, 'format_version': <galaxy.model.metadata.MetadataElementSpec object>, 'generated_by': <galaxy.model.metadata.MetadataElementSpec object>, 'id': <galaxy.model.metadata.MetadataElementSpec object>, 'nnz': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'type': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Cool(**kwd)[source]

Bases: H5

Class describing the cool format (https://github.com/mirnylab/cooler)

file_ext = 'cool'
sniff(filename: str) bool[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('matrix.cool')
>>> Cool().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Cool().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> Cool().sniff(fname)
False
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> Cool().sniff(fname)
False
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MCool(**kwd)[source]

Bases: H5

Class describing the multi-resolution cool format (https://github.com/mirnylab/cooler)

file_ext = 'mcool'
sniff(filename: str) bool[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('matrix.mcool')
>>> MCool().sniff(fname)
True
>>> fname = get_test_fname('matrix.cool')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('test.mz5')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> MCool().sniff(fname)
False
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.H5MLM(**kwd)[source]

Bases: H5

Machine learning model generated by Galaxy-ML.

file_ext = 'h5mlm'
TARGET_URL = 'https://github.com/goeckslab/Galaxy-ML'
max_peek_size = 1000
max_preview_size = 1000000
CONFIG = '-model_config-'
HTTP_REPR = '-http_repr-'
HYPERPARAMETER = '-model_hyperparameters-'
REPR = '-repr-'
URL = '-URL-'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, metadata_tmp_files_dir: str | None = None, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
get_attribute(filename: str, attr_key: str) str[source]
get_repr(filename: str) str[source]
get_html_repr(filename: str) str[source]
get_config_string(filename: str) str[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

display_data(trans, dataset: DatasetHasHidProtocol, preview: bool = False, filename: str | None = None, to_ext: str | None = None, **kwd)[source]

Displays data in central pane if preview is True, else handles download.

Datatypes should be very careful if overriding this method and this interface between datatypes and Galaxy will likely change.

TODO: Document alternatives to overriding this method (data providers?).

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'hyper_params': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.LudwigModel(**kwd)[source]

Bases: Html

Composite datatype that encloses multiple files for a Ludwig trained model.

composite_type: str | None = 'auto_primary_file'
file_ext = 'ludwig_model'
__init__(**kwd)[source]

Initialize the datatype

generate_primary_file(dataset: HasExtraFilesAndMetadata) str[source]
metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.HexrdMaterials(**kwd)[source]

Bases: H5

Class describing a Hexrd Materials file: https://github.com/HEXRD/hexrd

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.materials.h5')
>>> HexrdMaterials().sniff(fname)
True
>>> fname = get_test_fname('test.loom')
>>> HexrdMaterials().sniff(fname)
False
file_ext = 'hexrd.materials.h5'
edam_format = 'format_3590'
sniff(filename: str) bool[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

metadata_spec: MetadataSpecCollection = {'LatticeParameters': <galaxy.model.metadata.MetadataElementSpec object>, 'SpaceGroupNumber': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'materials': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Scf(**kwd)[source]

Bases: Binary

Class describing an scf binary sequence file

edam_format = 'format_1632'
edam_data = 'data_0924'
file_ext = 'scf'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Sff(**kwd)[source]

Bases: Binary

Standard Flowgram Format (SFF)

edam_format = 'format_3284'
edam_data = 'data_0924'
file_ext = 'sff'
sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.BigWig(**kwd)[source]

Bases: Binary

Accessing binary BigWig files from UCSC. The supplemental info in the paper has the binary details: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btq351v1

edam_format = 'format_3006'
edam_data = 'data_3002'
file_ext = 'bigwig'
track_type: str | None = 'LineTrack'
data_sources: Dict[str, str] = {'data_standalone': 'bigwig'}
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.BigBed(**kwd)[source]

Bases: BigWig

BigBed support from UCSC.

edam_format = 'format_3004'
edam_data = 'data_3002'
file_ext = 'bigbed'
data_sources: Dict[str, str] = {'data_standalone': 'bigbed'}
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.TwoBit(**kwd)[source]

Bases: Binary

Class describing a TwoBit format nucleotide file

edam_format = 'format_3009'
edam_data = 'data_0848'
file_ext = 'twobit'
sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.SQlite(**kwd)[source]

Bases: Binary

Class describing a Sqlite database

file_ext = 'sqlite'
edam_format = 'format_3621'
init_meta(dataset: HasMetadata, copy_from: HasMetadata | None = None) None[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
sniff_table_names(filename: str, table_names: Iterable) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

sqlite_dataprovider(dataset: DatasetProtocol, **settings) SQliteDataProvider[source]
sqlite_datatableprovider(dataset: DatasetProtocol, **settings) SQliteDataTableProvider[source]
sqlite_datadictprovider(dataset: DatasetProtocol, **settings) SQliteDataDictProvider[source]
dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'sqlite': <function SQlite.sqlite_dataprovider>, 'sqlite-dict': <function SQlite.sqlite_datadictprovider>, 'sqlite-table': <function SQlite.sqlite_datatableprovider>}
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GeminiSQLite(**kwd)[source]

Bases: SQlite

Class describing a Gemini Sqlite database

file_ext = 'gemini.sqlite'
edam_format = 'format_3622'
edam_data = 'data_3498'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'gemini_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.ChiraSQLite(**kwd)[source]

Bases: SQlite

Class describing a ChiRAViz Sqlite database

file_ext = 'chira.sqlite'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.CuffDiffSQlite(**kwd)[source]

Bases: SQlite

Class describing a CuffDiff SQLite database

file_ext = 'cuffdiff.sqlite'
edam_format = 'format_3621'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'cuffdiff_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'genes': <galaxy.model.metadata.MetadataElementSpec object>, 'samples': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MzSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Sqlite database

file_ext = 'mz.sqlite'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.PQP(**kwd)[source]

Bases: SQlite

Class describing a Peptide query parameters file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.pqp')
>>> PQP().sniff(fname)
True
>>> fname = get_test_fname('test.osw')
>>> PQP().sniff(fname)
False
file_ext = 'pqp'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]

table definition according to https://github.com/grosenberger/OpenMS/blob/develop/src/openms/source/ANALYSIS/OPENSWATH/TransitionPQPFile.cpp#L264 for now VERSION GENE PEPTIDE_GENE_MAPPING are excluded, since there is test data wo these tables, see also here https://github.com/OpenMS/OpenMS/issues/4365

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OSW(**kwd)[source]

Bases: SQlite

Class describing OpenSwath output

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.osw')
>>> OSW().sniff(fname)
True
>>> fname = get_test_fname('test.sqmass')
>>> OSW().sniff(fname)
False
file_ext = 'osw'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.SQmass(**kwd)[source]

Bases: SQlite

Class describing a Sqmass database

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.sqmass')
>>> SQmass().sniff(fname)
True
>>> fname = get_test_fname('test.pqp')
>>> SQmass().sniff(fname)
False
file_ext = 'sqmass'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.BlibSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Spectral Library Sqlite database

file_ext = 'blib'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'blib_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.DlibSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Spectral Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.dlib')
>>> DlibSQlite().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DlibSQlite().sniff(fname)
False
file_ext = 'dlib'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dlib_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.ElibSQlite(**kwd)[source]

Bases: SQlite

Class describing a Proteomics Chromatagram Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.elib')
>>> ElibSQlite().sniff(fname)
True
>>> fname = get_test_fname('test.dlib')
>>> ElibSQlite().sniff(fname)
False
file_ext = 'elib'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.IdpDB(**kwd)[source]

Bases: SQlite

Class describing an IDPicker 3 idpDB (sqlite) database

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.idpdb')
>>> IdpDB().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> IdpDB().sniff(fname)
False
file_ext = 'idpdb'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GAFASQLite(**kwd)[source]

Bases: SQlite

Class describing a GAFA SQLite database

file_ext = 'gafa.sqlite'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'gafa_schema_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.NcbiTaxonomySQlite(**kwd)[source]

Bases: SQlite

Class describing the NCBI Taxonomy database stored in SQLite as done by rust-ncbitaxonomy

file_ext = 'ncbitaxonomy.sqlite'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'ncbitaxonomy_schema_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>, 'taxon_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Xlsx(**kwd)[source]

Bases: Binary

Class for Excel 2007 (xlsx) files

file_ext = 'xlsx'
compressed = True
sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.ExcelXls(**kwd)[source]

Bases: Binary

Class describing an Excel (xls) file

file_ext = 'excel.xls'
edam_format = 'format_3468'
sniff_prefix(file_prefix: FilePrefix) bool[source]
get_mime() str[source]

Returns the mime type of the datatype

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.Sra(**kwd)[source]

Bases: Binary

Sequence Read Archive (SRA) datatype originally from mdshw5/sra-tools-galaxy

file_ext = 'sra'
sniff_prefix(file_prefix: FilePrefix) bool[source]

The first 8 bytes of any NCBI sra file is ‘NCBI.sra’, and the file is binary. For details about the format, see http://www.ncbi.nlm.nih.gov/books/n/helpsra/SRA_Overview_BK/#SRA_Overview_BK.4_SRA_Data_Structure

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.RData(**kwd)[source]

Bases: CompressedArchive

Generic R Data file datatype implementation, i.e. files generated with R’s save or save.img function see https://www.loc.gov/preservation/digital/formats/fdd/fdd000470.shtml and https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.rdata')
>>> RData().sniff(fname)
True
>>> from galaxy.util.bunch import Bunch
>>> dataset = Bunch()
>>> dataset.metadata = Bunch
>>> dataset.get_file_name = lambda : fname
>>> dataset.has_data = lambda: True
>>> RData().set_meta(dataset)
>>> dataset.metadata.version
'3'
VERSION_2_PREFIX = b'RDX2\nX\n'
VERSION_3_PREFIX = b'RDX3\nX\n'
file_ext = 'rdata'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.RDS(**kwd)[source]

Bases: CompressedArchive

File using a serialized R object generated with R’s saveRDS function see https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('int-r3.rds')
>>> RDS().sniff(fname)
True
>>> fname = get_test_fname('int-r4.rds')
>>> RDS().sniff(fname)
True
>>> fname = get_test_fname('int-r3-version2.rds')
>>> RDS().sniff(fname)
True
>>> from galaxy.util.bunch import Bunch
>>> dataset = Bunch()
>>> dataset.metadata = Bunch
>>> dataset.get_file_name = lambda : get_test_fname('int-r4.rds')
>>> dataset.has_data = lambda: True
>>> RDS().set_meta(dataset)
>>> dataset.metadata.version
'3'
>>> dataset.metadata.rversion
'4.1.1'
>>> dataset.metadata.minrversion
'3.5.0'
file_ext = 'rds'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'minrversion': <galaxy.model.metadata.MetadataElementSpec object>, 'rversion': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.OxliBinary(**kwd)[source]

Bases: Binary

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliCountGraph(**kwd)[source]

Bases: OxliBinary

OxliCountGraph starts with “OXLI” + one byte version number + 8-bit binary ‘1’ Test file generated via:

load-into-counting.py --n_tables 1 --max-tablesize 1 \
    oxli_countgraph.oxlicg khmer/tests/test-data/100-reads.fq.bz2

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliCountGraph().sniff(fname)
False
>>> fname = get_test_fname("oxli_countgraph.oxlicg")
>>> OxliCountGraph().sniff(fname)
True
file_ext = 'oxlicg'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliNodeGraph(**kwd)[source]

Bases: OxliBinary

OxliNodeGraph starts with “OXLI” + one byte version number + 8-bit binary ‘2’ Test file generated via:

load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \
    khmer/tests/test-data/100-reads.fq.bz2

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliNodeGraph().sniff(fname)
False
>>> fname = get_test_fname("oxli_nodegraph.oxling")
>>> OxliNodeGraph().sniff(fname)
True
file_ext = 'oxling'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliTagSet(**kwd)[source]

Bases: OxliBinary

OxliTagSet starts with “OXLI” + one byte version number + 8-bit binary ‘3’ Test file generated via:

load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \
    khmer/tests/test-data/100-reads.fq.bz2;
mv oxli_nodegraph.oxling.tagset oxli_tagset.oxlits

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliTagSet().sniff(fname)
False
>>> fname = get_test_fname("oxli_tagset.oxlits")
>>> OxliTagSet().sniff(fname)
True
file_ext = 'oxlits'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliStopTags(**kwd)[source]

Bases: OxliBinary

OxliStopTags starts with “OXLI” + one byte version number + 8-bit binary ‘4’ Test file adapted from khmer 2.0’s “khmer/tests/test-data/goodversion-k32.stoptags”

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliStopTags().sniff(fname)
False
>>> fname = get_test_fname("oxli_stoptags.oxlist")
>>> OxliStopTags().sniff(fname)
True
file_ext = 'oxlist'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliSubset(**kwd)[source]

Bases: OxliBinary

OxliSubset starts with “OXLI” + one byte version number + 8-bit binary ‘5’ Test file generated via:

load-graph.py -k 20 example tests/test-data/random-20-a.fa;
partition-graph.py example;
mv example.subset.0.pmap oxli_subset.oxliss

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliSubset().sniff(fname)
False
>>> fname = get_test_fname("oxli_subset.oxliss")
>>> OxliSubset().sniff(fname)
True
file_ext = 'oxliss'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.OxliGraphLabels(**kwd)[source]

Bases: OxliBinary

OxliGraphLabels starts with “OXLI” + one byte version number + 8-bit binary ‘6’ Test file generated via:

python -c "from khmer import GraphLabels; \
    gl = GraphLabels(20, 1e7, 4); \
    gl.consume_fasta_and_tag_with_labels('tests/test-data/test-labels.fa'); \
    gl.save_labels_and_tags('oxli_graphlabels.oxligl')"

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliGraphLabels().sniff(fname)
False
>>> fname = get_test_fname("oxli_graphlabels.oxligl")
>>> OxliGraphLabels().sniff(fname)
True
file_ext = 'oxligl'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.PostgresqlArchive(**kwd)[source]

Bases: CompressedArchive

Class describing a Postgresql database packed into a tar archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('postgresql_fake.tar.bz2')
>>> PostgresqlArchive().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar')
>>> PostgresqlArchive().sniff(fname)
False
file_ext = 'postgresql'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MongoDBArchive(**kwd)[source]

Bases: CompressedArchive

Class describing a Mongo database packed into a tar archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('mongodb_fake.tar.bz2')
>>> MongoDBArchive().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar')
>>> MongoDBArchive().sniff(fname)
False
file_ext = 'mongodb'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.GeneNoteBook(**kwd)[source]

Bases: MongoDBArchive

Class describing a bzip2-compressed GeneNoteBook archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('mongodb_fake.tar.bz2')
>>> GeneNoteBook().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> GeneNoteBook().sniff(fname)
False
file_ext = 'genenotebook'
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5Archive(**kwd)[source]

Bases: CompressedArchive

Class describing a FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5Archive().sniff(fname)
True
file_ext = 'fast5.tar'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5ArchiveGz(**kwd)[source]

Bases: Fast5Archive

Class describing a gzip-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveGz().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.xz')
>>> Fast5ArchiveGz().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveGz().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveGz().sniff(fname)
False
file_ext = 'fast5.tar.gz'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5ArchiveXz(**kwd)[source]

Bases: Fast5Archive

Class describing a xz-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveXz().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar.xz')
>>> Fast5ArchiveXz().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveXz().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveXz().sniff(fname)
False
file_ext = 'fast5.tar.xz'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Fast5ArchiveBz2(**kwd)[source]

Bases: Fast5Archive

Class describing a bzip2-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveBz2().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.xz')
>>> Fast5ArchiveBz2().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveBz2().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveBz2().sniff(fname)
False
file_ext = 'fast5.tar.bz2'
sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.SearchGuiArchive(**kwd)[source]

Bases: CompressedArchive

Class describing a SearchGUI archive

file_ext = 'searchgui_archive'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'searchgui_major_version': <galaxy.model.metadata.MetadataElementSpec object>, 'searchgui_version': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.NetCDF(**kwd)[source]

Bases: Binary

Binary data in netCDF format

file_ext = 'netcdf'
edam_format = 'format_3650'
edam_data = 'data_0943'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.Dcd(**kwd)[source]

Bases: Binary

Class describing a dcd file from the CHARMM molecular simulation program

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test_glucose_vacuum.dcd')
>>> Dcd().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Dcd().sniff(fname)
False
file_ext = 'dcd'
edam_data = 'data_3842'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Vel(**kwd)[source]

Bases: Binary

Class describing a velocity file from the CHARMM molecular simulation program

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test_charmm.vel')
>>> Vel().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Vel().sniff(fname)
False
file_ext = 'vel'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.DAA(**kwd)[source]

Bases: Binary

Class describing an DAA (diamond alignment archive) file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('diamond.daa')
>>> DAA().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DAA().sniff(fname)
False
file_ext = 'daa'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.RMA6(**kwd)[source]

Bases: Binary

Class describing an RMA6 (MEGAN6 read-match archive) file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('diamond.rma6')
>>> RMA6().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> RMA6().sniff(fname)
False
file_ext = 'rma6'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.DMND(**kwd)[source]

Bases: Binary

Class describing an DMND file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('diamond_db.dmnd')
>>> DMND().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DMND().sniff(fname)
False
file_ext = 'dmnd'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.ICM(**kwd)[source]

Bases: Binary

Class describing an ICM (interpolated context model) file, used by Glimmer

file_ext = 'icm'
edam_data = 'data_0950'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

sniff(filename: str) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Parquet(**kwd)[source]

Bases: Binary

Class describing Apache Parquet file (https://parquet.apache.org/)

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('example.parquet')
>>> Parquet().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Parquet().sniff(fname)
False
file_ext = 'parquet'
__init__(**kwd)[source]

Initialize the datatype

sniff_prefix(file_prefix: FilePrefix) bool[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.BafTar(**kwd)[source]

Bases: CompressedArchive

Base class for common behavior of tar files of directory-based raw file formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('brukerbaf.d.tar')
>>> BafTar().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar')
>>> BafTar().sniff(fname)
False
edam_data = 'data_2536'
edam_format = 'format_3712'
file_ext = 'brukerbaf.d.tar'
get_signature_file() str[source]
sniff(filename: str) bool[source]
get_type() str[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.YepTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Agilent/Bruker YEP format data

file_ext = 'agilentbrukeryep.d.tar'
get_signature_file() str[source]
get_type() str[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.TdfTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Bruker TDF format data

file_ext = 'brukertdf.d.tar'
get_signature_file() str[source]
get_type() str[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MassHunterTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Agilent MassHunter format data

file_ext = 'agilentmasshunter.d.tar'
get_signature_file() str[source]
get_type() str[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.MassLynxTar(**kwd)[source]

Bases: BafTar

A tar’d up .d directory containing Waters MassLynx format data

file_ext = 'watersmasslynx.raw.tar'
get_signature_file() str[source]
get_type() str[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.WiffTar(**kwd)[source]

Bases: BafTar

A tar’d up .wiff/.scan pair containing Sciex WIFF format data

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('some.wiff.tar')
>>> WiffTar().sniff(fname)
True
>>> fname = get_test_fname('brukerbaf.d.tar')
>>> WiffTar().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> WiffTar().sniff(fname)
False
file_ext = 'wiff.tar'
sniff(filename: str) bool[source]
get_type() str[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Wiff2Tar(**kwd)[source]

Bases: BafTar

A tar’d up .wiff2/.scan pair containing Sciex WIFF format data

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('some.wiff2.tar')
>>> Wiff2Tar().sniff(fname)
True
>>> fname = get_test_fname('brukerbaf.d.tar')
>>> Wiff2Tar().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Wiff2Tar().sniff(fname)
False
file_ext = 'wiff2.tar'
sniff(filename: str) bool[source]
get_type() str[source]
metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Pretext(**kwd)[source]

Bases: Binary

PretextMap contact map file Try to guess if the file is a Pretext file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sample.pretext')
>>> Pretext().sniff(fname)
True
file_ext = 'pretext'
sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.binary.JP2(**kwd)[source]

Bases: Binary

JPEG 2000 binary image format

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.jp2')
>>> JP2().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> JP2().sniff(fname)
False
file_ext = 'jp2'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Npz(**kwd)[source]

Bases: CompressedArchive

Class describing an Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.images.npz')
>>> Npz().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Npz().sniff(fname)
False
file_ext = 'npz'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.HexrdImagesNpz(**kwd)[source]

Bases: Npz

Class describing an HEXRD Images Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.images.npz')
>>> HexrdImagesNpz().sniff(fname)
True
>>> fname = get_test_fname('eta_ome.npz')
>>> HexrdImagesNpz().sniff(fname)
False
file_ext = 'hexrd.images.npz'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object>, 'omegas': <galaxy.model.metadata.MetadataElementSpec object>, 'panel_id': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.HexrdEtaOmeNpz(**kwd)[source]

Bases: Npz

Class describing an HEXRD Eta-Ome Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.eta_ome.npz')
>>> HexrdEtaOmeNpz().sniff(fname)
True
>>> fname = get_test_fname('hexrd.images.npz')
>>> HexrdEtaOmeNpz().sniff(fname)
False
file_ext = 'hexrd.eta_ome.npz'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'HKLs': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.FITS(**kwd)[source]

Bases: Binary

FITS (Flexible Image Transport System) file data format, widely used in astronomy Represents sky images (in celestial coordinates) and tables https://fits.gsfc.nasa.gov/fits_primer.html

file_ext = 'fits'
__init__(**kwd)[source]

Initialize the datatype

sniff(filename: str) bool[source]

Determines whether the file is a FITS file >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.fits’) >>> FITS().sniff(fname) True >>> fname = FilePrefix(get_test_fname(‘interval.interval’)) >>> FITS().sniff(fname) False

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'HDUs': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.binary.Numpy(**kwd)[source]

Bases: Binary

Class defining a numpy data file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.npy')
>>> Numpy().sniff(fname)
True
file_ext = 'npy'
set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]

Unimplemented method, allows guessing of metadata from contents of file

sniff_prefix(file_prefix: FilePrefix) bool[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

display_peek(dataset: DatasetProtocol) str[source]

Create HTML table, used for displaying peek

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version_str': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)

galaxy.datatypes.blast module

NCBI BLAST datatypes.

Covers the blastxml format and the BLAST databases.

class galaxy.datatypes.blast.BlastXml(**kwd)[source]

Bases: GenericXml

NCBI Blast XML Output data

file_ext = 'blastxml'
edam_format = 'format_3331'
edam_data = 'data_0857'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text

sniff_prefix(file_prefix: FilePrefix) bool[source]

Determines whether the file is blastxml

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('megablast_xml_parser_test1.blastxml')
>>> BlastXml().sniff(fname)
True
>>> fname = get_test_fname('tblastn_four_human_vs_rhodopsin.blastxml')
>>> BlastXml().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> BlastXml().sniff(fname)
False
static merge(split_files: List[str], output_file: str) None[source]

Merging multiple XML files is non-trivial and must be done in subclasses.

metadata_spec: MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

sniff(filename)
class galaxy.datatypes.blast.BlastNucDb(**kwd)[source]

Bases: _BlastDb

Class for nucleotide BLAST database files.

file_ext = 'blastdbn'
composite_type: str | None = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastProtDb(**kwd)[source]

Bases: _BlastDb

Class for protein BLAST database files.

file_ext = 'blastdbp'
composite_type: str | None = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastDomainDb(**kwd)[source]

Bases: _BlastDb

Class for domain BLAST database files.

file_ext = 'blastdbd'
composite_type: str | None = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.LastDb(**kwd)[source]

Bases: Data

Class for LAST database files.

file_ext = 'lastdb'
composite_type: str | None = 'basic'
set_peek(dataset: DatasetProtocol, **kwd) None[source]

Set the peek and blurb text.

display_peek(dataset: DatasetProtocol) str[source]

Create HTML content, used for displaying peek.

__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastNucDb5(**kwd)[source]

Bases: _BlastDb

Class for nucleotide BLAST database files.

file_ext = 'blastdbn5'
composite_type: str | None = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastProtDb5(**kwd)[source]

Bases: _BlastDb

Class for protein BLAST database files.

file_ext = 'blastdbp5'
composite_type: str | None = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.blast.BlastDomainDb5(**kwd)[source]

Bases: _BlastDb

Class for domain BLAST database files.

file_ext = 'blastdbd5'
composite_type: str | None = 'basic'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.checkers module

Module proxies galaxy.util.checkers for backward compatibility.

External datatypes may make use of these functions.

galaxy.datatypes.checkers.check_binary(name, file_path: bool = True) bool[source]
galaxy.datatypes.checkers.check_bz2(file_path: str, check_content: bool = True) Tuple[bool, bool][source]
galaxy.datatypes.checkers.check_gzip(file_path: str, check_content: bool = True) Tuple[bool, bool][source]
galaxy.datatypes.checkers.check_html(name, file_path: bool = True) bool[source]

Returns True if the file/string contains HTML code.

galaxy.datatypes.checkers.check_image(file_path: str) bool[source]

Simple wrapper around image_type to yield a True/False verdict

galaxy.datatypes.checkers.check_zip(file_path: str, check_content: bool = True, files=1) Tuple[bool, bool][source]
galaxy.datatypes.checkers.is_gzip(file_path: str) bool[source]
galaxy.datatypes.checkers.is_bz2(file_path: str) bool[source]

galaxy.datatypes.chrominfo module

class galaxy.datatypes.chrominfo.ChromInfo(**kwd)[source]

Bases: Tabular

file_ext = 'len'
metadata_spec: MetadataSpecCollection = {'chrom': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'length': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

galaxy.datatypes.constructive_solid_geometry module

Constructive Solid Geometry file formats.

class galaxy.datatypes.constructive_solid_geometry.Ply(**kwd)[source]

Bases: object

The PLY format describes an object as a collection of vertices, faces and other elements, along with properties such as color and normal direction that can be attached to these elements. A PLY file contains the description of exactly one object.

subtype = ''
abstract __init__(**kwd)[source]
sniff_prefix(file_prefix: FilePrefix) bool[source]

The structure of a typical PLY file: Header, Vertex List, Face List, (lists of other elements)

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]
set_peek(dataset: DatasetProtocol, **kwd) None[source]
display_peek(dataset: DatasetProtocol) str[source]
sniff(filename)
class galaxy.datatypes.constructive_solid_geometry.PlyAscii(**kwd)[source]

Bases: Ply, Text

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.plyascii')
>>> PlyAscii().sniff(fname)
True
>>> fname = get_test_fname('test.vtkascii')
>>> PlyAscii().sniff(fname)
False
file_ext = 'plyascii'
subtype = 'ascii'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'face': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.PlyBinary(**kwd)[source]

Bases: Ply, Binary

file_ext = 'plybinary'
subtype = 'binary'
__init__(**kwd)[source]

Initialize the datatype

metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'face': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object>}

Dictionary of metadata fields for this datatype

class galaxy.datatypes.constructive_solid_geometry.Vtk(**kwd)[source]

Bases: object

The Visualization Toolkit provides a number of source and writer objects to read and write popular data file formats. The Visualization Toolkit also provides some of its own file formats.

There are two different styles of file formats available in VTK. The simplest are the legacy, serial formats that are easy to read and write either by hand or programmatically. However, these formats are less flexible than the XML based file formats which support random access, parallel I/O, and portable data compression and are preferred to the serial VTK file formats whenever possible.

All keyword phrases are written in ASCII form whether the file is binary or ASCII. The binary section of the file (if in binary form) is the data proper; i.e., the numbers that define points coordinates, scalars, cell indices, and so forth.

Binary data must be placed into the file immediately after the newline (’\n’) character from the previous ASCII keyword and parameter sequence.

TODO: only legacy formats are currently supported and support for XML formats should be added.

subtype = ''
abstract __init__(**kwd)[source]
sniff_prefix(file_prefix: FilePrefix) bool[source]

VTK files can be either ASCII or binary, with two different styles of file formats: legacy or XML. We’ll assume if the file contains a valid VTK header, then it is a valid VTK file.

set_meta(dataset: DatasetProtocol, overwrite: bool = True, **kwd) None[source]
set_initial_metadata(i: int, line: str, dataset: DatasetProtocol) DatasetProtocol[source]
set_structure_metadata(line: str, dataset: DatasetProtocol, dataset_type: str | None) Tuple[DatasetProtocol, str | None][source]

The fourth part of legacy VTK files is the dataset structure. The geometry part describes the geometry and topology of the dataset. This part begins with a line containing the keyword DATASET followed by a keyword describing the type of dataset. Then, depending upon the type of dataset, other keyword/ data combinations define the actual data.

<