Warning

This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

galaxy.datatypes package

Subpackages

Submodules

galaxy.datatypes.annotation module

class galaxy.datatypes.annotation.Augustus(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Class describing an Augustus prediction model

compressed = True
display_peek(dataset)[source]
edam_data = 'data_0950'
file_ext = 'augustus'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd400e6d0>}
set_peek(dataset)[source]
sniff(filename)[source]

Augustus archives always contain the same files

class galaxy.datatypes.annotation.SnapHmm(**kwd)[source]

Bases: galaxy.datatypes.data.Text

display_peek(dataset)[source]
edam_data = 'data_1364'
file_ext = 'snaphmm'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd400e640>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

SNAP model files start with zoeHMM

galaxy.datatypes.anvio module

Datatypes for Anvi’o https://github.com/merenlab/anvio

class galaxy.datatypes.anvio.AnvioComposite(**kwd)[source]

Bases: galaxy.datatypes.text.Html

Base class to use for Anvi’o composite datatypes. Generally consist of a sqlite database, plus optional additional files

composite_type = 'auto_primary_file'
display_peek(dataset)[source]

Create HTML content, used for displaying peek.

file_ext = 'anvio_composite'
generate_primary_file(dataset=None)[source]

This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately

get_mime()[source]

Returns the mime type of the datatype

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_peek(dataset)[source]

Set the peek and blurb text

class galaxy.datatypes.anvio.AnvioContigsDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioDB

Class for Anvio Contigs DB database files.

__init__(*args, **kwd)[source]
file_ext = 'anvio_contigs_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd41f72e0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.anvio.AnvioDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioComposite

Class for AnvioDB database files.

__init__(*args, **kwd)[source]
file_ext = 'anvio_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd43cd730>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_meta(dataset, **kwd)[source]

Set the anvio_basename based upon actual extra_files_path contents.

class galaxy.datatypes.anvio.AnvioGenomesDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioDB

Class for Anvio Genomes DB database files.

file_ext = 'anvio_genomes_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd41f7b20>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.anvio.AnvioPanDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioDB

Class for Anvio Pan DB database files.

file_ext = 'anvio_pan_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd41c0d90>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.anvio.AnvioProfileDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioDB

Class for Anvio Profile DB database files.

__init__(*args, **kwd)[source]
file_ext = 'anvio_profile_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd41c0a60>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.anvio.AnvioSamplesDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioDB

Class for Anvio Samples DB database files.

file_ext = 'anvio_samples_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd42758b0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.anvio.AnvioStructureDB(*args, **kwd)[source]

Bases: galaxy.datatypes.anvio.AnvioDB

Class for Anvio Structure DB database files.

file_ext = 'anvio_structure_db'
metadata_spec = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd41f72b0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd44a9130>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}

galaxy.datatypes.assembly module

velvet datatypes James E Johnson - University of Minnesota for velvet assembler tool in galaxy

class galaxy.datatypes.assembly.Amos(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Class describing the AMOS assembly file

edam_data = 'data_0925'
edam_format = 'format_3582'
file_ext = 'afg'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd4136580>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is an amos assembly file format Example:

{CTG
iid:1
eid:1
seq:
CCTCTCCTGTAGAGTTCAACCGA-GCCGGTAGAGTTTTATCA
.
qlt:
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
.
{TLE
src:1027
off:0
clr:618,0
gap:
250 612
.
}
}
class galaxy.datatypes.assembly.Roadmaps(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Class describing the Sequences file generated by velveth

edam_format = 'format_2561'
file_ext = 'roadmaps'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd54d1cd0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
Determines whether the file is a velveth produced RoadMap::
142858 21 1 ROADMAP 1 ROADMAP 2 …
class galaxy.datatypes.assembly.Sequences(**kwd)[source]

Bases: galaxy.datatypes.sequence.Fasta

Class describing the Sequences file generated by velveth

edam_data = 'data_0925'
file_ext = 'sequences'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'sequences': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd4136910>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is a velveth produced fasta format The id line has 3 fields separated by tabs: sequence_name sequence_index category:

>SEQUENCE_0_length_35   1       1
GGATATAGGGCCAACCCAACTCAACGGCCTGTCTT
>SEQUENCE_1_length_35   2       1
CGACGAATGACAGGTCACGAATTTGGCGGGGATTA
class galaxy.datatypes.assembly.Velvet(**kwd)[source]

Bases: galaxy.datatypes.text.Html

__init__(**kwd)[source]
composite_type = 'auto_primary_file'
file_ext = 'velvet'
generate_primary_file(dataset=None)[source]
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd54d1df0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'long_reads': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd400e790>, 'paired_end_reads': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd400e880>, 'short2_reads': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd400e3d0>}
regenerate_primary_file(dataset)[source]

cannot do this until we are setting metadata

set_meta(dataset, **kwd)[source]

galaxy.datatypes.binary module

Binary classes

class galaxy.datatypes.binary.Ab1(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an ab1 binary sequence file

display_peek(dataset)[source]
edam_data = 'data_0924'
edam_format = 'format_3000'
file_ext = 'ab1'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1970>}
set_peek(dataset)[source]
class galaxy.datatypes.binary.Anndata(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Class describing an HDF5 anndata files: http://anndata.rtfd.io >>> from galaxy.datatypes.sniff import get_test_fname >>> Anndata().sniff(get_test_fname(‘pbmc3k_tiny.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘test.mz5’)) False >>> Anndata().sniff(get_test_fname(‘import.loom.krumsiek11.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘adata_0_6_small2.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘adata_0_6_small.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘adata_0_7_4_small2.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘adata_0_7_4_small.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘adata_unk2.h5ad’)) True >>> Anndata().sniff(get_test_fname(‘adata_unk.h5ad’)) True

display_peek(dataset)[source]
file_ext = 'h5ad'
metadata_spec = {'anndata_spec_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e474f0>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47eb0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149790>, 'description': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe715d2e0>, 'doi': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe715d0d0>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47ca0>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47c40>, 'obs_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e470d0>, 'obs_layers': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47460>, 'obs_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e473a0>, 'obs_size': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47040>, 'obsm_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47100>, 'obsm_layers': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47310>, 'raw_var_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47220>, 'raw_var_layers': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47190>, 'raw_var_size': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47d60>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e475b0>, 'shape': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd199e80>, 'title': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7079a00>, 'uns_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1992b0>, 'uns_layers': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1990d0>, 'url': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe715d220>, 'var_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47f40>, 'var_layers': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47df0>, 'var_size': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e47a00>, 'varm_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd199040>, 'varm_layers': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd199160>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.BafTar(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Base class for common behavior of tar files of directory-based raw file formats >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘brukerbaf.d.tar’) >>> BafTar().sniff(fname) True >>> fname = get_test_fname(‘test.fast5.tar’) >>> BafTar().sniff(fname) False

display_peek(dataset)[source]
edam_data = 'data_2536'
edam_format = 'format_3712'
file_ext = 'brukerbaf.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9940>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Bam(**kwd)[source]

Bases: galaxy.datatypes.binary.BamNative

Class describing a BAM binary file

column_dataprovider(dataset, **settings)[source]
data_sources = {'data': 'bai', 'index': 'bigwig'}
dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'column': <function Bam.column_dataprovider at 0x7f6fdd1af820>, 'dict': <function Bam.dict_dataprovider at 0x7f6fdd1af9d0>, 'genomic-region': <function Bam.genomic_region_dataprovider at 0x7f6fdd1afee0>, 'genomic-region-dict': <function Bam.genomic_region_dict_dataprovider at 0x7f6fdcfe80d0>, 'header': <function Bam.header_dataprovider at 0x7f6fdd1afb80>, 'id-seq-qual': <function Bam.id_seq_qual_dataprovider at 0x7f6fdd1afd30>, 'line': <function Bam.line_dataprovider at 0x7f6fdd1af4c0>, 'regex-line': <function Bam.regex_line_dataprovider at 0x7f6fdd1af670>, 'samtools': <function Bam.samtools_dataprovider at 0x7f6fdcfe8280>}
dataset_content_needs_grooming(file_name)[source]

Check if file_name is a coordinate-sorted BAM file

dict_dataprovider(dataset, **settings)[source]
edam_data = 'data_0863'
edam_format = 'format_2572'
file_ext = 'bam'
genomic_region_dataprovider(dataset, **settings)[source]
genomic_region_dict_dataprovider(dataset, **settings)[source]
get_index_flag(file_name)[source]

Return pysam flag for bai index (default) or csi index (contig size > (2**29 - 1) )

header_dataprovider(dataset, **settings)[source]
id_seq_qual_dataprovider(dataset, **settings)[source]
line_dataprovider(dataset, **settings)[source]
metadata_spec = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e2efd0>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18ff70>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e56d00>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f3d0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f460>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f700>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f7c0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fdc0>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fee0>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fe50>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fd30>}
regex_line_dataprovider(dataset, **settings)[source]
samtools_dataprovider(dataset, **settings)[source]

Generic samtools interface - all options available through settings.

set_meta(dataset, overwrite=True, **kwd)[source]
sniff(file_name)[source]
track_type = 'ReadTrack'
class galaxy.datatypes.binary.BamInputSorted(**kwd)[source]

Bases: galaxy.datatypes.binary.BamNative

A class for BAM files that can formally be unsorted or queryname sorted. Alignments are either ordered based on the order with which the queries appear when producing the alignment, or ordered by their queryname. This notaby keeps alignments produced by paired end sequencing adjacent.

dataset_content_needs_grooming(file_name)[source]

Groom if the file is coordinate sorted

file_ext = 'qname_input_sorted.bam'
metadata_spec = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157af0>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6f5b3a0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6f5b490>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6f5b4f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6f5b2e0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157820>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157a00>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157610>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157b20>}
sniff(file_name)[source]
sort_flag = '-n'
class galaxy.datatypes.binary.BamNative(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive, galaxy.datatypes.binary._BamOrSam

Class describing a BAM binary file that is not necessarily sorted

display_data(trans, dataset, preview=False, filename=None, to_ext=None, offset=None, ck_size=None, **kwd)[source]
display_peek(dataset)[source]
edam_data = 'data_0863'
edam_format = 'format_2572'
file_ext = 'unsorted.bam'
get_chunk(trans, dataset, offset=0, ck_size=None)[source]
groom_dataset_content(file_name)[source]

Ensures that the BAM file contents are coordinate-sorted. This function is called on an output dataset after the content is initially generated.

init_meta(dataset, copy_from=None)[source]
classmethod is_bam(filename)[source]
static merge(split_files, output_file)[source]

Merges BAM files

Parameters:
  • split_files – List of bam file paths to merge
  • output_file – Write merged bam file to this location
metadata_spec = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18ff70>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f3d0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f460>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f700>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f7c0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fdc0>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fee0>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fe50>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fd30>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
sort_flag = None
to_archive(dataset, name='')[source]
validate(dataset, **kwd)[source]
class galaxy.datatypes.binary.BamQuerynameSorted(**kwd)[source]

Bases: galaxy.datatypes.binary.BamInputSorted

A class for queryname sorted BAM files.

dataset_content_needs_grooming(file_name)[source]

Check if file_name is a queryname-sorted BAM file

file_ext = 'qname_sorted.bam'
metadata_spec = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe71499d0>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe70ace50>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe71570a0>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157730>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7157100>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe70aceb0>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149340>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149730>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe70acf10>}
sniff(file_name)[source]
sort_flag = '-n'
class galaxy.datatypes.binary.BaseBcf(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

edam_data = 'data_3498'
edam_format = 'format_3020'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149a00>}
class galaxy.datatypes.binary.Bcf(**kwd)[source]

Bases: galaxy.datatypes.binary.BaseBcf

Class describing a (BGZF-compressed) BCF file

file_ext = 'bcf'
metadata_spec = {'bcf_index': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149670>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149a00>}
set_meta(dataset, overwrite=True, **kwd)[source]

Creates the index for the BCF file.

sniff(filename)[source]
class galaxy.datatypes.binary.BcfUncompressed(**kwd)[source]

Bases: galaxy.datatypes.binary.BaseBcf

Class describing an uncompressed BCF file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('1.bcf_uncompressed')
>>> BcfUncompressed().sniff(fname)
True
>>> fname = get_test_fname('1.bcf')
>>> BcfUncompressed().sniff(fname)
False
file_ext = 'bcf_uncompressed'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149130>}
sniff(filename)[source]
class galaxy.datatypes.binary.BigBed(**kwd)[source]

Bases: galaxy.datatypes.binary.BigWig

BigBed support from UCSC.

__init__(**kwd)[source]
data_sources = {'data_standalone': 'bigbed'}
edam_data = 'data_3002'
edam_format = 'format_3004'
file_ext = 'bigbed'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe430>}
class galaxy.datatypes.binary.BigWig(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Accessing binary BigWig files from UCSC. The supplemental info in the paper has the binary details: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btq351v1

__init__(**kwd)[source]
data_sources = {'data_standalone': 'bigwig'}
display_peek(dataset)[source]
edam_data = 'data_3002'
edam_format = 'format_3006'
file_ext = 'bigwig'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe520>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
track_type = 'LineTrack'
class galaxy.datatypes.binary.Binary(**kwd)[source]

Bases: galaxy.datatypes.data.Data

Binary data

edam_format = 'format_2333'
file_ext = 'binary'
get_mime()[source]

Returns the mime type of the datatype

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>}
static register_sniffable_binary_format(data_type, ext, type_class)[source]

Deprecated method.

static register_unsniffable_binary_ext(ext)[source]

Deprecated method.

set_peek(dataset, **kwd)[source]

Set the peek and blurb text

class galaxy.datatypes.binary.Biom2(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Class describing a biom2 file (http://biom-format.org/documentation/biom_format.html)

display_peek(dataset)[source]
edam_format = 'format_3746'
file_ext = 'biom2'
metadata_spec = {'creation_date': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e1b910>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149790>, 'format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a15b0>, 'format_url': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1490>, 'format_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1520>, 'generated_by': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e3f970>, 'id': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1370>, 'nnz': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffef70>, 'shape': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffeee0>, 'type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a16a0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> Biom2().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Biom2().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> Biom2().sniff(fname)
False
class galaxy.datatypes.binary.BlibSQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Proteomics Spectral Library Sqlite database

file_ext = 'blib'
metadata_spec = {'blib_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeec70>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Bref3(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Bref3 format is a binary format for storing phased, non-missing genotypes for a list of samples.

__init__(**kwd)[source]
display_peek(dataset)[source]
file_ext = 'bref3'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dc10>}
set_peek(dataset)[source]
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.Bz2DynamicCompressedArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.DynamicCompressedArchive

compressed_format = 'bz2'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19df10>}
class galaxy.datatypes.binary.CRAM(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

edam_data = 'data_0863'
edam_format = 'format_3462'
file_ext = 'cram'
get_cram_version(filename)[source]
metadata_spec = {'cram_index': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149580>, 'cram_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe71498e0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>}
set_index_file(dataset, index_file)[source]
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Cel(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Cel File format described at: http://media.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cel.html

edam_data = 'data_3110'
edam_format = 'format_1638'
file_ext = 'cel'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd011dc0>}
set_meta(dataset, **kwd)[source]

Set metadata for Cel file.

set_peek(dataset)[source]
sniff(filename)[source]

Try to guess if the file is a Cel file. >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘affy_v_agcc.cel’) >>> Cel().sniff(fname) True >>> fname = get_test_fname(‘affy_v_3.cel’) >>> Cel().sniff(fname) True >>> fname = get_test_fname(‘affy_v_4.cel’) >>> Cel().sniff(fname) True >>> fname = get_test_fname(‘test.gal’) >>> Cel().sniff(fname) False

class galaxy.datatypes.binary.ChiraSQLite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a ChiRAViz Sqlite database

file_ext = 'chira.sqlite'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5af0>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5a60>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5bb0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.CompressedArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.

compressed = True
display_peek(dataset)[source]
file_ext = 'compressed_archive'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>}
set_peek(dataset)[source]
class galaxy.datatypes.binary.CompressedZipArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.

display_peek(dataset)[source]
file_ext = 'zip'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dfa0>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Cool(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Class describing the cool format (https://github.com/mirnylab/cooler)

display_peek(dataset)[source]
file_ext = 'cool'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffedf0>}
set_peek(dataset)[source]
sniff(filename)[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('matrix.cool')
>>> Cool().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Cool().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> Cool().sniff(fname)
False
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> Cool().sniff(fname)
False
class galaxy.datatypes.binary.Cpt(**kwd)[source]

Bases: galaxy.datatypes.binary.GmxBinary

Class describing a checkpoint (.cpt) file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.cpt')
>>> Cpt().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Cpt().sniff(fname)
False
file_ext = 'cpt'
magic_number = 171817
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e00940>}
class galaxy.datatypes.binary.CuffDiffSQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a CuffDiff SQLite database

display_peek(dataset)[source]
edam_format = 'format_3621'
file_ext = 'cuffdiff.sqlite'
metadata_spec = {'cuffdiff_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5910>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'genes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5850>, 'samples': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff57c0>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.DAA(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an DAA (diamond alignment archive) file >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘diamond.daa’) >>> DAA().sniff(fname) True >>> fname = get_test_fname(‘interval.interval’) >>> DAA().sniff(fname) False

__init__(**kwd)[source]
file_ext = 'daa'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9e50>}
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.DMND(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an DMND file >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘diamond_db.dmnd’) >>> DMND().sniff(fname) True >>> fname = get_test_fname(‘interval.interval’) >>> DMND().sniff(fname) False

__init__(**kwd)[source]
file_ext = 'dmnd'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9c10>}
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.Dcd(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing a dcd file from the CHARMM molecular simulation program

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test_glucose_vacuum.dcd')
>>> Dcd().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Dcd().sniff(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
edam_data = 'data_3842'
file_ext = 'dcd'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4f40>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.DlibSQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Proteomics Spectral Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.dlib')
>>> DlibSQlite().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> DlibSQlite().sniff(fname)
False
file_ext = 'dlib'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'dlib_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeeb20>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.DynamicCompressedArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

matches_any(target_datatypes) → bool[source]

Treat two aspects of compressed datatypes separately.

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dca0>}
class galaxy.datatypes.binary.Edr(**kwd)[source]

Bases: galaxy.datatypes.binary.GmxBinary

Class describing an edr file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.edr')
>>> Edr().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Edr().sniff(fname)
False
file_ext = 'edr'
magic_number = -55555
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1610>}
class galaxy.datatypes.binary.ElibSQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Proteomics Chromatagram Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.elib')
>>> ElibSQlite().sniff(fname)
True
>>> fname = get_test_fname('test.dlib')
>>> ElibSQlite().sniff(fname)
False
file_ext = 'elib'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>, 'version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee9d0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.ExcelXls(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an Excel (xls) file

display_peek(dataset)[source]
edam_format = 'format_3468'
file_ext = 'excel.xls'
get_mime()[source]

Returns the mime type of the datatype

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee1f0>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Fast5Archive(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Class describing a FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5Archive().sniff(fname)
True
display_peek(dataset)[source]
file_ext = 'fast5.tar'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4a00>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Fast5ArchiveBz2(**kwd)[source]

Bases: galaxy.datatypes.binary.Fast5Archive

Class describing a bzip2-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveBz2().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveBz2().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveBz2().sniff(fname)
False
file_ext = 'fast5.tar.bz2'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4b80>}
sniff(filename)[source]
class galaxy.datatypes.binary.Fast5ArchiveGz(**kwd)[source]

Bases: galaxy.datatypes.binary.Fast5Archive

Class describing a gzip-compressed FAST5 archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.fast5.tar.gz')
>>> Fast5ArchiveGz().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar.bz2')
>>> Fast5ArchiveGz().sniff(fname)
False
>>> fname = get_test_fname('test.fast5.tar')
>>> Fast5ArchiveGz().sniff(fname)
False
file_ext = 'fast5.tar.gz'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4ac0>}
sniff(filename)[source]
class galaxy.datatypes.binary.GAFASQLite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a GAFA SQLite database

file_ext = 'gafa.sqlite'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'gafa_schema_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee5e0>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.GeminiSQLite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Gemini Sqlite database

display_peek(dataset)[source]
edam_data = 'data_3498'
edam_format = 'format_3622'
file_ext = 'gemini.sqlite'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'gemini_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5d30>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.GenericAsn1Binary(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class for generic ASN.1 binary format

edam_data = 'data_0849'
edam_format = 'format_1966'
file_ext = 'asn1-binary'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19de50>}
class galaxy.datatypes.binary.GmxBinary(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Base class for GROMACS binary files - xtc, trr, cpt

display_peek(dataset)[source]
file_ext = ''
magic_number = None
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd199190>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.Grib(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an GRIB file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.grib')
>>> Grib().sniff_prefix(FilePrefix(fname))
True
>>> fname = FilePrefix(get_test_fname('interval.interval'))
>>> Grib().sniff_prefix(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
edam_format = 'format_2333'
file_ext = 'grib'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'grib_edition': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd199460>}
set_meta(dataset, **kwd)[source]

Set the GRIB edition.

set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
class galaxy.datatypes.binary.GzDynamicCompressedArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.DynamicCompressedArchive

compressed_format = 'gzip'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dd30>}
class galaxy.datatypes.binary.H5(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an HDF5 file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.mz5')
>>> H5().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> H5().sniff(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
edam_format = 'format_3590'
file_ext = 'h5'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149790>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.H5MLM(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Machine learning model generated by Galaxy-ML.

URL = 'https://github.com/goeckslab/Galaxy-ML'
display_data(trans, dataset, preview=False, filename=None, to_ext=None, **kwd)[source]
display_peek(dataset)[source]
file_ext = 'h5mlm'
get_config_string(filename)[source]
get_repr(filename)[source]
max_peek_size = 1000
max_preview_size = 1000000
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149790>, 'hyper_params': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffebb0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.HexrdEtaOmeNpz(**kwd)[source]

Bases: galaxy.datatypes.binary.Npz

Class describing an HEXRD Eta-Ome Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.eta_ome.npz')
>>> HexrdEtaOmeNpz().sniff(fname)
True
>>> fname = get_test_fname('hexrd.images.npz')
>>> HexrdEtaOmeNpz().sniff(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
file_ext = 'hexrd.eta_ome.npz'
metadata_spec = {'HKLs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfaed30>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'files': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb91f0>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb92b0>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfaec70>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.HexrdImagesNpz(**kwd)[source]

Bases: galaxy.datatypes.binary.Npz

Class describing an HEXRD Images Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.images.npz')
>>> HexrdImagesNpz().sniff(fname)
True
>>> fname = get_test_fname('eta_ome.npz')
>>> HexrdImagesNpz().sniff(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
file_ext = 'hexrd.images.npz'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'files': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb91f0>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb92b0>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfaef10>, 'omegas': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfaee80>, 'panel_id': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb90a0>, 'shape': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfaefa0>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.HexrdMaterials(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Class describing a Hexrd Materials file: https://github.com/HEXRD/hexrd

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.materials.h5')
>>> HexrdMaterials().sniff(fname)
True
>>> fname = get_test_fname('test.loom')
>>> HexrdMaterials().sniff(fname)
False
edam_format = 'format_3590'
file_ext = 'hexrd.materials.h5'
metadata_spec = {'LatticeParameters': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe7c0>, 'SpaceGroupNumber': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe850>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149790>, 'materials': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe940>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.ICM(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an ICM (interpolated context model) file, used by Glimmer

edam_data = 'data_0950'
file_ext = 'icm'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9b20>}
set_peek(dataset)[source]
sniff(dataset)[source]
class galaxy.datatypes.binary.Idat(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Binary data in idat format

edam_data = 'data_2603'
edam_format = 'format_2058'
file_ext = 'idat'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc12e0>}
sniff(filename)[source]
class galaxy.datatypes.binary.IdpDB(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing an IDPicker 3 idpDB (sqlite) database

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.idpdb')
>>> IdpDB().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> IdpDB().sniff(fname)
False
display_peek(dataset)[source]
file_ext = 'idpdb'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee7c0>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee730>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee880>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.JP2(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

JPEG 2000 binary image format >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.jp2’) >>> JP2().sniff(fname) True >>> fname = get_test_fname(‘interval.interval’) >>> JP2().sniff(fname) False

__init__(**kwd)[source]
display_peek(dataset)[source]
file_ext = 'jp2'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb93d0>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Loom(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Class describing a Loom file: http://loompy.org/

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.loom')
>>> Loom().sniff(fname)
True
>>> fname = get_test_fname('test.mz5')
>>> Loom().sniff(fname)
False
display_peek(dataset)[source]
edam_format = 'format_3590'
file_ext = 'loom'
metadata_spec = {'col_attrs_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2340>, 'col_attrs_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2670>, 'col_graphs_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2580>, 'col_graphs_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7079190>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2160>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149790>, 'description': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149880>, 'doi': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149370>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2070>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2700>, 'loom_spec_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2100>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb2fd0>, 'row_attrs_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb22b0>, 'row_graphs_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe70792e0>, 'row_graphs_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7079400>, 'shape': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6eb21c0>, 'title': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7149490>, 'url': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe71490a0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.LudwigModel(**kwd)[source]

Bases: galaxy.datatypes.text.Html

Composite datatype that encloses multiple files for a Ludwig trained model.

__init__(**kwd)[source]
composite_type = 'auto_primary_file'
file_ext = 'ludwig_model'
generate_primary_file(dataset=None)[source]
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffea90>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.binary.MCool(**kwd)[source]

Bases: galaxy.datatypes.binary.H5

Class describing the multi-resolution cool format (https://github.com/mirnylab/cooler)

display_peek(dataset)[source]
file_ext = 'mcool'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffecd0>}
set_peek(dataset)[source]
sniff(filename)[source]
>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('matrix.mcool')
>>> MCool().sniff(fname)
True
>>> fname = get_test_fname('matrix.cool')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('test.mz5')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('wiggle.wig')
>>> MCool().sniff(fname)
False
>>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2')
>>> MCool().sniff(fname)
False
class galaxy.datatypes.binary.MashSketch(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Mash Sketch file. Sketches are used by the MinHash algorithm to allow fast distance estimations with low storage and memory requirements. To make a sketch, each k-mer in a sequence is hashed, which creates a pseudo-random identifier. By sorting these identifiers (hashes), a small subset from the top of the sorted list can represent the entire sequence (these are min-hashes). The more similar another sequence is, the more min-hashes it is likely to share.

file_ext = 'msh'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19d880>}
class galaxy.datatypes.binary.MassHunterTar(**kwd)[source]

Bases: galaxy.datatypes.binary.BafTar

A tar’d up .d directory containing Agilent MassHunter format data

file_ext = 'agilentmasshunter.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9700>}
class galaxy.datatypes.binary.MassLynxTar(**kwd)[source]

Bases: galaxy.datatypes.binary.BafTar

A tar’d up .d directory containing Waters MassLynx format data

file_ext = 'watersmasslynx.raw.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9640>}
class galaxy.datatypes.binary.Meryldb(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

MerylDB is a tar.gz archive, with 128 files. 64 data files and 64 index files.

file_ext = 'meryldb'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19db50>}
sniff(filename)[source]

Try to guess if the file is a Cel file. >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘affy_v_agcc.cel’) >>> Meryldb().sniff(fname) False >>> fname = get_test_fname(‘read-db.meryldb’) >>> Meryldb().sniff(fname) True

class galaxy.datatypes.binary.MzSQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Proteomics Sqlite database

file_ext = 'mz.sqlite'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff55b0>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5520>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5670>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.NcbiTaxonomySQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing the NCBI Taxonomy database stored in SQLite as done by rust-ncbitaxonomy

display_peek(dataset)[source]
file_ext = 'ncbitaxonomy.sqlite'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'ncbitaxonomy_schema_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee490>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>, 'taxon_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee3d0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.NetCDF(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Binary data in netCDF format

display_peek(dataset)[source]
edam_data = 'data_0943'
edam_format = 'format_3650'
file_ext = 'netcdf'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4e20>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.Npz(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Class describing an Numpy NPZ file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('hexrd.images.npz')
>>> Npz().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Npz().sniff(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
file_ext = 'npz'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'files': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb91f0>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb92b0>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.OSW(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing OpenSwath output

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.osw')
>>> OSW().sniff(fname)
True
>>> fname = get_test_fname('test.sqmass')
>>> OSW().sniff(fname)
False
file_ext = 'osw'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff50d0>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5040>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5190>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.OxliBinary(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4340>}
class galaxy.datatypes.binary.OxliCountGraph(**kwd)[source]

Bases: galaxy.datatypes.binary.OxliBinary

OxliCountGraph starts with “OXLI” + one byte version number + 8-bit binary ‘1’ Test file generated via:

load-into-counting.py --n_tables 1 --max-tablesize 1 \
    oxli_countgraph.oxlicg khmer/tests/test-data/100-reads.fq.bz2

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliCountGraph().sniff(fname)
False
>>> fname = get_test_fname("oxli_countgraph.oxlicg")
>>> OxliCountGraph().sniff(fname)
True
file_ext = 'oxlicg'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4430>}
sniff(filename)[source]
class galaxy.datatypes.binary.OxliGraphLabels(**kwd)[source]

Bases: galaxy.datatypes.binary.OxliBinary

OxliGraphLabels starts with “OXLI” + one byte version number + 8-bit binary ‘6’ Test file generated via:

python -c "from khmer import GraphLabels; \
    gl = GraphLabels(20, 1e7, 4); \
    gl.consume_fasta_and_tag_with_labels('tests/test-data/test-labels.fa'); \
    gl.save_labels_and_tags('oxli_graphlabels.oxligl')"

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliGraphLabels().sniff(fname)
False
>>> fname = get_test_fname("oxli_graphlabels.oxligl")
>>> OxliGraphLabels().sniff(fname)
True
file_ext = 'oxligl'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc47f0>}
sniff(filename)[source]
class galaxy.datatypes.binary.OxliNodeGraph(**kwd)[source]

Bases: galaxy.datatypes.binary.OxliBinary

OxliNodeGraph starts with “OXLI” + one byte version number + 8-bit binary ‘2’ Test file generated via:

load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \
    khmer/tests/test-data/100-reads.fq.bz2

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliNodeGraph().sniff(fname)
False
>>> fname = get_test_fname("oxli_nodegraph.oxling")
>>> OxliNodeGraph().sniff(fname)
True
file_ext = 'oxling'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc44f0>}
sniff(filename)[source]
class galaxy.datatypes.binary.OxliStopTags(**kwd)[source]

Bases: galaxy.datatypes.binary.OxliBinary

OxliStopTags starts with “OXLI” + one byte version number + 8-bit binary ‘4’ Test file adapted from khmer 2.0’s “khmer/tests/test-data/goodversion-k32.stoptags”

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliStopTags().sniff(fname)
False
>>> fname = get_test_fname("oxli_stoptags.oxlist")
>>> OxliStopTags().sniff(fname)
True
file_ext = 'oxlist'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4670>}
sniff(filename)[source]
class galaxy.datatypes.binary.OxliSubset(**kwd)[source]

Bases: galaxy.datatypes.binary.OxliBinary

OxliSubset starts with “OXLI” + one byte version number + 8-bit binary ‘5’ Test file generated via:

load-graph.py -k 20 example tests/test-data/random-20-a.fa;
partition-graph.py example;
mv example.subset.0.pmap oxli_subset.oxliss

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliSubset().sniff(fname)
False
>>> fname = get_test_fname("oxli_subset.oxliss")
>>> OxliSubset().sniff(fname)
True
file_ext = 'oxliss'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4730>}
sniff(filename)[source]
class galaxy.datatypes.binary.OxliTagSet(**kwd)[source]

Bases: galaxy.datatypes.binary.OxliBinary

OxliTagSet starts with “OXLI” + one byte version number + 8-bit binary ‘3’ Test file generated via:

load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \
    khmer/tests/test-data/100-reads.fq.bz2;
mv oxli_nodegraph.oxling.tagset oxli_tagset.oxlits

using khmer 2.0

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('sequence.csfasta')
>>> OxliTagSet().sniff(fname)
False
>>> fname = get_test_fname("oxli_tagset.oxlits")
>>> OxliTagSet().sniff(fname)
True
file_ext = 'oxlits'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc45b0>}
sniff(filename)[source]
class galaxy.datatypes.binary.PQP(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Peptide query parameters file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.pqp')
>>> PQP().sniff(fname)
True
>>> fname = get_test_fname('test.osw')
>>> PQP().sniff(fname)
False
file_ext = 'pqp'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5340>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff52b0>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5400>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]

table definition according to https://github.com/grosenberger/OpenMS/blob/develop/src/openms/source/ANALYSIS/OPENSWATH/TransitionPQPFile.cpp#L264 for now VERSION GENE PEPTIDE_GENE_MAPPING are excluded, since there is test data wo these tables, see also here https://github.com/OpenMS/OpenMS/issues/4365

class galaxy.datatypes.binary.Parquet(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing Apache Parquet file (https://parquet.apache.org/) >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘example.parquet’) >>> Parquet().sniff(fname) True >>> fname = get_test_fname(‘test.mz5’) >>> Parquet().sniff(fname) False

__init__(**kwd)[source]
file_ext = 'parquet'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9a30>}
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.PostgresqlArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Class describing a Postgresql database packed into a tar archive

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('postgresql_fake.tar.bz2')
>>> PostgresqlArchive().sniff(fname)
True
>>> fname = get_test_fname('test.fast5.tar')
>>> PostgresqlArchive().sniff(fname)
False
display_peek(dataset)[source]
file_ext = 'postgresql'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc48e0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Pretext(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

PretextMap contact map file Try to guess if the file is a Pretext file. >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘sample.pretext’) >>> Pretext().sniff(fname) True

display_peek(dataset)[source]
file_ext = 'pretext'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb94c0>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.ProBam(**kwd)[source]

Bases: galaxy.datatypes.binary.Bam

Class describing a BAM binary file - extended for proteomics data

edam_data = 'data_0863'
edam_format = 'format_3826'
file_ext = 'probam'
metadata_spec = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6f5b400>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18ff70>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe70f2b50>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f3d0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f460>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f700>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18f7c0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fdc0>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fee0>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fe50>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd18fd30>}
class galaxy.datatypes.binary.RDS(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

File using a serialized R object generated with R’s saveRDS function see https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('int-r3.rds')
>>> RDS().sniff(fname)
True
>>> fname = get_test_fname('int-r4.rds')
>>> RDS().sniff(fname)
True
>>> fname = get_test_fname('int-r3-version2.rds')
>>> RDS().sniff(fname)
True
>>> from galaxy.util.bunch import Bunch
>>> dataset = Bunch()
>>> dataset.metadata = Bunch
>>> dataset.file_name = get_test_fname('int-r4.rds')
>>> dataset.has_data = lambda: True
>>> RDS().set_meta(dataset)
>>> dataset.metadata.version
'3'
>>> dataset.metadata.rversion
'4.1.1'
>>> dataset.metadata.minrversion
'3.5.0'
check_required_metadata = True
file_ext = 'rds'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'minrversion': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4280>, 'rversion': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc41f0>, 'version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4130>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.RData(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Generic R Data file datatype implementation, i.e. files generated with R’s save or save.img function see https://www.loc.gov/preservation/digital/formats/fdd/fdd000470.shtml and https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.rdata')
>>> RData().sniff(fname)
True
>>> from galaxy.util.bunch import Bunch
>>> dataset = Bunch()
>>> dataset.metadata = Bunch
>>> dataset.file_name = fname
>>> dataset.has_data = lambda: True
>>> RData().set_meta(dataset)
>>> dataset.metadata.version
'3'
VERSION_2_PREFIX = b'RDX2\nX\n'
VERSION_3_PREFIX = b'RDX3\nX\n'
check_required_metadata = True
file_ext = 'rdata'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee040>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.RMA6(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an RMA6 (MEGAN6 read-match archive) file >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘diamond.rma6’) >>> RMA6().sniff(fname) True >>> fname = get_test_fname(‘interval.interval’) >>> RMA6().sniff(fname) False

__init__(**kwd)[source]
file_ext = 'rma6'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9d30>}
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.SQlite(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing a Sqlite database

dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'sqlite': <function SQlite.sqlite_dataprovider at 0x7f6fdcff6160>, 'sqlite-dict': <function SQlite.sqlite_datadictprovider at 0x7f6fdcff64c0>, 'sqlite-table': <function SQlite.sqlite_datatableprovider at 0x7f6fdcff6310>}
display_peek(dataset)[source]
edam_format = 'format_3621'
file_ext = 'sqlite'
init_meta(dataset, copy_from=None)[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5f10>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5e80>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcff5fd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
sniff_table_names(filename, table_names)[source]
sqlite_datadictprovider(dataset, **settings)[source]
sqlite_dataprovider(dataset, **settings)[source]
sqlite_datatableprovider(dataset, **settings)[source]
class galaxy.datatypes.binary.SQmass(**kwd)[source]

Bases: galaxy.datatypes.binary.SQlite

Class describing a Sqmass database

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.sqmass')
>>> SQmass().sniff(fname)
True
>>> fname = get_test_fname('test.pqp')
>>> SQmass().sniff(fname)
False
file_ext = 'sqmass'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeee20>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeed90>, 'tables': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeeee0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Scf(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing an scf binary sequence file

display_peek(dataset)[source]
edam_data = 'data_0924'
edam_format = 'format_1632'
file_ext = 'scf'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe700>}
set_peek(dataset)[source]
class galaxy.datatypes.binary.SearchGuiArchive(**kwd)[source]

Bases: galaxy.datatypes.binary.CompressedArchive

Class describing a SearchGUI archive

display_peek(dataset)[source]
file_ext = 'searchgui_archive'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd19dac0>, 'searchgui_major_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4d90>, 'searchgui_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc4cd0>}
set_meta(dataset, overwrite=True, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.Sff(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Standard Flowgram Format (SFF)

display_peek(dataset)[source]
edam_data = 'data_0924'
edam_format = 'format_3284'
file_ext = 'sff'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe610>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.Sra(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Sequence Read Archive (SRA) datatype originally from mdshw5/sra-tools-galaxy

display_peek(dataset)[source]
file_ext = 'sra'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee130>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]

The first 8 bytes of any NCBI sra file is ‘NCBI.sra’, and the file is binary. For details about the format, see http://www.ncbi.nlm.nih.gov/books/n/helpsra/SRA_Overview_BK/#SRA_Overview_BK.4_SRA_Data_Structure

class galaxy.datatypes.binary.TdfTar(**kwd)[source]

Bases: galaxy.datatypes.binary.BafTar

A tar’d up .d directory containing Bruker TDF format data

file_ext = 'brukertdf.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb97c0>}
class galaxy.datatypes.binary.Trr(**kwd)[source]

Bases: galaxy.datatypes.binary.GmxBinary

Class describing an trr file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.trr')
>>> Trr().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Trr().sniff(fname)
False
file_ext = 'trr'
magic_number = 1993
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd199f10>}
class galaxy.datatypes.binary.TwoBit(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing a TwoBit format nucleotide file

display_peek(dataset)[source]
edam_data = 'data_0848'
edam_format = 'format_3009'
file_ext = 'twobit'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcffe340>}
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(sniff_prefix)[source]
class galaxy.datatypes.binary.Vel(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class describing a velocity file from the CHARMM molecular simulation program

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test_charmm.vel')
>>> Vel().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Vel().sniff(fname)
False
__init__(**kwd)[source]
display_peek(dataset)[source]
file_ext = 'vel'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9f70>}
set_peek(dataset)[source]
sniff(filename)[source]
class galaxy.datatypes.binary.WiffTar(**kwd)[source]

Bases: galaxy.datatypes.binary.BafTar

A tar’d up .wiff/.scan pair containing Sciex WIFF format data >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘some.wiff.tar’) >>> WiffTar().sniff(fname) True >>> fname = get_test_fname(‘brukerbaf.d.tar’) >>> WiffTar().sniff(fname) False >>> fname = get_test_fname(‘test.fast5.tar’) >>> WiffTar().sniff(fname) False

file_ext = 'wiff.tar'
get_type()[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9580>}
sniff(filename)[source]
class galaxy.datatypes.binary.Xlsx(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Class for Excel 2007 (xlsx) files

compressed = True
file_ext = 'xlsx'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfee310>}
sniff(filename)[source]
class galaxy.datatypes.binary.Xtc(**kwd)[source]

Bases: galaxy.datatypes.binary.GmxBinary

Class describing an xtc file from the GROMACS suite

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('md.xtc')
>>> Xtc().sniff(fname)
True
>>> fname = get_test_fname('md.trr')
>>> Xtc().sniff(fname)
False
file_ext = 'xtc'
magic_number = 1995
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe6e00b50>}
class galaxy.datatypes.binary.YepTar(**kwd)[source]

Bases: galaxy.datatypes.binary.BafTar

A tar’d up .d directory containing Agilent/Bruker YEP format data

file_ext = 'agilentbrukeryep.d.tar'
get_signature_file()[source]
get_type()[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfb9880>}

galaxy.datatypes.blast module

NCBI BLAST datatypes.

Covers the blastxml format and the BLAST databases.

class galaxy.datatypes.blast.BlastDomainDb(**kwd)[source]

Bases: galaxy.datatypes.blast._BlastDb

Class for domain BLAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
file_ext = 'blastdbd'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd549a5b0>}
class galaxy.datatypes.blast.BlastDomainDb5(**kwd)[source]

Bases: galaxy.datatypes.blast._BlastDb

Class for domain BLAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
file_ext = 'blastdbd5'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2fd8cd0>}
class galaxy.datatypes.blast.BlastNucDb(**kwd)[source]

Bases: galaxy.datatypes.blast._BlastDb

Class for nucleotide BLAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
file_ext = 'blastdbn'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd36077c0>}
class galaxy.datatypes.blast.BlastNucDb5(**kwd)[source]

Bases: galaxy.datatypes.blast._BlastDb

Class for nucleotide BLAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
file_ext = 'blastdbn5'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2fd8ac0>}
class galaxy.datatypes.blast.BlastProtDb(**kwd)[source]

Bases: galaxy.datatypes.blast._BlastDb

Class for protein BLAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
file_ext = 'blastdbp'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd5a505b0>}
class galaxy.datatypes.blast.BlastProtDb5(**kwd)[source]

Bases: galaxy.datatypes.blast._BlastDb

Class for protein BLAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
file_ext = 'blastdbp5'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2fd8bb0>}
class galaxy.datatypes.blast.BlastXml(**kwd)[source]

Bases: galaxy.datatypes.xml.GenericXml

NCBI Blast XML Output data

edam_data = 'data_0857'
edam_format = 'format_3331'
file_ext = 'blastxml'
static merge(split_files, output_file)[source]

Merging multiple XML files is non-trivial and must be done in subclasses.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3179c40>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_peek(dataset)[source]

Set the peek and blurb text

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is blastxml

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('megablast_xml_parser_test1.blastxml')
>>> BlastXml().sniff(fname)
True
>>> fname = get_test_fname('tblastn_four_human_vs_rhodopsin.blastxml')
>>> BlastXml().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> BlastXml().sniff(fname)
False
class galaxy.datatypes.blast.LastDb(**kwd)[source]

Bases: galaxy.datatypes.data.Data

Class for LAST database files.

__init__(**kwd)[source]
composite_type = 'basic'
display_peek(dataset)[source]

Create HTML content, used for displaying peek.

file_ext = 'lastdb'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2fd89d0>}
set_peek(dataset)[source]

Set the peek and blurb text.

galaxy.datatypes.checkers module

Module proxies galaxy.util.checkers for backward compatibility.

External datatypes may make use of these functions.

galaxy.datatypes.checkers.check_binary(name, file_path: bool = True) → bool[source]
galaxy.datatypes.checkers.check_bz2(file_path: str, check_content: bool = True) → typing.Tuple[bool, bool][source]
galaxy.datatypes.checkers.check_gzip(file_path: str, check_content: bool = True) → typing.Tuple[bool, bool][source]
galaxy.datatypes.checkers.check_html(name, file_path: bool = True) → bool[source]

Returns True if the file/string contains HTML code.

galaxy.datatypes.checkers.check_image(file_path: str)[source]

Simple wrapper around image_type to yield a True/False verdict

galaxy.datatypes.checkers.check_zip(file_path: str, check_content: bool = True, files=1) → typing.Tuple[bool, bool][source]
galaxy.datatypes.checkers.is_gzip(file_path: str) → bool[source]
galaxy.datatypes.checkers.is_bz2(file_path: str) → bool[source]

galaxy.datatypes.chrominfo module

class galaxy.datatypes.chrominfo.ChromInfo(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

file_ext = 'len'
metadata_spec = {'chrom': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd419bb20>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c160>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'length': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd419bd30>}

galaxy.datatypes.constructive_solid_geometry module

Constructive Solid Geometry file formats.

class galaxy.datatypes.constructive_solid_geometry.GmshGeo(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Gmsh geometry File

file_ext = 'gmsh.geo'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd40d6970>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.constructive_solid_geometry.GmshMsh(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Gmsh Mesh File

__init__(**kwd)[source]
file_ext = 'gmsh.msh'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e3a0>, 'version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e460>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Gmsh msh format startswith:$MeshFormat >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.gmsh.msh’) >>> GmshMsh().sniff(fname) True >>> fname = get_test_fname(‘test.neper.tesr’) >>> GmshMsh().sniff(fname) False

class galaxy.datatypes.constructive_solid_geometry.NeperMultiScaleCell(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Neper Multiscale Cell File

file_ext = 'neper.mscell'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e880>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.constructive_solid_geometry.NeperPoints(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Neper Position File Neper position format has 1 - 3 floats per line separated by white space.

__init__(**kwd)[source]
file_ext = 'neper.points'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e0d0>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
class galaxy.datatypes.constructive_solid_geometry.NeperPointsTabular(**kwd)[source]

Bases: galaxy.datatypes.constructive_solid_geometry.NeperPoints, galaxy.datatypes.tabular.Tabular

Neper Position File Neper position format has 1 - 3 floats per line separated by TABs.

__init__(**kwd)[source]
file_ext = 'neper.points.tsv'
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c160>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e9d0>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
class galaxy.datatypes.constructive_solid_geometry.NeperTesr(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Neper Raster Tessellation File ***tesr

**format
format
**general
dimension size_x size_y [size_z] voxsize_x voxsize_y [voxsize_z]
[*origin
origin_x origin_y [origin_z]]

[*hasvoid has_void]

[**cell
number_of_cells
__init__(**kwd)[source]
file_ext = 'neper.tesr'
metadata_spec = {'cells': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e220>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e4c0>, 'format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e430>, 'origin': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e550>, 'size': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e100>, 'voxsize': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e820>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Neper tesr format startswith:***tesr >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.neper.tesr’) >>> NeperTesr().sniff(fname) True >>> fname = get_test_fname(‘test.neper.tess’) >>> NeperTesr().sniff(fname) False

class galaxy.datatypes.constructive_solid_geometry.NeperTess(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Neper Tessellation File ***tess

**format
format
**general
dim type
**cell
number_of_cells
__init__(**kwd)[source]
file_ext = 'neper.tess'
metadata_spec = {'cells': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e580>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e610>, 'format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e7f0>}
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Neper tess format startswith:***tess >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.neper.tess’) >>> NeperTess().sniff(fname) True >>> fname = get_test_fname(‘test.neper.tesr’) >>> NeperTess().sniff(fname) False

class galaxy.datatypes.constructive_solid_geometry.Ply(**kwd)[source]

Bases: object

The PLY format describes an object as a collection of vertices, faces and other elements, along with properties such as color and normal direction that can be attached to these elements. A PLY file contains the description of exactly one object.

__init__(**kwd)[source]
display_peek(dataset)[source]
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

The structure of a typical PLY file: Header, Vertex List, Face List, (lists of other elements)

subtype = ''
class galaxy.datatypes.constructive_solid_geometry.PlyAscii(**kwd)[source]

Bases: galaxy.datatypes.constructive_solid_geometry.Ply, galaxy.datatypes.data.Text

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.plyascii')
>>> PlyAscii().sniff(fname)
True
>>> fname = get_test_fname('test.vtkascii')
>>> PlyAscii().sniff(fname)
False
__init__(**kwd)[source]
file_ext = 'plyascii'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'face': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd41c0370>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd4b61670>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd412beb0>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd4b61df0>}
subtype = 'ascii'
class galaxy.datatypes.constructive_solid_geometry.PlyBinary(**kwd)[source]

Bases: galaxy.datatypes.constructive_solid_geometry.Ply, galaxy.datatypes.binary.Binary

__init__(**kwd)[source]
file_ext = 'plybinary'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'face': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd412b6d0>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd412b100>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd4118ca0>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd412bbe0>}
subtype = 'binary'
class galaxy.datatypes.constructive_solid_geometry.STL(**kwd)[source]

Bases: galaxy.datatypes.data.Data

file_ext = 'stl'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd786e250>}
class galaxy.datatypes.constructive_solid_geometry.Vtk(**kwd)[source]

Bases: object

The Visualization Toolkit provides a number of source and writer objects to read and write popular data file formats. The Visualization Toolkit also provides some of its own file formats.

There are two different styles of file formats available in VTK. The simplest are the legacy, serial formats that are easy to read and write either by hand or programmatically. However, these formats are less flexible than the XML based file formats which support random access, parallel I/O, and portable data compression and are preferred to the serial VTK file formats whenever possible.

All keyword phrases are written in ASCII form whether the file is binary or ASCII. The binary section of the file (if in binary form) is the data proper; i.e., the numbers that define points coordinates, scalars, cell indices, and so forth.

Binary data must be placed into the file immediately after the newline (‘\n’) character from the previous ASCII keyword and parameter sequence.

TODO: only legacy formats are currently supported and support for XML formats should be added.

__init__(**kwd)[source]
display_peek(dataset)[source]
get_blurb(dataset)[source]
set_initial_metadata(i, line, dataset)[source]
set_meta(dataset, **kwd)[source]
set_peek(dataset)[source]
set_structure_metadata(line, dataset, dataset_type)[source]

The fourth part of legacy VTK files is the dataset structure. The geometry part describes the geometry and topology of the dataset. This part begins with a line containing the keyword DATASET followed by a keyword describing the type of dataset. Then, depending upon the type of dataset, other keyword/ data combinations define the actual data.

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

VTK files can be either ASCII or binary, with two different styles of file formats: legacy or XML. We’ll assume if the file contains a valid VTK header, then it is a valid VTK file.

subtype = ''
class galaxy.datatypes.constructive_solid_geometry.VtkAscii(**kwd)[source]

Bases: galaxy.datatypes.constructive_solid_geometry.Vtk, galaxy.datatypes.data.Text

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.vtkascii')
>>> VtkAscii().sniff(fname)
True
>>> fname = get_test_fname('test.vtkbinary')
>>> VtkAscii().sniff(fname)
False
__init__(**kwd)[source]
file_ext = 'vtkascii'
metadata_spec = {'cells': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d310>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dataset_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd727e490>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'dimensions': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd9fe9640>, 'field_components': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d3d0>, 'field_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d9a0>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd727eb50>, 'lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d3a0>, 'origin': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd9fe9520>, 'points': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd9fe9160>, 'polygons': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d670>, 'spacing': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd9fe9c70>, 'triangle_strips': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d2b0>, 'vertices': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd9fe9e50>, 'vtk_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd727e280>}
subtype = 'ASCII'
class galaxy.datatypes.constructive_solid_geometry.VtkBinary(**kwd)[source]

Bases: galaxy.datatypes.constructive_solid_geometry.Vtk, galaxy.datatypes.binary.Binary

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('test.vtkbinary')
>>> VtkBinary().sniff(fname)
True
>>> fname = get_test_fname('test.vtkascii')
>>> VtkBinary().sniff(fname)
False
__init__(**kwd)[source]
file_ext = 'vtkbinary'
metadata_spec = {'cells': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607e80>, 'dataset_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928da60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'dimensions': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd7274850>, 'field_components': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607550>, 'field_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607df0>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928d820>, 'lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607760>, 'origin': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607d00>, 'points': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607cd0>, 'polygons': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607ee0>, 'spacing': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607fd0>, 'triangle_strips': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607bb0>, 'vertices': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd3607490>, 'vtk_version': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd928dd30>}
subtype = 'BINARY'
class galaxy.datatypes.constructive_solid_geometry.ZsetGeof(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Z-set geof File

file_ext = 'zset.geof'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd40d6280>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
galaxy.datatypes.constructive_solid_geometry.get_next_line(fh)[source]

galaxy.datatypes.coverage module

Coverage datatypes

class galaxy.datatypes.coverage.LastzCoverage(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

file_ext = 'coverage'
get_track_resolution(dataset, start, end)[source]
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeb700>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf32a90>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'forwardCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcead640>, 'positionCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfeb730>, 'reverseCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4ca00>}

galaxy.datatypes.data module

class galaxy.datatypes.data.Data(**kwd)[source]

Bases: object

Base class for all datatypes. Implements basic interfaces as well as class methods for metadata.

>>> class DataTest( Data ):
...     MetadataElement( name="test" )
...
>>> DataTest.metadata_spec.test.name
'test'
>>> DataTest.metadata_spec.test.desc
'test'
>>> type( DataTest.metadata_spec.test.param )
<class 'galaxy.model.metadata.MetadataParameter'>
CHUNKABLE = False
__init__(**kwd)[source]

Initialize the datatype

add_composite_file(name, **kwds)[source]
add_display_app(app_id, label, file_function, links_function)[source]

Adds a display app to the datatype. app_id is a unique id label is the primary display label, e.g., display at ‘UCSC’ file_function is a string containing the name of the function that returns a properly formatted display links_function is a string containing the name of the function that returns a list of (link_name,link)

add_display_application(display_application)[source]

New style display applications

after_setting_metadata(dataset)[source]

This function is called on the dataset after metadata is set.

allow_datatype_change = None
as_display_type(dataset, type, **kwd)[source]

Returns modified file contents for a particular display type

base_dataprovider(dataset, **settings)[source]
before_setting_metadata(dataset)[source]

This function is called on the dataset before metadata is set.

chunk64_dataprovider(dataset, **settings)[source]
chunk_dataprovider(dataset, **settings)[source]
clear_display_apps()[source]
composite_files = {}
composite_type = None
convert_dataset(trans, original_dataset, target_type, return_output=False, visible=True, deps=None, target_context=None, history=None)[source]

This function adds a job to the queue to convert a dataset to another type. Returns a message about success/failure.

copy_safe_peek = True
data_sources = {}
dataprovider(dataset, data_format, **settings)[source]

Base dataprovider factory for all datatypes that returns the proper provider for the given data_format or raises a NoProviderAvailable.

dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>}
dataset_content_needs_grooming(file_name)[source]

This function is called on an output dataset file after the content is initially generated.

display_as_markdown(dataset_instance, markdown_format_helpers)[source]

Prepare for embedding dataset into a basic Markdown document.

This is a somewhat experimental interface and should not be implemented on datatypes not tightly tied to a Galaxy version (e.g. datatypes in the Tool Shed).

Speaking very losely - the datatype should should load a bounded amount of data from the supplied dataset instance and prepare for embedding it into Markdown. This should be relatively vanilla Markdown - the result of this is bleached and it should not contain nested Galaxy Markdown directives.

If the data cannot reasonably be displayed, just indicate this and do not throw an exception.

display_data(trans, data, preview=False, filename=None, to_ext=None, **kwd)[source]

Displays data in central pane if preview is True, else handles download.

Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.

TOOD: Document alternatives to overridding this method (data providers?).

display_info(dataset)[source]

Returns formatted html of dataset info

display_name(dataset)[source]

Returns formatted html of dataset name

display_peek(dataset)[source]

Create HTML table, used for displaying peek

edam_data = 'data_0006'
edam_format = 'format_1915'
file_ext = 'data'
find_conversion_destination()[source]

Returns ( direct_match, converted_ext, existing converted dataset )

generate_primary_file(dataset=None)[source]
get_composite_files(dataset=None)[source]
get_converter_types(original_dataset, datatypes_registry)[source]

Returns available converters by type for this dataset

get_display_application(key, default=None)[source]
get_display_applications_by_dataset(dataset, trans)[source]
get_display_label(type)[source]

Returns primary label for display app

Returns a list of tuples of (name, link) for a particular display type. No check on ‘access’ permissions is done here - if you can view the dataset, you can also save it or send it to a destination outside of Galaxy, so Galaxy security restrictions do not apply anyway.

get_display_types()[source]

Returns display types available

get_max_optional_metadata_filesize()[source]
get_mime()[source]

Returns the mime type of the datatype

get_raw_data(dataset)[source]

Returns the full data. To stream it open the file_name and read/write as needed

get_visualizations(dataset)[source]

Returns a list of visualizations for datatype.

groom_dataset_content(file_name)[source]

This function is called on an output dataset file if dataset_content_needs_grooming returns True.

handle_dataset_as_image(hda) → str[source]
has_dataprovider(data_format)[source]

Returns True if data_format is available in dataproviders.

has_resolution
init_meta(dataset, copy_from=None)[source]
is_binary = True
classmethod is_datatype_change_allowed()[source]

Returns the value of the allow_datatype_change class attribute if set in a subclass, or True iff the datatype is not composite.

matches_any(target_datatypes: typing.List[typing.Any]) → bool[source]

Check if this datatype is of any of the target_datatypes or is a subtype thereof.

max_optional_metadata_filesize
static merge(split_files, output_file)[source]

Merge files with copy.copyfileobj() will not hit the max argument limitation of cat. gz and bz2 files are also working.

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
missing_meta(dataset, check=None, skip=None)[source]

Checks for empty metadata values. Returns False if no non-optional metadata is missing and the missing metadata key otherwise. Specifying a list of ‘check’ values will only check those names provided; when used, optionality is ignored Specifying a list of ‘skip’ items will return True even when a named metadata value is missing; when used, optionality is ignored

primary_file_name = 'index'
remove_display_app(app_id)[source]

Removes a display app from the datatype

repair_methods(dataset)[source]

Unimplemented method, returns dict with method/option for repairing errors

set_max_optional_metadata_filesize(max_value)[source]
set_meta(dataset: typing.Any, overwrite=True, **kwd)[source]

Unimplemented method, allows guessing of metadata from contents of file

set_peek(dataset)[source]

Set the peek and blurb text

supported_display_apps = {}
to_archive(dataset, name='')[source]

Collect archive paths and file handles that need to be exported when archiving dataset.

Parameters:
  • dataset – HistoryDatasetAssociation
  • name – archive name, in collection context corresponds to collection name(s) and element_identifier, joined by ‘/’, e.g ‘fastq_collection/sample1/forward’
track_type = None
validate(dataset, **kwd)[source]
writable_files
class galaxy.datatypes.data.DataMeta(name, bases, dict_)[source]

Bases: abc.ABCMeta

Metaclass for Data class. Sets up metadata spec.

__init__(name, bases, dict_)[source]
exception galaxy.datatypes.data.DatatypeConverterNotFoundException[source]

Bases: Exception

class galaxy.datatypes.data.DatatypeValidation(state, message)[source]

Bases: object

__init__(state, message)[source]
static invalid(message)[source]
static unvalidated()[source]
static validated()[source]
class galaxy.datatypes.data.Directory(**kwd)[source]

Bases: galaxy.datatypes.data.Data

Class representing a directory of files.

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1b20>}
class galaxy.datatypes.data.GenericAsn1(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Class for generic ASN.1 text format

edam_data = 'data_0849'
edam_format = 'format_1966'
file_ext = 'asn1'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1a00>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.data.LineCount(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Dataset contains a single line with a single integer that denotes the line count for a related dataset. Used for custom builds.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1970>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.data.Newick(**kwd)[source]

Bases: galaxy.datatypes.data.Text

New Hampshire/Newick Format

edam_data = 'data_0872'
edam_format = 'format_1910'
file_ext = 'newick'
get_visualizations(dataset)[source]

Returns a list of visualizations for datatype.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a18b0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)[source]

Returning false as the newick format is too general and cannot be sniffed.

class galaxy.datatypes.data.Nexus(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Nexus format as used By Paup, Mr Bayes, etc

edam_data = 'data_0872'
edam_format = 'format_1912'
file_ext = 'nex'
get_visualizations(dataset)[source]

Returns a list of visualizations for datatype.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a17c0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

All Nexus Files Simply puts a ‘#NEXUS’ in its first line

class galaxy.datatypes.data.Text(**kwd)[source]

Bases: galaxy.datatypes.data.Data

count_data_lines(dataset)[source]

Count the number of lines of data in dataset, skipping all blank lines and comments.

dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'line': <function Text.line_dataprovider at 0x7f6fdd1a7b80>, 'regex-line': <function Text.regex_line_dataprovider at 0x7f6fdd1a7d30>}
edam_format = 'format_2330'
estimate_file_lines(dataset)[source]

Perform a rough estimate by extrapolating number of lines from a small read.

file_ext = 'txt'
get_mime()[source]

Returns the mime type of the datatype

is_binary = False
line_class = 'line'
line_dataprovider(dataset, **settings)[source]

Returns an iterator over the dataset’s lines (that have been stripped) optionally excluding blank lines and lines that start with a comment character.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
regex_line_dataprovider(dataset, **settings)[source]

Returns an iterator over the dataset’s lines optionally including/excluding lines that match one or more regex filters.

set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

set_peek(dataset, line_count=None, WIDTH=256, skipchars=None, line_wrap=True, **kwd)[source]

Set the peek. This method is used by various subclasses of Text.

classmethod split(input_datasets, subdir_generator_function, split_params)[source]

Split the input files by line.

galaxy.datatypes.data.get_file_peek(file_name, WIDTH=256, LINE_COUNT=5, skipchars=None, line_wrap=True)[source]

Returns the first LINE_COUNT lines wrapped to WIDTH.

>>> def assert_peek_is(file_name, expected, *args, **kwd):
...     path = get_test_fname(file_name)
...     peek = get_file_peek(path, *args, **kwd)
...     assert peek == expected, "%s != %s" % (peek, expected)
>>> assert_peek_is('0_nonewline', u'0')
>>> assert_peek_is('0.txt', u'0\n')
>>> assert_peek_is('4.bed', u'chr22\t30128507\t31828507\tuc003bnx.1_cds_2_0_chr22_29227_f\t0\t+\n', LINE_COUNT=1)
>>> assert_peek_is('1.bed', u'chr1\t147962192\t147962580\tCCDS989.1_cds_0_0_chr1_147962193_r\t0\t-\nchr1\t147984545\t147984630\tCCDS990.1_cds_0_0_chr1_147984546_f\t0\t+\n', LINE_COUNT=2)
galaxy.datatypes.data.get_params_and_input_name(converter, deps, target_context=None)[source]
galaxy.datatypes.data.get_test_fname(fname)[source]

Returns test data filename

galaxy.datatypes.data.validate(dataset_instance)[source]

galaxy.datatypes.genetics module

rgenetics datatypes Use at your peril Ross Lazarus for the rgenetics and galaxy projects

genome graphs datatypes derived from Interval datatypes genome graphs datasets have a header row with appropriate columnames The first column is always the marker - eg columname = rs, first row= rs12345 if the rows are snps subsequent row values are all numeric ! Will fail if any non numeric (eg ‘+’ or ‘NA’) values ross lazarus for rgenetics august 20 2007

class galaxy.datatypes.genetics.Affybatch(**kwd)[source]

Bases: galaxy.datatypes.genetics.RexpBase

derived class for BioC data structures in Galaxy

__init__(**kwd)[source]
file_ext = 'affybatch'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4eb20>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4e970>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ea00>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4e880>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ea60>}
class galaxy.datatypes.genetics.AllegroLOD(**kwd)[source]

Bases: galaxy.datatypes.genetics.LinkageStudies

Allegro output format for LOD scores

file_ext = 'allegro_fparam'
header_check(fio)[source]
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb4f0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
>>> classname = AllegroLOD
>>> from galaxy.datatypes.sniff import get_test_fname
>>> extn_true = classname().file_ext
>>> file_true = get_test_fname("linkstudies." + extn_true)
>>> classname().sniff(file_true)
True
>>> false_files = list(LinkageStudies.test_files)
>>> false_files.remove("linkstudies." + extn_true)
>>> result_true = []
>>> for fname in false_files:
...     file_false = get_test_fname(fname)
...     res = classname().sniff(file_false)
...     if res:
...         result_true.append(fname)
>>>
>>> result_true
[]
class galaxy.datatypes.genetics.DataIn(**kwd)[source]

Bases: galaxy.datatypes.genetics.LinkageStudies

Common linkage input file for intermarker distances and recombination rates

__init__(**kwd)[source]
file_ext = 'linkage_datain'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb400>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
>>> classname = DataIn
>>> from galaxy.datatypes.sniff import get_test_fname
>>> extn_true = classname().file_ext
>>> file_true = get_test_fname("linkstudies." + extn_true)
>>> classname().sniff(file_true)
True
>>> false_files = list(LinkageStudies.test_files)
>>> false_files.remove("linkstudies." + extn_true)
>>> result_true = []
>>> for fname in false_files:
...     file_false = get_test_fname(fname)
...     res = classname().sniff(file_false)
...     if res:
...         result_true.append(fname)
>>>
>>> result_true
[]
class galaxy.datatypes.genetics.Eigenstratgeno(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

Eigenstrat format - may be able to get rid of this if we move to shellfish Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'eigenstratgeno'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d100>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.Eigenstratpca(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

Eigenstrat PCA file for case control adjustment Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'eigenstratpca'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d1c0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.Eset(**kwd)[source]

Bases: galaxy.datatypes.genetics.RexpBase

derived class for BioC data structures in Galaxy

__init__(**kwd)[source]
file_ext = 'eset'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ed90>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ec70>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ebe0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ed00>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ee20>}
class galaxy.datatypes.genetics.Fped(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

FBAT pedigree format - single file, map is header row of rs numbers. Strange. Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'fped'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94e80>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.Fphe(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

fbat pedigree file - mad format with ! as first char on header row Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'fphe'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94d00>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.GenomeGraphs(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

Tab delimited data containing a marker id and any number of numeric values

__init__(**kwd)[source]

Initialize gg datatype, by adding UCSC display apps

as_ucsc_display_file(dataset, **kwd)[source]

Returns file

file_ext = 'gg'
get_mime()[source]

Returns the mime type of the datatype

make_html_table(dataset)[source]

Create HTML table, used for displaying peek

metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5c70>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5c40>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'markerCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5be0>}
set_meta(dataset, **kwd)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is in gg format

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'test_space.txt' )
>>> GenomeGraphs().sniff( fname )
False
>>> fname = get_test_fname( '1.gg' )
>>> GenomeGraphs().sniff( fname )
True

from the ever-helpful angie hinrichs angie@soe.ucsc.edu a genome graphs call looks like this

http://genome.ucsc.edu/cgi-bin/hgGenome?clade=mammal&org=Human&db=hg18&hgGenome_dataSetName=dname &hgGenome_dataSetDescription=test&hgGenome_formatType=best%20guess&hgGenome_markerType=best%20guess &hgGenome_columnLabels=best%20guess&hgGenome_maxVal=&hgGenome_labelVals= &hgGenome_maxGapToFill=25000000&hgGenome_uploadFile=http://galaxy.esphealth.org/datasets/333/display/index &hgGenome_doSubmitUpload=submit

Galaxy gives this for an interval file

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&position=chr1:1-1000&hgt.customText= http%3A%2F%2Fgalaxy.esphealth.org%2Fdisplay_as%3Fid%3D339%26display_app%3Ducsc

validate(dataset, **kwd)[source]

Validate a gg file - all numeric after header row

class galaxy.datatypes.genetics.GenotypeMatrix(**kwd)[source]

Bases: galaxy.datatypes.genetics.LinkageStudies

Sample matrix of genotypes - GTs as columns

__init__(**kwd)[source]
file_ext = 'alohomora_gts'
header_check(fio)[source]
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb250>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
>>> classname = GenotypeMatrix
>>> from galaxy.datatypes.sniff import get_test_fname
>>> extn_true = classname().file_ext
>>> file_true = get_test_fname("linkstudies." + extn_true)
>>> classname().sniff(file_true)
True
>>> false_files = list(LinkageStudies.test_files)
>>> false_files.remove("linkstudies." + extn_true)
>>> result_true = []
>>> for fname in false_files:
...     file_false = get_test_fname(fname)
...     res = classname().sniff(file_false)
...     if res:
...         result_true.append(fname)
>>>
>>> result_true
[]
class galaxy.datatypes.genetics.IdeasPre(**kwd)[source]

Bases: galaxy.datatypes.text.Html

This datatype defines the input format required by IDEAS: https://academic.oup.com/nar/article/44/14/6721/2468150 The IDEAS preprocessor tool produces an output using this format. The extra_files_path of the primary input dataset contains the following files and directories. - chromosome_windows.txt (optional) - chromosomes.bed (optional) - IDEAS_input_config.txt - compressed archived tmp directory containing a number of compressed bed files.

__init__(**kwd)[source]
composite_type = 'auto_primary_file'
file_ext = 'ideaspre'
generate_primary_file(dataset=None)[source]
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d310>, 'chrom_bed': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d3a0>, 'chrom_windows': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d430>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'input_config': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d4c0>, 'tmp_archive': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d550>}
regenerate_primary_file(dataset)[source]
set_meta(dataset, **kwd)[source]
class galaxy.datatypes.genetics.LinkageStudies(**kwd)[source]

Bases: galaxy.datatypes.data.Text

superclass for classical linkage analysis suites

__init__(**kwd)[source]
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb190>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
test_files = ['linkstudies.allegro_fparam', 'linkstudies.alohomora_gts', 'linkstudies.linkage_datain', 'linkstudies.linkage_map']
class galaxy.datatypes.genetics.Lped(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

linkage pedigree (ped,map) Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'lped'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94b80>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.MAlist(**kwd)[source]

Bases: galaxy.datatypes.genetics.RexpBase

derived class for BioC data structures in Galaxy

__init__(**kwd)[source]
file_ext = 'malist'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb040>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4ef70>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4eee0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4e7c0>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb0d0>}
class galaxy.datatypes.genetics.MarkerMap(**kwd)[source]

Bases: galaxy.datatypes.genetics.LinkageStudies

Map of genetic markers including physical and genetic distance Common input format for linkage programs

chrom, genetic pos, markername, physical pos, Nr

file_ext = 'linkage_map'
header_check(fio)[source]
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2bfb2e0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
>>> classname = MarkerMap
>>> from galaxy.datatypes.sniff import get_test_fname
>>> extn_true = classname().file_ext
>>> file_true = get_test_fname("linkstudies." + extn_true)
>>> classname().sniff(file_true)
True
>>> false_files = list(LinkageStudies.test_files)
>>> false_files.remove("linkstudies." + extn_true)
>>> result_true = []
>>> for fname in false_files:
...     file_false = get_test_fname(fname)
...     res = classname().sniff(file_false)
...     if res:
...         result_true.append(fname)
>>>
>>> result_true
[]
class galaxy.datatypes.genetics.Pbed(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

Plink Binary compressed 2bit/geno Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'pbed'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94f40>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.Phe(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

Phenotype file

__init__(**kwd)[source]
file_ext = 'phe'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94dc0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.Pheno(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

base class for pheno files

file_ext = 'pheno'
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d8e0>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d640>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d730>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d670>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d7c0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d820>}
class galaxy.datatypes.genetics.Pphe(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

Plink phenotype file - header must have FID IID… Rgenetics data collections

__init__(**kwd)[source]
file_ext = 'pphe'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94c40>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.RexpBase(**kwd)[source]

Bases: galaxy.datatypes.text.Html

base class for BioC data structures in Galaxy must be constructed with the pheno data in place since that goes into the metadata for each instance

__init__(**kwd)[source]
composite_type = 'auto_primary_file'
display_peek(dataset)[source]

Returns formatted html of peek

file_ext = 'rexpbase'
generate_primary_file(dataset=None)[source]

This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately

get_file_peek(filename)[source]

can’t really peek at a filename - need the extra_files_path and such?

get_mime()[source]

Returns the mime type of the datatype

get_peek(dataset)[source]

expects a .pheno file in the extra_files_dir - ugh

get_phecols(phenolist, maxConc=20)[source]

sept 2009: cannot use whitespace to split - make a more complex structure here and adjust the methods that rely on this structure return interesting phenotype column names for an rexpression eset or affybatch to use in array subsetting and so on. Returns a data structure for a dynamic Galaxy select parameter. A column with only 1 value doesn’t change, so is not interesting for analysis. A column with a different value in every row is equivalent to a unique identifier so is also not interesting for anova or limma analysis - both these are removed after the concordance (count of unique terms) is constructed for each column. Then a complication - each remaining pair of columns is tested for redundancy - if two columns are always paired, then only one is needed :)

get_pheno(dataset)[source]

expects a .pheno file in the extra_files_dir - ugh note that R is wierd and adds the row.name in the header so the columns are all wrong - unless you tell it not to. A file can be written as write.table(file=’foo.pheno’,pData(foo),sep=’ ‘,quote=F,row.names=F)

html_table = None
init_meta(dataset, copy_from=None)[source]
make_html_table(pp='nothing supplied from peek\n')[source]

Create HTML table, used for displaying peek

metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9db50>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9da30>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d9a0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9dac0>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b4e820>}
regenerate_primary_file(dataset)[source]

cannot do this until we are setting metadata

set_meta(dataset, **kwd)[source]

NOTE we apply the tabular machinary to the phenodata extracted from a BioC eSet or affybatch.

set_peek(dataset, **kwd)[source]

expects a .pheno file in the extra_files_dir - ugh note that R is weird and does not include the row.name in the header. why?

class galaxy.datatypes.genetics.Rgenetics(**kwd)[source]

Bases: galaxy.datatypes.text.Html

base class to use for rgenetics datatypes derived from html - composite datatype elements stored in extra files path

composite_type = 'auto_primary_file'
file_ext = 'rgenetics'
generate_primary_file(dataset=None)[source]
get_mime()[source]

Returns the mime type of the datatype

metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94a30>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
regenerate_primary_file(dataset)[source]

cannot do this until we are setting metadata

set_meta(dataset, **kwd)[source]

for lped/pbed eg

class galaxy.datatypes.genetics.SNPMatrix(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

BioC SNPMatrix Rgenetics data collections

file_ext = 'snpmatrix'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94ac0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_peek(dataset, **kwd)[source]
sniff(filename)[source]

need to check the file header hex code

class galaxy.datatypes.genetics.Snptest(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

BioC snptest Rgenetics data collections

file_ext = 'snptest'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d250>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.ldIndep(**kwd)[source]

Bases: galaxy.datatypes.genetics.Rgenetics

LD (a good measure of redundancy of information) depleted Plink Binary compressed 2bit/geno This is really a plink binary, but some tools work better with less redundancy so are constrained to these files

__init__(**kwd)[source]
file_ext = 'ldreduced'
metadata_spec = {'base_name': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b9d040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcfc1ca0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.genetics.rgFeatureList(**kwd)[source]

Bases: galaxy.datatypes.genetics.rgTabList

for featureid lists of exclusions or inclusions in the clean tool output from QC eg low maf, high missingness, bad hwe in controls, excess mendel errors,… featureid subsets on statistical criteria -> specialized display such as gg same infrastructure for expression?

__init__(**kwd)[source]

Initialize featurelist datatype

file_ext = 'rgFList'
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b948e0>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94850>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b947c0>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b946a0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94730>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94970>}
class galaxy.datatypes.genetics.rgSampleList(**kwd)[source]

Bases: galaxy.datatypes.genetics.rgTabList

for sampleid exclusions or inclusions in the clean tool output from QC eg excess het, gender error, ibd pair member,eigen outlier,excess mendel errors,… since they can be uploaded, should be flexible but they are persistent at least same infrastructure for expression?

__init__(**kwd)[source]

Initialize samplelist datatype

file_ext = 'rgSList'
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94550>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b944c0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94460>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94340>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b94400>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2b945e0>}
class galaxy.datatypes.genetics.rgTabList(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

for sampleid and for featureid lists of exclusions or inclusions in the clean tool featureid subsets on statistical criteria -> specialized display such as gg

__init__(**kwd)[source]

Initialize featurelistt datatype

display_peek(dataset)[source]

Returns formated html of peek

file_ext = 'rgTList'
get_mime()[source]

Returns the mime type of the datatype

metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5eb0>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5e20>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5d90>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5ca0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5fa0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2cc5f40>}

galaxy.datatypes.gis module

GIS classes

class galaxy.datatypes.gis.Shapefile(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

The Shapefile data format: For more information please see http://en.wikipedia.org/wiki/Shapefile

__init__(**kwd)[source]
composite_type = 'auto_primary_file'
display_peek(dataset)[source]

Create HTML content, used for displaying peek.

file_ext = 'shp'
generate_primary_file(dataset=None)[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd7e49370>}
set_peek(dataset)[source]

Set the peek and blurb text.

galaxy.datatypes.graph module

Graph content classes.

class galaxy.datatypes.graph.Xgmml(**kwd)[source]

Bases: galaxy.datatypes.xml.GenericXml

XGMML graph format (http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats).

file_ext = 'xgmml'
set_peek(dataset)[source]

Set the peek and blurb text

sniff(filename)[source]

Returns false and the user must manually set.

static merge(split_files, output_file)[source]

Merging multiple XML files is non-trivial and must be done in subclasses.

node_edge_dataprovider(dataset, **settings)[source]
dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'line': <function Text.line_dataprovider at 0x7f6fdd1a7b80>, 'node-edge': <function Xgmml.node_edge_dataprovider at 0x7f6fd31c54c0>, 'regex-line': <function Text.regex_line_dataprovider at 0x7f6fdd1a7d30>, 'xml': <function GenericXml.xml_dataprovider at 0x7f6fdce92ee0>}
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd72b8460>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.graph.Sif(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

SIF graph format (http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats).

First column: node id Second column: relationship type Third to Nth column: target ids for link

file_ext = 'sif'
set_peek(dataset)[source]

Set the peek and blurb text

sniff(filename)[source]

Returns false and the user must manually set.

static merge(split_files, output_file)[source]
node_edge_dataprovider(dataset, **settings)[source]
dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'column': <function TabularData.column_dataprovider at 0x7f6fdcfad8b0>, 'dataset-column': <function TabularData.dataset_column_dataprovider at 0x7f6fdcfada60>, 'dataset-dict': <function TabularData.dataset_dict_dataprovider at 0x7f6fdcfaddc0>, 'dict': <function TabularData.dict_dataprovider at 0x7f6fdcfadc10>, 'line': <function Text.line_dataprovider at 0x7f6fdd1a7b80>, 'node-edge': <function Sif.node_edge_dataprovider at 0x7f6fd31c5820>, 'regex-line': <function Text.regex_line_dataprovider at 0x7f6fdd1a7d30>}
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd293ffd0>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd293fbe0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2a30580>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2a30460>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2a30fd0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd293f820>}
class galaxy.datatypes.graph.XGMMLGraphDataProvider(source, selector=None, max_depth=None, **kwargs)[source]

Bases: galaxy.datatypes.dataproviders.hierarchy.XMLDataProvider

Provide two lists: nodes, edges:

'nodes': contains objects of the form:
    { 'id' : <some string id>, 'data': <any extra data> }
'edges': contains objects of the form:
    { 'source' : <an index into nodes>, 'target': <an index into nodes>, 'data': <any extra data> }
settings = {'limit': 'int', 'max_depth': 'int', 'offset': 'int', 'selector': 'str'}
class galaxy.datatypes.graph.SIFGraphDataProvider(source, indeces=None, column_count=None, column_types=None, parsers=None, parse_columns=True, deliminator='t', filters=None, **kwargs)[source]

Bases: galaxy.datatypes.dataproviders.column.ColumnarDataProvider

Provide two lists: nodes, edges:

'nodes': contains objects of the form:
    { 'id' : <some string id>, 'data': <any extra data> }
'edges': contains objects of the form:
    { 'source' : <an index into nodes>, 'target': <an index into nodes>, 'data': <any extra data> }
settings = {'column_count': 'int', 'column_types': 'list:str', 'comment_char': 'str', 'deliminator': 'str', 'filters': 'list:str', 'indeces': 'list:int', 'invert': 'bool', 'limit': 'int', 'offset': 'int', 'parse_columns': 'bool', 'provide_blank': 'bool', 'regex_list': 'list:escaped', 'strip_lines': 'bool', 'strip_newlines': 'bool'}

galaxy.datatypes.images module

Image classes

class galaxy.datatypes.images.Analyze75(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Mayo Analyze 7.5 files http://www.imzml.org

__init__(**kwd)[source]
composite_type = 'auto_primary_file'
file_ext = 'analyze75'
generate_primary_file(dataset=None)[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8bc70>}
class galaxy.datatypes.images.Bmp(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3592'
file_ext = 'bmp'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23b20>}
class galaxy.datatypes.images.Eps(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3466'
file_ext = 'eps'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b490>}
class galaxy.datatypes.images.Gif(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3467'
file_ext = 'gif'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23a60>}
class galaxy.datatypes.images.Gifti(**kwd)[source]

Bases: galaxy.datatypes.xml.GenericXml

Class describing a Gifti format

file_ext = 'gii'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8ba00>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is a Gifti file

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('Human.colin.R.activations.label.gii')
>>> Gifti().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Gifti().sniff(fname)
False
>>> fname = get_test_fname('megablast_xml_parser_test1.blastxml')
>>> Gifti().sniff(fname)
False
>>> fname = get_test_fname('tblastn_four_human_vs_rhodopsin.blastxml')
>>> Gifti().sniff(fname)
False
class galaxy.datatypes.images.Gmaj(**kwd)[source]

Bases: galaxy.datatypes.data.Data

Class describing a GMAJ Applet

copy_safe_peek = False
display_peek(dataset)[source]
edam_format = 'format_3547'
file_ext = 'gmaj.zip'
get_mime()[source]

Returns the mime type of the datatype

metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8bd60>}
set_peek(dataset)[source]
sniff(filename)[source]

NOTE: the sniff.convert_newlines() call in the upload utility will keep Gmaj data types from being correctly sniffed, but the files can be uploaded (they’ll be sniffed as ‘txt’). This sniff function is here to provide an example of a sniffer for a zip file.

class galaxy.datatypes.images.Hamamatsu(**kwd)[source]

Bases: galaxy.datatypes.images.Image

file_ext = 'vms'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23190>}
class galaxy.datatypes.images.Html(**kwd)[source]

Bases: galaxy.datatypes.text.Html

Deprecated class. This class should not be used anymore, but the galaxy.datatypes.text:Html one. This is for backwards compatibilities only.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b700>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
class galaxy.datatypes.images.Im(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3593'
file_ext = 'im'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce239a0>}
class galaxy.datatypes.images.Image(**kwd)[source]

Bases: galaxy.datatypes.data.Data

Class describing an image

__init__(**kwd)[source]
edam_data = 'data_2968'
edam_format = 'format_3547'
file_ext = ''
handle_dataset_as_image(hda)[source]
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce2f2b0>}
set_peek(dataset)[source]
sniff(filename)[source]

Determine if the file is in this format

class galaxy.datatypes.images.Jpg(**kwd)[source]

Bases: galaxy.datatypes.images.Image

__init__(**kwd)[source]
edam_format = 'format_3579'
file_ext = 'jpg'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce2ffd0>}
class galaxy.datatypes.images.Laj(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Class describing a LAJ Applet

copy_safe_peek = False
display_peek(dataset)[source]
file_ext = 'laj'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b640>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_peek(dataset)[source]
class galaxy.datatypes.images.Mirax(**kwd)[source]

Bases: galaxy.datatypes.images.Image

file_ext = 'mrxs'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23250>}
class galaxy.datatypes.images.Mrc2014(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

MRC/CCP4 2014 file format (.mrc). https://www.ccpem.ac.uk/mrc_format/mrc2014.php

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('1.mrc')
>>> Mrc2014().sniff(fname)
True
>>> fname = get_test_fname('2.txt')
>>> Mrc2014().sniff(fname)
False
file_ext = 'mrc'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8be20>}
sniff(filename)[source]
class galaxy.datatypes.images.Nifti1(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Nifti1 format https://nifti.nimh.nih.gov/pub/dist/src/niftilib/nifti1.h

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('T1_top_350bytes.nii1')
>>> Nifti1().sniff( fname )
True
>>> fname = get_test_fname('2.txt')
>>> Nifti1().sniff( fname )
False
file_ext = 'nii1'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8bb80>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
class galaxy.datatypes.images.Nifti2(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Nifti2 format https://brainder.org/2015/04/03/the-nifti-2-file-format/

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('avg152T1_LR_nifti2_top_100bytes.nii2')
>>> Nifti2().sniff( fname )
True
>>> fname = get_test_fname('T1_top_350bytes.nii1')
>>> Nifti2().sniff( fname )
False
file_ext = 'nii2'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8bac0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
class galaxy.datatypes.images.Nrrd(**kwd)[source]

Bases: galaxy.datatypes.images.Image

file_ext = 'nrrd'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23be0>}
class galaxy.datatypes.images.OMETiff(**kwd)[source]

Bases: galaxy.datatypes.images.Tiff

file_ext = 'ome.tiff'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23070>, 'offsets': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23100>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.images.Pbm(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3601'
file_ext = 'pbm'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce233a0>}
class galaxy.datatypes.images.Pcd(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3594'
file_ext = 'pcd'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce238e0>}
class galaxy.datatypes.images.Pcx(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3595'
file_ext = 'pcx'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23820>}
class galaxy.datatypes.images.Pdf(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3508'
file_ext = 'pdf'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b910>}
sniff(filename)[source]

Determine if the file is in pdf format.

class galaxy.datatypes.images.Pgm(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3602'
file_ext = 'pgm'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b3d0>}
class galaxy.datatypes.images.Png(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3603'
file_ext = 'png'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23eb0>}
class galaxy.datatypes.images.Ppm(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3596'
file_ext = 'ppm'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23760>}
class galaxy.datatypes.images.Psd(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3597'
file_ext = 'psd'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce236a0>}
class galaxy.datatypes.images.Rast(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3605'
file_ext = 'rast'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b850>}
class galaxy.datatypes.images.Rgb(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3600'
file_ext = 'rgb'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23460>}
class galaxy.datatypes.images.Sakura(**kwd)[source]

Bases: galaxy.datatypes.images.Image

file_ext = 'svslide'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23310>}
class galaxy.datatypes.images.Star(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Base format class for Relion STAR (Self-defining Text Archiving and Retrieval) image files. https://relion.readthedocs.io/en/latest/Reference/Conventions.html

file_ext = 'star'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8b7c0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>}
set_peek(dataset)[source]

Set the peek and blurb text

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Each file must have one or more data blocks. The start of a data block is defined by the keyword “data_” followed by an optional string for identification (e.g., “data_images”). All text before the first “data_” keyword are comments

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('1.star')
>>> Star().sniff(fname)
True
>>> fname = get_test_fname('interval.interval')
>>> Star().sniff(fname)
False
class galaxy.datatypes.images.Tck(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Tracks file format (.tck) format https://mrtrix.readthedocs.io/en/latest/getting_started/image_data.html#tracks-file-format-tck

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('fibers_sparse_top_6_lines.tck')
>>> Tck().sniff( fname )
True
>>> fname = get_test_fname('2.txt')
>>> Tck().sniff( fname )
False
file_ext = 'tck'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8bfd0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
class galaxy.datatypes.images.Tiff(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3591'
file_ext = 'tiff'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23070>}
class galaxy.datatypes.images.Trk(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

Track File format (.trk) is the tractography file format. http://trackvis.org/docs/?subsect=fileformat

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('IIT2mean_top_2000bytes.trk')
>>> Trk().sniff( fname )
True
>>> fname = get_test_fname('2.txt')
>>> Trk().sniff( fname )
False
file_ext = 'trk'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce8bee0>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]
class galaxy.datatypes.images.Xbm(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3598'
file_ext = 'xbm'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce235e0>}
class galaxy.datatypes.images.Xpm(**kwd)[source]

Bases: galaxy.datatypes.images.Image

edam_format = 'format_3599'
file_ext = 'xpm'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce23520>}
galaxy.datatypes.images.create_applet_tag_peek(class_name, archive, params)[source]

galaxy.datatypes.interval module

Interval datatypes

class galaxy.datatypes.interval.Bed(**kwd)[source]

Bases: galaxy.datatypes.interval.Interval

Tab delimited data in BED format

as_ucsc_display_file(dataset, **kwd)[source]

Returns file contents with only the bed data. If bed 6+, treat as interval.

check_required_metadata = True
column_names = ['Chrom', 'Start', 'End', 'Name', 'Score', 'Strand', 'ThickStart', 'ThickEnd', 'ItemRGB', 'BlockCount', 'BlockSizes', 'BlockStarts']
data_sources = {'data': 'tabix', 'feature_search': 'fli', 'index': 'bigwig'}
edam_format = 'format_3003'
file_ext = 'bed'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2520>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc23a0>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2460>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc28e0>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc24c0>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2400>, 'viz_filter_cols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2310>}
set_meta(dataset, overwrite=True, **kwd)[source]

Sets the metadata information for datasets previously determined to be in bed format.

sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Checks for ‘bedness’

BED lines have three required fields and nine additional optional fields. The number of fields per line must be consistent throughout any single set of data in an annotation track. The order of the optional fields is binding: lower-numbered fields must always be populated if higher-numbered fields are used. The data type of all 12 columns is: 1-str, 2-int, 3-int, 4-str, 5-int, 6-str, 7-int, 8-int, 9-int or list, 10-int, 11-list, 12-list

For complete details see http://genome.ucsc.edu/FAQ/FAQformat#format1

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'test_tab.bed' )
>>> Bed().sniff( fname )
True
>>> fname = get_test_fname( 'interv1.bed' )
>>> Bed().sniff( fname )
True
>>> fname = get_test_fname( 'complete.bed' )
>>> Bed().sniff( fname )
True
track_type = 'FeatureTrack'
class galaxy.datatypes.interval.Bed12(**kwd)[source]

Bases: galaxy.datatypes.interval.BedStrict

Tab delimited data in strict BED format - no non-standard columns allowed; column count forced to 12

edam_format = 'format_3586'
file_ext = 'bed12'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd5e0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd8b0>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd700>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd820>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd670>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd7c0>, 'viz_filter_cols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2310>}
class galaxy.datatypes.interval.Bed6(**kwd)[source]

Bases: galaxy.datatypes.interval.BedStrict

Tab delimited data in strict BED format - no non-standard columns allowed; column count forced to 6

edam_format = 'format_3585'
file_ext = 'bed6'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd280>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd550>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd3a0>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd4c0>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd310>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd430>, 'viz_filter_cols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2310>}
class galaxy.datatypes.interval.BedGraph(**kwd)[source]

Bases: galaxy.datatypes.interval.Interval

Tab delimited chrom/start/end/datavalue dataset

as_ucsc_display_file(dataset, **kwd)[source]

Returns file contents as is with no modifications. TODO: this is a functional stub and will need to be enhanced moving forward to provide additional support for bedgraph.

data_sources = {'data': 'bigwig', 'index': 'bigwig'}
edam_format = 'format_3583'
file_ext = 'bedgraph'
get_estimated_display_viewport(dataset, chrom_col=0, start_col=1, end_col=2)[source]

Set viewport based on dataset’s first 100 lines.

metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc27c0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc25e0>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2700>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2640>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2760>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc26a0>}
track_type = 'LineTrack'
class galaxy.datatypes.interval.BedStrict(**kwd)[source]

Bases: galaxy.datatypes.interval.Bed

Tab delimited data in strict BED format - no non-standard columns allowed

__init__(**kwd)[source]
allow_datatype_change = False
edam_format = 'format_3584'
file_ext = 'bedstrict'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2040>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd1f0>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd040>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd160>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2e20>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbd0d0>, 'viz_filter_cols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2310>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)[source]
class galaxy.datatypes.interval.ChromatinInteractions(**kwd)[source]

Bases: galaxy.datatypes.interval.Interval

Chromatin interactions obtained from 3C/5C/Hi-C experiments.

column_names = ['Chrom1', 'Start1', 'End1', 'Chrom2', 'Start2', 'End2', 'Value']
data_sources = {'data': 'tabix', 'index': 'bigwig'}
file_ext = 'chrint'
metadata_spec = {'chrom1Col': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8af0>, 'chrom2Col': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8c40>, 'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdceb8370>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8dc0>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'end1Col': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8be0>, 'end2Col': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8d00>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc29a0>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc28e0>, 'start1Col': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8b80>, 'start2Col': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8ca0>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2a00>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2940>, 'valueCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8d60>}
sniff(filename)[source]
track_type = 'DiagonalHeatmapTrack'
class galaxy.datatypes.interval.CustomTrack(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

UCSC CustomTrack

__init__(**kwd)[source]

Initialize interval datatype, by adding UCSC display app

display_peek(dataset)[source]

Returns formated html of peek

edam_format = 'format_3588'
file_ext = 'customtrack'
get_estimated_display_viewport(dataset, chrom_col=None, start_col=None, end_col=None)[source]

Return a chrom, start, stop tuple for viewing a file.

metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8760>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb86d0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8640>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb82b0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb85b0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb87f0>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is in customtrack format.

CustomTrack files are built within Galaxy and are basically bed or interval files with the first line looking something like this.

track name=”User Track” description=”User Supplied Track (from Galaxy)” color=0,0,0 visibility=1

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'complete.bed' )
>>> CustomTrack().sniff( fname )
False
>>> fname = get_test_fname( 'ucsc.customtrack' )
>>> CustomTrack().sniff( fname )
True
class galaxy.datatypes.interval.ENCODEPeak(**kwd)[source]

Bases: galaxy.datatypes.interval.Interval

Human ENCODE peak format. There are both broad and narrow peak formats. Formats are very similar; narrow peak has an additional column, though.

Broad peak ( http://genome.ucsc.edu/FAQ/FAQformat#format13 ): This format is used to provide called regions of signal enrichment based on pooled, normalized (interpreted) data. It is a BED 6+3 format.

Narrow peak http://genome.ucsc.edu/FAQ/FAQformat#format12 and : This format is used to provide called peaks of signal enrichment based on pooled, normalized (interpreted) data. It is a BED6+4 format.

column_names = ['Chrom', 'Start', 'End', 'Name', 'Score', 'Strand', 'SignalValue', 'pValue', 'qValue', 'Peak']
data_sources = {'data': 'tabix', 'index': 'bigwig'}
edam_format = 'format_3612'
file_ext = 'encodepeak'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8880>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8a60>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb89a0>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc28e0>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8940>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8a00>}
sniff(filename)[source]
class galaxy.datatypes.interval.Gff(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular, galaxy.datatypes.interval._RemoteCallMixin

Tab delimited data in Gff format

__init__(**kwd)[source]

Initialize datatype, by adding GBrowse display app

column_names = ['Seqname', 'Source', 'Feature', 'Start', 'End', 'Score', 'Strand', 'Frame', 'Group']
data_sources = {'data': 'interval_index', 'feature_search': 'fli', 'index': 'bigwig'}
dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'column': <function TabularData.column_dataprovider at 0x7f6fdcfad8b0>, 'dataset-column': <function TabularData.dataset_column_dataprovider at 0x7f6fdcfada60>, 'dataset-dict': <function TabularData.dataset_dict_dataprovider at 0x7f6fdcfaddc0>, 'dict': <function TabularData.dict_dataprovider at 0x7f6fdcfadc10>, 'genomic-region': <function Gff.genomic_region_dataprovider at 0x7f6fdce584c0>, 'genomic-region-dict': <function Gff.genomic_region_dict_dataprovider at 0x7f6fdce58670>, 'interval': <function Gff.interval_dataprovider at 0x7f6fdce58820>, 'interval-dict': <function Gff.interval_dict_dataprovider at 0x7f6fdce589d0>, 'line': <function Text.line_dataprovider at 0x7f6fdd1a7b80>, 'regex-line': <function Text.regex_line_dataprovider at 0x7f6fdd1a7d30>}
display_peek(dataset)[source]

Returns formated html of peek

edam_data = 'data_1255'
edam_format = 'format_2305'
file_ext = 'gff'
genomic_region_dataprovider(dataset, **settings)[source]
genomic_region_dict_dataprovider(dataset, **settings)[source]
get_estimated_display_viewport(dataset)[source]

Return a chrom, start, stop tuple for viewing a file. There are slight differences between gff 2 and gff 3 formats. This function should correctly handle both…

interval_dataprovider(dataset, **settings)[source]
interval_dict_dataprovider(dataset, **settings)[source]
metadata_spec = {'attribute_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbddf0>, 'attributes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdd60>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdcd0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdc10>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>}
set_attribute_metadata(dataset)[source]

Sets metadata elements for dataset’s attributes.

set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is in gff format

GFF lines have nine required fields that must be tab-separated.

For complete details see http://genome.ucsc.edu/FAQ/FAQformat#format3

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('gff.gff3')
>>> Gff().sniff( fname )
False
>>> fname = get_test_fname('test.gff')
>>> Gff().sniff( fname )
True
track_type = 'FeatureTrack'
valid_gff_frame = ['.', '0', '1', '2']
class galaxy.datatypes.interval.Gff3(**kwd)[source]

Bases: galaxy.datatypes.interval.Gff

Tab delimited data in Gff3 format

__init__(**kwd)[source]

Initialize datatype, by adding GBrowse display app

column_names = ['Seqid', 'Source', 'Type', 'Start', 'End', 'Score', 'Strand', 'Phase', 'Attributes']
edam_format = 'format_1975'
file_ext = 'gff3'
metadata_spec = {'attribute_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbddf0>, 'attributes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdd60>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdf10>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdc10>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is in GFF version 3 format

GFF 3 format:

  1. adds a mechanism for representing more than one level of hierarchical grouping of features and subfeatures.
  2. separates the ideas of group membership and feature name/id
  3. constrains the feature type field to be taken from a controlled vocabulary.
  4. allows a single feature, such as an exon, to belong to more than one group at a time.
  5. provides an explicit convention for pairwise alignments
  6. provides an explicit convention for features that occupy disjunct regions

The format consists of 9 columns, separated by tabs (NOT spaces).

Undefined fields are replaced with the “.” character, as described in the original GFF spec.

For complete details see http://song.sourceforge.net/gff3.shtml

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'test.gff' )
>>> Gff3().sniff( fname )
False
>>> fname = get_test_fname( 'test.gtf' )
>>> Gff3().sniff( fname )
False
>>> fname = get_test_fname('gff.gff3')
>>> Gff3().sniff( fname )
True
>>> fname = get_test_fname( 'grch37.75.gtf' )
>>> Gff3().sniff( fname )
False
track_type = 'FeatureTrack'
valid_gff3_phase = ['.', '0', '1', '2']
valid_gff3_strand = ['+', '-', '.', '?']
class galaxy.datatypes.interval.Gtf(**kwd)[source]

Bases: galaxy.datatypes.interval.Gff

Tab delimited data in Gtf format

column_names = ['Seqname', 'Source', 'Feature', 'Start', 'End', 'Score', 'Strand', 'Frame', 'Attributes']
edam_format = 'format_2306'
file_ext = 'gtf'
metadata_spec = {'attribute_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbddf0>, 'attributes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdbdd60>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8220>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8160>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>}
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines whether the file is in gtf format

GTF lines have nine required fields that must be tab-separated. The first eight GTF fields are the same as GFF. The group field has been expanded into a list of attributes. Each attribute consists of a type/value pair. Attributes must end in a semi-colon, and be separated from any following attribute by exactly one space. The attribute list must begin with the two mandatory attributes:

gene_id value - A globally unique identifier for the genomic source of the sequence. transcript_id value - A globally unique identifier for the predicted transcript.

For complete details see http://genome.ucsc.edu/FAQ/FAQformat#format4

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( '1.bed' )
>>> Gtf().sniff( fname )
False
>>> fname = get_test_fname( 'test.gff' )
>>> Gtf().sniff( fname )
False
>>> fname = get_test_fname( 'test.gtf' )
>>> Gtf().sniff( fname )
True
>>> fname = get_test_fname( 'grch37.75.gtf' )
>>> Gtf().sniff( fname )
True
track_type = 'FeatureTrack'
class galaxy.datatypes.interval.Interval(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

Tab delimited data containing interval information

__init__(**kwd)[source]

Initialize interval datatype, by adding UCSC display apps

as_ucsc_display_file(dataset, **kwd)[source]

Returns file contents with only the bed data

data_sources = {'data': 'tabix', 'index': 'bigwig'}
dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'column': <function TabularData.column_dataprovider at 0x7f6fdcfad8b0>, 'dataset-column': <function TabularData.dataset_column_dataprovider at 0x7f6fdcfada60>, 'dataset-dict': <function TabularData.dataset_dict_dataprovider at 0x7f6fdcfaddc0>, 'dict': <function TabularData.dict_dataprovider at 0x7f6fdcfadc10>, 'genomic-region': <function Interval.genomic_region_dataprovider at 0x7f6fdce403a0>, 'genomic-region-dict': <function Interval.genomic_region_dict_dataprovider at 0x7f6fdce40550>, 'interval': <function Interval.interval_dataprovider at 0x7f6fdce40700>, 'interval-dict': <function Interval.interval_dict_dataprovider at 0x7f6fdce408b0>, 'line': <function Text.line_dataprovider at 0x7f6fdd1a7b80>, 'regex-line': <function Text.regex_line_dataprovider at 0x7f6fdd1a7d30>}
display_peek(dataset)[source]

Returns formated html of peek

displayable(dataset)[source]
edam_data = 'data_3002'
edam_format = 'format_3475'
file_ext = 'interval'
genomic_region_dataprovider(dataset, **settings)[source]
genomic_region_dict_dataprovider(dataset, **settings)[source]
get_estimated_display_viewport(dataset, chrom_col=None, start_col=None, end_col=None)[source]

Return a chrom, start, stop tuple for viewing a file.

get_track_resolution(dataset, start, end)[source]
init_meta(dataset, copy_from=None)[source]
interval_dataprovider(dataset, **settings)[source]
interval_dict_dataprovider(dataset, **settings)[source]
line_class = 'region'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdceb8370>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2880>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc29a0>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc28e0>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2a00>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2940>}
repair_methods(dataset)[source]

Return options for removing errors along with a description

set_meta(dataset, overwrite=True, first_line_is_header=False, **kwd)[source]

Tries to guess from the line the location number of the column for the chromosome, region start-end and strand

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Checks for ‘intervalness’

This format is mostly used by galaxy itself. Valid interval files should include a valid header comment, but this seems to be loosely regulated.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'test_space.txt' )
>>> Interval().sniff( fname )
False
>>> fname = get_test_fname( 'interval.interval' )
>>> Interval().sniff( fname )
True
track_type = 'FeatureTrack'

Generate links to UCSC genome browser sites based on the dbkey and content of dataset.

validate(dataset, **kwd)[source]

Validate an interval file using the bx GenomicIntervalReader

class galaxy.datatypes.interval.ProBed(**kwd)[source]

Bases: galaxy.datatypes.interval.Bed

Tab delimited data in proBED format - adaptation of BED for proteomics data.

column_names = ['Chrom', 'Start', 'End', 'Name', 'Score', 'Strand', 'ThickStart', 'ThickEnd', 'ItemRGB', 'BlockCount', 'BlockSizes', 'BlockStarts', 'ProteinAccession', 'PeptideSequence', 'Uniqueness', 'GenomeReferenceVersion', 'PsmScore', 'Fdr', 'Modifications', 'Charge', 'ExpMassToCharge', 'CalcMassToCharge', 'PsmRank', 'DatasetID', 'Uri']
edam_format = 'format_3827'
file_ext = 'probed'
metadata_spec = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc22b0>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2130>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'endCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc21f0>, 'nameCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc28e0>, 'startCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2250>, 'strandCol': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc2190>, 'viz_filter_cols': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdc20a0>}
class galaxy.datatypes.interval.ScIdx(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

ScIdx files are 1-based and consist of strand-specific coordinate counts. They always have 5 columns, and the first row is the column labels: ‘chrom’, ‘index’, ‘forward’, ‘reverse’, ‘value’. Each line following the first consists of data: chromosome name (type str), peak index (type int), Forward strand peak count (type int), Reverse strand peak count (type int) and value (type int). The value of the 5th ‘value’ column is the sum of the forward and reverse peak count values.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('cntrl_hg19.scidx')
>>> ScIdx().sniff(fname)
True
>>> Bed().sniff(fname)
False
>>> fname = get_test_fname('empty.txt')
>>> ScIdx().sniff(fname)
False
__init__(**kwd)[source]

Initialize scidx datatype.

file_ext = 'scidx'
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8ee0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8e50>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>}
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Checks for ‘scidx-ness.’

class galaxy.datatypes.interval.Wiggle(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular, galaxy.datatypes.interval._RemoteCallMixin

Tab delimited data in wiggle format

__init__(**kwd)[source]
data_sources = {'data': 'bigwig', 'index': 'bigwig'}
dataproviders = {'base': <function Data.base_dataprovider at 0x7f6fdd1a71f0>, 'chunk': <function Data.chunk_dataprovider at 0x7f6fdd1a73a0>, 'chunk64': <function Data.chunk64_dataprovider at 0x7f6fdd1a7550>, 'column': <function TabularData.column_dataprovider at 0x7f6fdcfad8b0>, 'dataset-column': <function TabularData.dataset_column_dataprovider at 0x7f6fdcfada60>, 'dataset-dict': <function TabularData.dataset_dict_dataprovider at 0x7f6fdcfaddc0>, 'dict': <function TabularData.dict_dataprovider at 0x7f6fdcfadc10>, 'line': <function Text.line_dataprovider at 0x7f6fdd1a7b80>, 'regex-line': <function Text.regex_line_dataprovider at 0x7f6fdd1a7d30>, 'wiggle': <function Wiggle.wiggle_dataprovider at 0x7f6fdcdb9310>, 'wiggle-dict': <function Wiggle.wiggle_dict_dataprovider at 0x7f6fdcdb94c0>}
display_peek(dataset)[source]

Returns formated html of peek

edam_format = 'format_3005'
file_ext = 'wig'
get_estimated_display_viewport(dataset)[source]

Return a chrom, start, stop tuple for viewing a file.

get_track_resolution(dataset, start, end)[source]
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c1f0>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcdb8430>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>}
set_meta(dataset, overwrite=True, **kwd)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Determines wether the file is in wiggle format

The .wig format is line-oriented. Wiggle data is preceeded by a track definition line, which adds a number of options for controlling the default display of this track. Following the track definition line is the track data, which can be entered in several different formats.

The track definition line begins with the word ‘track’ followed by the track type. The track type with version is REQUIRED, and it currently must be wiggle_0. For example, track type=wiggle_0…

For complete details see http://genome.ucsc.edu/goldenPath/help/wiggle.html

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname( 'interv1.bed' )
>>> Wiggle().sniff( fname )
False
>>> fname = get_test_fname( 'wiggle.wig' )
>>> Wiggle().sniff( fname )
True
track_type = 'LineTrack'
wiggle_dataprovider(dataset, **settings)[source]
wiggle_dict_dataprovider(dataset, **settings)[source]

galaxy.datatypes.isa module

ISA datatype

See https://github.com/ISA-tools

class galaxy.datatypes.isa.IsaJson(**kwd)[source]

Bases: galaxy.datatypes.isa._Isa

__init__(**kwd)[source]
file_ext = 'isa-json'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1cad880>}
class galaxy.datatypes.isa.IsaTab(**kwd)[source]

Bases: galaxy.datatypes.isa._Isa

__init__(**kwd)[source]
file_ext = 'isa-tab'
metadata_spec = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1cad430>}

galaxy.datatypes.media module

Video classes

galaxy.datatypes.media.ffprobe(path)[source]
class galaxy.datatypes.media.Audio(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

set_meta(dataset, **kwd)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd250c280>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16190>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'duration': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd14e1c10>, 'sample_rates': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16070>}
class galaxy.datatypes.media.Video(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

set_meta(dataset, **kwd)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16fa0>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16e20>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'fps': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16430>, 'resolution_h': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e162e0>, 'resolution_w': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16220>, 'video_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e164f0>, 'video_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16f10>}
class galaxy.datatypes.media.Mkv(**kwd)[source]

Bases: galaxy.datatypes.media.Video

file_ext = 'mkv'
sniff(filename)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e6d0>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e460>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'fps': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e610>, 'resolution_h': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e400>, 'resolution_w': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215edc0>, 'video_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e580>, 'video_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e730>}
class galaxy.datatypes.media.Mp4(**kwd)[source]

Bases: galaxy.datatypes.media.Video

Class that reads MP4 video file. >>> from galaxy.datatypes.sniff import sniff_with_cls >>> sniff_with_cls(Mp4, ‘video_1.mp4’) True >>> sniff_with_cls(Mp4, ‘audio_1.mp4’) False

file_ext = 'mp4'
sniff(filename)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e130>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e430>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'fps': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e2b0>, 'resolution_h': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e340>, 'resolution_w': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e520>, 'video_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e280>, 'video_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215ee20>}
class galaxy.datatypes.media.Flv(**kwd)[source]

Bases: galaxy.datatypes.media.Video

file_ext = 'flv'
sniff(filename)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e8e0>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215ed00>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'fps': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e880>, 'resolution_h': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215ea90>, 'resolution_w': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e970>, 'video_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215e9a0>, 'video_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215ebe0>}
class galaxy.datatypes.media.Mpg(**kwd)[source]

Bases: galaxy.datatypes.media.Video

file_ext = 'mpg'
sniff(filename)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215ef70>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215ee50>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'fps': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1534e50>, 'resolution_h': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd217e5b0>, 'resolution_w': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215eb80>, 'video_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1cd37c0>, 'video_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215eeb0>}
class galaxy.datatypes.media.Mp3(**kwd)[source]

Bases: galaxy.datatypes.media.Audio

Class that reads MP3 audio file. >>> from galaxy.datatypes.sniff import sniff_with_cls >>> sniff_with_cls(Mp3, ‘audio_2.mp3’) True >>> sniff_with_cls(Mp3, ‘audio_1.wav’) False

file_ext = 'mp3'
sniff(filename)[source]
metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d1c0>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d3a0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'duration': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d730>, 'sample_rates': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d910>}
class galaxy.datatypes.media.Wav(**kwd)[source]

Bases: galaxy.datatypes.media.Audio

Class that reads WAV audio file >>> from galaxy.datatypes.sniff import sniff_with_cls >>> sniff_with_cls(Wav, ‘hello.wav’) True >>> sniff_with_cls(Wav, ‘audio_2.mp3’) False >>> sniff_with_cls(Wav, ‘drugbank_drugs.cml’) False

file_ext = 'wav'
blurb = 'RIFF WAV Audio file'
is_binary = True
get_mime()[source]

Returns the mime type of the datatype.

sniff(filename)[source]
set_meta(dataset, overwrite=True, **kwd)[source]

Set the metadata for this dataset from the file contents.

metadata_spec = {'audio_codecs': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd250c280>, 'audio_streams': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16190>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fe7138f10>, 'duration': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd14e1c10>, 'nchannels': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d9d0>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d8b0>, 'rate': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d7c0>, 'sample_rates': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e16070>, 'sampwidth': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd215d970>}

galaxy.datatypes.metadata module

Expose the model metadata module as a datatype module also, allowing it to live in galaxy.model means the model module doesn’t have any dependencies on th datatypes module. This module will need to remain here for datatypes living in the tool shed so we might as well keep and use this interface from the datatypes module.

class galaxy.datatypes.metadata.Statement(target)[source]

Bases: object

This class inserts its target into a list in the surrounding class. the data.Data class has a metaclass which executes these statements. This is how we shove the metadata element spec into the class.

__init__(target)[source]
classmethod process(element)[source]
class galaxy.datatypes.metadata.MetadataCollection[source]

Bases: collections.abc.Mapping

MetadataCollection is not a collection at all, but rather a proxy to the real metadata which is stored as a Dictionary. This class handles processing the metadata elements when they are set and retrieved, returning default values in cases when metadata is not set.

__init__()[source]
element_is_set(name) → bool[source]

check if the meta data with the given name is set, i.e.

  • if the such a metadata actually exists and
  • if its value differs from no_value
Parameters:name – the name of the metadata element
Returns:True if the value differes from the no_value False if its equal of if no metadata with the name is specified
from_JSON_dict(filename=None, path_rewriter=None, json_dict=None)[source]
get_metadata_parameter(name, **kwd)[source]
get_parent()[source]
make_dict_copy(to_copy)[source]

Makes a deep copy of input iterable to_copy according to self.spec

parent
remove_key(name)[source]
requires_dataset_id
set_parent(parent)[source]
spec
to_JSON_dict(filename=None)[source]
class galaxy.datatypes.metadata.MetadataSpecCollection(*args, **kwds)[source]

Bases: collections.OrderedDict

A simple extension of OrderedDict which allows cleaner access to items and allows the values to be iterated over directly as if it were a list. append() is also implemented for simplicity and does not “append”.

__init__(*args, **kwds)[source]
append(item)[source]
class galaxy.datatypes.metadata.MetadataParameter(spec)[source]

Bases: object

__init__(spec)[source]
from_external_value(value, parent)[source]

Turns a value read from an external dict into its value to be pushed directly into the metadata dict.

get_field(value=None, context=None, other_values=None, **kwd)[source]
make_copy(value, target_context: galaxy.model.metadata.MetadataCollection, source_context)[source]
classmethod marshal(value)[source]

This method should/can be overridden to convert the incoming value to whatever type it is supposed to be.

to_external_value(value)[source]

Turns a value read from a metadata into its value to be pushed directly into the external dict.

to_safe_string(value)[source]
to_string(value)[source]
unwrap(form_value)[source]

Turns a value into its storable form.

validate(value)[source]

Throw an exception if the value is invalid.

wrap(value, session)[source]

Turns a value into its usable form.

class galaxy.datatypes.metadata.MetadataElementSpec(datatype, name=None, desc=None, param=<class 'galaxy.model.metadata.MetadataParameter'>, default=None, no_value=None, visible=True, set_in_upload=False, **kwargs)[source]

Bases: object

Defines a metadata element and adds it to the metadata_spec (which is a MetadataSpecCollection) of datatype.

__init__(datatype, name=None, desc=None, param=<class 'galaxy.model.metadata.MetadataParameter'>, default=None, no_value=None, visible=True, set_in_upload=False, **kwargs)[source]
get(name, default=None)[source]
unwrap(value)[source]

Turns an incoming value into its storable form.

wrap(value, session)[source]

Turns a stored value into its usable form.

class galaxy.datatypes.metadata.SelectParameter(spec)[source]

Bases: galaxy.model.metadata.MetadataParameter

__init__(spec)[source]
get_field(value=None, context=None, other_values=None, values=None, **kwd)[source]
classmethod marshal(value)[source]
to_string(value)[source]
wrap(value, session)[source]
class galaxy.datatypes.metadata.DBKeyParameter(spec)[source]

Bases: galaxy.model.metadata.SelectParameter

get_field(value=None, context=None, other_values=None, values=None, **kwd)[source]
class galaxy.datatypes.metadata.RangeParameter(spec)[source]

Bases: galaxy.model.metadata.SelectParameter

__init__(spec)[source]
get_field(value=None, context=None, other_values=None, values=None, **kwd)[source]
classmethod marshal(value)[source]
class galaxy.datatypes.metadata.ColumnParameter(spec)[source]

Bases: galaxy.model.metadata.RangeParameter

get_field(value=None, context=None, other_values=None, values=None, **kwd)[source]
class galaxy.datatypes.metadata.ColumnTypesParameter(spec)[source]

Bases: galaxy.model.metadata.MetadataParameter

to_string(value)[source]
class galaxy.datatypes.metadata.ListParameter(spec)[source]

Bases: galaxy.model.metadata.MetadataParameter

to_string(value)[source]
class galaxy.datatypes.metadata.DictParameter(spec)[source]

Bases: galaxy.model.metadata.MetadataParameter

to_safe_string(value)[source]
to_string(value)[source]
class galaxy.datatypes.metadata.PythonObjectParameter(spec)[source]

Bases: galaxy.model.metadata.MetadataParameter

get_field(value=None, context=None, other_values=None, **kwd)[source]
classmethod marshal(value)[source]
to_string(value)[source]
class galaxy.datatypes.metadata.FileParameter(spec)[source]

Bases: galaxy.model.metadata.MetadataParameter

from_external_value(value, parent, path_rewriter=None)[source]

Turns a value read from a external dict into its value to be pushed directly into the metadata dict.

get_field(value=None, context=None, other_values=None, **kwd)[source]
make_copy(value, target_context: galaxy.model.metadata.MetadataCollection, source_context)[source]
classmethod marshal(value)[source]
new_file(dataset=None, **kwds)[source]
to_external_value(value)[source]

Turns a value read from a metadata into its value to be pushed directly into the external dict.

to_safe_string(value)[source]
to_string(value)[source]
wrap(value, session)[source]
class galaxy.datatypes.metadata.MetadataTempFile(**kwds)[source]

Bases: object

__init__(**kwds)[source]
classmethod cleanup_from_JSON_dict_filename(filename)[source]
file_name
classmethod from_JSON(json_dict)[source]
classmethod is_JSONified_value(value)[source]
tmp_dir = 'database/tmp'
to_JSON()[source]

galaxy.datatypes.microarrays module

class galaxy.datatypes.microarrays.Gal(**kwd)[source]

Bases: galaxy.datatypes.microarrays.GenericMicroarrayFile

Gal File format described at: http://mdc.custhelp.com/app/answers/detail/a_id/18883/#gal

edam_data = 'data_3110'
edam_format = 'format_3829'
file_ext = 'gal'
metadata_spec = {'block_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd17fa8b0>, 'block_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd17faf70>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecaf0>, 'file_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd17fa3a0>, 'number_of_data_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd17faf10>, 'number_of_optional_header_records': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecfd0>, 'version_number': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecca0>}
set_meta(dataset, **kwd)[source]

Set metadata for Gal file.

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a Gal file. >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.gal’) >>> Gal().sniff(fname) True >>> fname = get_test_fname(‘test.gpr’) >>> Gal().sniff(fname) False

class galaxy.datatypes.microarrays.GenericMicroarrayFile(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Abstract class for most of the microarray files.

get_mime()[source]
metadata_spec = {'block_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecdf0>, 'block_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecd30>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1df8190>, 'file_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecb80>, 'number_of_data_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecb20>, 'number_of_optional_header_records': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ecdc0>, 'version_number': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1df8220>}
set_peek(dataset)[source]
class galaxy.datatypes.microarrays.Gpr(**kwd)[source]

Bases: galaxy.datatypes.microarrays.GenericMicroarrayFile

Gpr File format described at: http://mdc.custhelp.com/app/answers/detail/a_id/18883/#gpr

edam_data = 'data_3110'
edam_format = 'format_3829'
file_ext = 'gpr'
metadata_spec = {'block_count': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd15343d0>, 'block_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd18ec8e0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd2154e50>, 'file_type': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1534d30>, 'number_of_data_columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1cd3dc0>, 'number_of_optional_header_records': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd19f96a0>, 'version_number': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1c3e340>}
set_meta(dataset, **kwd)[source]

Set metadata for Gpr file.

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a Gpr file. >>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname(‘test.gpr’) >>> Gpr().sniff(fname) True >>> fname = get_test_fname(‘test.gal’) >>> Gpr().sniff(fname) False

galaxy.datatypes.molecules module

class galaxy.datatypes.molecules.CIF(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

CIF format.

file_ext = 'cif'
get_dataset_info(metadata)[source]
meta_error = False
metadata_spec = {'atom_data': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d30d0>, 'chemical_formula': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd29f11f0>, 'data_block_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d3550>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'is_periodic': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd29f1b50>, 'lattice_parameters': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd29f1130>, 'number_of_atoms': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd29f1be0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e290d0>}
set_meta(dataset, **kwd)[source]

Find Atom IDs for metadata.

set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a CIF file.

The CIF format and the Relion STAR format have a shared origin. Note therefore that STAR files and the STAR sniffer also use “data_” blocks. STAR files will not pass the CIF sniffer, but CIF files can pass the STAR sniffer.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('Si.cif')
>>> CIF().sniff(fname)
True
>>> fname = get_test_fname('Si_lowercase.cell')
>>> CIF().sniff(fname)
False
>>> fname = get_test_fname('1.star')
>>> CIF().sniff(fname)
False
class galaxy.datatypes.molecules.CML(**kwd)[source]

Bases: galaxy.datatypes.xml.GenericXml

Chemical Markup Language http://cml.sourceforge.net/

file_ext = 'cml'
static merge(split_files, output_file)[source]

Merging CML files.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdce2f6a0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd524a2e0>}
set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a CML file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('interval.interval')
>>> CML().sniff(fname)
False
>>> fname = get_test_fname('drugbank_drugs.cml')
>>> CML().sniff(fname)
True
classmethod split(input_datasets, subdir_generator_function, split_params)[source]

Split the input files by molecule records.

class galaxy.datatypes.molecules.Cell(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

CASTEP CELL format.

file_ext = 'cell'
get_dataset_info(metadata)[source]
meta_error = False
metadata_spec = {'atom_data': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d3a30>, 'chemical_formula': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d3eb0>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'is_periodic': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d3c10>, 'lattice_parameters': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d34c0>, 'number_of_atoms': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd32d3610>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e290d0>}
set_meta(dataset, **kwd)[source]

Find Atom IDs for metadata.

set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a CASTEP CELL file.

A fingerprint for CELL files is the use of %BLOCK and %ENDBLOCK to denote data blocks (not case sensitive).

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('Si_uppercase.cell')
>>> Cell().sniff(fname)
True
>>> fname = get_test_fname('Si_lowercase.cell')
>>> Cell().sniff(fname)
True
>>> fname = get_test_fname('Si.cif')
>>> Cell().sniff(fname)
False
class galaxy.datatypes.molecules.DRF(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

file_ext = 'drf'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd285cbb0>}
set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

class galaxy.datatypes.molecules.ExtendedXYZ(**kwd)[source]

Bases: galaxy.datatypes.molecules.XYZ

Extended XYZ format.

Uses specification from https://github.com/libAtoms/extxyz.

file_ext = 'extxyz'
metadata_spec = {'atom_data': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd806a670>, 'chemical_formula': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd806ad90>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'is_periodic': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd806adf0>, 'lattice_parameters': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd806a280>, 'number_of_atoms': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd806a3d0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e290d0>}
read_blocks(lines)[source]

Parses and returns a list of XYZ structure blocks (aka frames).

Raises IndexError, TypeError, ValueError

set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is an Extended XYZ file.

XYZ files will not pass the ExtendedXYZ sniffer, but ExtendedXYZ files can pass the XYZ sniffer.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('Si.extxyz')
>>> ExtendedXYZ().sniff(fname)
True
>>> fname = get_test_fname('Si.xyz')
>>> ExtendedXYZ().sniff(fname)
False
class galaxy.datatypes.molecules.FPS(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

chemfp fingerprint file: http://code.google.com/p/chem-fingerprints/wiki/FPS

file_ext = 'fps'
static merge(split_files, output_file)[source]

Merging fps files requires merging the header manually. We take the header from the first file.

metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd285c430>}
set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a FPS file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('q.fps')
>>> FPS().sniff(fname)
True
>>> fname = get_test_fname('drugbank_drugs.cml')
>>> FPS().sniff(fname)
False
classmethod split(input_datasets, subdir_generator_function, split_params)[source]

Split the input files by fingerprint records.

class galaxy.datatypes.molecules.GRO(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

GROMACS structure format. https://manual.gromacs.org/current/reference-manual/file-formats.html#gro

file_ext = 'gro'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd524a670>}
set_peek(dataset)[source]
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a GRO file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('5e5z.gro')
>>> GRO().sniff_prefix(fname)
True
>>> fname = get_test_fname('5e5z.pdb')
>>> GRO().sniff_prefix(fname)
False
class galaxy.datatypes.molecules.GenericMolFile(**kwd)[source]

Bases: galaxy.datatypes.data.Text

Abstract class for most of the molecule files.

element_symbols = ['Ac', 'Ag', 'Al', 'Am', 'Ar', 'As', 'At', 'Au', 'B ', 'Ba', 'Be', 'Bh', 'Bi', 'Bk', 'Br', 'C ', 'Ca', 'Cd', 'Ce', 'Cf', 'Cl', 'Cm', 'Co', 'Cr', 'Cs', 'Cu', 'Ds', 'Db', 'Dy', 'Er', 'Es', 'Eu', 'F ', 'Fe', 'Fm', 'Fr', 'Ga', 'Gd', 'Ge', 'H ', 'He', 'Hf', 'Hg', 'Ho', 'Hs', 'I ', 'In', 'Ir', 'K ', 'Kr', 'La', 'Li', 'Lr', 'Lu', 'Md', 'Mg', 'Mn', 'Mo', 'Mt', 'N ', 'Na', 'Nb', 'Nd', 'Ne', 'Ni', 'No', 'Np', 'O ', 'Os', 'P ', 'Pa', 'Pb', 'Pd', 'Pm', 'Po', 'Pr', 'Pt', 'Pu', 'Ra', 'Rb', 'Re', 'Rf', 'Rg', 'Rh', 'Rn', 'Ru', 'S ', 'Sb', 'Sc', 'Se', 'Sg', 'Si', 'Sm', 'Sn', 'Sr', 'Ta', 'Tb', 'Tc', 'Te', 'Th', 'Ti', 'Tl', 'Tm', 'U ', 'V ', 'W ', 'Xe', 'Y ', 'Yb', 'Zn', 'Zr']
get_mime()[source]
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e290d0>}
set_peek(dataset)[source]
class galaxy.datatypes.molecules.InChI(**kwd)[source]

Bases: galaxy.datatypes.tabular.Tabular

column_names = ['InChI']
file_ext = 'inchi'
metadata_spec = {'column_names': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c280>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd524a130>, 'columns': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd524af40>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c040>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c0d0>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdcf4c310>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd524af10>}
set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

set_peek(dataset)[source]
sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a InChI file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('drugbank_drugs.inchi')
>>> InChI().sniff(fname)
True
>>> fname = get_test_fname('drugbank_drugs.cml')
>>> InChI().sniff(fname)
False
class galaxy.datatypes.molecules.MOL(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

file_ext = 'mol'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd1e29130>}
set_meta(dataset, **kwd)[source]

Set the number molecules, in the case of MOL its always one.

class galaxy.datatypes.molecules.MOL2(**kwd)[source]

Bases: galaxy.datatypes.molecules.GenericMolFile

file_ext = 'mol2'
metadata_spec = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a1d60>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fdd1a11c0>, 'number_of_molecules': <galaxy.model.metadata.MetadataElementSpec object at 0x7f6fd5f5bf10>}
set_meta(dataset, **kwd)[source]

Set the number of lines of data in dataset.

sniff(filename)
sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]

Try to guess if the file is a MOL2 file.

>>> from galaxy.datatypes.sniff import get_test_fname
>>> fname = get_test_fname('drugbank_drugs.mol2')
>>> MOL2().sniff(fname)
True
>>> fname = get_test_fname('drugbank_drugs.cml')
>>> MOL2().sniff(fname)
False
classmethod split(input_datasets, subdir_generator_function, split_params)[source]

Split the input files by molecule records.

class galaxy.datatypes.molecules.OBFS(**kwd)[source]

Bases: galaxy.datatypes.binary.Binary

OpenBabel Fastsearch format (fs).

__init__(**kwd)[source]