Warning
This document is for an old release of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.
galaxy.datatypes package¶
Subpackages¶
- galaxy.datatypes.converters package
- Submodules
- galaxy.datatypes.converters.bed_to_gff_converter module
- galaxy.datatypes.converters.bgzip module
- galaxy.datatypes.converters.cram_to_bam module
- galaxy.datatypes.converters.fasta_to_len module
- galaxy.datatypes.converters.fasta_to_tabular_converter module
- galaxy.datatypes.converters.fastq_to_fqtoc module
- galaxy.datatypes.converters.fastqsolexa_to_fasta_converter module
- galaxy.datatypes.converters.fastqsolexa_to_qual_converter module
- galaxy.datatypes.converters.gff_to_bed_converter module
- galaxy.datatypes.converters.gff_to_interval_index_converter module
- galaxy.datatypes.converters.interval_to_bed_converter module
- galaxy.datatypes.converters.interval_to_bedstrict_converter module
- galaxy.datatypes.converters.interval_to_fli module
- galaxy.datatypes.converters.interval_to_interval_index_converter module
- galaxy.datatypes.converters.interval_to_tabix_converter module
- galaxy.datatypes.converters.lped_to_fped_converter module
- galaxy.datatypes.converters.lped_to_pbed_converter module
- galaxy.datatypes.converters.maf_to_fasta_converter module
- galaxy.datatypes.converters.maf_to_interval_converter module
- galaxy.datatypes.converters.parquet_to_csv_converter module
- galaxy.datatypes.converters.pbed_ldreduced_converter module
- galaxy.datatypes.converters.pbed_to_lped_converter module
- galaxy.datatypes.converters.picard_interval_list_to_bed6_converter module
- galaxy.datatypes.converters.pileup_to_interval_index_converter module
- galaxy.datatypes.converters.ref_to_seq_taxonomy_converter module
- galaxy.datatypes.converters.tabular_csv module
- galaxy.datatypes.converters.tabular_to_dbnsfp module
- galaxy.datatypes.converters.vcf_to_interval_index_converter module
- galaxy.datatypes.converters.vcf_to_vcf_bgzip module
- galaxy.datatypes.converters.wiggle_to_simple_converter module
- galaxy.datatypes.dataproviders package
- Submodules
- galaxy.datatypes.dataproviders.base module
- galaxy.datatypes.dataproviders.chunk module
- galaxy.datatypes.dataproviders.column module
- galaxy.datatypes.dataproviders.dataset module
- galaxy.datatypes.dataproviders.decorators module
- galaxy.datatypes.dataproviders.exceptions module
- galaxy.datatypes.dataproviders.external module
- galaxy.datatypes.dataproviders.hierarchy module
- galaxy.datatypes.dataproviders.line module
- galaxy.datatypes.display_applications package
- galaxy.datatypes.util package
Submodules¶
galaxy.datatypes.annotation module¶
- class galaxy.datatypes.annotation.SnapHmm(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
- file_ext = 'snaphmm'¶
- edam_data = 'data_1364'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
SNAP model files start with zoeHMM
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.annotation.Augustus(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Class describing an Augustus prediction model
- file_ext = 'augustus'¶
- edam_data = 'data_0950'¶
- compressed = True¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.anvio module¶
Datatypes for Anvi’o https://github.com/merenlab/anvio
- class galaxy.datatypes.anvio.AnvioComposite(**kwd)[source]¶
Bases:
galaxy.datatypes.text.Html
Base class to use for Anvi’o composite datatypes. Generally consist of a sqlite database, plus optional additional files
- file_ext = 'anvio_composite'¶
- generate_primary_file(dataset=None)[source]¶
This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioComposite
Class for AnvioDB database files.
- file_ext = 'anvio_db'¶
- set_meta(dataset, **kwd)[source]¶
Set the anvio_basename based upon actual extra_files_path contents.
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioStructureDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioDB
Class for Anvio Structure DB database files.
- file_ext = 'anvio_structure_db'¶
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioGenomesDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioDB
Class for Anvio Genomes DB database files.
- file_ext = 'anvio_genomes_db'¶
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioContigsDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioDB
Class for Anvio Contigs DB database files.
- file_ext = 'anvio_contigs_db'¶
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioProfileDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioDB
Class for Anvio Profile DB database files.
- file_ext = 'anvio_profile_db'¶
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioPanDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioDB
Class for Anvio Pan DB database files.
- file_ext = 'anvio_pan_db'¶
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.anvio.AnvioSamplesDB(*args, **kwd)[source]¶
Bases:
galaxy.datatypes.anvio.AnvioDB
Class for Anvio Samples DB database files.
- file_ext = 'anvio_samples_db'¶
- metadata_spec: metadata.MetadataSpecCollection = {'anvio_basename': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.assembly module¶
velvet datatypes James E Johnson - University of Minnesota for velvet assembler tool in galaxy
- class galaxy.datatypes.assembly.Amos(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Class describing the AMOS assembly file
- edam_data = 'data_0925'¶
- edam_format = 'format_3582'¶
- file_ext = 'afg'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Determines whether the file is an amos assembly file format Example:
{CTG iid:1 eid:1 seq: CCTCTCCTGTAGAGTTCAACCGA-GCCGGTAGAGTTTTATCA . qlt: DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD . {TLE src:1027 off:0 clr:618,0 gap: 250 612 . } }
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.assembly.Sequences(**kwd)[source]¶
Bases:
galaxy.datatypes.sequence.Fasta
Class describing the Sequences file generated by velveth
- edam_data = 'data_0925'¶
- file_ext = 'sequences'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Determines whether the file is a velveth produced fasta format The id line has 3 fields separated by tabs: sequence_name sequence_index category:
>SEQUENCE_0_length_35 1 1 GGATATAGGGCCAACCCAACTCAACGGCCTGTCTT >SEQUENCE_1_length_35 2 1 CGACGAATGACAGGTCACGAATTTGGCGGGGATTA
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'sequences': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.assembly.Roadmaps(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Class describing the Sequences file generated by velveth
- edam_format = 'format_2561'¶
- file_ext = 'roadmaps'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
- Determines whether the file is a velveth produced RoadMap::
142858 21 1 ROADMAP 1 ROADMAP 2 …
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.assembly.Velvet(**kwd)[source]¶
Bases:
galaxy.datatypes.text.Html
- file_ext = 'velvet'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'long_reads': <galaxy.model.metadata.MetadataElementSpec object>, 'paired_end_reads': <galaxy.model.metadata.MetadataElementSpec object>, 'short2_reads': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.binary module¶
Binary classes
- class galaxy.datatypes.binary.Binary(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Data
Binary data
- edam_format = 'format_2333'¶
- file_ext = 'binary'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Ab1(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an ab1 binary sequence file
- file_ext = 'ab1'¶
- edam_format = 'format_3000'¶
- edam_data = 'data_0924'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Idat(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Binary data in idat format
- file_ext = 'idat'¶
- edam_format = 'format_2058'¶
- edam_data = 'data_2603'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Cel(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Cel File format described at: http://media.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cel.html
- file_ext = 'cel'¶
- edam_format = 'format_1638'¶
- edam_data = 'data_3110'¶
- sniff(filename)[source]¶
Try to guess if the file is a Cel file.
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('affy_v_agcc.cel') >>> Cel().sniff(fname) True >>> fname = get_test_fname('affy_v_3.cel') >>> Cel().sniff(fname) True >>> fname = get_test_fname('affy_v_4.cel') >>> Cel().sniff(fname) True >>> fname = get_test_fname('test.gal') >>> Cel().sniff(fname) False
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.MashSketch(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Mash Sketch file. Sketches are used by the MinHash algorithm to allow fast distance estimations with low storage and memory requirements. To make a sketch, each k-mer in a sequence is hashed, which creates a pseudo-random identifier. By sorting these identifiers (hashes), a small subset from the top of the sorted list can represent the entire sequence (these are min-hashes). The more similar another sequence is, the more min-hashes it is likely to share.
- file_ext = 'msh'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.CompressedArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.
- file_ext = 'compressed_archive'¶
- compressed = True¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Meryldb(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
MerylDB is a tar.gz archive, with 128 files. 64 data files and 64 index files.
- file_ext = 'meryldb'¶
- sniff(filename)[source]¶
Try to guess if the file is a Cel file.
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('affy_v_agcc.cel') >>> Meryldb().sniff(fname) False >>> fname = get_test_fname('read-db.meryldb') >>> Meryldb().sniff(fname) True
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Bref3(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Bref3 format is a binary format for storing phased, non-missing genotypes for a list of samples.
- file_ext = 'bref3'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.DynamicCompressedArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
- uncompressed_datatype_instance: galaxy.datatypes.data.Data¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.GzDynamicCompressedArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.DynamicCompressedArchive
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- uncompressed_datatype_instance: galaxy.datatypes.data.Data¶
- class galaxy.datatypes.binary.Bz2DynamicCompressedArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.DynamicCompressedArchive
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- uncompressed_datatype_instance: galaxy.datatypes.data.Data¶
- class galaxy.datatypes.binary.CompressedZipArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Class describing an compressed binary file This class can be sublass’ed to implement archive filetypes that will not be unpacked by upload.py.
- file_ext = 'zip'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.GenericAsn1Binary(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class for generic ASN.1 binary format
- file_ext = 'asn1-binary'¶
- edam_format = 'format_1966'¶
- edam_data = 'data_0849'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.BamNative(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
,galaxy.datatypes.binary._BamOrSam
Class describing a BAM binary file that is not necessarily sorted
- edam_format = 'format_2572'¶
- edam_data = 'data_0863'¶
- file_ext = 'unsorted.bam'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- static merge(split_files, output_file)[source]¶
Merges BAM files
- Parameters
split_files – List of bam file paths to merge
output_file – Write merged bam file to this location
- to_archive(dataset, name='')[source]¶
Collect archive paths and file handles that need to be exported when archiving dataset.
- Parameters
dataset – HistoryDatasetAssociation
name – archive name, in collection context corresponds to collection name(s) and element_identifier, joined by ‘/’, e.g ‘fastq_collection/sample1/forward’
- groom_dataset_content(file_name)[source]¶
Ensures that the BAM file contents are coordinate-sorted. This function is called on an output dataset after the content is initially generated.
- display_data(trans, dataset, preview=False, filename=None, to_ext=None, offset=None, ck_size=None, **kwd)[source]¶
Displays data in central pane if preview is True, else handles download.
Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.
TOOD: Document alternatives to overridding this method (data providers?).
- metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Bam(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BamNative
Class describing a BAM binary file
- edam_format = 'format_2572'¶
- edam_data = 'data_0863'¶
- file_ext = 'bam'¶
- get_index_flag(file_name)[source]¶
Return pysam flag for bai index (default) or csi index (contig size > (2**29 - 1) )
- dataset_content_needs_grooming(file_name)[source]¶
Check if file_name is a coordinate-sorted BAM file
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- samtools_dataprovider(dataset, **settings)[source]¶
Generic samtools interface - all options available through settings.
- dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'column': <function Bam.column_dataprovider>, 'dict': <function Bam.dict_dataprovider>, 'genomic-region': <function Bam.genomic_region_dataprovider>, 'genomic-region-dict': <function Bam.genomic_region_dict_dataprovider>, 'header': <function Bam.header_dataprovider>, 'id-seq-qual': <function Bam.id_seq_qual_dataprovider>, 'line': <function Bam.line_dataprovider>, 'regex-line': <function Bam.regex_line_dataprovider>, 'samtools': <function Bam.samtools_dataprovider>}¶
- metadata_spec: metadata.MetadataSpecCollection = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.ProBam(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Bam
Class describing a BAM binary file - extended for proteomics data
- edam_format = 'format_3826'¶
- edam_data = 'data_0863'¶
- file_ext = 'probam'¶
- metadata_spec: metadata.MetadataSpecCollection = {'bam_csi_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_index': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.BamInputSorted(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BamNative
A class for BAM files that can formally be unsorted or queryname sorted. Alignments are either ordered based on the order with which the queries appear when producing the alignment, or ordered by their queryname. This notaby keeps alignments produced by paired end sequencing adjacent.
- file_ext = 'qname_input_sorted.bam'¶
- metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.BamQuerynameSorted(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BamInputSorted
A class for queryname sorted BAM files.
- file_ext = 'qname_sorted.bam'¶
- dataset_content_needs_grooming(file_name)[source]¶
Check if file_name is a queryname-sorted BAM file
- metadata_spec: metadata.MetadataSpecCollection = {'bam_header': <galaxy.model.metadata.MetadataElementSpec object>, 'bam_version': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'read_groups': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_lengths': <galaxy.model.metadata.MetadataElementSpec object>, 'reference_names': <galaxy.model.metadata.MetadataElementSpec object>, 'sort_order': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.CRAM(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
- file_ext = 'cram'¶
- edam_format = 'format_3462'¶
- edam_data = 'data_0863'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'cram_index': <galaxy.model.metadata.MetadataElementSpec object>, 'cram_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.BaseBcf(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
- edam_format = 'format_3020'¶
- edam_data = 'data_3498'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Bcf(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BaseBcf
Class describing a (BGZF-compressed) BCF file
- file_ext = 'bcf'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'bcf_index': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.BcfUncompressed(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BaseBcf
Class describing an uncompressed BCF file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('1.bcf_uncompressed') >>> BcfUncompressed().sniff(fname) True >>> fname = get_test_fname('1.bcf') >>> BcfUncompressed().sniff(fname) False
- file_ext = 'bcf_uncompressed'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.H5(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an HDF5 file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.mz5') >>> H5().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> H5().sniff(fname) False
- file_ext = 'h5'¶
- edam_format = 'format_3590'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Loom(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Class describing a Loom file: http://loompy.org/
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.loom') >>> Loom().sniff(fname) True >>> fname = get_test_fname('test.mz5') >>> Loom().sniff(fname) False
- file_ext = 'loom'¶
- edam_format = 'format_3590'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'col_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'col_attrs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'col_graphs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'col_graphs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'description': <galaxy.model.metadata.MetadataElementSpec object>, 'doi': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object>, 'loom_spec_version': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'row_graphs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'row_graphs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'title': <galaxy.model.metadata.MetadataElementSpec object>, 'url': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Anndata(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Class describing an HDF5 anndata files: http://anndata.rtfd.io
>>> from galaxy.datatypes.sniff import get_test_fname >>> Anndata().sniff(get_test_fname('pbmc3k_tiny.h5ad')) True >>> Anndata().sniff(get_test_fname('test.mz5')) False >>> Anndata().sniff(get_test_fname('import.loom.krumsiek11.h5ad')) True >>> Anndata().sniff(get_test_fname('adata_0_6_small2.h5ad')) True >>> Anndata().sniff(get_test_fname('adata_0_6_small.h5ad')) True >>> Anndata().sniff(get_test_fname('adata_0_7_4_small2.h5ad')) True >>> Anndata().sniff(get_test_fname('adata_0_7_4_small.h5ad')) True >>> Anndata().sniff(get_test_fname('adata_unk2.h5ad')) True >>> Anndata().sniff(get_test_fname('adata_unk.h5ad')) True
- file_ext = 'h5ad'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'anndata_spec_version': <galaxy.model.metadata.MetadataElementSpec object>, 'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'description': <galaxy.model.metadata.MetadataElementSpec object>, 'doi': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_count': <galaxy.model.metadata.MetadataElementSpec object>, 'layers_names': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_names': <galaxy.model.metadata.MetadataElementSpec object>, 'obs_size': <galaxy.model.metadata.MetadataElementSpec object>, 'obsm_count': <galaxy.model.metadata.MetadataElementSpec object>, 'obsm_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_count': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'raw_var_size': <galaxy.model.metadata.MetadataElementSpec object>, 'row_attrs_count': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'title': <galaxy.model.metadata.MetadataElementSpec object>, 'uns_count': <galaxy.model.metadata.MetadataElementSpec object>, 'uns_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'url': <galaxy.model.metadata.MetadataElementSpec object>, 'var_count': <galaxy.model.metadata.MetadataElementSpec object>, 'var_layers': <galaxy.model.metadata.MetadataElementSpec object>, 'var_size': <galaxy.model.metadata.MetadataElementSpec object>, 'varm_count': <galaxy.model.metadata.MetadataElementSpec object>, 'varm_layers': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Grib(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an GRIB file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.grib') >>> Grib().sniff_prefix(FilePrefix(fname)) True >>> fname = FilePrefix(get_test_fname('interval.interval')) >>> Grib().sniff_prefix(fname) False
- file_ext = 'grib'¶
- edam_format = 'format_2333'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'grib_edition': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.GmxBinary(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Base class for GROMACS binary files - xtc, trr, cpt
- file_ext = ''¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.Trr(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.GmxBinary
Class describing an trr file from the GROMACS suite
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('md.trr') >>> Trr().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> Trr().sniff(fname) False
- file_ext = 'trr'¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Cpt(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.GmxBinary
Class describing a checkpoint (.cpt) file from the GROMACS suite
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('md.cpt') >>> Cpt().sniff(fname) True >>> fname = get_test_fname('md.trr') >>> Cpt().sniff(fname) False
- file_ext = 'cpt'¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Xtc(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.GmxBinary
Class describing an xtc file from the GROMACS suite
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('md.xtc') >>> Xtc().sniff(fname) True >>> fname = get_test_fname('md.trr') >>> Xtc().sniff(fname) False
- file_ext = 'xtc'¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Edr(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.GmxBinary
Class describing an edr file from the GROMACS suite
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('md.edr') >>> Edr().sniff(fname) True >>> fname = get_test_fname('md.trr') >>> Edr().sniff(fname) False
- file_ext = 'edr'¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Biom2(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Class describing a biom2 file (http://biom-format.org/documentation/biom_format.html)
- file_ext = 'biom2'¶
- edam_format = 'format_3746'¶
- sniff(filename)[source]¶
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2') >>> Biom2().sniff(fname) True >>> fname = get_test_fname('test.mz5') >>> Biom2().sniff(fname) False >>> fname = get_test_fname('wiggle.wig') >>> Biom2().sniff(fname) False
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'creation_date': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'format_url': <galaxy.model.metadata.MetadataElementSpec object>, 'format_version': <galaxy.model.metadata.MetadataElementSpec object>, 'generated_by': <galaxy.model.metadata.MetadataElementSpec object>, 'id': <galaxy.model.metadata.MetadataElementSpec object>, 'nnz': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>, 'type': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Cool(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Class describing the cool format (https://github.com/mirnylab/cooler)
- file_ext = 'cool'¶
- sniff(filename)[source]¶
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('matrix.cool') >>> Cool().sniff(fname) True >>> fname = get_test_fname('test.mz5') >>> Cool().sniff(fname) False >>> fname = get_test_fname('wiggle.wig') >>> Cool().sniff(fname) False >>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2') >>> Cool().sniff(fname) False
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.MCool(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Class describing the multi-resolution cool format (https://github.com/mirnylab/cooler)
- file_ext = 'mcool'¶
- sniff(filename)[source]¶
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('matrix.mcool') >>> MCool().sniff(fname) True >>> fname = get_test_fname('matrix.cool') >>> MCool().sniff(fname) False >>> fname = get_test_fname('test.mz5') >>> MCool().sniff(fname) False >>> fname = get_test_fname('wiggle.wig') >>> MCool().sniff(fname) False >>> fname = get_test_fname('biom2_sparse_otu_table_hdf5.biom2') >>> MCool().sniff(fname) False
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.H5MLM(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Machine learning model generated by Galaxy-ML.
- file_ext = 'h5mlm'¶
- URL = 'https://github.com/goeckslab/Galaxy-ML'¶
- max_peek_size = 1000¶
- max_preview_size = 1000000¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- display_data(trans, dataset, preview=False, filename=None, to_ext=None, **kwd)[source]¶
Displays data in central pane if preview is True, else handles download.
Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.
TOOD: Document alternatives to overridding this method (data providers?).
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'hyper_params': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.LudwigModel(**kwd)[source]¶
Bases:
galaxy.datatypes.text.Html
Composite datatype that encloses multiple files for a Ludwig trained model.
- file_ext = 'ludwig_model'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.HexrdMaterials(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.H5
Class describing a Hexrd Materials file: https://github.com/HEXRD/hexrd
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('hexrd.materials.h5') >>> HexrdMaterials().sniff(fname) True >>> fname = get_test_fname('test.loom') >>> HexrdMaterials().sniff(fname) False
- file_ext = 'hexrd.materials.h5'¶
- edam_format = 'format_3590'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'LatticeParameters': <galaxy.model.metadata.MetadataElementSpec object>, 'SpaceGroupNumber': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'materials': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Scf(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an scf binary sequence file
- edam_format = 'format_1632'¶
- edam_data = 'data_0924'¶
- file_ext = 'scf'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Sff(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Standard Flowgram Format (SFF)
- edam_format = 'format_3284'¶
- edam_data = 'data_0924'¶
- file_ext = 'sff'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.BigWig(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Accessing binary BigWig files from UCSC. The supplemental info in the paper has the binary details: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btq351v1
- edam_format = 'format_3006'¶
- edam_data = 'data_3002'¶
- file_ext = 'bigwig'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.BigBed(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BigWig
BigBed support from UCSC.
- edam_format = 'format_3004'¶
- edam_data = 'data_3002'¶
- file_ext = 'bigbed'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.TwoBit(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing a TwoBit format nucleotide file
- edam_format = 'format_3009'¶
- edam_data = 'data_0848'¶
- file_ext = 'twobit'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.SQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing a Sqlite database
- file_ext = 'sqlite'¶
- edam_format = 'format_3621'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'sqlite': <function SQlite.sqlite_dataprovider>, 'sqlite-dict': <function SQlite.sqlite_datadictprovider>, 'sqlite-table': <function SQlite.sqlite_datatableprovider>}¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.GeminiSQLite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Gemini Sqlite database
- file_ext = 'gemini.sqlite'¶
- edam_format = 'format_3622'¶
- edam_data = 'data_3498'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'gemini_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.ChiraSQLite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a ChiRAViz Sqlite database
- file_ext = 'chira.sqlite'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.CuffDiffSQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a CuffDiff SQLite database
- file_ext = 'cuffdiff.sqlite'¶
- edam_format = 'format_3621'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'cuffdiff_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'genes': <galaxy.model.metadata.MetadataElementSpec object>, 'samples': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.MzSQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Proteomics Sqlite database
- file_ext = 'mz.sqlite'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.PQP(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Peptide query parameters file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.pqp') >>> PQP().sniff(fname) True >>> fname = get_test_fname('test.osw') >>> PQP().sniff(fname) False
- file_ext = 'pqp'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- sniff(filename)[source]¶
table definition according to https://github.com/grosenberger/OpenMS/blob/develop/src/openms/source/ANALYSIS/OPENSWATH/TransitionPQPFile.cpp#L264 for now VERSION GENE PEPTIDE_GENE_MAPPING are excluded, since there is test data wo these tables, see also here https://github.com/OpenMS/OpenMS/issues/4365
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OSW(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing OpenSwath output
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.osw') >>> OSW().sniff(fname) True >>> fname = get_test_fname('test.sqmass') >>> OSW().sniff(fname) False
- file_ext = 'osw'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.SQmass(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Sqmass database
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.sqmass') >>> SQmass().sniff(fname) True >>> fname = get_test_fname('test.pqp') >>> SQmass().sniff(fname) False
- file_ext = 'sqmass'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.BlibSQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Proteomics Spectral Library Sqlite database
- file_ext = 'blib'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'blib_version': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.DlibSQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Proteomics Spectral Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.dlib') >>> DlibSQlite().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> DlibSQlite().sniff(fname) False
- file_ext = 'dlib'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dlib_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.ElibSQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a Proteomics Chromatagram Library Sqlite database DLIBs only have the “entries”, “metadata”, and “peptidetoprotein” tables populated. ELIBs have the rest of the tables populated too, such as “peptidequants” or “peptidescores”.
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.elib') >>> ElibSQlite().sniff(fname) True >>> fname = get_test_fname('test.dlib') >>> ElibSQlite().sniff(fname) False
- file_ext = 'elib'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.IdpDB(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing an IDPicker 3 idpDB (sqlite) database
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.idpdb') >>> IdpDB().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> IdpDB().sniff(fname) False
- file_ext = 'idpdb'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.GAFASQLite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing a GAFA SQLite database
- file_ext = 'gafa.sqlite'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'gafa_schema_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.NcbiTaxonomySQlite(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.SQlite
Class describing the NCBI Taxonomy database stored in SQLite as done by rust-ncbitaxonomy
- file_ext = 'ncbitaxonomy.sqlite'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'ncbitaxonomy_schema_version': <galaxy.model.metadata.MetadataElementSpec object>, 'table_columns': <galaxy.model.metadata.MetadataElementSpec object>, 'table_row_count': <galaxy.model.metadata.MetadataElementSpec object>, 'tables': <galaxy.model.metadata.MetadataElementSpec object>, 'taxon_count': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Xlsx(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class for Excel 2007 (xlsx) files
- file_ext = 'xlsx'¶
- compressed = True¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.ExcelXls(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an Excel (xls) file
- file_ext = 'excel.xls'¶
- edam_format = 'format_3468'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Sra(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Sequence Read Archive (SRA) datatype originally from mdshw5/sra-tools-galaxy
- file_ext = 'sra'¶
- sniff_prefix(sniff_prefix)[source]¶
The first 8 bytes of any NCBI sra file is ‘NCBI.sra’, and the file is binary. For details about the format, see http://www.ncbi.nlm.nih.gov/books/n/helpsra/SRA_Overview_BK/#SRA_Overview_BK.4_SRA_Data_Structure
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.RData(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Generic R Data file datatype implementation, i.e. files generated with R’s save or save.img function see https://www.loc.gov/preservation/digital/formats/fdd/fdd000470.shtml and https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.rdata') >>> RData().sniff(fname) True >>> from galaxy.util.bunch import Bunch >>> dataset = Bunch() >>> dataset.metadata = Bunch >>> dataset.file_name = fname >>> dataset.has_data = lambda: True >>> RData().set_meta(dataset) >>> dataset.metadata.version '3'
- VERSION_2_PREFIX = b'RDX2\nX\n'¶
- VERSION_3_PREFIX = b'RDX3\nX\n'¶
- file_ext = 'rdata'¶
- check_required_metadata = True¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.RDS(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
File using a serialized R object generated with R’s saveRDS function see https://cran.r-project.org/doc/manuals/r-patched/R-ints.html#Serialization-Formats
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('int-r3.rds') >>> RDS().sniff(fname) True >>> fname = get_test_fname('int-r4.rds') >>> RDS().sniff(fname) True >>> fname = get_test_fname('int-r3-version2.rds') >>> RDS().sniff(fname) True >>> from galaxy.util.bunch import Bunch >>> dataset = Bunch() >>> dataset.metadata = Bunch >>> dataset.file_name = get_test_fname('int-r4.rds') >>> dataset.has_data = lambda: True >>> RDS().set_meta(dataset) >>> dataset.metadata.version '3' >>> dataset.metadata.rversion '4.1.1' >>> dataset.metadata.minrversion '3.5.0'
- file_ext = 'rds'¶
- check_required_metadata = True¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'minrversion': <galaxy.model.metadata.MetadataElementSpec object>, 'rversion': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.OxliBinary(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OxliCountGraph(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.OxliBinary
OxliCountGraph starts with “OXLI” + one byte version number + 8-bit binary ‘1’ Test file generated via:
load-into-counting.py --n_tables 1 --max-tablesize 1 \ oxli_countgraph.oxlicg khmer/tests/test-data/100-reads.fq.bz2
using khmer 2.0
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sequence.csfasta') >>> OxliCountGraph().sniff(fname) False >>> fname = get_test_fname("oxli_countgraph.oxlicg") >>> OxliCountGraph().sniff(fname) True
- file_ext = 'oxlicg'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OxliNodeGraph(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.OxliBinary
OxliNodeGraph starts with “OXLI” + one byte version number + 8-bit binary ‘2’ Test file generated via:
load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \ khmer/tests/test-data/100-reads.fq.bz2
using khmer 2.0
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sequence.csfasta') >>> OxliNodeGraph().sniff(fname) False >>> fname = get_test_fname("oxli_nodegraph.oxling") >>> OxliNodeGraph().sniff(fname) True
- file_ext = 'oxling'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OxliTagSet(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.OxliBinary
OxliTagSet starts with “OXLI” + one byte version number + 8-bit binary ‘3’ Test file generated via:
load-graph.py --n_tables 1 --max-tablesize 1 oxli_nodegraph.oxling \ khmer/tests/test-data/100-reads.fq.bz2; mv oxli_nodegraph.oxling.tagset oxli_tagset.oxlits
using khmer 2.0
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sequence.csfasta') >>> OxliTagSet().sniff(fname) False >>> fname = get_test_fname("oxli_tagset.oxlits") >>> OxliTagSet().sniff(fname) True
- file_ext = 'oxlits'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OxliStopTags(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.OxliBinary
OxliStopTags starts with “OXLI” + one byte version number + 8-bit binary ‘4’ Test file adapted from khmer 2.0’s “khmer/tests/test-data/goodversion-k32.stoptags”
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sequence.csfasta') >>> OxliStopTags().sniff(fname) False >>> fname = get_test_fname("oxli_stoptags.oxlist") >>> OxliStopTags().sniff(fname) True
- file_ext = 'oxlist'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OxliSubset(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.OxliBinary
OxliSubset starts with “OXLI” + one byte version number + 8-bit binary ‘5’ Test file generated via:
load-graph.py -k 20 example tests/test-data/random-20-a.fa; partition-graph.py example; mv example.subset.0.pmap oxli_subset.oxliss
using khmer 2.0
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sequence.csfasta') >>> OxliSubset().sniff(fname) False >>> fname = get_test_fname("oxli_subset.oxliss") >>> OxliSubset().sniff(fname) True
- file_ext = 'oxliss'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.OxliGraphLabels(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.OxliBinary
OxliGraphLabels starts with “OXLI” + one byte version number + 8-bit binary ‘6’ Test file generated via:
python -c "from khmer import GraphLabels; \ gl = GraphLabels(20, 1e7, 4); \ gl.consume_fasta_and_tag_with_labels('tests/test-data/test-labels.fa'); \ gl.save_labels_and_tags('oxli_graphlabels.oxligl')"
using khmer 2.0
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sequence.csfasta') >>> OxliGraphLabels().sniff(fname) False >>> fname = get_test_fname("oxli_graphlabels.oxligl") >>> OxliGraphLabels().sniff(fname) True
- file_ext = 'oxligl'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.PostgresqlArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Class describing a Postgresql database packed into a tar archive
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('postgresql_fake.tar.bz2') >>> PostgresqlArchive().sniff(fname) True >>> fname = get_test_fname('test.fast5.tar') >>> PostgresqlArchive().sniff(fname) False
- file_ext = 'postgresql'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Fast5Archive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Class describing a FAST5 archive
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.fast5.tar') >>> Fast5Archive().sniff(fname) True
- file_ext = 'fast5.tar'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Fast5ArchiveGz(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Fast5Archive
Class describing a gzip-compressed FAST5 archive
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.fast5.tar.gz') >>> Fast5ArchiveGz().sniff(fname) True >>> fname = get_test_fname('test.fast5.tar.bz2') >>> Fast5ArchiveGz().sniff(fname) False >>> fname = get_test_fname('test.fast5.tar') >>> Fast5ArchiveGz().sniff(fname) False
- file_ext = 'fast5.tar.gz'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Fast5ArchiveBz2(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Fast5Archive
Class describing a bzip2-compressed FAST5 archive
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.fast5.tar.bz2') >>> Fast5ArchiveBz2().sniff(fname) True >>> fname = get_test_fname('test.fast5.tar.gz') >>> Fast5ArchiveBz2().sniff(fname) False >>> fname = get_test_fname('test.fast5.tar') >>> Fast5ArchiveBz2().sniff(fname) False
- file_ext = 'fast5.tar.bz2'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'fast5_count': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.SearchGuiArchive(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Class describing a SearchGUI archive
- file_ext = 'searchgui_archive'¶
- set_meta(dataset, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'searchgui_major_version': <galaxy.model.metadata.MetadataElementSpec object>, 'searchgui_version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.NetCDF(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Binary data in netCDF format
- file_ext = 'netcdf'¶
- edam_format = 'format_3650'¶
- edam_data = 'data_0943'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.Dcd(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing a dcd file from the CHARMM molecular simulation program
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test_glucose_vacuum.dcd') >>> Dcd().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> Dcd().sniff(fname) False
- file_ext = 'dcd'¶
- edam_data = 'data_3842'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Vel(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing a velocity file from the CHARMM molecular simulation program
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test_charmm.vel') >>> Vel().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> Vel().sniff(fname) False
- file_ext = 'vel'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.DAA(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an DAA (diamond alignment archive) file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('diamond.daa') >>> DAA().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> DAA().sniff(fname) False
- file_ext = 'daa'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.RMA6(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an RMA6 (MEGAN6 read-match archive) file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('diamond.rma6') >>> RMA6().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> RMA6().sniff(fname) False
- file_ext = 'rma6'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.DMND(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an DMND file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('diamond_db.dmnd') >>> DMND().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> DMND().sniff(fname) False
- file_ext = 'dmnd'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.ICM(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an ICM (interpolated context model) file, used by Glimmer
- file_ext = 'icm'¶
- edam_data = 'data_0950'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Parquet(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing Apache Parquet file (https://parquet.apache.org/)
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('example.parquet') >>> Parquet().sniff(fname) True >>> fname = get_test_fname('test.mz5') >>> Parquet().sniff(fname) False
- file_ext = 'parquet'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.BafTar(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Base class for common behavior of tar files of directory-based raw file formats
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('brukerbaf.d.tar') >>> BafTar().sniff(fname) True >>> fname = get_test_fname('test.fast5.tar') >>> BafTar().sniff(fname) False
- edam_data = 'data_2536'¶
- edam_format = 'format_3712'¶
- file_ext = 'brukerbaf.d.tar'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.YepTar(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BafTar
A tar’d up .d directory containing Agilent/Bruker YEP format data
- file_ext = 'agilentbrukeryep.d.tar'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.TdfTar(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BafTar
A tar’d up .d directory containing Bruker TDF format data
- file_ext = 'brukertdf.d.tar'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.MassHunterTar(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BafTar
A tar’d up .d directory containing Agilent MassHunter format data
- file_ext = 'agilentmasshunter.d.tar'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.MassLynxTar(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BafTar
A tar’d up .d directory containing Waters MassLynx format data
- file_ext = 'watersmasslynx.raw.tar'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.WiffTar(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.BafTar
A tar’d up .wiff/.scan pair containing Sciex WIFF format data
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('some.wiff.tar') >>> WiffTar().sniff(fname) True >>> fname = get_test_fname('brukerbaf.d.tar') >>> WiffTar().sniff(fname) False >>> fname = get_test_fname('test.fast5.tar') >>> WiffTar().sniff(fname) False
- file_ext = 'wiff.tar'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Pretext(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
PretextMap contact map file Try to guess if the file is a Pretext file.
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('sample.pretext') >>> Pretext().sniff(fname) True
- file_ext = 'pretext'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.binary.JP2(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
JPEG 2000 binary image format
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.jp2') >>> JP2().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> JP2().sniff(fname) False
- file_ext = 'jp2'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.Npz(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.CompressedArchive
Class describing an Numpy NPZ file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('hexrd.images.npz') >>> Npz().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> Npz().sniff(fname) False
- file_ext = 'npz'¶
- set_meta(dataset, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.HexrdImagesNpz(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Npz
Class describing an HEXRD Images Numpy NPZ file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('hexrd.images.npz') >>> HexrdImagesNpz().sniff(fname) True >>> fname = get_test_fname('eta_ome.npz') >>> HexrdImagesNpz().sniff(fname) False
- file_ext = 'hexrd.images.npz'¶
- set_meta(dataset, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object>, 'omegas': <galaxy.model.metadata.MetadataElementSpec object>, 'panel_id': <galaxy.model.metadata.MetadataElementSpec object>, 'shape': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.binary.HexrdEtaOmeNpz(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Npz
Class describing an HEXRD Eta-Ome Numpy NPZ file
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('hexrd.eta_ome.npz') >>> HexrdEtaOmeNpz().sniff(fname) True >>> fname = get_test_fname('hexrd.images.npz') >>> HexrdEtaOmeNpz().sniff(fname) False
- file_ext = 'hexrd.eta_ome.npz'¶
- set_meta(dataset, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'HKLs': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'files': <galaxy.model.metadata.MetadataElementSpec object>, 'nfiles': <galaxy.model.metadata.MetadataElementSpec object>, 'nframes': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.blast module¶
NCBI BLAST datatypes.
Covers the blastxml
format and the BLAST databases.
- class galaxy.datatypes.blast.BlastXml(**kwd)[source]¶
Bases:
galaxy.datatypes.xml.GenericXml
NCBI Blast XML Output data
- file_ext = 'blastxml'¶
- edam_format = 'format_3331'¶
- edam_data = 'data_0857'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Determines whether the file is blastxml
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('megablast_xml_parser_test1.blastxml') >>> BlastXml().sniff(fname) True >>> fname = get_test_fname('tblastn_four_human_vs_rhodopsin.blastxml') >>> BlastXml().sniff(fname) True >>> fname = get_test_fname('interval.interval') >>> BlastXml().sniff(fname) False
- static merge(split_files, output_file)[source]¶
Merging multiple XML files is non-trivial and must be done in subclasses.
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.blast.BlastNucDb(**kwd)[source]¶
Bases:
galaxy.datatypes.blast._BlastDb
Class for nucleotide BLAST database files.
- file_ext = 'blastdbn'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.blast.BlastProtDb(**kwd)[source]¶
Bases:
galaxy.datatypes.blast._BlastDb
Class for protein BLAST database files.
- file_ext = 'blastdbp'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.blast.BlastDomainDb(**kwd)[source]¶
Bases:
galaxy.datatypes.blast._BlastDb
Class for domain BLAST database files.
- file_ext = 'blastdbd'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.blast.LastDb(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Data
Class for LAST database files.
- file_ext = 'lastdb'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.blast.BlastNucDb5(**kwd)[source]¶
Bases:
galaxy.datatypes.blast._BlastDb
Class for nucleotide BLAST database files.
- file_ext = 'blastdbn5'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.blast.BlastProtDb5(**kwd)[source]¶
Bases:
galaxy.datatypes.blast._BlastDb
Class for protein BLAST database files.
- file_ext = 'blastdbp5'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.blast.BlastDomainDb5(**kwd)[source]¶
Bases:
galaxy.datatypes.blast._BlastDb
Class for domain BLAST database files.
- file_ext = 'blastdbd5'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.checkers module¶
Module proxies galaxy.util.checkers
for backward compatibility.
External datatypes may make use of these functions.
- galaxy.datatypes.checkers.check_bz2(file_path: str, check_content: bool = True) Tuple[bool, bool] [source]¶
- galaxy.datatypes.checkers.check_gzip(file_path: str, check_content: bool = True) Tuple[bool, bool] [source]¶
- galaxy.datatypes.checkers.check_html(name, file_path: bool = True) bool [source]¶
Returns True if the file/string contains HTML code.
- galaxy.datatypes.checkers.check_image(file_path: str)[source]¶
Simple wrapper around image_type to yield a True/False verdict
galaxy.datatypes.chrominfo module¶
- class galaxy.datatypes.chrominfo.ChromInfo(**kwd)[source]¶
Bases:
galaxy.datatypes.tabular.Tabular
- file_ext = 'len'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'chrom': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'length': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.constructive_solid_geometry module¶
Constructive Solid Geometry file formats.
- class galaxy.datatypes.constructive_solid_geometry.Ply(**kwd)[source]¶
Bases:
object
The PLY format describes an object as a collection of vertices, faces and other elements, along with properties such as color and normal direction that can be attached to these elements. A PLY file contains the description of exactly one object.
- subtype = ''¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
The structure of a typical PLY file: Header, Vertex List, Face List, (lists of other elements)
- sniff(filename)¶
- class galaxy.datatypes.constructive_solid_geometry.PlyAscii(**kwd)[source]¶
Bases:
galaxy.datatypes.constructive_solid_geometry.Ply
,galaxy.datatypes.data.Text
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.plyascii') >>> PlyAscii().sniff(fname) True >>> fname = get_test_fname('test.vtkascii') >>> PlyAscii().sniff(fname) False
- file_ext = 'plyascii'¶
- subtype = 'ascii'¶
- metadata_spec: metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'face': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.PlyBinary(**kwd)[source]¶
Bases:
galaxy.datatypes.constructive_solid_geometry.Ply
,galaxy.datatypes.binary.Binary
- file_ext = 'plybinary'¶
- subtype = 'binary'¶
- metadata_spec: metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'face': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'other_elements': <galaxy.model.metadata.MetadataElementSpec object>, 'vertex': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.Vtk(**kwd)[source]¶
Bases:
object
The Visualization Toolkit provides a number of source and writer objects to read and write popular data file formats. The Visualization Toolkit also provides some of its own file formats.
There are two different styles of file formats available in VTK. The simplest are the legacy, serial formats that are easy to read and write either by hand or programmatically. However, these formats are less flexible than the XML based file formats which support random access, parallel I/O, and portable data compression and are preferred to the serial VTK file formats whenever possible.
All keyword phrases are written in ASCII form whether the file is binary or ASCII. The binary section of the file (if in binary form) is the data proper; i.e., the numbers that define points coordinates, scalars, cell indices, and so forth.
Binary data must be placed into the file immediately after the newline (‘\n’) character from the previous ASCII keyword and parameter sequence.
TODO: only legacy formats are currently supported and support for XML formats should be added.
- subtype = ''¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
VTK files can be either ASCII or binary, with two different styles of file formats: legacy or XML. We’ll assume if the file contains a valid VTK header, then it is a valid VTK file.
- set_structure_metadata(line, dataset, dataset_type)[source]¶
The fourth part of legacy VTK files is the dataset structure. The geometry part describes the geometry and topology of the dataset. This part begins with a line containing the keyword DATASET followed by a keyword describing the type of dataset. Then, depending upon the type of dataset, other keyword/ data combinations define the actual data.
- sniff(filename)¶
- class galaxy.datatypes.constructive_solid_geometry.VtkAscii(**kwd)[source]¶
Bases:
galaxy.datatypes.constructive_solid_geometry.Vtk
,galaxy.datatypes.data.Text
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.vtkascii') >>> VtkAscii().sniff(fname) True >>> fname = get_test_fname('test.vtkbinary') >>> VtkAscii().sniff(fname) False
- file_ext = 'vtkascii'¶
- subtype = 'ASCII'¶
- metadata_spec: metadata.MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dataset_type': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimensions': <galaxy.model.metadata.MetadataElementSpec object>, 'field_components': <galaxy.model.metadata.MetadataElementSpec object>, 'field_names': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'lines': <galaxy.model.metadata.MetadataElementSpec object>, 'origin': <galaxy.model.metadata.MetadataElementSpec object>, 'points': <galaxy.model.metadata.MetadataElementSpec object>, 'polygons': <galaxy.model.metadata.MetadataElementSpec object>, 'spacing': <galaxy.model.metadata.MetadataElementSpec object>, 'triangle_strips': <galaxy.model.metadata.MetadataElementSpec object>, 'vertices': <galaxy.model.metadata.MetadataElementSpec object>, 'vtk_version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.VtkBinary(**kwd)[source]¶
Bases:
galaxy.datatypes.constructive_solid_geometry.Vtk
,galaxy.datatypes.binary.Binary
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.vtkbinary') >>> VtkBinary().sniff(fname) True >>> fname = get_test_fname('test.vtkascii') >>> VtkBinary().sniff(fname) False
- file_ext = 'vtkbinary'¶
- subtype = 'BINARY'¶
- metadata_spec: metadata.MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'dataset_type': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimensions': <galaxy.model.metadata.MetadataElementSpec object>, 'field_components': <galaxy.model.metadata.MetadataElementSpec object>, 'field_names': <galaxy.model.metadata.MetadataElementSpec object>, 'file_format': <galaxy.model.metadata.MetadataElementSpec object>, 'lines': <galaxy.model.metadata.MetadataElementSpec object>, 'origin': <galaxy.model.metadata.MetadataElementSpec object>, 'points': <galaxy.model.metadata.MetadataElementSpec object>, 'polygons': <galaxy.model.metadata.MetadataElementSpec object>, 'spacing': <galaxy.model.metadata.MetadataElementSpec object>, 'triangle_strips': <galaxy.model.metadata.MetadataElementSpec object>, 'vertices': <galaxy.model.metadata.MetadataElementSpec object>, 'vtk_version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.STL(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Data
- file_ext = 'stl'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.NeperTess(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Neper Tessellation File
Example:
***tess **format format **general dim type **cell number_of_cells
- file_ext = 'neper.tess'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Neper tess format, starts with
***tess
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.neper.tess') >>> NeperTess().sniff(fname) True >>> fname = get_test_fname('test.neper.tesr') >>> NeperTess().sniff(fname) False
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.constructive_solid_geometry.NeperTesr(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Neper Raster Tessellation File
Example:
***tesr **format format **general dimension size_x size_y [size_z] voxsize_x voxsize_y [voxsize_z] [*origin origin_x origin_y [origin_z]] [*hasvoid has_void] [**cell number_of_cells
- file_ext = 'neper.tesr'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Neper tesr format, starts with
***tesr
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.neper.tesr') >>> NeperTesr().sniff(fname) True >>> fname = get_test_fname('test.neper.tess') >>> NeperTesr().sniff(fname) False
- set_meta(dataset, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'cells': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'origin': <galaxy.model.metadata.MetadataElementSpec object>, 'size': <galaxy.model.metadata.MetadataElementSpec object>, 'voxsize': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.constructive_solid_geometry.NeperPoints(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Neper Position File Neper position format has 1 - 3 floats per line separated by white space.
- file_ext = 'neper.points'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.NeperPointsTabular(**kwd)[source]¶
Bases:
galaxy.datatypes.constructive_solid_geometry.NeperPoints
,galaxy.datatypes.tabular.Tabular
Neper Position File Neper position format has 1 - 3 floats per line separated by TABs.
- file_ext = 'neper.points.tsv'¶
- set_meta(dataset, **kwd)[source]¶
Tries to determine the number of columns as well as those columns that contain numerical values in the dataset. A skip parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many invalid comment lines should be skipped. Using None for skip will cause skip to be zero, but the first line will be processed as a header. A max_data_lines parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many data lines should be processed to ensure that the non-optional metadata parameters are properly set; if used, optional metadata parameters will be set to None, unless the entire file has already been read. Using None for max_data_lines will process all data lines.
Items of interest:
We treat ‘overwrite’ as always True (we always want to set tabular metadata when called).
If a tabular file has no data, it will have one column of type ‘str’.
We used to check only the first 100 lines when setting metadata and this class’s set_peek() method read the entire file to determine the number of lines in the file. Since metadata can now be processed on cluster nodes, we’ve merged the line count portion of the set_peek() processing here, and we now check the entire contents of the file.
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'dimension': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.NeperMultiScaleCell(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Neper Multiscale Cell File
- file_ext = 'neper.mscell'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.GmshMsh(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Gmsh Mesh File
- file_ext = 'gmsh.msh'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Gmsh msh format, starts with
$MeshFormat
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname('test.gmsh.msh') >>> GmshMsh().sniff(fname) True >>> fname = get_test_fname('test.neper.tesr') >>> GmshMsh().sniff(fname) False
- set_meta(dataset, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'format': <galaxy.model.metadata.MetadataElementSpec object>, 'version': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.constructive_solid_geometry.GmshGeo(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Gmsh geometry File
- file_ext = 'gmsh.geo'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.constructive_solid_geometry.ZsetGeof(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Z-set geof File
- file_ext = 'zset.geof'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.coverage module¶
Coverage datatypes
- class galaxy.datatypes.coverage.LastzCoverage(**kwd)[source]¶
Bases:
galaxy.datatypes.tabular.Tabular
- file_ext = 'coverage'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'chromCol': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'forwardCol': <galaxy.model.metadata.MetadataElementSpec object>, 'positionCol': <galaxy.model.metadata.MetadataElementSpec object>, 'reverseCol': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
galaxy.datatypes.data module¶
- class galaxy.datatypes.data.DataMeta(name, bases, namespace, **kwargs)[source]¶
Bases:
abc.ABCMeta
Metaclass for Data class. Sets up metadata spec.
- class galaxy.datatypes.data.Data(**kwd)[source]¶
Bases:
object
Base class for all datatypes. Implements basic interfaces as well as class methods for metadata.
>>> class DataTest( Data ): ... MetadataElement( name="test" ) ... >>> DataTest.metadata_spec.test.name 'test' >>> DataTest.metadata_spec.test.desc 'test' >>> type( DataTest.metadata_spec.test.param ) <class 'galaxy.model.metadata.MetadataParameter'>
- edam_data = 'data_0006'¶
- edam_format = 'format_1915'¶
- file_ext = 'data'¶
- CHUNKABLE = False¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- copy_safe_peek = True¶
- is_binary = True¶
- primary_file_name = 'index'¶
- dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>}¶
- classmethod is_datatype_change_allowed()[source]¶
Returns the value of the allow_datatype_change class attribute if set in a subclass, or True iff the datatype is not composite.
- get_raw_data(dataset)[source]¶
Returns the full data. To stream it open the file_name and read/write as needed
- dataset_content_needs_grooming(file_name)[source]¶
This function is called on an output dataset file after the content is initially generated.
- groom_dataset_content(file_name)[source]¶
This function is called on an output dataset file if dataset_content_needs_grooming returns True.
- set_meta(dataset: Any, overwrite=True, **kwd)[source]¶
Unimplemented method, allows guessing of metadata from contents of file
- missing_meta(dataset, check=None, skip=None)[source]¶
Checks for empty metadata values. Returns False if no non-optional metadata is missing and the missing metadata key otherwise. Specifying a list of ‘check’ values will only check those names provided; when used, optionality is ignored Specifying a list of ‘skip’ items will return True even when a named metadata value is missing; when used, optionality is ignored
- property max_optional_metadata_filesize¶
- to_archive(dataset, name='')[source]¶
Collect archive paths and file handles that need to be exported when archiving dataset.
- Parameters
dataset – HistoryDatasetAssociation
name – archive name, in collection context corresponds to collection name(s) and element_identifier, joined by ‘/’, e.g ‘fastq_collection/sample1/forward’
- display_data(trans, data, preview=False, filename=None, to_ext=None, **kwd)[source]¶
Displays data in central pane if preview is True, else handles download.
Datatypes should be very careful if overridding this method and this interface between datatypes and Galaxy will likely change.
TOOD: Document alternatives to overridding this method (data providers?).
- display_as_markdown(dataset_instance, markdown_format_helpers)[source]¶
Prepare for embedding dataset into a basic Markdown document.
This is a somewhat experimental interface and should not be implemented on datatypes not tightly tied to a Galaxy version (e.g. datatypes in the Tool Shed).
Speaking very losely - the datatype should should load a bounded amount of data from the supplied dataset instance and prepare for embedding it into Markdown. This should be relatively vanilla Markdown - the result of this is bleached and it should not contain nested Galaxy Markdown directives.
If the data cannot reasonably be displayed, just indicate this and do not throw an exception.
- repair_methods(dataset)[source]¶
Unimplemented method, returns dict with method/option for repairing errors
- add_display_app(app_id, label, file_function, links_function)[source]¶
Adds a display app to the datatype. app_id is a unique id label is the primary display label, e.g., display at ‘UCSC’ file_function is a string containing the name of the function that returns a properly formatted display links_function is a string containing the name of the function that returns a list of (link_name,link)
- as_display_type(dataset, type, **kwd)[source]¶
Returns modified file contents for a particular display type
- get_display_links(dataset, type, app, base_url, target_frame='_blank', **kwd)[source]¶
Returns a list of tuples of (name, link) for a particular display type. No check on ‘access’ permissions is done here - if you can view the dataset, you can also save it or send it to a destination outside of Galaxy, so Galaxy security restrictions do not apply anyway.
- get_converter_types(original_dataset, datatypes_registry)[source]¶
Returns available converters by type for this dataset
- find_conversion_destination(dataset, accepted_formats: List[str], datatypes_registry, **kwd) Tuple[bool, Optional[str], Optional[DatasetInstance]] [source]¶
Returns ( direct_match, converted_ext, existing converted dataset )
- convert_dataset(trans, original_dataset, target_type, return_output=False, visible=True, deps=None, target_context=None, history=None)[source]¶
This function adds a job to the queue to convert a dataset to another type. Returns a message about success/failure.
- after_setting_metadata(dataset)[source]¶
This function is called on the dataset after metadata is set.
- before_setting_metadata(dataset)[source]¶
This function is called on the dataset before metadata is set.
- property writable_files¶
- property has_resolution¶
- matches_any(target_datatypes: List[Any]) bool [source]¶
Check if this datatype is of any of the target_datatypes or is a subtype thereof.
- static merge(split_files, output_file)[source]¶
Merge files with copy.copyfileobj() will not hit the max argument limitation of cat. gz and bz2 files are also working.
- class galaxy.datatypes.data.Text(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Data
- edam_format = 'format_2330'¶
- file_ext = 'txt'¶
- line_class = 'line'¶
- is_binary = False¶
- estimate_file_lines(dataset)[source]¶
Perform a rough estimate by extrapolating number of lines from a small read.
- count_data_lines(dataset)[source]¶
Count the number of lines of data in dataset, skipping all blank lines and comments.
- set_peek(dataset, line_count=None, WIDTH=256, skipchars=None, line_wrap=True, **kwd)[source]¶
Set the peek. This method is used by various subclasses of Text.
- classmethod split(input_datasets, subdir_generator_function, split_params)[source]¶
Split the input files by line.
- line_dataprovider(dataset, **settings)[source]¶
Returns an iterator over the dataset’s lines (that have been stripped) optionally excluding blank lines and lines that start with a comment character.
- regex_line_dataprovider(dataset, **settings)[source]¶
Returns an iterator over the dataset’s lines optionally including/excluding lines that match one or more regex filters.
- dataproviders: Dict[str, Any] = {'base': <function Data.base_dataprovider>, 'chunk': <function Data.chunk_dataprovider>, 'chunk64': <function Data.chunk64_dataprovider>, 'line': <function Text.line_dataprovider>, 'regex-line': <function Text.regex_line_dataprovider>}¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.data.Directory(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Data
Class representing a directory of files.
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.data.GenericAsn1(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Class for generic ASN.1 text format
- edam_data = 'data_0849'¶
- edam_format = 'format_1966'¶
- file_ext = 'asn1'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.data.LineCount(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Dataset contains a single line with a single integer that denotes the line count for a related dataset. Used for custom builds.
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.data.Newick(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
New Hampshire/Newick Format
- edam_data = 'data_0872'¶
- edam_format = 'format_1910'¶
- file_ext = 'newick'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.data.Nexus(**kwd)[source]¶
Bases:
galaxy.datatypes.data.Text
Nexus format as used By Paup, Mr Bayes, etc
- edam_data = 'data_0872'¶
- edam_format = 'format_1912'¶
- file_ext = 'nex'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
All Nexus Files Simply puts a ‘#NEXUS’ in its first line
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- galaxy.datatypes.data.get_file_peek(file_name, WIDTH=256, LINE_COUNT=5, skipchars=None, line_wrap=True)[source]¶
Returns the first LINE_COUNT lines wrapped to WIDTH.
>>> def assert_peek_is(file_name, expected, *args, **kwd): ... path = get_test_fname(file_name) ... peek = get_file_peek(path, *args, **kwd) ... assert peek == expected, "%s != %s" % (peek, expected) >>> assert_peek_is('0_nonewline', u'0') >>> assert_peek_is('0.txt', u'0\n') >>> assert_peek_is('4.bed', u'chr22\t30128507\t31828507\tuc003bnx.1_cds_2_0_chr22_29227_f\t0\t+\n', LINE_COUNT=1) >>> assert_peek_is('1.bed', u'chr1\t147962192\t147962580\tCCDS989.1_cds_0_0_chr1_147962193_r\t0\t-\nchr1\t147984545\t147984630\tCCDS990.1_cds_0_0_chr1_147984546_f\t0\t+\n', LINE_COUNT=2)
galaxy.datatypes.flow module¶
Flow analysis datatypes.
- class galaxy.datatypes.flow.FCS(**kwd)[source]¶
Bases:
galaxy.datatypes.binary.Binary
Class describing an FCS binary file
- file_ext = 'fcs'¶
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Checking if the file is in FCS format. Should read FCS2.0, FCS3.0 and FCS3.1
Based on flowcore: https://github.com/RGLab/flowCore/blob/27141b792ad65ae8bd0aeeef26e757c39cdaefe7/R/IO.R#L667
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
galaxy.datatypes.genetics module¶
rgenetics datatypes Use at your peril Ross Lazarus for the rgenetics and galaxy projects
genome graphs datatypes derived from Interval datatypes genome graphs datasets have a header row with appropriate columnames The first column is always the marker - eg columname = rs, first row= rs12345 if the rows are snps subsequent row values are all numeric ! Will fail if any non numeric (eg ‘+’ or ‘NA’) values ross lazarus for rgenetics august 20 2007
- class galaxy.datatypes.genetics.GenomeGraphs(**kwd)[source]¶
Bases:
galaxy.datatypes.tabular.Tabular
Tab delimited data containing a marker id and any number of numeric values
- file_ext = 'gg'¶
- set_meta(dataset, **kwd)[source]¶
Tries to determine the number of columns as well as those columns that contain numerical values in the dataset. A skip parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many invalid comment lines should be skipped. Using None for skip will cause skip to be zero, but the first line will be processed as a header. A max_data_lines parameter is used because various tabular data types reuse this function, and their data type classes are responsible to determine how many data lines should be processed to ensure that the non-optional metadata parameters are properly set; if used, optional metadata parameters will be set to None, unless the entire file has already been read. Using None for max_data_lines will process all data lines.
Items of interest:
We treat ‘overwrite’ as always True (we always want to set tabular metadata when called).
If a tabular file has no data, it will have one column of type ‘str’.
We used to check only the first 100 lines when setting metadata and this class’s set_peek() method read the entire file to determine the number of lines in the file. Since metadata can now be processed on cluster nodes, we’ve merged the line count portion of the set_peek() processing here, and we now check the entire contents of the file.
- ucsc_links(dataset, type, app, base_url)[source]¶
from the ever-helpful angie hinrichs angie@soe.ucsc.edu a genome graphs call looks like this
http://genome.ucsc.edu/cgi-bin/hgGenome?clade=mammal&org=Human&db=hg18&hgGenome_dataSetName=dname &hgGenome_dataSetDescription=test&hgGenome_formatType=best%20guess&hgGenome_markerType=best%20guess &hgGenome_columnLabels=best%20guess&hgGenome_maxVal=&hgGenome_labelVals= &hgGenome_maxGapToFill=25000000&hgGenome_uploadFile=http://galaxy.esphealth.org/datasets/333/display/index &hgGenome_doSubmitUpload=submit
Galaxy gives this for an interval file
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&position=chr1:1-1000&hgt.customText= http%3A%2F%2Fgalaxy.esphealth.org%2Fdisplay_as%3Fid%3D339%26display_app%3Ducsc
- sniff_prefix(file_prefix: galaxy.datatypes.sniff.FilePrefix)[source]¶
Determines whether the file is in gg format
>>> from galaxy.datatypes.sniff import get_test_fname >>> fname = get_test_fname( 'test_space.txt' ) >>> GenomeGraphs().sniff( fname ) False >>> fname = get_test_fname( '1.gg' ) >>> GenomeGraphs().sniff( fname ) True
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>, 'markerCol': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- sniff(filename)¶
- class galaxy.datatypes.genetics.rgTabList(**kwd)[source]¶
Bases:
galaxy.datatypes.tabular.Tabular
for sampleid and for featureid lists of exclusions or inclusions in the clean tool featureid subsets on statistical criteria -> specialized display such as gg
- file_ext = 'rgTList'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.rgSampleList(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.rgTabList
for sampleid exclusions or inclusions in the clean tool output from QC eg excess het, gender error, ibd pair member,eigen outlier,excess mendel errors,… since they can be uploaded, should be flexible but they are persistent at least same infrastructure for expression?
- file_ext = 'rgSList'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.rgFeatureList(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.rgTabList
for featureid lists of exclusions or inclusions in the clean tool output from QC eg low maf, high missingness, bad hwe in controls, excess mendel errors,… featureid subsets on statistical criteria -> specialized display such as gg same infrastructure for expression?
- file_ext = 'rgFList'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Rgenetics(**kwd)[source]¶
Bases:
galaxy.datatypes.text.Html
base class to use for rgenetics datatypes derived from html - composite datatype elements stored in extra files path
- file_ext = 'rgenetics'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.SNPMatrix(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
BioC SNPMatrix Rgenetics data collections
- file_ext = 'snpmatrix'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Lped(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
linkage pedigree (ped,map) Rgenetics data collections
- file_ext = 'lped'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Pphe(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
Plink phenotype file - header must have FID IID… Rgenetics data collections
- file_ext = 'pphe'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Fphe(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
fbat pedigree file - mad format with ! as first char on header row Rgenetics data collections
- file_ext = 'fphe'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Phe(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
Phenotype file
- file_ext = 'phe'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Fped(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
FBAT pedigree format - single file, map is header row of rs numbers. Strange. Rgenetics data collections
- file_ext = 'fped'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Pbed(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
Plink Binary compressed 2bit/geno Rgenetics data collections
- file_ext = 'pbed'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.ldIndep(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
LD (a good measure of redundancy of information) depleted Plink Binary compressed 2bit/geno This is really a plink binary, but some tools work better with less redundancy so are constrained to these files
- file_ext = 'ldreduced'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Eigenstratgeno(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
Eigenstrat format - may be able to get rid of this if we move to shellfish Rgenetics data collections
- file_ext = 'eigenstratgeno'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Eigenstratpca(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
Eigenstrat PCA file for case control adjustment Rgenetics data collections
- file_ext = 'eigenstratpca'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Snptest(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.Rgenetics
BioC snptest Rgenetics data collections
- file_ext = 'snptest'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.IdeasPre(**kwd)[source]¶
Bases:
galaxy.datatypes.text.Html
This datatype defines the input format required by IDEAS: https://academic.oup.com/nar/article/44/14/6721/2468150 The IDEAS preprocessor tool produces an output using this format. The extra_files_path of the primary input dataset contains the following files and directories. - chromosome_windows.txt (optional) - chromosomes.bed (optional) - IDEAS_input_config.txt - compressed archived tmp directory containing a number of compressed bed files.
- file_ext = 'ideaspre'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'chrom_bed': <galaxy.model.metadata.MetadataElementSpec object>, 'chrom_windows': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'input_config': <galaxy.model.metadata.MetadataElementSpec object>, 'tmp_archive': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Pheno(**kwd)[source]¶
Bases:
galaxy.datatypes.tabular.Tabular
base class for pheno files
- file_ext = 'pheno'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'column_types': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'comment_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'delimiter': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.RexpBase(**kwd)[source]¶
Bases:
galaxy.datatypes.text.Html
base class for BioC data structures in Galaxy must be constructed with the pheno data in place since that goes into the metadata for each instance
- file_ext = 'rexpbase'¶
- html_table = None¶
- generate_primary_file(dataset=None)[source]¶
This is called only at upload to write the html file cannot rename the datasets here - they come with the default unfortunately
- get_phecols(phenolist, maxConc=20)[source]¶
sept 2009: cannot use whitespace to split - make a more complex structure here and adjust the methods that rely on this structure return interesting phenotype column names for an rexpression eset or affybatch to use in array subsetting and so on. Returns a data structure for a dynamic Galaxy select parameter. A column with only 1 value doesn’t change, so is not interesting for analysis. A column with a different value in every row is equivalent to a unique identifier so is also not interesting for anova or limma analysis - both these are removed after the concordance (count of unique terms) is constructed for each column. Then a complication - each remaining pair of columns is tested for redundancy - if two columns are always paired, then only one is needed :)
- get_pheno(dataset)[source]¶
expects a .pheno file in the extra_files_dir - ugh note that R is wierd and adds the row.name in the header so the columns are all wrong - unless you tell it not to. A file can be written as write.table(file=’foo.pheno’,pData(foo),sep=’ ‘,quote=F,row.names=F)
- set_peek(dataset, **kwd)[source]¶
expects a .pheno file in the extra_files_dir - ugh note that R is weird and does not include the row.name in the header. why?
- get_file_peek(filename)[source]¶
can’t really peek at a filename - need the extra_files_path and such?
- set_meta(dataset, **kwd)[source]¶
NOTE we apply the tabular machinary to the phenodata extracted from a BioC eSet or affybatch.
- make_html_table(pp='nothing supplied from peek\n')[source]¶
Create HTML table, used for displaying peek
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Affybatch(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.RexpBase
derived class for BioC data structures in Galaxy
- file_ext = 'affybatch'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.Eset(**kwd)[source]¶
Bases:
galaxy.datatypes.genetics.RexpBase
derived class for BioC data structures in Galaxy
- file_ext = 'eset'¶
- metadata_spec: galaxy.model.metadata.MetadataSpecCollection = {'base_name': <galaxy.model.metadata.MetadataElementSpec object>, 'column_names': <galaxy.model.metadata.MetadataElementSpec object>, 'columns': <galaxy.model.metadata.MetadataElementSpec object>, 'data_lines': <galaxy.model.metadata.MetadataElementSpec object>, 'dbkey': <galaxy.model.metadata.MetadataElementSpec object>, 'pheCols': <galaxy.model.metadata.MetadataElementSpec object>, 'pheno_path': <galaxy.model.metadata.MetadataElementSpec object>}¶
Dictionary of metadata fields for this datatype
- class galaxy.datatypes.genetics.MAlist(