Warning
This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.
galaxy.datatypes.util package
Utilities for Galaxy datatypes.
Submodules
galaxy.datatypes.util.generic_util module
galaxy.datatypes.util.gff_util module
Provides utilities for working with GFF files.
- class galaxy.datatypes.util.gff_util.GFFInterval(reader, fields, chrom_col=0, feature_col=2, start_col=3, end_col=4, strand_col=6, score_col=5, default_strand='.', fix_strand=False)[source]
Bases:
GenomicInterval
A GFF interval, including attributes. If file is strictly a GFF file, only attribute is ‘group.’
- class galaxy.datatypes.util.gff_util.GFFFeature(reader, chrom_col=0, feature_col=2, start_col=3, end_col=4, strand_col=6, score_col=5, default_strand='.', fix_strand=False, intervals=None, raw_size=0)[source]
Bases:
GFFInterval
A GFF feature, which can include multiple intervals.
- class galaxy.datatypes.util.gff_util.GFFIntervalToBEDReaderWrapper(reader, **kwargs)[source]
Bases:
NiceReaderWrapper
Reader wrapper that reads GFF intervals/lines and automatically converts them to BED format.
- class galaxy.datatypes.util.gff_util.GFFReaderWrapper(reader, chrom_col=0, feature_col=2, start_col=3, end_col=4, strand_col=6, score_col=5, fix_strand=False, convert_to_bed_coord=False, **kwargs)[source]
Bases:
NiceReaderWrapper
Reader wrapper for GFF files.
Wrapper has two major functions:
group entries for GFF file (via group column), GFF3 (via id attribute), or GTF (via gene_id/transcript id);
convert coordinates from GFF format–starting and ending coordinates are 1-based, closed–to the ‘traditional’/BED interval format–0 based, half-open. This is useful when using GFF files as inputs to tools that expect traditional interval format.
- galaxy.datatypes.util.gff_util.convert_bed_coords_to_gff(interval)[source]
Converts an interval object’s coordinates from BED format to GFF format. Accepted object types include GenomicInterval and list (where the first element in the list is the interval’s start, and the second element is the interval’s end).
- galaxy.datatypes.util.gff_util.convert_gff_coords_to_bed(interval)[source]
Converts an interval object’s coordinates from GFF format to BED format. Accepted object types include GFFFeature, GenomicInterval, and list (where the first element in the list is the interval’s start, and the second element is the interval’s end).
- galaxy.datatypes.util.gff_util.parse_gff_attributes(attr_str)[source]
Parses a GFF/GTF attribute string and returns a dictionary of name-value pairs. The general format for a GFF3 attributes string is
name1=value1;name2=value2
The general format for a GTF attribute string is
name1 “value1” ; name2 “value2”
The general format for a GFF attribute string is a single string that denotes the interval’s group; in this case, method returns a dictionary with a single key-value pair, and key name is ‘group’
- galaxy.datatypes.util.gff_util.parse_gff3_attributes(attr_str)[source]
Parses a GFF3 attribute string and returns a dictionary of name-value pairs. The general format for a GFF3 attributes string is
name1=value1;name2=value2
galaxy.datatypes.util.maf_utilities module
Provides wrappers and utilities for working with MAF files and alignments.
- galaxy.datatypes.util.maf_utilities.maketrans()
Return a translation table usable for str.translate().
If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.
- class galaxy.datatypes.util.maf_utilities.TempFileHandler(max_open_files=None, **kwds)[source]
Bases:
object
Handles creating, opening, closing, and deleting of Temp files, with a maximum number of files open at one time.
- DEFAULT_MAX_OPEN_FILES = 32768.0
- class galaxy.datatypes.util.maf_utilities.RegionAlignment(size: int, species=None, temp_file_handler=None)[source]
Bases:
object
- DNA_COMPLEMENT = {65: 84, 67: 71, 71: 67, 84: 65, 97: 116, 99: 103, 103: 99, 116: 97}
- MAX_SEQUENCE_SIZE = 9223372036854775807
- class galaxy.datatypes.util.maf_utilities.GenomicRegionAlignment(start, end, species=None, temp_file_handler=None)[source]
Bases:
RegionAlignment
- class galaxy.datatypes.util.maf_utilities.SplicedAlignment(exon_starts, exon_ends, species=None, temp_file_handler=None)[source]
Bases:
object
- DNA_COMPLEMENT = {65: 84, 67: 71, 71: 67, 84: 65, 97: 116, 99: 103, 103: 99, 116: 97}
- property start
- property end
- galaxy.datatypes.util.maf_utilities.open_or_build_maf_index(maf_file, index_filename, species=None)[source]
- galaxy.datatypes.util.maf_utilities.build_maf_index_species_chromosomes(filename, index_species=None)[source]
- galaxy.datatypes.util.maf_utilities.chop_block_by_region(block, src, region, species=None, mincols=0)[source]
- galaxy.datatypes.util.maf_utilities.orient_block_by_region(block, src, region, force_strand=None)[source]
- galaxy.datatypes.util.maf_utilities.get_oriented_chopped_blocks_for_region(index, src, region, species=None, mincols=0, force_strand=None)[source]
- galaxy.datatypes.util.maf_utilities.get_oriented_chopped_blocks_with_index_offset_for_region(index, src, region, species=None, mincols=0, force_strand=None)[source]
- galaxy.datatypes.util.maf_utilities.get_chopped_blocks_for_region(index, src, region, species=None, mincols=0)[source]
- galaxy.datatypes.util.maf_utilities.get_chopped_blocks_with_index_offset_for_region(index, src, region, species=None, mincols=0)[source]
- galaxy.datatypes.util.maf_utilities.get_region_alignment(index, primary_species, chrom, start, end, strand='+', species=None, mincols=0, overwrite_with_gaps=True, temp_file_handler=None)[source]
- galaxy.datatypes.util.maf_utilities.reduce_block_by_primary_genome(block, species, chromosome, region_start)[source]
- galaxy.datatypes.util.maf_utilities.fill_region_alignment(alignment, index, primary_species, chrom, start, end, strand='+', species=None, mincols=0, overwrite_with_gaps=True)[source]
- galaxy.datatypes.util.maf_utilities.get_spliced_region_alignment(index, primary_species, chrom, starts, ends, strand='+', species=None, mincols=0, overwrite_with_gaps=True, temp_file_handler=None)[source]