Warning

This document is for an in-development version of Galaxy. You can alternatively view this page in the latest release if it exists or view the top of the latest release's documentation.

galaxy.datatypes.converters package¶

Submodules¶

galaxy.datatypes.converters.bed_to_gff_converter module¶

galaxy.datatypes.converters.bgzip module¶

Uses pysam to bgzip a file

usage: %prog in_file out_file

galaxy.datatypes.converters.bgzip.main()[source]¶

galaxy.datatypes.converters.cram_to_bam module¶

Uses pysam to convert a CRAM file to a sorted bam file. usage: %prog in_file out_file

galaxy.datatypes.converters.cram_to_bam.main()[source]¶

galaxy.datatypes.converters.fasta_to_len module¶

Input: fasta, int Output: tabular Return titles with lengths of corresponding seq

galaxy.datatypes.converters.fasta_to_len.compute_fasta_length(fasta_file, out_file, keep_first_char, keep_first_word=False)[source]¶

galaxy.datatypes.converters.fasta_to_tabular_converter module¶

Input: fasta Output: tabular

galaxy.datatypes.converters.fastq_to_fqtoc module¶

galaxy.datatypes.converters.fastq_to_fqtoc.main()[source]¶

The format of the file is JSON:

{ "sections" : [
        { "start" : "x", "end" : "y", "sequences" : "z" },
        ...
]}

This works only for UNCOMPRESSED fastq files. The Python GzipFile does not provide seekable offsets via tell(), so clients just have to split the slow way

galaxy.datatypes.converters.fastqsolexa_to_fasta_converter module¶

convert fastqsolexa file to separated sequence and quality files.

assume each sequence and quality score are contained in one line the order should be: 1st line: @title_of_seq 2nd line: nucleotides 3rd line: +title_of_qualityscore (might be skipped) 4th line: quality scores (in three forms: a. digits, b. ASCII codes, the first char as the coding base, c. ASCII codes without the first char.)

Usage: %python fastqsolexa_to_fasta_converter.py <your_fastqsolexa_filename> <output_seq_filename> <output_score_filename>

galaxy.datatypes.converters.fastqsolexa_to_fasta_converter.stop_err(msg)[source]¶

galaxy.datatypes.converters.fastqsolexa_to_qual_converter module¶

convert fastqsolexa file to separated sequence and quality files.

assume each sequence and quality score are contained in one line the order should be: 1st line: @title_of_seq 2nd line: nucleotides 3rd line: +title_of_qualityscore (might be skipped) 4th line: quality scores (in three forms: a. digits, b. ASCII codes, the first char as the coding base, c. ASCII codes without the first char.)

Usage: %python fastqsolexa_to_qual_converter.py <your_fastqsolexa_filename> <output_seq_filename> <output_score_filename>

galaxy.datatypes.converters.fastqsolexa_to_qual_converter.stop_err(msg)[source]¶

galaxy.datatypes.converters.gff_to_bed_converter module¶

galaxy.datatypes.converters.gff_to_interval_index_converter module¶

Convert from GFF file to interval index file.

usage:: python gff_to_interval_index_converter.py [input] [output]

galaxy.datatypes.converters.gff_to_interval_index_converter.main()[source]¶

galaxy.datatypes.converters.interval_to_bed_converter module¶

galaxy.datatypes.converters.interval_to_bed_converter.stop_err(msg)[source]¶

galaxy.datatypes.converters.interval_to_bedstrict_converter module¶

galaxy.datatypes.converters.interval_to_bedstrict_converter.stop_err(msg)[source]¶

galaxy.datatypes.converters.interval_to_bedstrict_converter.force_bed_field_count(fields, region_count, force_num_columns)[source]¶

galaxy.datatypes.converters.interval_to_fli module¶

Creates a feature location index (FLI) for a given BED/GFF file. FLI index has the form:

[line_length]
<symbol1_in_lowercase><tab><symbol1><tab><location>
<symbol2_in_lowercase><tab><symbol2><tab><location>
...

where location is formatted as:

contig:start-end

and symbols are sorted in lexigraphical order.

galaxy.datatypes.converters.interval_to_fli.main()[source]¶

galaxy.datatypes.converters.interval_to_interval_index_converter module¶

Convert from interval file to interval index file.

usage: %prog <options> in_file out_file: -c, –chr-col: chromosome column, default=1 -s, –start-col: start column, default=2 -e, –end-col: end column, default=3

galaxy.datatypes.converters.interval_to_interval_index_converter.main()[source]¶

galaxy.datatypes.converters.interval_to_tabix_converter module¶

Uses pysam to index a bgzipped interval file with tabix Supported presets: bed, gff, vcf

usage: %prog in_file out_file

galaxy.datatypes.converters.interval_to_tabix_converter.main()[source]¶

galaxy.datatypes.converters.interval_to_tabix_converter.to_tabix(bgzip_fname, out_fname, preset=None, chrom_col=None, start_col=None, end_col=None)[source]¶

galaxy.datatypes.converters.lped_to_fped_converter module¶

galaxy.datatypes.converters.lped_to_fped_converter.timenow()[source]¶: return current time as a string

galaxy.datatypes.converters.lped_to_fped_converter.rgConv(inpedfilepath, outhtmlname, outfilepath)[source]¶: convert linkage ped/map to fbat

galaxy.datatypes.converters.lped_to_fped_converter.main()[source]¶: call fbater need to work with rgenetics composite datatypes so in and out are html files with data in extrafiles path <command>python ‘$__tool_directory__/rg_convert_lped_fped.py’ ‘$input1/$input1.metadata.base_name’ ‘$output1’ ‘$output1.extra_files_path’ </command>

galaxy.datatypes.converters.lped_to_pbed_converter module¶

galaxy.datatypes.converters.lped_to_pbed_converter.timenow()[source]¶: return current time as a string

galaxy.datatypes.converters.lped_to_pbed_converter.getMissval(inped='')[source]¶: read some lines…ugly hack - try to guess missing value should be N or 0 but might be . or -

galaxy.datatypes.converters.lped_to_pbed_converter.rgConv(inpedfilepath, outhtmlname, outfilepath, plink)[source]¶

galaxy.datatypes.converters.lped_to_pbed_converter.main()[source]¶: need to work with rgenetics composite datatypes so in and out are html files with data in extrafiles path <command>python ‘$__tool_directory__/lped_to_pbed_converter.py’ ‘$input1/$input1.metadata.base_name’ ‘$output1’ ‘$output1.extra_files_path’ ‘${GALAXY_DATA_INDEX_DIR}/rg/bin/plink’ </command>

galaxy.datatypes.converters.maf_to_fasta_converter module¶

galaxy.datatypes.converters.maf_to_interval_converter module¶

galaxy.datatypes.converters.pbed_ldreduced_converter module¶

galaxy.datatypes.converters.pbed_ldreduced_converter.timenow()[source]¶: return current time as a string

galaxy.datatypes.converters.pbed_ldreduced_converter.pruneLD(plinktasks=[], cd='./', vclbase=[])[source]¶

galaxy.datatypes.converters.pbed_ldreduced_converter.makeLDreduced(basename, infpath=None, outfpath=None, plinke='plink', forcerebuild=False, returnFname=False, winsize='60', winmove='40', r2thresh='0.1')[source]¶: not there so make and leave in output dir for post job hook to copy back into input extra files path for next time

galaxy.datatypes.converters.pbed_ldreduced_converter.main()[source]¶: need to work with rgenetics composite datatypes so in and out are html files with data in extrafiles path

galaxy.datatypes.converters.pbed_to_lped_converter module¶

galaxy.datatypes.converters.pbed_to_lped_converter.timenow()[source]¶: return current time as a string

galaxy.datatypes.converters.pbed_to_lped_converter.rgConv(inpedfilepath, outhtmlname, outfilepath, plink)[source]¶

galaxy.datatypes.converters.pbed_to_lped_converter.main()[source]¶: need to work with rgenetics composite datatypes so in and out are html files with data in extrafiles path <command>python ‘$__tool_directory__/pbed_to_lped_converter.py’ ‘$input1/$input1.metadata.base_name’ ‘$output1’ ‘$output1.extra_files_path’ ‘${GALAXY_DATA_INDEX_DIR}/rg/bin/plink’ </command>

galaxy.datatypes.converters.picard_interval_list_to_bed6_converter module¶

galaxy.datatypes.converters.pileup_to_interval_index_converter module¶

Convert from pileup file to interval index file.

usage: %prog <options> in_file out_file

galaxy.datatypes.converters.pileup_to_interval_index_converter.main()[source]¶

galaxy.datatypes.converters.ref_to_seq_taxonomy_converter module¶

convert a ref.taxonomy file to a seq.taxonomy file Usage: %python ref_to_seq_taxonomy_converter.py <ref.taxonom> <seq.taxonomy>

galaxy.datatypes.converters.tabular_csv module¶

Uses the python csv library to convert to and from tabular

usage: %prog [–from-tabular] -i in_file -o out_file

galaxy.datatypes.converters.tabular_csv.main()[source]¶

galaxy.datatypes.converters.tabular_csv.convert_to_tsv(input_fname, output_fname)[source]¶

galaxy.datatypes.converters.tabular_csv.convert_to_csv(input_fname, output_fname)[source]¶

galaxy.datatypes.converters.tabular_to_dbnsfp module¶

Uses pysam to bgzip a file

usage: %prog in_file out_file

galaxy.datatypes.converters.tabular_to_dbnsfp.main()[source]¶

galaxy.datatypes.converters.vcf_to_interval_index_converter module¶

Convert from VCF file to interval index file.

galaxy.datatypes.converters.vcf_to_interval_index_converter.main()[source]¶

galaxy.datatypes.converters.vcf_to_vcf_bgzip module¶

Uses pysam to bgzip a vcf file as-is. Headers, which are important, are kept. Original ordering, which may be specifically needed by tools or external display applications, is also maintained.

usage: %prog in_file out_file

galaxy.datatypes.converters.vcf_to_vcf_bgzip.main()[source]¶

galaxy.datatypes.converters.wiggle_to_simple_converter module¶

Read a wiggle track and print out a series of lines containing “chrom position score”. Ignores track lines, handles bed, variableStep and fixedStep wiggle lines.

galaxy.datatypes.converters.wiggle_to_simple_converter.main()[source]¶