Delledonne Lab Homepage - Short Reads ToolKit

header image

Massimo Delledonne Home

Functional Genomics Facility

StatSeq Workshop 2012

Advanced Search

Home

Software

Short Reads ToolKit

Written by Administrator
Monday, 07 December 2009
Short Reads Toolkit is a set of programs developed to work with RNA-seq data (short reads). The package is a collection of python and perl scripts released under the GPLv3 license and is downloadable HERE. coverage.pl SYNTAX: coverage.pl <file.vmf> The script calculates the length of short reads (R), the length of aligned reference sequence covered by short reads (L), the number of mapped reads (N) and, the the depth of coverage (D). D = (N x R) / L gff2knowngene.pl SYNTAX: gff2knowngene.pl <annotation.gff> The script parses the file <annotation.gff> and prints out a knownGene.txt file. res2fa.pl SYNTAX: res2fa.pl <solexaOutput.txt> The script parses the file <solexaOutput.txt> and print out a fasta file. runMaq.pl SYNTAX: runMaq.pl <reference.bfa> <solexaFile1.txt> [solexaFile2.txt, ...] SYNTAX (paired-end): runMaq.pl <reference.bfa> -p <solexaFile1.txt> <solexaFile2.txt> The script runs all command needed to identify candidate SNPs from RNA-seq data. mapreadslocation.py SYNTAX: mapreadslocation.py [-onlyMulti] <DataBase> <file.vmf> The scripts maps each short-read onto an exon, splice junction, intron/UTR, external exon or intergenic region. By default it works only with unique match short reads, use the <-onlyMulti> option to use the multiple match short reads. The <DataBase> file is a sqlite3 database created by the createDB.py script. createDB.py SYNTAX: createDB.py <DataBase> <knownGene.txt> <''/geneSymbols> <nearDistance[0/x]> Parameters: knownGene.txt: the knownGene.txt file geneSymbols: file containing the gene name conversion symbols (use '' if not exist) nearDistance: the range distance used by erange (short reads mapped in this range will be marked as NEARGENE) mapSNPlocation.py SYNTAX: mapSNPlocation.py <DataBase> <snpDB.sqlite> <outputMAQ.snp> The script maps each candidate SNP reported by the maq analysis onto an exon, splice junction, intron/UTR, external exon or intergenic region. The <DataBase> file is a sqlite3 database created by the createDB.py script. The <snpDB.sqlite> file is a sqlite3 database created by the createSNPdb.py script. createSNPdb.py SYNTAX: createSNPdb.py <snp.txt> <snpDB.sqlite> The <snp.txt> is downloadable from UCSC site. More... A patched version of Cistematic 2.5 containing the vitisvinifera.py script is available HERE. The set of scripts getallsplice.tgz needed to create a dataset of all possible splice junctions is available HERE. You need a knownGene.txt file to run it. The project is hosted also on googlecode at http://code.google.com/p/shortreadstoolkit/ Other scripts filter_oligos.py
Last Updated ( Thursday, 20 January 2011 )

Mambo is Free Software released under the GNU/GPL License.