HMMSplicer: a tool for efficient and sensitive discovery of known and novel splice junctions in RNA-Seq data.
PLoS One
; 5(11): e13875, 2010 Nov 08.
Article
em En
| MEDLINE
| ID: mdl-21079731
BACKGROUND: High-throughput sequencing of an organism's transcriptome, or RNA-Seq, is a valuable and versatile new strategy for capturing snapshots of gene expression. However, transcriptome sequencing creates a new class of alignment problem: mapping short reads that span exon-exon junctions back to the reference genome, especially in the case where a splice junction is previously unknown. METHODOLOGY/PRINCIPAL FINDINGS: Here we introduce HMMSplicer, an accurate and efficient algorithm for discovering canonical and non-canonical splice junctions in short read datasets. HMMSplicer identifies more splice junctions than currently available algorithms when tested on publicly available A. thaliana, P. falciparum, and H. sapiens datasets without a reduction in specificity. CONCLUSIONS/SIGNIFICANCE: HMMSplicer was found to perform especially well in compact genomes and on genes with low expression levels, alternative splice isoforms, or non-canonical splice junctions. Because HHMSplicer does not rely on pre-built gene models, the products of inexact splicing are also detected. For H. sapiens, we find 3.6% of 3' splice sites and 1.4% of 5' splice sites are inexact, typically differing by 3 bases in either direction. In addition, HMMSplicer provides a score for every predicted junction allowing the user to set a threshold to tune false positive rates depending on the needs of the experiment. HMMSplicer is implemented in Python. Code and documentation are freely available at http://derisilab.ucsf.edu/software/hmmsplicer.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Splicing de RNA
/
Perfilação da Expressão Gênica
/
Sítios de Splice de RNA
Tipo de estudo:
Diagnostic_studies
/
Prognostic_studies
Limite:
Animals
/
Humans
Idioma:
En
Ano de publicação:
2010
Tipo de documento:
Article