GIIRA--RNA-Seq driven gene finding incorporating ambiguous reads.
Bioinformatics
; 30(5): 606-13, 2014 Mar 01.
Article
in En
| MEDLINE
| ID: mdl-24123675
ABSTRACT
MOTIVATION The reliable identification of genes is a major challenge in genome research, as further analysis depends on the correctness of this initial step. With high-throughput RNA-Seq data reflecting currently expressed genes, a particularly meaningful source of information has become commonly available for gene finding. However, practical application in automated gene identification is still not the standard case. A particular challenge in including RNA-Seq data is the difficult handling of ambiguously mapped reads. RESULTS:
We present GIIRA (Gene Identification Incorporating RNA-Seq data and Ambiguous reads), a novel prokaryotic and eukaryotic gene finder that is exclusively based on a RNA-Seq mapping and inherently includes ambiguously mapped reads. GIIRA extracts candidate regions supported by a sufficient number of mappings and reassigns ambiguous reads to their most likely origin using a maximum-flow approach. This avoids the exclusion of genes that are predominantly supported by ambiguous mappings. Evaluation on simulated and real data and comparison with existing methods incorporating RNA-Seq information highlight the accuracy of GIIRA in identifying the expressed genes. AVAILABILITY AND IMPLEMENTATION GIIRA is implemented in Java and is available from https//sourceforge.net/projects/giira/.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Sequence Analysis, RNA
/
Gene Expression Profiling
/
Genes
Type of study:
Diagnostic_studies
/
Prognostic_studies
Limits:
Animals
/
Humans
Language:
En
Journal:
Bioinformatics
Journal subject:
INFORMATICA MEDICA
Year:
2014
Document type:
Article
Affiliation country: