ASElux: an ultra-fast and accurate allelic reads counter.
Bioinformatics
; 34(8): 1313-1320, 2018 04 15.
Article
en En
| MEDLINE
| ID: mdl-29186329
Motivation: Mapping bias causes preferential alignment to the reference allele, forming a major obstacle in allele-specific expression (ASE) analysis. The existing methods, such as simulation and SNP-aware alignment, are either inaccurate or relatively slow. To fast and accurately count allelic reads for ASE analysis, we developed a novel approach, ASElux, which utilizes the personal SNP information and counts allelic reads directly from unmapped RNA-sequence (RNA-seq) data. ASElux significantly reduces runtime by disregarding reads outside single nucleotide polymorphisms (SNPs) during the alignment. Results: When compared to other tools on simulated and experimental data, ASElux achieves a higher accuracy on ASE estimation than non-SNP-aware aligners and requires a much shorter time than the benchmark SNP-aware aligner, GSNAP with just a slight loss in performance. ASElux can process 40 million read-pairs from an RNA-sequence (RNA-seq) sample and count allelic reads within 10 min, which is comparable to directly counting the allelic reads from alignments based on other tools. Furthermore, processing an RNA-seq sample using ASElux in conjunction with a general aligner, such as STAR, is more accurate and still â¼4× faster than STAR + WASP, and â¼33× faster than the lead SNP-aware aligner, GSNAP, making ASElux ideal for ASE analysis of large-scale transcriptomic studies. We applied ASElux to 273 lung RNA-seq samples from GTEx and identified a splice-QTL rs11078928 in lung which explains the mechanism underlying an asthma GWAS SNP rs11078927. Thus, our analysis demonstrated ASE as a highly powerful complementary tool to cis-expression quantitative trait locus (eQTL) analysis. Availability and implementation: The software can be downloaded from https://github.com/abl0719/ASElux. Contact: zmiao@ucla.edu or a5ko@ucla.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Texto completo:
1
Banco de datos:
MEDLINE
Asunto principal:
Programas Informáticos
/
Perfilación de la Expresión Génica
/
Polimorfismo de Nucleótido Simple
/
Sitios de Carácter Cuantitativo
/
Alelos
Tipo de estudio:
Prognostic_studies
Límite:
Humans
Idioma:
En
Revista:
Bioinformatics
Asunto de la revista:
INFORMATICA MEDICA
Año:
2018
Tipo del documento:
Article
País de afiliación:
Estados Unidos