Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 31(24): 4035-7, 2015 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-26315902

RESUMO

UNLABELLED: Rapid advances of next-generation sequencing technology have led to the integration of genetic information with clinical care. Genetic basis of diseases and response to drugs provide new ways of disease diagnosis and safer drug usage. This integration reveals the urgent need for effective and accurate tools to analyze genetic variants. Due to the number and diversity of sources for annotation, automating variant analysis is a challenging task. Here, we present database.bio, a web application that combines variant annotation, prioritization and visualization so as to support insight into the individual genetic characteristics. It enhances annotation speed by preprocessing data on a supercomputer, and reduces database space via a unified database representation with compressed fields. AVAILABILITY AND IMPLEMENTATION: Freely available at https://database.bio.


Assuntos
Bases de Dados de Ácidos Nucleicos , Variação Genética , Software , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular
2.
BMC Bioinformatics ; 16 Suppl 7: S10, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25952019

RESUMO

BACKGROUND: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alterative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, the performance is often inferior to GPU counterparts as an MIC card contains only ~60 cores (while a GPU card typically has over a thousand cores). RESULTS: To better utilize MIC-enabled computers for NGS data analysis, we developed a new short-read aligner MICA that is optimized in view of MIC's limitation and the extra parallelism inside each MIC core. By utilizing the 512-bit vector units in the MIC and implementing a new seeding strategy, experiments on aligning 150 bp paired-end reads show that MICA using one MIC card is 4.9 times faster than BWA-MEM (using 6 cores of a top-end CPU), and slightly faster than SOAP3-dp (using a GPU). Furthermore, MICA's simplicity allows very efficient scale-up when multiple MIC cards are used in a node (3 cards give a 14.1-fold speedup over BWA-MEM). SUMMARY: MICA can be readily used by MIC-enabled supercomputers for production purpose. We have tested MICA on Tianhe-2 with 90 WGS samples (17.47 Tera-bases), which can be aligned in an hour using 400 nodes. MICA has impressive performance even though MIC is only in its initial stage of development. AVAILABILITY AND IMPLEMENTATION: MICA's source code is freely available at http://sourceforge.net/projects/mica-aligner under GPL v3. SUPPLEMENTARY INFORMATION: Supplementary information is available as "Additional File 1". Datasets are available at www.bio8.cs.hku.hk/dataset/mica.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Humanos , Linguagens de Programação
3.
PeerJ ; 2: e421, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24949238

RESUMO

This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels), BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads), or just 25 min for 210-fold whole exome sequencing. BALSA's speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA