Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
IEEE/ACM Trans Comput Biol Bioinform ; 16(4): 1117-1131, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-28991750

RESUMEN

Counting and indexing fixed length substrings, or $k$k-mers, in biological sequences is a key step in many bioinformatics tasks including genome alignment and mapping, genome assembly, and error correction. While advances in next generation sequencing technologies have dramatically reduced the cost and improved latency and throughput, few bioinformatics tools can efficiently process the datasets at the current generation rate of 1.8 terabases per 3-day experiment from a single sequencer. We present Kmerind, a high performance parallel $k$k-mer indexing library for distributed memory environments. The Kmerind library provides a set of simple and consistent APIs with sequential semantics and parallel implementations that are designed to be flexible and extensible. Kmerind's $k$k-mer counter performs similarly or better than the best existing $k$k-mer counting tools even on shared memory systems. In a distributed memory environment, Kmerind counts $k$k-mers in a 120 GB sequence read dataset in less than 13 seconds on 1024 Xeon CPU cores, and fully indexes their positions in approximately 17 seconds. Querying for 1 percent of the $k$k-mers in these indices can be completed in 0.23 seconds and 28 seconds, respectively. Kmerind is the first $k$k-mer indexing library for distributed memory environments, and the first extensible library for general $k$k-mer indexing and counting. Kmerind is available at https://github.com/ParBLiSS/kmerind.


Asunto(s)
Biología Computacional/métodos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Algoritmos , Redes de Comunicación de Computadores , Computadores , Biblioteca de Genes , Genoma , Humanos , Lenguajes de Programación , Semántica , Programas Informáticos
2.
Sci Rep ; 8(1): 10872, 2018 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-30022098

RESUMEN

The biological interpretation of gene lists with interesting shared properties, such as up- or down-regulation in a particular experiment, is typically accomplished using gene ontology enrichment analysis tools. Given a list of genes, a gene ontology (GO) enrichment analysis may return hundreds of statistically significant GO results in a "flat" list, which can be challenging to summarize. It can also be difficult to keep pace with rapidly expanding biological knowledge, which often results in daily changes to any of the over 47,000 gene ontologies that describe biological knowledge. GOATOOLS, a Python-based library, makes it more efficient to stay current with the latest ontologies and annotations, perform gene ontology enrichment analyses to determine over- and under-represented terms, and organize results for greater clarity and easier interpretation using a novel GOATOOLS GO grouping method. We performed functional analyses on both stochastic simulation data and real data from a published RNA-seq study to compare the enrichment results from GOATOOLS to two other popular tools: DAVID and GOstats. GOATOOLS is freely available through GitHub: https://github.com/tanghaibao/goatools .


Asunto(s)
Enfermedad de Alzheimer/genética , Biomarcadores/análisis , Biología Computacional/métodos , Modelos Animales de Enfermedad , Regulación del Desarrollo de la Expresión Génica , Ontología de Genes , Programas Informáticos , Algoritmos , Enfermedad de Alzheimer/patología , Animales , Perfilación de la Expresión Génica , Ratones
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA