Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Mol Ecol ; 21(13): 3363-78, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22486918

RESUMEN

Symbiotic bacteria often help their hosts acquire nutrients from their diet, showing trends of co-evolution and independent acquisition by hosts from the same trophic levels. While these trends hint at important roles for biotic factors, the effects of the abiotic environment on symbiotic community composition remain comparably understudied. In this investigation, we examined the influence of abiotic and biotic factors on the gut bacterial communities of fish from different taxa, trophic levels and habitats. Phylogenetic and statistical analyses of 25 16S rRNA libraries revealed that salinity, trophic level and possibly host phylogeny shape the composition of fish gut bacteria. When analysed alongside bacterial communities from other environments, fish gut communities typically clustered with gut communities from mammals and insects. Similar consideration of individual phylotypes (vs. communities) revealed evolutionary ties between fish gut microbes and symbionts of animals, as many of the bacteria from the guts of herbivorous fish were closely related to those from mammals. Our results indicate that fish harbour more specialized gut communities than previously recognized. They also highlight a trend of convergent acquisition of similar bacterial communities by fish and mammals, raising the possibility that fish were the first to evolve symbioses resembling those found among extant gut fermenting mammals.


Asunto(s)
Bacterias/genética , Peces/microbiología , Tracto Gastrointestinal/microbiología , Metagenoma , Animales , Bacterias/clasificación , Datos de Secuencia Molecular , Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN , Simbiosis
2.
J Biomed Biotechnol ; 2011: 495849, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21541181

RESUMEN

High-throughput sequencing technologies enable metagenome profiling, simultaneous sequencing of multiple microbial species present within an environmental sample. Since metagenomic data includes sequence fragments ("reads") from organisms that are absent from any database, new algorithms must be developed for the identification and annotation of novel sequence fragments. Homology-based techniques have been modified to detect novel species and genera, but, composition-based methods, have not been adapted. We develop a detection technique that can discriminate between "known" and "unknown" taxa, which can be used with composition-based methods, as well as a hybrid method. Unlike previous studies, we rigorously evaluate all algorithms for their ability to detect novel taxa. First, we show that the integration of a detector with a composition-based method performs significantly better than homology-based methods for the detection of novel species and genera, with best performance at finer taxonomic resolutions. Most importantly, we evaluate all the algorithms by introducing an "unknown" class and show that the modified version of PhymmBL has similar or better overall classification performance than the other modified algorithms, especially for the species-level and ultrashort reads. Finally, we evaluate the performance of several algorithms on a real acid mine drainage dataset.


Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Bacterias/genética , Bases de Datos de Ácidos Nucleicos , Genoma/genética , Metagenómica , Minería , Sistemas de Lectura Abierta/genética , Curva ROC , Especificidad de la Especie , Eliminación de Residuos Líquidos
3.
PLoS One ; 10(1): e0109277, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25607539

RESUMEN

UNLABELLED: Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial. AVAILABILITY: http://www.ece.drexel.edu/gailr/EESI/tutorial.php.


Asunto(s)
Modelos Genéticos , Filogenia , Análisis de Secuencia de ADN/métodos , Programas Informáticos
4.
Pac Symp Biocomput ; : 10-20, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-19908353

RESUMEN

Metagenomics is the study of environmental samples. Because few tools exist for metagenomic analysis, a natural step has been to utilize the popular homology tool, BLAST, to search for sequence similarity between sample fragments and an administered database. Most biologists use this method today without knowing BLAST's accuracy, especially when a particular taxonomic class is under-represented in the database. The aim of this paper is to benchmark the performance of BLAST for taxonomic classification of metagenomic datasets in a supervised setting; meaning that the database contains microbes of the same class as the 'unknown' query fragments. We examine well- and under-represented genera and phyla in order to study their effect on the accuracy of BLAST. We conclude that on fine-resolution classes, such as genera, the accuracy of BLAST does not degrade very much with under-representation, but in a highly variant class, such as phyla, performance degrades significantly. Our analysis includes five-fold cross validation to substantiate our findings.


Asunto(s)
Metagenómica/métodos , Algoritmos , Bacterias/clasificación , Bacterias/genética , Bacterias/aislamiento & purificación , Biología Computacional , Crenarchaeota/clasificación , Crenarchaeota/genética , Crenarchaeota/aislamiento & purificación , ADN de Archaea/genética , ADN Bacteriano/genética , Bases de Datos de Ácidos Nucleicos , Humanos , Metagenoma/genética , Metagenómica/estadística & datos numéricos , Filogenia , Alineación de Secuencia/métodos , Alineación de Secuencia/estadística & datos numéricos
5.
IEEE Trans Nanobioscience ; 9(4): 310-6, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-20876033

RESUMEN

"Binning" (or taxonomic classification) of DNA sequence reads is an initial step to analyzing an environmental biological sample. Currently, a homology-based tool, BLAST, is one of the most commonly used tools to label DNA reads, but it is argued that BLAST will quickly lose its classification ability as the genome databases grow. In this paper, we compare the accuracies of a naïve Bayes classifier (NBC) and statistical language model to BLAST for binning reads and demonstrate that NBC obtains good performance for the low cost of computational complexity. On the other hand, the back-off n-gram language model can improve accuracy when only partial training data is available (such as in-progress sequencing projects). NBC demonstrates comparable performance to BLAST and can also be optimized on partial training datasets by adjusting the word feature size. A fivefold cross validation is conducted to compare each method's accuracy for determining novel genomes at different taxonomic levels, with NBC outperforming BLAST for species-level classification but BLAST outperforming NBC for genus-level and phyla-level classification. In conclusion, the NBC is a competitive taxonomic classifier, and language models can improve performance when only partial training data is available.


Asunto(s)
Metagenómica/métodos , Modelos Estadísticos , Análisis de Secuencia de ADN/métodos , Teorema de Bayes , Bases de Datos Genéticas , Genoma , Fragmentos de Péptidos/clasificación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA