Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 551(7681): 457-463, 2017 11 23.
Artículo en Inglés | MEDLINE | ID: mdl-29088705

RESUMEN

Our growing awareness of the microbial world's importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth's microbial diversity.


Asunto(s)
Biodiversidad , Planeta Tierra , Microbiota/genética , Animales , Archaea/genética , Archaea/aislamiento & purificación , Bacterias/genética , Bacterias/aislamiento & purificación , Ecología/métodos , Dosificación de Gen , Mapeo Geográfico , Humanos , Plantas/microbiología , ARN Ribosómico 16S/análisis , ARN Ribosómico 16S/genética
2.
BMC Bioinformatics ; 17: 155, 2016 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-27059896

RESUMEN

BACKGROUND: Understanding the interactions between antibodies and the linear epitopes that they recognize is an important task in the study of immunological diseases. We present a novel computational method for the design of linear epitopes of specified binding affinity to Intravenous Immunoglobulin (IVIg). RESULTS: We show that the method, called Pythia-design can accurately design peptides with both high-binding affinity and low binding affinity to IVIg. To show this, we experimentally constructed and tested the computationally constructed designs. We further show experimentally that these designed peptides are more accurate that those produced by a recent method for the same task. Pythia-design is based on combining random walks with an ensemble of probabilistic support vector machines (SVM) classifiers, and we show that it produces a diverse set of designed peptides, an important property to develop robust sets of candidates for construction. We show that by combining Pythia-design and the method of (PloS ONE 6(8):23616, 2011), we are able to produce an even more accurate collection of designed peptides. Analysis of the experimental validation of Pythia-design peptides indicates that binding of IVIg is favored by epitopes that contain trypthophan and cysteine. CONCLUSIONS: Our method, Pythia-design, is able to generate a diverse set of binding and non-binding peptides, and its designs have been experimentally shown to be accurate.


Asunto(s)
Biología Computacional/métodos , Epítopos/química , Inmunoglobulinas Intravenosas/química , Péptidos Cíclicos/química , Citrulina/química , Cisteína/química , Humanos , Modelos Moleculares , Reproducibilidad de los Resultados , Máquina de Vectores de Soporte , Triptófano/química
3.
Mol Ecol ; 25(19): 4963-77, 2016 10.
Artículo en Inglés | MEDLINE | ID: mdl-27588381

RESUMEN

Blacklegged ticks (Ixodes scapularis) are one of the most important pathogen vectors in the United States, responsible for transmitting Lyme disease and other tick-borne diseases. The structure of a host's microbial community has the potential to affect the ecology and evolution of the host. We employed high-throughput sequencing of the 16S rRNA gene V3-V4 hypervariable regions in the first study to investigate the tick microbiome across all developmental stages (larvae, nymphs, adults). In addition to field-collected life stages, newly hatched laboratory-reared larvae were studied to determine the baseline microbial community structure and to assess transovarial transmission. We also targeted midguts and salivary glands due to their importance in pathogen maintenance and transmission. Over 100 000 sequences were produced per life stage replicate. Rickettsia was the most abundant bacterial genus across all sample types matching mostly the Ixodes rickettsial endosymbionts, and its proportion decreased as developmental stage progressed, with the exception of adult females that harboured a mean relative abundance of 97.9%. Laboratory-reared larvae displayed the lowest bacterial diversity, containing almost exclusively Rickettsia. Many of the remaining bacteria included genera associated with soil, water and plants, suggesting environmental acquisition while off-host. Female organs exhibited significantly different ß-diversity than the whole tick from which they were derived. Our results demonstrate clear differences in both α- and ß-diversity among tick developmental stages and between tick organs and the tick as a whole. Furthermore, field-acquired bacteria appear to be very important to the overall internal bacterial community of this tick species, with influence from the host bloodmeal appearing limited.


Asunto(s)
Bacterias/clasificación , Ixodes/microbiología , Microbiota , Animales , Femenino , Larva/microbiología , New York , Ninfa/microbiología , ARN Ribosómico 16S/genética , Rickettsia/clasificación , Rickettsia/aislamiento & purificación
4.
Nat Methods ; 9(8): 796-804, 2012 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-22796662

RESUMEN

Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ~1,700 transcriptional interactions at a precision of ~50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.


Asunto(s)
Biología Computacional , Regulación Bacteriana de la Expresión Génica/genética , Redes Reguladoras de Genes , Análisis de Secuencia por Matrices de Oligonucleótidos , Algoritmos , Escherichia coli/genética , Saccharomyces cerevisiae/genética , Programas Informáticos , Staphylococcus aureus/genética , Transcripción Genética/genética
5.
Bioinformatics ; 29(1): 135-6, 2013 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-23202745

RESUMEN

SUMMARY: Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. AVAILABILITY AND IMPLEMENTATION: http://github.com/ibm-bioinformatics/bluesnp


Asunto(s)
Estudio de Asociación del Genoma Completo , Programas Informáticos , Humanos , Fenotipo , Sitios de Carácter Cuantitativo
6.
Proc Natl Acad Sci U S A ; 107(14): 6286-91, 2010 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-20308593

RESUMEN

Numerous methods have been developed for inferring gene regulatory networks from expression data, however, both their absolute and comparative performance remain poorly understood. In this paper, we introduce a framework for critical performance assessment of methods for gene network inference. We present an in silico benchmark suite that we provided as a blinded, community-wide challenge within the context of the DREAM (Dialogue on Reverse Engineering Assessment and Methods) project. We assess the performance of 29 gene-network-inference methods, which have been applied independently by participating teams. Performance profiling reveals that current inference methods are affected, to various degrees, by different types of systematic prediction errors. In particular, all but the best-performing method failed to accurately infer multiple regulatory inputs (combinatorial regulation) of genes. The results of this community-wide experiment show that reliable network inference from gene expression data remains an unsolved problem, and they indicate potential ways of network reconstruction improvements.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Biometría , Perfilación de la Expresión Génica
7.
NPJ Sci Food ; 5(1): 3, 2021 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-33558514

RESUMEN

In this work, we hypothesized that shifts in the food microbiome can be used as an indicator of unexpected contaminants or environmental changes. To test this hypothesis, we sequenced the total RNA of 31 high protein powder (HPP) samples of poultry meal pet food ingredients. We developed a microbiome analysis pipeline employing a key eukaryotic matrix filtering step that improved microbe detection specificity to >99.96% during in silico validation. The pipeline identified 119 microbial genera per HPP sample on average with 65 genera present in all samples. The most abundant of these were Bacteroides, Clostridium, Lactococcus, Aeromonas, and Citrobacter. We also observed shifts in the microbial community corresponding to ingredient composition differences. When comparing culture-based results for Salmonella with total RNA sequencing, we found that Salmonella growth did not correlate with multiple sequence analyses. We conclude that microbiome sequencing is useful to characterize complex food microbial communities, while additional work is required for predicting specific species' viability from total RNA sequencing.

8.
NPJ Sci Food ; 3: 24, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31754632

RESUMEN

Here we propose that using shotgun sequencing to examine food leads to accurate authentication of ingredients and detection of contaminants. To demonstrate this, we developed a bioinformatic pipeline, FASER (Food Authentication from SEquencing Reads), designed to resolve the relative composition of mixtures of eukaryotic species using RNA or DNA sequencing. Our comprehensive database includes >6000 plants and animals that may be present in food. FASER accurately identified eukaryotic species with 0.4% median absolute difference between observed and expected proportions on sequence data from various sources including sausage meat, plants, and fish. FASER was applied to 31 high protein powder raw factory ingredient total RNA samples. The samples mostly contained the expected source ingredient, chicken, while three samples unexpectedly contained pork and beef. Our results demonstrate that DNA/RNA sequencing of food ingredients, combined with a robust analysis, can be used to find contaminants and authenticate food ingredients in a single assay.

10.
PLoS Biol ; 3(11): e343, 2005 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-16187794

RESUMEN

Biological networks, such as those describing gene regulation, signal transduction, and neural synapses, are representations of large-scale dynamic systems. Discovery of organizing principles of biological networks can be enhanced by embracing the notion that there is a deep interplay between network structure and system dynamics. Recently, many structural characteristics of these non-random networks have been identified, but dynamical implications of the features have not been explored comprehensively. We demonstrate by exhaustive computational analysis that a dynamical property--stability or robustness to small perturbations--is highly correlated with the relative abundance of small subnetworks (network motifs) in several previously determined biological networks. We propose that robust dynamical stability is an influential property that can determine the non-random structure of biological networks.


Asunto(s)
Simulación por Computador , Modelos Biológicos , Transducción de Señal , Teoría de Sistemas , Transcripción Genética , Animales , Caenorhabditis elegans/fisiología , Biología Computacional/métodos , Drosophila melanogaster/fisiología , Escherichia coli/fisiología , Red Nerviosa , Saccharomyces cerevisiae/fisiología , Estadística como Asunto
11.
PLoS One ; 12(4): e0175527, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28384336

RESUMEN

BACKGROUND: Paper currency by its very nature is frequently transferred from one person to another and represents an important medium for human contact with-and potential exchange of-microbes. In this pilot study, we swabbed circulating $1 bills obtained from a New York City bank in February (Winter) and June (Summer) 2013 and used shotgun metagenomic sequencing to profile the communities found on their surface. Using basic culture conditions, we also tested whether viable microbes could be recovered from bills. RESULTS: Shotgun metagenomics identified eukaryotes as the most abundant sequences on money, followed by bacteria, viruses and archaea. Eukaryotic assemblages were dominated by human, other metazoan and fungal taxa. The currency investigated harbored a diverse microbial population that was dominated by human skin and oral commensals, including Propionibacterium acnes, Staphylococcus epidermidis and Micrococcus luteus. Other taxa detected not associated with humans included Lactococcus lactis and Streptococcus thermophilus, microbes typically associated with dairy production and fermentation. Culturing results indicated that viable microbes can be isolated from paper currency. CONCLUSIONS: We conducted the first metagenomic characterization of the surface of paper money in the United States, establishing a baseline for microbes found on $1 bills circulating in New York City. Our results suggest that money amalgamates DNA from sources inhabiting the human microbiome, food, and other environmental inputs, some of which can be recovered as viable organisms. These monetary communities may be maintained through contact with human skin, and DNA obtained from money may provide a record of human behavior and health. Understanding these microbial profiles is especially relevant to public health as money could potentially mediate interpersonal transfer of microbes.


Asunto(s)
Bacterias/aislamiento & purificación , Metagenómica , Humanos , Ciudad de Nueva York , Proyectos Piloto , Propiedades de Superficie
12.
Genome Biol ; 18(1): 182, 2017 09 21.
Artículo en Inglés | MEDLINE | ID: mdl-28934964

RESUMEN

BACKGROUND: One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. RESULTS: In this study, we use the largest-to-date set of laboratory-generated and simulated controls across 846 species to evaluate the performance of 11 metagenomic classifiers. Tools were characterized on the basis of their ability to identify taxa at the genus, species, and strain levels, quantify relative abundances of taxa, and classify individual reads to the species level. Strikingly, the number of species identified by the 11 tools can differ by over three orders of magnitude on the same datasets. Various strategies can ameliorate taxonomic misclassification, including abundance filtering, ensemble approaches, and tool intersection. Nevertheless, these strategies were often insufficient to completely eliminate false positives from environmental samples, which are especially important where they concern medically relevant species. Overall, pairing tools with different classification strategies (k-mer, alignment, marker) can combine their respective advantages. CONCLUSIONS: This study provides positive and negative controls, titrated standards, and a guide for selecting tools for metagenomic analyses by comparing ranges of precision, accuracy, and recall. We show that proper experimental design and analysis parameters can reduce false positives, provide greater resolution of species in complex metagenomic samples, and improve the interpretation of results.


Asunto(s)
Benchmarking/métodos , Mapeo Contig/métodos , Código de Barras del ADN Taxonómico/métodos , Metagenoma , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Benchmarking/normas , Mapeo Contig/normas , Código de Barras del ADN Taxonómico/normas , Humanos , Microbiota , Filogenia , Análisis de Secuencia de ADN/normas
13.
PLoS One ; 10(6): e0125777, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26030907

RESUMEN

Single-cell RNA and protein concentrations dynamically fluctuate because of stochastic ("noisy") regulation. Consequently, biological signaling and genetic networks not only translate stimuli with functional response but also random fluctuations. Intuitively, this feature manifests as the accumulation of fluctuations from the network source to the target. Taking advantage of the fact that noise propagates directionally, we developed a method for causation prediction that does not require time-lagged observations and therefore can be applied to data generated by destructive assays such as immunohistochemistry. Our method for causation prediction, "Inference of Network Directionality Using Covariance Elements (INDUCE)," exploits the theoretical relationship between a change in the strength of a causal interaction and the associated changes in the single cell measured entries of the covariance matrix of protein concentrations. We validated our method for causation prediction in two experimental systems where causation is well established: in an E. coli synthetic gene network, and in MEK to ERK signaling in mammalian cells. We report the first analysis of covariance elements documenting noise propagation from a kinase to a phosphorylated substrate in an endogenous mammalian signaling network.


Asunto(s)
Algoritmos , Ruido , Escherichia coli , Modelos Biológicos
14.
F1000Res ; 4: 1030, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-27134723

RESUMEN

UNLABELLED: DREAM challenges are community competitions designed to advance computational methods and address fundamental questions in system biology and translational medicine. Each challenge asks participants to develop and apply computational methods to either predict unobserved outcomes or to identify unknown model parameters given a set of training data. Computational methods are evaluated using an automated scoring metric, scores are posted to a public leaderboard, and methods are published to facilitate community discussions on how to build improved methods. By engaging participants from a wide range of science and engineering backgrounds, DREAM challenges can comparatively evaluate a wide range of statistical, machine learning, and biophysical methods. Here, we describe DREAMTools, a Python package for evaluating DREAM challenge scoring metrics. DREAMTools provides a command line interface that enables researchers to test new methods on past challenges, as well as a framework for scoring new challenges. As of March 2016, DREAMTools includes more than 80% of completed DREAM challenges. DREAMTools complements the data, metadata, and software tools available at the DREAM website http://dreamchallenges.org and on the Synapse platform at https://www.synapse.org. AVAILABILITY:   DREAMTools is a Python package. Releases and documentation are available at http://pypi.python.org/pypi/dreamtools. The source code is available at http://github.com/dreamtools/dreamtools.

15.
Cell Syst ; 1(1): 72-87, 2015 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-26594662

RESUMEN

The panoply of microorganisms and other species present in our environment influence human health and disease, especially in cities, but have not been profiled with metagenomics at a city-wide scale. We sequenced DNA from surfaces across the entire New York City (NYC) subway system, the Gowanus Canal, and public parks. Nearly half of the DNA (48%) does not match any known organism; identified organisms spanned 1,688 bacterial, viral, archaeal, and eukaryotic taxa, which were enriched for harmless genera associated with skin (e.g., Acinetobacter). Predicted ancestry of human DNA left on subway surfaces can recapitulate U.S. Census demographic data, and bacterial signatures can reveal a station's history, such as marine-associated bacteria in a hurricane-flooded station. Some evidence of pathogens was found (Bacillus anthracis), but a lack of reported cases in NYC suggests that the pathogens represent a normal, urban microbiome. This baseline metagenomic map of NYC could help long-term disease surveillance, bioterrorism threat mitigation, and health management in the built environment of cities.

16.
Sci Signal ; 4(189): mr7, 2011 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-21900204

RESUMEN

Computational analyses of systematic measurements on the states and activities of signaling proteins (as captured by phosphoproteomic data, for example) have the potential to uncover uncharacterized protein-protein interactions and to identify the subset that are important for cellular response to specific biological stimuli. However, inferring mechanistically plausible protein signaling networks (PSNs) from phosphoproteomics data is a difficult task, owing in part to the lack of sufficiently comprehensive experimental measurements, the inherent limitations of network inference algorithms, and a lack of standards for assessing the accuracy of inferred PSNs. A case study in which 12 research groups inferred PSNs from a phosphoproteomics data set demonstrates an assessment of inferred PSNs on the basis of the accuracy of their predictions. The concurrent prediction of the same previously unreported signaling interactions by different participating teams suggests relevant validation experiments and establishes a framework for combining PSNs inferred by multiple research groups into a composite PSN. We conclude that crowdsourcing the construction of PSNs-that is, outsourcing the task to the interested community-may be an effective strategy for network inference.


Asunto(s)
Biología Computacional/métodos , Conducta Cooperativa , Fosfoproteínas/metabolismo , Mapas de Interacción de Proteínas/genética , Transducción de Señal/genética , Proteómica/métodos
17.
PLoS One ; 5(2): e9202, 2010 Feb 23.
Artículo en Inglés | MEDLINE | ID: mdl-20186320

RESUMEN

BACKGROUND: Systems biology has embraced computational modeling in response to the quantitative nature and increasing scale of contemporary data sets. The onslaught of data is accelerating as molecular profiling technology evolves. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) is a community effort to catalyze discussion about the design, application, and assessment of systems biology models through annual reverse-engineering challenges. METHODOLOGY AND PRINCIPAL FINDINGS: We describe our assessments of the four challenges associated with the third DREAM conference which came to be known as the DREAM3 challenges: signaling cascade identification, signaling response prediction, gene expression prediction, and the DREAM3 in silico network challenge. The challenges, based on anonymized data sets, tested participants in network inference and prediction of measurements. Forty teams submitted 413 predicted networks and measurement test sets. Overall, a handful of best-performer teams were identified, while a majority of teams made predictions that were equivalent to random. Counterintuitively, combining the predictions of multiple teams (including the weaker teams) can in some cases improve predictive power beyond that of any single method. CONCLUSIONS: DREAM provides valuable feedback to practitioners of systems biology modeling. Lessons learned from the predictions of the community provide much-needed context for interpreting claims of efficacy of algorithms described in the scientific literature.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Modelos Biológicos , Biología de Sistemas/métodos , Animales , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Humanos , Mapeo de Interacción de Proteínas/métodos , Reproducibilidad de los Resultados , Transducción de Señal , Programas Informáticos
20.
Ann N Y Acad Sci ; 1158: 159-95, 2009 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19348640

RESUMEN

Regardless of how creative, innovative, and elegant our computational methods, the ultimate proof of an algorithm's worth is the experimentally validated quality of its predictions. Unfortunately, this truism is hard to reduce to practice. Usually, modelers produce hundreds to hundreds of thousands of predictions, most (if not all) of which go untested. In a best-case scenario, a small subsample of predictions (three to ten usually) is experimentally validated, as a quality control step to attest to the global soundness of the full set of predictions. However, whether this small set is even representative of the global algorithm's performance is a question usually left unaddressed. Thus, a clear understanding of the strengths and weaknesses of an algorithm most often remains elusive, especially to the experimental biologists who must decide which tool to use to address a specific problem. In this chapter, we describe the first systematic set of challenges posed to the systems biology community in the framework of the DREAM (Dialogue for Reverse Engineering Assessments and Methods) project. These tests, which came to be known as the DREAM2 challenges, consist of data generously donated by participants to the DREAM project and curated in such a way as to become problems of network reconstruction and whose solutions, the actual networks behind the data, were withheld from the participants. The explanation of the resulting five challenges, a global comparison of the submissions, and a discussion of the best performing strategies are the main topics discussed.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Redes Reguladoras de Genes , Modelos Biológicos , Biología de Sistemas , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Mapeo de Interacción de Proteínas , Proteínas Proto-Oncogénicas/genética , Proteínas Represoras/genética , Reproducibilidad de los Resultados , Programas Informáticos , Técnicas del Sistema de Dos Híbridos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA