Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Anal Chem ; 91(7): 4346-4356, 2019 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-30741529

RESUMO

High-throughput, comprehensive, and confident identifications of metabolites and other chemicals in biological and environmental samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent technological advances, metabolomics studies still result in the detection of a disproportionate number of features that cannot be confidently assigned to a chemical structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on reference libraries constructed by analysis of authentic reference materials with limited commercial availability. To this end, we have developed the in silico chemical library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chemical properties. In the instantiation described here, we predict probable three-dimensional molecular conformers (i.e., conformational isomers) using chemical identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of molecular dynamics, quantum chemistry, and ion mobility calculations, to generate structures and chemical property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calculations, improving its computational efficiency by over 2 orders of magnitude. Calculated CCS values were validated against 1983 experimentally measured CCS values and compared to previously reported CCS calculation approaches. Average calculated CCS error for the validation set is 3.2% using standard parameters, outperforming other density functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calculated and experimental CCS values ( metabolomics.pnnl.gov ), initially including a CCS library with over 1 million entries. Finally, three successful applications of molecule characterization using calculated CCS are described, including providing evidence for the presence of an environmental degradation product, the separation of molecular isomers, and an initial characterization of complex blinded mixtures of exposure chemicals. This work represents a method to address the limitations of small molecule identification and offers an alternative to generating chemical identification libraries experimentally by analyzing authentic reference materials. All code is available at github.com/pnnl .


Assuntos
Quimioinformática/métodos , Teoria da Densidade Funcional , Bibliotecas de Moléculas Pequenas/química , Aprendizado de Máquina , Modelos Químicos , Simulação de Dinâmica Molecular
2.
Bioinformatics ; 33(19): 3137-3139, 2017 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-28605449

RESUMO

Summary: FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data. Availability and Implementation: FQC is implemented in Python 3 and Javascript, and is maintained under an MIT license. Documentation and source code is available at: https://github.com/pnnl/fqc . Contact: joseph.brown@pnnl.gov.

3.
Nucleic Acids Res ; 43(15): 7504-20, 2015 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-26130723

RESUMO

Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson-Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download.


Assuntos
Modelos Estatísticos , RNA/química , Análise de Sequência de RNA/métodos , Sequência de Bases , Variação Genética , Cadeias de Markov , Motivos de Nucleotídeos , Alinhamento de Sequência , Software
4.
Proc Natl Acad Sci U S A ; 110(12): 4651-5, 2013 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-23487761

RESUMO

Do bacterial taxa demonstrate clear endemism, like macroorganisms, or can one site's bacterial community recapture the total phylogenetic diversity of the world's oceans? Here we compare a deep bacterial community characterization from one site in the English Channel (L4-DeepSeq) with 356 datasets from the International Census of Marine Microbes (ICoMM) taken from around the globe (ranging from marine pelagic and sediment samples to sponge-associated environments). At the L4-DeepSeq site, increasing sequencing depth uncovers greater phylogenetic overlap with the global ICoMM data. This site contained 31.7-66.2% of operational taxonomic units identified in a given ICoMM biome. Extrapolation of this overlap suggests that 1.93 × 10(11) sequences from the L4 site would capture all ICoMM bacterial phylogenetic diversity. Current technology trends suggest this limit may be attainable within 3 y. These results strongly suggest the marine biosphere maintains a previously undetected, persistent microbial seed bank.


Assuntos
Bactérias , Biodiversidade , Metagenoma , Oceanos e Mares , Filogenia , Microbiologia da Água
5.
Bioessays ; 35(9): 810-7, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23836415

RESUMO

Large-scale characterization of the human microbiota has largely focused on Western adults, yet these populations may be uncharacteristic because of their diets and lifestyles. In particular, the rise of "Western diseases" may in part stem from reduced exposure to, or even loss of, microbes with which humans have coevolved. Here, we review beneficial microbes associated with pathogen resistance, highlighting the emerging role of complex microbial communities in protecting against disease. We discuss ways in which modern lifestyles and practices may deplete physiologically important microbiota, and explore prospects for reintroducing or encouraging the growth of beneficial microbes to promote the restoration of healthy microbial ecosystems.


Assuntos
Trato Gastrointestinal/microbiologia , Metagenoma , Microbiota , Animais , Doença , Ecossistema , Trato Gastrointestinal/imunologia , Interações Hospedeiro-Patógeno/imunologia , Humanos , Probióticos/metabolismo
6.
Bioinformatics ; 27(21): 3067-9, 2011 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-21903626

RESUMO

MOTIVATION: Microbial community profiling is a highly active area of research, but tools that facilitate visualization of phylogenetic trees and associated environmental data have not kept up with the increasing quantity of data generated in these studies. RESULTS: TopiaryExplorer supports the visualization of very large phylogenetic trees, including features such as the automated coloring of branches by environmental data, manipulation of trees and incorporation of per-tip metadata (e.g. taxonomic labels). AVAILABILITY: http://topiaryexplorer.sourceforge.net. CONTACT: rob.knight@colorado.edu.


Assuntos
Filogenia , Software , Meio Ambiente , Proteobactérias/classificação , Proteobactérias/isolamento & purificação
8.
Gigascience ; 2(1): 16, 2013 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-24280061

RESUMO

BACKGROUND: As microbial ecologists take advantage of high-throughput sequencing technologies to describe microbial communities across ever-increasing numbers of samples, new analysis tools are required to relate the distribution of microbes among larger numbers of communities, and to use increasingly rich and standards-compliant metadata to understand the biological factors driving these relationships. In particular, the Earth Microbiome Project drives these needs by profiling the genomic content of tens of thousands of samples across multiple environment types. FINDINGS: Features of EMPeror include: ability to visualize gradients and categorical data, visualize different principal coordinates axes, present the data in the form of parallel coordinates, show taxa as well as environmental samples, dynamically adjust the size and transparency of the spheres representing the communities on a per-category basis, dynamically scale the axes according to the fraction of variance each explains, show, hide or recolor points according to arbitrary metadata including that compliant with the MIxS family of standards developed by the Genomic Standards Consortium, display jackknifed-resampled data to assess statistical confidence in clustering, perform coordinate comparisons (useful for procrustes analysis plots), and greatly reduce loading times and overall memory footprint compared with existing approaches. Additionally, ease of sharing, given EMPeror's small output file size, enables agile collaboration by allowing users to embed these visualizations via emails or web pages without the need for extra plugins. CONCLUSIONS: Here we present EMPeror, an open source and web browser enabled tool with a versatile command line interface that allows researchers to perform rapid exploratory investigations of 3D visualizations of microbial community data, such as the widely used principal coordinates plots. EMPeror includes a rich set of controllers to modify features as a function of the metadata. By being specifically tailored to the requirements of microbial ecologists, EMPeror thus increases the speed with which insight can be gained from large microbiome datasets.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA