Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Anal Chem ; 91(7): 4346-4356, 2019 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-30741529

RESUMO

High-throughput, comprehensive, and confident identifications of metabolites and other chemicals in biological and environmental samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent technological advances, metabolomics studies still result in the detection of a disproportionate number of features that cannot be confidently assigned to a chemical structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on reference libraries constructed by analysis of authentic reference materials with limited commercial availability. To this end, we have developed the in silico chemical library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chemical properties. In the instantiation described here, we predict probable three-dimensional molecular conformers (i.e., conformational isomers) using chemical identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of molecular dynamics, quantum chemistry, and ion mobility calculations, to generate structures and chemical property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calculations, improving its computational efficiency by over 2 orders of magnitude. Calculated CCS values were validated against 1983 experimentally measured CCS values and compared to previously reported CCS calculation approaches. Average calculated CCS error for the validation set is 3.2% using standard parameters, outperforming other density functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calculated and experimental CCS values ( metabolomics.pnnl.gov ), initially including a CCS library with over 1 million entries. Finally, three successful applications of molecule characterization using calculated CCS are described, including providing evidence for the presence of an environmental degradation product, the separation of molecular isomers, and an initial characterization of complex blinded mixtures of exposure chemicals. This work represents a method to address the limitations of small molecule identification and offers an alternative to generating chemical identification libraries experimentally by analyzing authentic reference materials. All code is available at github.com/pnnl .


Assuntos
Quimioinformática/métodos , Teoria da Densidade Funcional , Bibliotecas de Moléculas Pequenas/química , Aprendizado de Máquina , Modelos Químicos , Simulação de Dinâmica Molecular
2.
Processes (Basel) ; 6(6)2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33824861

RESUMO

We report the application of a recently proposed approach for modeling biological systems using a maximum entropy production rate principle in lieu of having in vivo rate constants. The method is applied in four steps: (1) a new ordinary differential equation (ODE) based optimization approach based on Marcelin's 1910 mass action equation is used to obtain the maximum entropy distribution; (2) the predicted metabolite concentrations are compared to those generally expected from experiments using a loss function from which post-translational regulation of enzymes is inferred; (3) the system is re-optimized with the inferred regulation from which rate constants are determined from the metabolite concentrations and reaction fluxes; and finally (4) a full ODE-based, mass action simulation with rate parameters and allosteric regulation is obtained. From the last step, the power characteristics and resistance of each reaction can be determined. The method is applied to the central metabolism of Neurospora crassa and the flow of material through the three competing pathways of upper glycolysis, the non-oxidative pentose phosphate pathway, and the oxidative pentose phosphate pathway are evaluated as a function of the NADP/NADPH ratio. It is predicted that regulation of phosphofructokinase (PFK) and flow through the pentose phosphate pathway are essential for preventing an extreme level of fructose 1,6-bisphophate accumulation. Such an extreme level of fructose 1,6-bisphophate would otherwise result in a glassy cytoplasm with limited diffusion, dramatically decreasing the entropy and energy production rate and, consequently, biological competitiveness.

3.
J Phys Chem B ; 118(51): 14745-60, 2014 Dec 26.
Artigo em Inglês | MEDLINE | ID: mdl-25495377

RESUMO

We have applied a new stochastic simulation approach to predict the metabolite levels, material flux, and thermodynamic profiles of the oxidative TCA cycles found in E. coli and Synechococcus sp. PCC 7002, and in the reductive TCA cycle typical of chemolithoautotrophs and phototrophic green sulfur bacteria such as Chlorobaculum tepidum. The simulation approach is based on modeling states using statistical thermodynamics and employs an assumption similar to that used in transition state theory. The ability to evaluate the thermodynamics of metabolic pathways allows one to understand the relationship between coupling of energy and material gradients in the environment and the self-organization of stable biological systems, and it is shown that each cycle operates in the direction expected due to its environmental niche. The simulations predict changes in metabolite levels and flux in response to changes in cofactor concentrations that would be hard to predict without an elaborate model based on the law of mass action. In fact, we show that a thermodynamically unfavorable reaction can still have flux in the forward direction when it is part of a reaction network. The ability to predict metabolite levels, energy flow, and material flux should be significant for understanding the dynamics of natural systems and for understanding principles for engineering organisms for production of specialty chemicals.


Assuntos
Chlorobi/metabolismo , Ciclo do Ácido Cítrico , Cianobactérias/metabolismo , Escherichia coli/metabolismo , Modelos Químicos , Termodinâmica , Trifosfato de Adenosina/metabolismo , Dióxido de Carbono/metabolismo , Ferredoxinas/metabolismo , Oxirredução
4.
Bioinformatics ; 29(6): 797-8, 2013 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-23361326

RESUMO

MOTIVATION: BLAST remains one of the most widely used tools in computational biology. The rate at which new sequence data is available continues to grow exponentially, driving the emergence of new fields of biological research. At the same time, multicore systems and conventional clusters are more accessible. ScalaBLAST has been designed to run on conventional multiprocessor systems with an eye to extreme parallelism, enabling parallel BLAST calculations using >16 000 processing cores with a portable, robust, fault-resilient design that introduces little to no overhead with respect to serial BLAST.


Assuntos
Alinhamento de Sequência/métodos , Software , Algoritmos , Biologia Computacional/métodos
5.
Pac Symp Biocomput ; : 225-34, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22174278

RESUMO

We report the development of a novel high performance computing method for the identification of proteins from unknown (environmental) samples. The method uses computational optimization to provide an effective way to control the false discovery rate for environmental samples and complements de novo peptide sequencing. Furthermore, the method provides information based on the expressed protein in a microbial community, and thus complements DNA-based identification methods. Testing on blind samples demonstrates that the method provides 79-95% overlap with analogous results from searches involving only the correct genomes. We provide scaling and performance evaluations for the software that demonstrate the ability to carry out large-scale optimizations on 1258 genomes containing 4.2M proteins.


Assuntos
Microbiota , Proteômica/estatística & dados numéricos , Espectrometria de Massas em Tandem/estatística & dados numéricos , Biologia Computacional , Metodologias Computacionais , Interpretação Estatística de Dados , Funções Verossimilhança , Microbiota/genética , Proteínas/genética , Proteínas/isolamento & purificação , Proteoma/genética , Proteoma/isolamento & purificação , Software
6.
Bioinformatics ; 27(21): 3072-3, 2011 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-21926122

RESUMO

SUMMARY: A MapReduce-based implementation called MR-MSPolygraph for parallelizing peptide identification from mass spectrometry data is presented. The underlying serial method, MSPolygraph, uses a novel hybrid approach to match an experimental spectrum against a combination of a protein sequence database and a spectral library. Our MapReduce implementation can run on any Hadoop cluster environment. Experimental results demonstrate that, relative to the serial version, MR-MSPolygraph reduces the time to solution from weeks to hours, for processing tens of thousands of experimental spectra. Speedup and other related performance studies are also reported on a 400-core Hadoop cluster using spectral datasets from environmental microbial communities as inputs. AVAILABILITY: The source code along with user documentation are available on http://compbio.eecs.wsu.edu/MR-MSPolygraph. CONTACT: ananth@eecs.wsu.edu; william.cannon@pnnl.gov. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Espectrometria de Massas/métodos , Peptídeos/química , Software , Bases de Dados de Proteínas , Análise de Sequência de Proteína
7.
J Proteome Res ; 10(5): 2306-17, 2011 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-21391700

RESUMO

We report a hybrid search method combining database and spectral library searches that allows for a straightforward approach to characterizing the error rates from the combined data. Using these methods, we demonstrate significantly increased sensitivity and specificity in matching peptides to tandem mass spectra. The hybrid search method increased the number of spectra that can be assigned to a peptide in a global proteomics study by 57-147% at an estimated false discovery rate of 5%, with clear room for even greater improvements. The approach combines the general utility of using consensus model spectra typical of database search methods with the accuracy of the intensity information contained in spectral libraries. A common scoring metric based on recent developments linking data analysis and statistical thermodynamics is used, which allows the use of a conservative estimate of error rates for the combined data. We applied this approach to proteomics analysis of Synechococcus sp. PCC 7002, a cyanobacterium that is a model organism for studies of photosynthetic carbon fixation and biofuels development. The increased specificity and sensitivity of this approach allowed us to identify many more peptides involved in the processes important for photoautotrophic growth.


Assuntos
Biologia Computacional/métodos , Peptídeos/isolamento & purificação , Proteômica/métodos , Synechococcus/química , Espectrometria de Massas em Tandem/métodos , Funções Verossimilhança , Modelos Químicos , Biblioteca de Peptídeos , Sensibilidade e Especificidade , Synechococcus/metabolismo , Termodinâmica
8.
J Am Soc Mass Spectrom ; 18(9): 1625-37, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17651984

RESUMO

The dynamical behavior of model peptides was evaluated with respect to their ability to form internal proton donor-acceptor pairs using molecular dynamics simulations. The proton donor-acceptor pairs are postulated to be prerequisites for peptide bond cleavage resulting in formation of b and y ions during low-energy collision-induced dissociation in tandem mass spectrometry (MS/MS). The simulations for the polyalanine pentamer Ala(5)H(+) were compared with experimental data from energy-resolved surface induced dissociation (SID) studies. The results of the simulation are insightful into the events that likely lead up to the fragmentation of peptides. Nine-mer polyalanine-based model peptides were used to examine the dynamical effect of each of the 20 common amino acids on the probability to form donor-acceptor pairs at labile peptide bonds. A range of probabilities was observed as a function of the substituted amino acid. However, the location of the peptide bond involved in the donor-acceptor pair plays a critical role in the dynamical behavior. This influence of position on the probability of forming a donor-acceptor pair would be hard to predict from statistical analyses on experimental spectra of aggregate, diverse peptides. In addition, the inclusion of basic side chains in the model peptides alters the probability of forming donor-acceptor pairs across the entire backbone. In this case, there are still more ionizing protons than basic residues, but the side chains of the basic amino acids form stable hydrogen bond networks with the peptide carbonyl oxygens and thus act to prevent free access of "mobile protons" to labile peptide bonds. It is clear from the work that the identification of peptides from low-energy CID using automated computational methods should consider the location of the fragmenting bond as well as the amino acid composition.


Assuntos
Aminoácidos/química , Modelos Químicos , Modelos Moleculares , Peptídeos/química , Sequência de Aminoácidos , Simulação por Computador , Dados de Sequência Molecular , Conformação Proteica , Desnaturação Proteica , Dobramento de Proteína , Relação Estrutura-Atividade
9.
J Proteome Res ; 4(5): 1687-98, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16212422

RESUMO

We evaluate statistical models used in two-hypothesis tests for identifying peptides from tandem mass spectrometry data. The null hypothesis H(0), that a peptide matches a spectrum by chance, requires information on the probability of by-chance matches between peptide fragments and peaks in the spectrum. Likewise, the alternate hypothesis H(A), that the spectrum is due to a particular peptide, requires probabilities that the peptide fragments would indeed be observed if it was the causative agent. We compare models for these probabilities by determining the identification rates produced by the models using an independent data set. The initial models use different probabilities depending on fragment ion type, but uniform probabilities for each ion type across all of the labile bonds along the backbone. More sophisticated models for probabilities under both H(A) and H(0) are introduced that do not assume uniform probabilities for each ion type. In addition, the performance of these models using a standard likelihood model is compared to an information theory approach derived from the likelihood model. Also, a simple but effective model for incorporating peak intensities is described. Finally, a support-vector machine is used to discriminate between correct and incorrect identifications based on multiple characteristics of the scoring functions. The results are shown to reduce the misidentification rate significantly when compared to a benchmark cross-correlation based approach.


Assuntos
Proteoma , Proteômica/métodos , Bases de Dados de Proteínas , Deinococcus/metabolismo , Funções Verossimilhança , Espectrometria de Massas , Modelos Estatísticos , Peptídeos/química , Probabilidade , Curva ROC
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...