Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-39131324

RESUMO

Methods for assessing compound identification confidence in metabolomics and related studies have been debated and actively researched for the past two decades. The earliest effort in 2007 focused primarily on mass spectrometry and nuclear magnetic resonance spectroscopy and resulted in four recommended levels of metabolite identification confidence - the Metabolite Standards Initiative (MSI) Levels. In 2014, the original MSI Levels were expanded to five levels (including two sublevels) to facilitate communication of compound identification confidence in high resolution mass spectrometry studies. Further refinement in identification levels have occurred, for example to accommodate use of ion mobility spectrometry in metabolomics workflows, and alternate approaches to communicate compound identification confidence also have been developed based on identification points schema. However, neither qualitative levels of identification confidence nor quantitative scoring systems address the degree of ambiguity in compound identifications in context of the chemical space being considered, are easily automated, or are transferable between analytical platforms. In this perspective, we propose that the metabolomics and related communities consider identification probability as an approach for automated and transferable assessment of compound identification and ambiguity in metabolomics and related studies. Identification probability is defined simply as 1/N, where N is the number of compounds in a reference library or chemical space that match to an experimentally measured molecule within user-defined measurement precision(s), for example mass measurement or retention time accuracy, etc. We demonstrate the utility of identification probability in an in silico analysis of multi-property reference libraries constructed from the Human Metabolome Database and computational property predictions, provide guidance to the community in transparent implementation of the concept, and invite the community to further evaluate this concept in parallel with their current preferred methods for assessing metabolite identification confidence.

2.
J Mass Spectrom ; 57(12): e4898, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36463891

RESUMO

Mass spectrometry imaging (MSI) data visualization relies on heatmaps to show the spatial distribution and measured abundances of molecules within a sample. Nonuniform color gradients such as jet are still commonly used to visualize MSI data, increasing the probability of data misinterpretation and false conclusions. Also, the use of nonuniform color gradients and the combination of hues used in common colormaps make it challenging for people with color vision deficiencies (CVDs) to visualize and accurately interpret data. Here we present best practices for choosing a colormap to accurately display MSI data, improve readability, and accommodate all CVDs. We also provide other resources on the misuse of color in the scientific field and resources on scientifically derived colormaps presented herein.


Assuntos
Espectrometria de Massas , Humanos
3.
Anal Chem ; 94(16): 6130-6138, 2022 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-35430813

RESUMO

We present DEIMoS: Data Extraction for Integrated Multidimensional Spectrometry, a Python application programming interface (API) and command-line tool for high-dimensional mass spectrometry data analysis workflows that offers ease of development and access to efficient algorithmic implementations. Functionality includes feature detection, feature alignment, collision cross section (CCS) calibration, isotope detection, and MS/MS spectral deconvolution, with the output comprising detected features aligned across study samples and characterized by mass, CCS, tandem mass spectra, and isotopic signature. Notably, DEIMoS operates on N-dimensional data, largely agnostic to acquisition instrumentation; algorithm implementations simultaneously utilize all dimensions to (i) offer greater separation between features, thus improving detection sensitivity, (ii) increase alignment/feature matching confidence among data sets, and (iii) mitigate convolution artifacts in tandem mass spectra. We demonstrate DEIMoS with LC-IMS-MS/MS metabolomics data to illustrate the advantages of a multidimensional approach in each data processing step.


Assuntos
Metabolômica , Espectrometria de Massas em Tandem , Algoritmos , Cromatografia Líquida/métodos , Metabolômica/métodos , Software , Espectrometria de Massas em Tandem/métodos
4.
J Chem Inf Model ; 61(12): 5721-5725, 2021 12 27.
Artigo em Inglês | MEDLINE | ID: mdl-34842435

RESUMO

We describe the Mass Spectrometry Adduct Calculator (MSAC), an automated Python tool to calculate the adduct ion masses of a parent molecule. Here, adduct refers to a version of a parent molecule [M] that is charged due to addition or loss of atoms and electrons resulting in a charged ion, for example, [M + H]+. MSAC includes a database of 147 potential adducts and adduct/neutral loss combinations and their mass-to-charge ratios (m/z) as extracted from the NIST/EPA/NIH Mass Spectral Library (NIST17), Global Natural Products Social Molecular Networking Public Spectral Libraries (GNPS), and MassBank of North America (MoNA). The calculator relies on user-selected subsets of the combined database to calculate expected m/z for adducts of molecules supplied as formulas. This tool is intended to help researchers create identification libraries to collect evidence for the presence of molecules in mass spectrometry data. While the included adduct database focuses on adducts typically detected during liquid chromatography-mass spectrometry analyses, users may supply their own lists of adducts and charge states for calculating expected m/z. We also analyzed statistics on adducts from spectra contained in the three selected mass spectral libraries. MSAC is freely available at https://github.com/pnnl/MSAC.


Assuntos
Espectrometria de Massas , Cromatografia Líquida/métodos
5.
Chem Rev ; 121(10): 5633-5670, 2021 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-33979149

RESUMO

A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.


Assuntos
Metabolômica , Teoria Quântica
7.
Phys Chem Chem Phys ; 23(2): 1197-1214, 2021 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-33355332

RESUMO

Uncompetitive antagonists of the N-methyl d-aspartate receptor (NMDAR) have demonstrated therapeutic benefit in the treatment of neurological diseases such as Parkinson's and Alzheimer's, but some also cause dissociative effects that have led to the synthesis of illicit drugs. The ability to generate NMDAR antagonists in silico is therefore desirable for both new medication development and preempting and identifying new designer drugs. Recently, generative deep learning models have been applied to de novo drug design as a means to expand the amount of chemical space that can be explored for potential drug-like compounds. In this study, we assess the application of a generative model to the NMDAR to achieve two primary objectives: (i) the creation and release of a comprehensive library of experimentally validated NMDAR phencyclidine (PCP) site antagonists to assist the drug discovery community and (ii) an analysis of both the advantages conferred by applying such generative artificial intelligence models to drug design and the current limitations of the approach. We apply, and provide source code for, a variety of ligand- and structure-based assessment techniques used in standard drug discovery analyses to the deep learning-generated compounds. We present twelve candidate antagonists that are not available in existing chemical databases to provide an example of what this type of workflow can achieve, though synthesis and experimental validation of these compounds are still required.


Assuntos
Aprendizado Profundo , Receptores de N-Metil-D-Aspartato/antagonistas & inibidores , Bibliotecas de Moléculas Pequenas/química , Animais , Sítios de Ligação , Desenho de Fármacos , Ligantes , Camundongos , Estrutura Molecular , Receptores de N-Metil-D-Aspartato/química , Xenopus laevis
8.
J Chem Inf Model ; 60(12): 6251-6257, 2020 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-33283505

RESUMO

Thousands of chemical properties can be calculated for small molecules, which can be used to place the molecules within the context of a broader "chemical space." These definitions vary based on compounds of interest and the goals for the given chemical space definition. Here, we introduce a customizable Python module, chespa, built to easily assess different chemical space definitions through clustering of compounds in these spaces and visualizing trends of these clusters. To demonstrate this, chespa currently streamlines prediction of various molecular descriptors (predicted chemical properties, molecular substructures, AI-based chemical space, and chemical class ontology) in order to test six different chemical space definitions. Furthermore, we investigated how these varying definitions trend with mass spectrometry (MS)-based observability, that is, the ability of a molecule to be observed with MS (e.g., as a function of the molecule ionizability), using an example data set from the U.S. EPA's nontargeted analysis collaborative trial, where blinded samples had been analyzed previously, providing 1398 data points. Improved understanding of observability would offer many advantages in small-molecule identification, such as (i) a priori selection of experimental conditions based on suspected sample composition, (ii) the ability to reduce the number of candidate structures during compound identification by removing those less likely to ionize, and, in turn, (iii) a reduced false discovery rate and increased confidence in identifications. Factors controlling observability are not fully understood, making prediction of this property nontrivial and a prime candidate for chemical space analysis. Chespa is available at github.com/pnnl/chespa.


Assuntos
Espectrometria de Massas
9.
mSystems ; 5(3)2020 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-32518194

RESUMO

Increasing anthropogenic inputs of fixed nitrogen are leading to greater eutrophication of aquatic environments, but it is unclear how this impacts the flux and fate of carbon in lacustrine and riverine systems. Here, we present evidence that the form of nitrogen governs the partitioning of carbon among members in a genome-sequenced, model phototrophic biofilm of 20 members. Consumption of NO3 - as the sole nitrogen source unexpectedly resulted in more rapid transfer of carbon to heterotrophs than when NH4 + was also provided, suggesting alterations in the form of carbon exchanged. The form of nitrogen dramatically impacted net community nitrogen, but not carbon, uptake rates. Furthermore, this alteration in nitrogen form caused very large but focused alterations to community structure, strongly impacting the abundance of only two species within the biofilm and modestly impacting a third member species. Our data suggest that nitrogen metabolism may coordinate coupled carbon-nitrogen biogeochemical cycling in benthic biofilms and, potentially, in phototroph-heterotroph consortia more broadly. It further indicates that the form of nitrogen inputs may significantly impact the contribution of these communities to carbon partitioning across the terrestrial-aquatic interface.IMPORTANCE Anthropogenic inputs of nitrogen into aquatic ecosystems, and especially those of agricultural origin, involve a mix of chemical species. Although it is well-known in general that nitrogen eutrophication markedly influences the metabolism of aquatic phototrophic communities, relatively little is known regarding whether the specific chemical form of nitrogen inputs matter. Our data suggest that the nitrogen form alters the rate of nitrogen uptake significantly, whereas corresponding alterations in carbon uptake were minor. However, differences imposed by uptake of divergent nitrogen forms may result in alterations among phototroph-heterotroph interactions that rewire community metabolism. Furthermore, our data hint that availability of other nutrients (i.e., iron) might mediate the linkage between carbon and nitrogen cycling in these communities. Taken together, our data suggest that different nitrogen forms should be examined for divergent impacts on phototrophic communities in fluvial systems and that these anthropogenic nitrogen inputs may significantly differ in their ultimate biogeochemical impacts.

10.
Anal Chem ; 92(2): 1720-1729, 2020 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-31661259

RESUMO

Comprehensive and unambiguous identification of small molecules in complex samples will revolutionize our understanding of the role of metabolites in biological systems. Existing and emerging technologies have enabled measurement of chemical properties of molecules in complex mixtures and, in concert, are sensitive enough to resolve even stereoisomers. Despite these experimental advances, small molecule identification is inhibited by (i) chemical reference libraries (e.g., mass spectra, collision cross section, and other measurable property libraries) representing <1% of known molecules, limiting the number of possible identifications, and (ii) the lack of a method to generate candidate matches directly from experimental features (i.e., without a library). To this end, we developed a variational autoencoder (VAE) to learn a continuous numerical, or latent, representation of molecular structure to expand reference libraries for small molecule identification. We extended the VAE to include a chemical property decoder, trained as a multitask network, in order to shape the latent representation such that it assembles according to desired chemical properties. The approach is unique in its application to metabolomics and small molecule identification, with its focus on properties that can be obtained from experimental measurements (m/z, CCS) paired with its training paradigm, which involved a cascade of transfer learning iterations. First, molecular representation is learned from a large data set of structures with m/z labels. Next, in silico property values are used to continue training, as experimental property data is limited. Finally, the network is further refined by being trained with the experimental data. This allows the network to learn as much as possible at each stage, enabling success with progressively smaller data sets without overfitting. Once trained, the network can be used to predict chemical properties directly from structure, as well as generate candidate structures with desired chemical properties. Our approach is orders of magnitude faster than first-principles simulation for CCS property prediction. Additionally, the ability to generate novel molecules along manifolds, defined by chemical property analogues, positions DarkChem as highly useful in a number of application areas, including metabolomics and small molecule identification, drug discovery and design, chemical forensics, and beyond.


Assuntos
Simulação por Computador , Aprendizado Profundo , Bibliotecas de Moléculas Pequenas/análise , Metabolômica , Estrutura Molecular , Bibliotecas de Moléculas Pequenas/metabolismo
11.
J Chem Inf Model ; 59(9): 4052-4060, 2019 09 23.
Artigo em Inglês | MEDLINE | ID: mdl-31430141

RESUMO

The current gold standard for unambiguous molecular identification in metabolomics analysis is comparing two or more orthogonal properties from the analysis of authentic reference materials (standards) to experimental data acquired in the same laboratory with the same analytical methods. This represents a significant limitation for comprehensive chemical identification of small molecules in complex samples. The process is time consuming and costly, and the majority of molecules are not yet represented by standards. Thus, there is a need to assemble evidence for the presence of small molecules in complex samples through the use of libraries containing calculated chemical properties. To address this need, we developed a Multi-Attribute Matching Engine (MAME) and a library derived in part from our in silico chemical library engine (ISiCLE). Here, we describe an initial evaluation of these methods in a blinded analysis of synthetic chemical mixtures as part of the U.S. Environmental Protection Agency's (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT, Phase 1). For molecules in all mixtures, the initial blinded false negative rate (FNR), false discovery rate (FDR), and accuracy were 57%, 77%, and 91%, respectively. For high evidence scores, the FDR was 35%. After unblinding of the sample compositions, we optimized the scoring parameters to better exploit the available evidence and increased the accuracy for molecules suspected as present. The final FNR, FDR, and accuracy were 67%, 53%, and 96%, respectively. For high evidence scores, the FDR was 10%. This study demonstrates that multiattribute matching methods in conjunction with in silico libraries may one day enable reduced reliance on experimentally derived libraries for building evidence for the presence of molecules in complex samples.


Assuntos
Biologia Computacional/métodos , Simulação por Computador , Bibliotecas de Moléculas Pequenas/química , Algoritmos , Bibliotecas de Moléculas Pequenas/metabolismo
12.
Anal Chem ; 91(7): 4346-4356, 2019 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-30741529

RESUMO

High-throughput, comprehensive, and confident identifications of metabolites and other chemicals in biological and environmental samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent technological advances, metabolomics studies still result in the detection of a disproportionate number of features that cannot be confidently assigned to a chemical structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on reference libraries constructed by analysis of authentic reference materials with limited commercial availability. To this end, we have developed the in silico chemical library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chemical properties. In the instantiation described here, we predict probable three-dimensional molecular conformers (i.e., conformational isomers) using chemical identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of molecular dynamics, quantum chemistry, and ion mobility calculations, to generate structures and chemical property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calculations, improving its computational efficiency by over 2 orders of magnitude. Calculated CCS values were validated against 1983 experimentally measured CCS values and compared to previously reported CCS calculation approaches. Average calculated CCS error for the validation set is 3.2% using standard parameters, outperforming other density functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calculated and experimental CCS values ( metabolomics.pnnl.gov ), initially including a CCS library with over 1 million entries. Finally, three successful applications of molecule characterization using calculated CCS are described, including providing evidence for the presence of an environmental degradation product, the separation of molecular isomers, and an initial characterization of complex blinded mixtures of exposure chemicals. This work represents a method to address the limitations of small molecule identification and offers an alternative to generating chemical identification libraries experimentally by analyzing authentic reference materials. All code is available at github.com/pnnl .


Assuntos
Quimioinformática/métodos , Teoria da Densidade Funcional , Bibliotecas de Moléculas Pequenas/química , Aprendizado de Máquina , Modelos Químicos , Simulação de Dinâmica Molecular
13.
J Cheminform ; 10(1): 52, 2018 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-30367288

RESUMO

When using nuclear magnetic resonance (NMR) to assist in chemical identification in complex samples, researchers commonly rely on databases for chemical shift spectra. However, authentic standards are typically depended upon to build libraries experimentally. Considering complex biological samples, such as blood and soil, the entirety of NMR spectra required for all possible compounds would be infeasible to ascertain due to limitations of available standards and experimental processing time. As an alternative, we introduce the in silico Chemical Library Engine (ISiCLE) NMR chemical shift module to accurately and automatically calculate NMR chemical shifts of small organic molecules through use of quantum chemical calculations. ISiCLE performs density functional theory (DFT)-based calculations for predicting chemical properties-specifically NMR chemical shifts in this manuscript-via the open source, high-performance computational chemistry software, NWChem. ISiCLE calculates the NMR chemical shifts of sets of molecules using any available combination of DFT method, solvent, and NMR-active nuclei, using both user-selected reference compounds and/or linear regression methods. Calculated NMR chemical shifts are provided to the user for each molecule, along with comparisons with respect to a number of metrics commonly used in the literature. Here, we demonstrate ISiCLE using a set of 312 molecules, ranging in size up to 90 carbon atoms. For each, calculation of NMR chemical shifts have been performed with 8 different levels of DFT theory, and with solvation effects using the implicit solvent Conductor-like Screening Model. The DFT method dependence of the calculated chemical shifts have been systematically investigated through benchmarking and subsequently compared to experimental data available in the literature. Furthermore, ISiCLE has been applied to a set of 80 methylcyclohexane conformers, combined via Boltzmann weighting and compared to experimental values. We demonstrate that our protocol shows promise in the automation of chemical shift calculations and, ultimately, the expansion of chemical shift libraries.

14.
PLoS One ; 13(7): e0199239, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30067751

RESUMO

Color vision deficiency (CVD) affects more than 4% of the population and leads to a different visual perception of colors. Though this has been known for decades, colormaps with many colors across the visual spectra are often used to represent data, leading to the potential for misinterpretation or difficulty with interpretation by someone with this deficiency. Until the creation of the module presented here, there were no colormaps mathematically optimized for CVD using modern color appearance models. While there have been some attempts to make aesthetically pleasing or subjectively tolerable colormaps for those with CVD, our goal was to make optimized colormaps for the most accurate perception of scientific data by as many viewers as possible. We developed a Python module, cmaputil, to create CVD-optimized colormaps, which imports colormaps and modifies them to be perceptually uniform in CVD-safe colorspace while linearizing and maximizing the brightness range. The module is made available to the science community to enable others to easily create their own CVD-optimized colormaps. Here, we present an example CVD-optimized colormap created with this module that is optimized for viewing by those without a CVD as well as those with red-green colorblindness. This colormap, cividis, enables nearly-identical visual-data interpretation to both groups, is perceptually uniform in hue and brightness, and increases in brightness linearly.


Assuntos
Algoritmos , Percepção de Cores , Defeitos da Visão Cromática/fisiopatologia , Visão Ocular/fisiologia , Cor , Humanos , Estimulação Luminosa , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA