Pesquisa | BVS Integralidade em Saúde

1.

MSNovelist: de novo structure generation from mass spectra.

Stravs, Michael A; Dührkop, Kai; Böcker, Sebastian; Zamboni, Nicola.

Nat Methods ; 19(7): 865-870, 2022 07.

Artigo em Inglês | MEDLINE | ID: mdl-35637304

RESUMO

Current methods for structure elucidation of small molecules rely on finding similarity with spectra of known compounds, but do not predict structures de novo for unknown compound classes. We present MSNovelist, which combines fingerprint prediction with an encoder-decoder neural network to generate structures de novo solely from tandem mass spectrometry (MS2) spectra. In an evaluation with 3,863 MS2 spectra from the Global Natural Product Social Molecular Networking site, MSNovelist predicted 25% of structures correctly on first rank, retrieved 45% of structures overall and reproduced 61% of correct database annotations, without having ever seen the structure in the training phase. Similarly, for the CASMI 2016 challenge, MSNovelist correctly predicted 26% and retrieved 57% of structures, recovering 64% of correct database annotations. Finally, we illustrate the application of MSNovelist in a bryophyte MS2 dataset, in which de novo structure prediction substantially outscored the best database candidate for seven spectra. MSNovelist is ideally suited to complement library-based annotation in the case of poorly represented analyte classes and novel compounds.

Assuntos

Espectrometria de Massas em Tandem , Bases de Dados Factuais

2.

Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification.

Karunaratne, Erandika; Hill, Dennis W; Dührkop, Kai; Böcker, Sebastian; Grant, David F.

Anal Chem ; 95(32): 11901-11907, 2023 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-37540774

RESUMO

The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.

Assuntos

Bases de Dados de Compostos Químicos , Espectrometria de Massas em Tandem , Reprodutibilidade dos Testes , Metabolômica/métodos , Aprendizado de Máquina

3.

Feature-based molecular networking in the GNPS analysis environment.

Nothias, Louis-Félix; Petras, Daniel; Schmid, Robin; Dührkop, Kai; Rainer, Johannes; Sarvepalli, Abinesh; Protsyuk, Ivan; Ernst, Madeleine; Tsugawa, Hiroshi; Fleischauer, Markus; Aicheler, Fabian; Aksenov, Alexander A; Alka, Oliver; Allard, Pierre-Marie; Barsch, Aiko; Cachet, Xavier; Caraballo-Rodriguez, Andres Mauricio; Da Silva, Ricardo R; Dang, Tam; Garg, Neha; Gauglitz, Julia M; Gurevich, Alexey; Isaac, Giorgis; Jarmusch, Alan K; Kameník, Zdenek; Kang, Kyo Bin; Kessler, Nikolas; Koester, Irina; Korf, Ansgar; Le Gouellec, Audrey; Ludwig, Marcus; Martin H, Christian; McCall, Laura-Isobel; McSayles, Jonathan; Meyer, Sven W; Mohimani, Hosein; Morsy, Mustafa; Moyne, Oriane; Neumann, Steffen; Neuweger, Heiko; Nguyen, Ngoc Hung; Nothias-Esposito, Melissa; Paolini, Julien; Phelan, Vanessa V; Pluskal, Tomás; Quinn, Robert A; Rogers, Simon; Shrestha, Bindesh; Tripathi, Anupriya; van der Hooft, Justin J J.

Nat Methods ; 17(9): 905-908, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32839597

RESUMO

Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.

Assuntos

Produtos Biológicos/química , Espectrometria de Massas , Biologia Computacional/métodos , Bases de Dados Factuais , Metabolômica/métodos , Software

4.

Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra.

Dührkop, Kai.

Bioinformatics ; 38(Suppl 1): i342-i349, 2022 06 24.

Artigo em Inglês | MEDLINE | ID: mdl-35758813

RESUMO

MOTIVATION: Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these libraries are vastly incomplete; in silico methods search in structure databases, allowing us to overcome this limitation. The best-performing in silico methods use machine learning to predict a molecular fingerprint from tandem mass spectra, then use the predicted fingerprint to search in a molecular structure database. Predicted molecular fingerprints are also of great interest for compound class annotation, de novo structure elucidation, and other tasks. So far, kernel support vector machines are the best tool for fingerprint prediction. However, they cannot be trained on all publicly available reference spectra because their training time scales cubically with the number of training data. RESULTS: We use the Nyström approximation to transform the kernel into a linear feature map. We evaluate two methods that use this feature map as input: a linear support vector machine and a deep neural network (DNN). For evaluation, we use a cross-validated dataset of 156 017 compounds and three independent datasets with 1734 compounds. We show that the combination of kernel method and DNN outperforms the kernel support vector machine, which is the current gold standard, as well as a DNN on tandem mass spectra on all evaluation datasets. AVAILABILITY AND IMPLEMENTATION: The deep kernel learning method for fingerprint prediction is part of the SIRIUS software, available at https://bio.informatik.uni-jena.de/software/sirius.

Assuntos

Metabolômica , Espectrometria de Massas em Tandem , Bases de Dados de Compostos Químicos , Aprendizado de Máquina , Metabolômica/métodos , Redes Neurais de Computação , Espectrometria de Massas em Tandem/métodos

5.

Chemically informed analyses of metabolomics mass spectrometry data with Qemistree.

Tripathi, Anupriya; Vázquez-Baeza, Yoshiki; Gauglitz, Julia M; Wang, Mingxun; Dührkop, Kai; Nothias-Esposito, Mélissa; Acharya, Deepa D; Ernst, Madeleine; van der Hooft, Justin J J; Zhu, Qiyun; McDonald, Daniel; Brejnrod, Asker D; Gonzalez, Antonio; Handelsman, Jo; Fleischauer, Markus; Ludwig, Marcus; Böcker, Sebastian; Nothias, Louis-Félix; Knight, Rob; Dorrestein, Pieter C.

Nat Chem Biol ; 17(2): 146-151, 2021 02.

Artigo em Inglês | MEDLINE | ID: mdl-33199911

RESUMO

Untargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult to interpret. We developed Qemistree, a data exploration strategy based on the hierarchical organization of molecular fingerprints predicted from fragmentation spectra. Qemistree allows mass spectrometry data to be represented in the context of sample metadata and chemical ontologies. By expressing molecular relationships as a tree, we can apply ecological tools that are designed to analyze and visualize the relatedness of DNA sequences to metabolomics data. Here we demonstrate the use of tree-guided data exploration tools to compare metabolomics samples across different experimental conditions such as chromatographic shifts. Additionally, we leverage a tree representation to visualize chemical diversity in a heterogeneous collection of samples. The Qemistree software pipeline is freely available to the microbiome and metabolomics communities in the form of a QIIME2 plugin, and a global natural products social molecular networking workflow.

Assuntos

Espectrometria de Massas/métodos , Metabolômica , Algoritmos , Análise por Conglomerados , DNA/química , Impressões Digitais de DNA , Bases de Dados Factuais , Ecologia , Análise de Alimentos , Microbiota , Análise Multivariada , Software , Espectrometria de Massas em Tandem , Fluxo de Trabalho

6.

Illuminating the dark metabolome of Pseudo-nitzschia-microbiome associations.

Koester, Irina; Quinlan, Zachary A; Nothias, Louis-Félix; White, Margot E; Rabines, Ariel; Petras, Daniel; Brunson, John K; Dührkop, Kai; Ludwig, Marcus; Böcker, Sebastian; Azam, Farooq; Allen, Andrew E; Dorrestein, Pieter C; Aluwihare, Lihini I.

Environ Microbiol ; 24(11): 5408-5424, 2022 11.

Artigo em Inglês | MEDLINE | ID: mdl-36222155

RESUMO

The exchange of metabolites mediates algal and bacterial interactions that maintain ecosystem function. Yet, while thousands of metabolites are produced, only a few molecules have been identified in these associations. Using the ubiquitous microalgae Pseudo-nitzschia sp., as a model, we employed an untargeted metabolomics strategy to assign structural characteristics to the metabolites that distinguished specific diatom-microbiome associations. We cultured five species of Pseudo-nitzschia, including two species that produced the toxin domoic acid, and examined their microbiomes and metabolomes. A total of 4826 molecular features were detected by tandem mass spectrometry. Only 229 of these could be annotated using available mass spectral libraries, but by applying new in silico annotation tools, characterization was expanded to 2710 features. The metabolomes of the Pseudo-nitzschia-microbiome associations were distinct and distinguished by structurally diverse nitrogen compounds, ranging from simple amines and amides to cyclic compounds such as imidazoles, pyrrolidines and lactams. By illuminating the dark metabolomes, this study expands our capacity to discover new chemical targets that facilitate microbial partnerships and uncovers the chemical diversity that underpins algae-bacteria interactions.

Assuntos

Diatomáceas , Microbiota , Diatomáceas/metabolismo , Espectrometria de Massas em Tandem , Metaboloma

7.

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information.

Dührkop, Kai; Fleischauer, Markus; Ludwig, Marcus; Aksenov, Alexander A; Melnik, Alexey V; Meusel, Marvin; Dorrestein, Pieter C; Rousu, Juho; Böcker, Sebastian.

Nat Methods ; 16(4): 299-302, 2019 04.

Artigo em Inglês | MEDLINE | ID: mdl-30886413

RESUMO

Mass spectrometry is a predominant experimental technique in metabolomics and related fields, but metabolite structural elucidation remains highly challenging. We report SIRIUS 4 (https://bio.informatik.uni-jena.de/sirius/), which provides a fast computational approach for molecular structure identification. SIRIUS 4 integrates CSI:FingerID for searching in molecular structure databases. Using SIRIUS 4, we achieved identification rates of more than 70% on challenging metabolomics datasets.

Assuntos

Metabolômica/métodos , Estrutura Molecular , Processamento de Sinais Assistido por Computador , Espectrometria de Massas em Tandem/métodos , Algoritmos , Teorema de Bayes , Biomarcadores , Análise por Conglomerados , Biologia Computacional/métodos , Gráficos por Computador , Bases de Dados Factuais , Processamento Eletrônico de Dados , Internet , Isótopos , Funções Verossimilhança , Metaboloma , Redes Neurais de Computação , Linguagens de Programação , Interface Usuário-Computador

8.

Mass Difference Matching Unfolds Hidden Molecular Structures of Dissolved Organic Matter.

Simon, Carsten; Dührkop, Kai; Petras, Daniel; Roth, Vanessa-Nina; Böcker, Sebastian; Dorrestein, Pieter C; Gleixner, Gerd.

Environ Sci Technol ; 56(15): 11027-11040, 2022 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-35834352

RESUMO

Ultrahigh-resolution Fourier transform mass spectrometry (FTMS) has revealed unprecedented details of natural complex mixtures such as dissolved organic matter (DOM) on a molecular formula level, but we lack approaches to access the underlying structural complexity. We here explore the hypothesis that every DOM precursor ion is potentially linked with all emerging product ions in FTMS2 experiments. The resulting mass difference (Δm) matrix is deconvoluted to isolate individual precursor ion Δm profiles and matched with structural information, which was derived from 42 Δm features from 14 in-house reference compounds and a global set of 11â¯477 Δm features with assigned structure specificities, using a dataset of â¼18â¯000 unique structures. We show that Δm matching is highly sensitive in predicting potential precursor ion identities in terms of molecular and structural composition. Additionally, the approach identified unresolved precursor ions and missing elements in molecular formula annotation (P, Cl, F). Our study provides first results on how Δm matching refines structural annotations in van Krevelen space but simultaneously demonstrates the wide overlap between potential structural classes. We show that this effect is likely driven by chemodiversity and offers an explanation for the observed ubiquitous presence of molecules in the center of the van Krevelen space. Our promising first results suggest that Δm matching can both unfold the structural information encrypted in DOM and assess the quality of FTMS-derived molecular formulas of complex mixtures in general.

Assuntos

Matéria Orgânica Dissolvida , Espectrometria de Massas por Ionização por Electrospray , Misturas Complexas , Estrutura Molecular , Espectrometria de Massas por Ionização por Electrospray/métodos

9.

Bayesian networks for mass spectrometric metabolite identification via molecular fingerprints.

Ludwig, Marcus; Dührkop, Kai; Böcker, Sebastian.

Bioinformatics ; 34(13): i333-i340, 2018 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-29949965

RESUMO

Motivation: Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands of compounds in a biological sample. Recently, we presented CSI:FingerID for searching in molecular structure databases using tandem mass spectrometry data. CSI:FingerID predicts a molecular fingerprint that encodes the structure of the query compound, then uses this to search a molecular structure database such as PubChem. Scoring of the predicted query fingerprint and deterministic target fingerprints is carried out assuming independence between the molecular properties constituting the fingerprint. Results: We present a scoring that takes into account dependencies between molecular properties. As before, we predict posterior probabilities of molecular properties using machine learning. Dependencies between molecular properties are modeled as a Bayesian tree network; the tree structure is estimated on the fly from the instance data. For each edge, we also estimate the expected covariance between the two random variables. For fixed marginal probabilities, we then estimate conditional probabilities using the known covariance. Now, the corrected posterior probability of each candidate can be computed, and candidates are ranked by this score. Modeling dependencies improves identification rates of CSI:FingerID by 2.85 percentage points. Availability and implementation: The new scoring Bayesian (fixed tree) is integrated into SIRIUS 4.0 (https://bio.informatik.uni-jena.de/software/sirius/).

Assuntos

Bases de Dados de Compostos Químicos , Metabolômica , Espectrometria de Massas em Tandem , Teorema de Bayes , Aprendizado de Máquina , Metabolômica/métodos , Software

10.

Searching molecular structure databases with tandem mass spectra using CSI:FingerID.

Dührkop, Kai; Shen, Huibin; Meusel, Marvin; Rousu, Juho; Böcker, Sebastian.

Proc Natl Acad Sci U S A ; 112(41): 12580-5, 2015 Oct 13.

Artigo em Inglês | MEDLINE | ID: mdl-26392543

RESUMO

Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem MS to identify the thousands of compounds in a biological sample. Today, the vast majority of metabolites remain unknown. We present a method for searching molecular structure databases using tandem MS data of small molecules. Our method computes a fragmentation tree that best explains the fragmentation spectrum of an unknown molecule. We use the fragmentation tree to predict the molecular structure fingerprint of the unknown compound using machine learning. This fingerprint is then used to search a molecular structure database such as PubChem. Our method is shown to improve on the competing methods for computational metabolite identification by a considerable margin.

Assuntos

Bases de Dados de Proteínas , Aprendizado de Máquina , Espectrometria de Massas , Metabolômica , Animais , Humanos

11.

Current Challenges in Plant Eco-Metabolomics.

Peters, Kristian; Worrich, Anja; Weinhold, Alexander; Alka, Oliver; Balcke, Gerd; Birkemeyer, Claudia; Bruelheide, Helge; Calf, Onno W; Dietz, Sophie; Dührkop, Kai; Gaquerel, Emmanuel; Heinig, Uwe; Kücklich, Marlen; Macel, Mirka; Müller, Caroline; Poeschl, Yvonne; Pohnert, Georg; Ristok, Christian; Rodríguez, Victor Manuel; Ruttkies, Christoph; Schuman, Meredith; Schweiger, Rabea; Shahaf, Nir; Steinbeck, Christoph; Tortosa, Maria; Treutler, Hendrik; Ueberschaar, Nico; Velasco, Pablo; Weiß, Brigitte M; Widdig, Anja; Neumann, Steffen; Dam, Nicole M van.

Int J Mol Sci ; 19(5)2018 May 06.

Artigo em Inglês | MEDLINE | ID: mdl-29734799

RESUMO

The relatively new research discipline of Eco-Metabolomics is the application of metabolomics techniques to ecology with the aim to characterise biochemical interactions of organisms across different spatial and temporal scales. Metabolomics is an untargeted biochemical approach to measure many thousands of metabolites in different species, including plants and animals. Changes in metabolite concentrations can provide mechanistic evidence for biochemical processes that are relevant at ecological scales. These include physiological, phenotypic and morphological responses of plants and communities to environmental changes and also interactions with other organisms. Traditionally, research in biochemistry and ecology comes from two different directions and is performed at distinct spatiotemporal scales. Biochemical studies most often focus on intrinsic processes in individuals at physiological and cellular scales. Generally, they take a bottom-up approach scaling up cellular processes from spatiotemporally fine to coarser scales. Ecological studies usually focus on extrinsic processes acting upon organisms at population and community scales and typically study top-down and bottom-up processes in combination. Eco-Metabolomics is a transdisciplinary research discipline that links biochemistry and ecology and connects the distinct spatiotemporal scales. In this review, we focus on approaches to study chemical and biochemical interactions of plants at various ecological levels, mainly plantâ»organismal interactions, and discuss related examples from other domains. We present recent developments and highlight advancements in Eco-Metabolomics over the last decade from various angles. We further address the five key challenges: (1) complex experimental designs and large variation of metabolite profiles; (2) feature extraction; (3) metabolite identification; (4) statistical analyses; and (5) bioinformatics software tools and workflows. The presented solutions to these challenges will advance connecting the distinct spatiotemporal scales and bridging biochemistry and ecology.

Assuntos

Ecologia , Metabolômica/tendências , Plantas/genética , Plantas/metabolismo

12.

Fast metabolite identification with Input Output Kernel Regression.

Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho.

Bioinformatics ; 32(12): i28-i36, 2016 06 15.

Artigo em Inglês | MEDLINE | ID: mdl-27307628

RESUMO

MOTIVATION: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. RESULTS: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. CONTACT: celine.brouard@aalto.fi SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos , Aprendizado de Máquina , Metabolômica , Estrutura Molecular , Espectrometria de Massas em Tandem , Algoritmos , Bases de Dados de Compostos Químicos

13.

Metabolite identification through multiple kernel learning on fragmentation trees.

Shen, Huibin; Dührkop, Kai; Böcker, Sebastian; Rousu, Juho.

Bioinformatics ; 30(12): i157-64, 2014 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-24931979

RESUMO

MOTIVATION: Metabolite identification from tandem mass spectrometric data is a key task in metabolomics. Various computational methods have been proposed for the identification of metabolites from tandem mass spectra. Fragmentation tree methods explore the space of possible ways in which the metabolite can fragment, and base the metabolite identification on scoring of these fragmentation trees. Machine learning methods have been used to map mass spectra to molecular fingerprints; predicted fingerprints, in turn, can be used to score candidate molecular structures. RESULTS: Here, we combine fragmentation tree computations with kernel-based machine learning to predict molecular fingerprints and identify molecular structures. We introduce a family of kernels capturing the similarity of fragmentation trees, and combine these kernels using recently proposed multiple kernel learning approaches. Experiments on two large reference datasets show that the new methods significantly improve molecular fingerprint prediction accuracy. These improvements result in better metabolite identification, doubling the number of metabolites ranked at the top position of the candidates list.

Assuntos

Inteligência Artificial , Metabolômica/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Estrutura Molecular , Software

14.

Fast alignment of fragmentation trees.

Hufsky, Franziska; Dührkop, Kai; Rasche, Florian; Chimani, Markus; Böcker, Sebastian.

Bioinformatics ; 28(12): i265-73, 2012 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-22689771

RESUMO

MOTIVATION: Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules such as metabolites. One major bottleneck in metabolomics is the identification of 'unknown' small molecules not in any database. Recently, fragmentation tree alignments have been introduced for the automated comparison of the fragmentation patterns of small molecules. Fragmentation pattern similarities are strongly correlated with the chemical similarity of the molecules, and allow us to cluster compounds based solely on their fragmentation patterns. RESULTS: Aligning fragmentation trees is computationally hard. Nevertheless, we present three exact algorithms for the problem: a dynamic programming (DP) algorithm, a sparse variant of the DP, and an Integer Linear Program (ILP). Evaluation of our methods on three different datasets showed that thousands of alignments can be computed in a matter of minutes using DP, even for 'challenging' instances. Running times of the sparse DP were an order of magnitude better than for the classical DP. The ILP was clearly outperformed by both DP approaches. We also found that for both DP algorithms, computing the 1% slowest alignments required as much time as computing the 99% fastest.

Assuntos

Algoritmos , Biologia Computacional/métodos , Espectrometria de Massas , Metabolômica/métodos , Bases de Dados Factuais

15.

High-confidence structural annotation of metabolites absent from spectral libraries.

Hoffmann, Martin A; Nothias, Louis-Félix; Ludwig, Marcus; Fleischauer, Markus; Gentry, Emily C; Witting, Michael; Dorrestein, Pieter C; Dührkop, Kai; Böcker, Sebastian.

Nat Biotechnol ; 40(3): 411-421, 2022 03.

Artigo em Inglês | MEDLINE | ID: mdl-34650271

RESUMO

Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but, typically, only a small fraction of spectra can be matched. Previous in silico methods search in structure databases but cannot distinguish between correct and incorrect annotations. Here we introduce the COSMIC workflow that combines in silico structure database generation and annotation with a confidence score consisting of kernel density P value estimation and a support vector machine with enforced directionality of features. On diverse datasets, COSMIC annotates a substantial number of hits at low false discovery rates and outperforms spectral library search. To demonstrate that COSMIC can annotate structures never reported before, we annotated 12 natural bile acids. The annotation of nine structures was confirmed by manual evaluation and two structures using synthetic standards. In human samples, we annotated and manually validated 315 molecular structures currently absent from the Human Metabolome Database. Application of COSMIC to data from 17,400 metabolomics experiments led to 1,715 high-confidence structural annotations that were absent from spectral libraries.

Assuntos

Metabolômica , Espectrometria de Massas em Tandem , Bases de Dados Factuais , Humanos , Metaboloma , Metabolômica/métodos , Estrutura Molecular

16.

Standardized multi-omics of Earth's microbiomes reveals microbial and metabolite diversity.

Shaffer, Justin P; Nothias, Louis-Félix; Thompson, Luke R; Sanders, Jon G; Salido, Rodolfo A; Couvillion, Sneha P; Brejnrod, Asker D; Lejzerowicz, Franck; Haiminen, Niina; Huang, Shi; Lutz, Holly L; Zhu, Qiyun; Martino, Cameron; Morton, James T; Karthikeyan, Smruthi; Nothias-Esposito, Mélissa; Dührkop, Kai; Böcker, Sebastian; Kim, Hyun Woo; Aksenov, Alexander A; Bittremieux, Wout; Minich, Jeremiah J; Marotz, Clarisse; Bryant, MacKenzie M; Sanders, Karenina; Schwartz, Tara; Humphrey, Greg; Vásquez-Baeza, Yoshiki; Tripathi, Anupriya; Parida, Laxmi; Carrieri, Anna Paola; Beck, Kristen L; Das, Promi; González, Antonio; McDonald, Daniel; Ladau, Joshua; Karst, Søren M; Albertsen, Mads; Ackermann, Gail; DeReus, Jeff; Thomas, Torsten; Petras, Daniel; Shade, Ashley; Stegen, James; Song, Se Jin; Metz, Thomas O; Swafford, Austin D; Dorrestein, Pieter C; Jansson, Janet K; Gilbert, Jack A.

Nat Microbiol ; 7(12): 2128-2150, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36443458

RESUMO

Despite advances in sequencing, lack of standardization makes comparisons across studies challenging and hampers insights into the structure and function of microbial communities across multiple habitats on a planetary scale. Here we present a multi-omics analysis of a diverse set of 880 microbial community samples collected for the Earth Microbiome Project. We include amplicon (16S, 18S, ITS) and shotgun metagenomic sequence data, and untargeted metabolomics data (liquid chromatography-tandem mass spectrometry and gas chromatography mass spectrometry). We used standardized protocols and analytical methods to characterize microbial communities, focusing on relationships and co-occurrences of microbially related metabolites and microbial taxa across environments, thus allowing us to explore diversity at extraordinary scale. In addition to a reference database for metagenomic and metabolomic data, we provide a framework for incorporating additional studies, enabling the expansion of existing knowledge in the form of an evolving community resource. We demonstrate the utility of this database by testing the hypothesis that every microbe and metabolite is everywhere but the environment selects. Our results show that metabolite diversity exhibits turnover and nestedness related to both microbial communities and the environment, whereas the relative abundances of microbially related metabolites vary and co-occur with specific microbial consortia in a habitat-specific manner. We additionally show the power of certain chemistry, in particular terpenoids, in distinguishing Earth's environments (for example, terrestrial plant surfaces and soils, freshwater and marine animal stool), as well as that of certain microbes including Conexibacter woesei (terrestrial soils), Haloquadratum walsbyi (marine deposits) and Pantoea dispersa (terrestrial plant detritus). This Resource provides insight into the taxa and metabolites within microbial communities from diverse habitats across Earth, informing both microbial and chemical ecology, and provides a foundation and methods for multi-omics microbiome studies of hosts and the environment.

Assuntos

Microbiota , Animais , Microbiota/genética , Metagenoma , Metagenômica , Planeta Terra , Solo

17.

Studying Charge Migration Fragmentation of Sodiated Precursor Ions in Collision-Induced Dissociation at the Library Scale.

Ludwig, Marcus; Broeckling, Corey D; Dorrestein, Pieter C; Dührkop, Kai; Schymanski, Emma L; Böcker, Sebastian; Nothias, Louis-Félix.

J Am Soc Mass Spectrom ; 32(1): 180-186, 2021 Jan 06.

Artigo em Inglês | MEDLINE | ID: mdl-33186010

RESUMO

Interpretation of fragmentation mass spectra depends on our knowledge of collision-induced dissociation mechanisms. Computational methods for the annotation of fragmentation mechanisms operate within the boundaries of recognized fragmentation pathways. The prevalence of charge migration fragmentation (CMF) in sodiated ion fragmentation spectra, which produces nonsodiated fragment ions, is unknown. Here, we investigated the extent of CMF in the fragmentation spectra of sodiated precursors by mining the NIST17 spectral library using a diagnostic mass difference. Our results showed that a substantial amount of fragment ions in sodiated precursor spectra are derived from CMF, indicating that this fragmentation mechanism should be commonly considered by computational methods for compound annotation.

18.

Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra.

Dührkop, Kai; Nothias, Louis-Félix; Fleischauer, Markus; Reher, Raphael; Ludwig, Marcus; Hoffmann, Martin A; Petras, Daniel; Gerwick, William H; Rousu, Juho; Dorrestein, Pieter C; Böcker, Sebastian.

Nat Biotechnol ; 39(4): 462-471, 2021 04.

Artigo em Inglês | MEDLINE | ID: mdl-33230292

RESUMO

Metabolomics using nontargeted tandem mass spectrometry can detect thousands of molecules in a biological sample. However, structural molecule annotation is limited to structures present in libraries or databases, restricting analysis and interpretation of experimental data. Here we describe CANOPUS (class assignment and ontology prediction using mass spectrometry), a computational tool for systematic compound class annotation. CANOPUS uses a deep neural network to predict 2,497 compound classes from fragmentation spectra, including all biologically relevant classes. CANOPUS explicitly targets compounds for which neither spectral nor structural reference data are available and predicts classes lacking tandem mass spectrometry training data. In evaluation using reference data, CANOPUS reached very high prediction performance (average accuracy of 99.7% in cross-validation) and outperformed four baseline methods. We demonstrate the broad utility of CANOPUS by investigating the effect of microbial colonization in the mouse digestive system, through analysis of the chemodiversity of different Euphorbia plants and regarding the discovery of a marine natural product, revealing biological insights at the compound class level.

Assuntos

Organismos Aquáticos/química , Produtos Biológicos/análise , Biologia Computacional/métodos , Euphorbia/química , Metabolômica/métodos , Animais , Cromatografia Líquida , Microbioma Gastrointestinal , Camundongos , Redes Neurais de Computação , Espectrometria de Massas em Tandem

19.

Ion identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment.

Schmid, Robin; Petras, Daniel; Nothias, Louis-Félix; Wang, Mingxun; Aron, Allegra T; Jagels, Annika; Tsugawa, Hiroshi; Rainer, Johannes; Garcia-Aloy, Mar; Dührkop, Kai; Korf, Ansgar; Pluskal, Tomás; Kameník, Zdenek; Jarmusch, Alan K; Caraballo-Rodríguez, Andrés Mauricio; Weldon, Kelly C; Nothias-Esposito, Melissa; Aksenov, Alexander A; Bauermeister, Anelize; Albarracin Orio, Andrea; Grundmann, Carlismari O; Vargas, Fernando; Koester, Irina; Gauglitz, Julia M; Gentry, Emily C; Hövelmann, Yannick; Kalinina, Svetlana A; Pendergraft, Matthew A; Panitchpakdi, Morgan; Tehan, Richard; Le Gouellec, Audrey; Aleti, Gajender; Mannochio Russo, Helena; Arndt, Birgit; Hübner, Florian; Hayen, Heiko; Zhi, Hui; Raffatellu, Manuela; Prather, Kimberly A; Aluwihare, Lihini I; Böcker, Sebastian; McPhail, Kerry L; Humpf, Hans-Ulrich; Karst, Uwe; Dorrestein, Pieter C.

Nat Commun ; 12(1): 3832, 2021 06 22.

Artigo em Inglês | MEDLINE | ID: mdl-34158495

RESUMO

Molecular networking connects mass spectra of molecules based on the similarity of their fragmentation patterns. However, during ionization, molecules commonly form multiple ion species with different fragmentation behavior. As a result, the fragmentation spectra of these ion species often remain unconnected in tandem mass spectrometry-based molecular networks, leading to redundant and disconnected sub-networks of the same compound classes. To overcome this bottleneck, we develop Ion Identity Molecular Networking (IIMN) that integrates chromatographic peak shape correlation analysis into molecular networks to connect and collapse different ion species of the same molecule. The new feature relationships improve network connectivity for structurally related molecules, can be used to reveal unknown ion-ligand complexes, enhance annotation within molecular networks, and facilitate the expansion of spectral reference libraries. IIMN is integrated into various open source feature finding tools and the GNPS environment. Moreover, IIMN-based spectral libraries with a broad coverage of ion species are publicly available.

Assuntos

Biologia Computacional/métodos , Íons/metabolismo , Espectrometria de Massas/métodos , Redes e Vias Metabólicas , Metabolômica/métodos , Animais , Internet , Íons/química , Estrutura Molecular , Reprodutibilidade dos Testes , Software

20.

De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4.

Ludwig, Marcus; Fleischauer, Markus; Dührkop, Kai; Hoffmann, Martin A; Böcker, Sebastian.

Methods Mol Biol ; 2104: 185-207, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-31953819

RESUMO

SIRIUS 4 is the best-in-class computational tool for metabolite identification from high-resolution tandem mass spectrometry data. It offers de novo molecular formula annotation with outstanding accuracy. When searching fragmentation spectra in a structure database, it reaches over 70% correct identifications. A predicted fingerprint, which indicates the presence or absence of thousands of molecular properties, helps to deduce information about the compound of interest even if it is not contained in any structure database. Here, we present best practices and describe how to leverage the full potential of SIRIUS 4, how to incorporate it into your own workflow, and how it adds value to the analysis of mass spectrometry data beyond spectral library search.

Assuntos

Biologia Computacional , Bases de Dados Factuais , Metabolômica , Software , Cromatografia Líquida , Biologia Computacional/métodos , Humanos , Metabolômica/métodos , Estrutura Molecular , Espectrometria de Massas por Ionização por Electrospray , Relação Estrutura-Atividade , Espectrometria de Massas em Tandem , Interface Usuário-Computador , Fluxo de Trabalho

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa