Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Bases de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
2.
Metabolites ; 13(3)2023 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-36984753

RESUMO

Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics usually relies on mass spectrometry, a technology capable of detecting thousands of compounds in a biological sample. Metabolite annotation is executed using tandem mass spectrometry. Spectral library search is far from comprehensive, and numerous compounds remain unannotated. So-called in silico methods allow us to overcome the restrictions of spectral libraries, by searching in much larger molecular structure databases. Yet, after more than a decade of method development, in silico methods still do not reach the correct annotation rates that users would wish for. Here, we present a novel computational method called Mad Hatter for this task. Mad Hatter combines CSI:FingerID results with information from the searched structure database via a metascore. Compound information includes the melting point, and the number of words in the compound description starting with the letter 'u'. We then show that Mad Hatter reaches a stunning 97.6% correct annotations when searching PubChem, one of the largest and most comprehensive molecular structure databases. Unfortunately, Mad Hatter is not a real method. Rather, we developed Mad Hatter solely for the purpose of demonstrating common issues in computational method development and evaluation. We explain what evaluation glitches were necessary for Mad Hatter to reach this annotation level, what is wrong with similar metascores in general, and why metascores may screw up not only method evaluations but also the analysis of biological experiments. This paper may serve as an example of problems in the development and evaluation of machine learning models for metabolite annotation.

3.
Nat Biotechnol ; 40(3): 411-421, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34650271

RESUMO

Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but, typically, only a small fraction of spectra can be matched. Previous in silico methods search in structure databases but cannot distinguish between correct and incorrect annotations. Here we introduce the COSMIC workflow that combines in silico structure database generation and annotation with a confidence score consisting of kernel density P value estimation and a support vector machine with enforced directionality of features. On diverse datasets, COSMIC annotates a substantial number of hits at low false discovery rates and outperforms spectral library search. To demonstrate that COSMIC can annotate structures never reported before, we annotated 12 natural bile acids. The annotation of nine structures was confirmed by manual evaluation and two structures using synthetic standards. In human samples, we annotated and manually validated 315 molecular structures currently absent from the Human Metabolome Database. Application of COSMIC to data from 17,400 metabolomics experiments led to 1,715 high-confidence structural annotations that were absent from spectral libraries.


Assuntos
Metabolômica , Espectrometria de Massas em Tandem , Bases de Dados Factuais , Humanos , Metaboloma , Metabolômica/métodos , Estrutura Molecular
4.
Nat Biotechnol ; 39(4): 462-471, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33230292

RESUMO

Metabolomics using nontargeted tandem mass spectrometry can detect thousands of molecules in a biological sample. However, structural molecule annotation is limited to structures present in libraries or databases, restricting analysis and interpretation of experimental data. Here we describe CANOPUS (class assignment and ontology prediction using mass spectrometry), a computational tool for systematic compound class annotation. CANOPUS uses a deep neural network to predict 2,497 compound classes from fragmentation spectra, including all biologically relevant classes. CANOPUS explicitly targets compounds for which neither spectral nor structural reference data are available and predicts classes lacking tandem mass spectrometry training data. In evaluation using reference data, CANOPUS reached very high prediction performance (average accuracy of 99.7% in cross-validation) and outperformed four baseline methods. We demonstrate the broad utility of CANOPUS by investigating the effect of microbial colonization in the mouse digestive system, through analysis of the chemodiversity of different Euphorbia plants and regarding the discovery of a marine natural product, revealing biological insights at the compound class level.


Assuntos
Organismos Aquáticos/química , Produtos Biológicos/análise , Biologia Computacional/métodos , Euphorbia/química , Metabolômica/métodos , Animais , Cromatografia Líquida , Microbioma Gastrointestinal , Camundongos , Redes Neurais de Computação , Espectrometria de Massas em Tandem
5.
Methods Mol Biol ; 2104: 185-207, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31953819

RESUMO

SIRIUS 4 is the best-in-class computational tool for metabolite identification from high-resolution tandem mass spectrometry data. It offers de novo molecular formula annotation with outstanding accuracy. When searching fragmentation spectra in a structure database, it reaches over 70% correct identifications. A predicted fingerprint, which indicates the presence or absence of thousands of molecular properties, helps to deduce information about the compound of interest even if it is not contained in any structure database. Here, we present best practices and describe how to leverage the full potential of SIRIUS 4, how to incorporate it into your own workflow, and how it adds value to the analysis of mass spectrometry data beyond spectral library search.


Assuntos
Biologia Computacional , Bases de Dados Factuais , Metabolômica , Software , Cromatografia Líquida , Biologia Computacional/métodos , Humanos , Metabolômica/métodos , Estrutura Molecular , Espectrometria de Massas por Ionização por Electrospray , Relação Estrutura-Atividade , Espectrometria de Massas em Tandem , Interface Usuário-Computador , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA