Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-39131324

RESUMO

Methods for assessing compound identification confidence in metabolomics and related studies have been debated and actively researched for the past two decades. The earliest effort in 2007 focused primarily on mass spectrometry and nuclear magnetic resonance spectroscopy and resulted in four recommended levels of metabolite identification confidence - the Metabolite Standards Initiative (MSI) Levels. In 2014, the original MSI Levels were expanded to five levels (including two sublevels) to facilitate communication of compound identification confidence in high resolution mass spectrometry studies. Further refinement in identification levels have occurred, for example to accommodate use of ion mobility spectrometry in metabolomics workflows, and alternate approaches to communicate compound identification confidence also have been developed based on identification points schema. However, neither qualitative levels of identification confidence nor quantitative scoring systems address the degree of ambiguity in compound identifications in context of the chemical space being considered, are easily automated, or are transferable between analytical platforms. In this perspective, we propose that the metabolomics and related communities consider identification probability as an approach for automated and transferable assessment of compound identification and ambiguity in metabolomics and related studies. Identification probability is defined simply as 1/N, where N is the number of compounds in a reference library or chemical space that match to an experimentally measured molecule within user-defined measurement precision(s), for example mass measurement or retention time accuracy, etc. We demonstrate the utility of identification probability in an in silico analysis of multi-property reference libraries constructed from the Human Metabolome Database and computational property predictions, provide guidance to the community in transparent implementation of the concept, and invite the community to further evaluate this concept in parallel with their current preferred methods for assessing metabolite identification confidence.

2.
Nucleic Acids Res ; 52(W1): W381-W389, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38783107

RESUMO

GCMS-ID (Gas Chromatography Mass Spectrometry compound IDentifier) is a webserver designed to enable the identification of compounds from GC-MS experiments. GC-MS instruments produce both electron impact mass spectra (EI-MS) and retention index (RI) data for as few as one, to as many as hundreds of different compounds. Matching the measured EI-MS, RI or EI-MS + RI data to experimentally collected EI-MS and/or RI reference libraries allows facile compound identification. However, the number of available experimental RI and EI-MS reference spectra, especially for metabolomics or exposomics-related studies, is disappointingly small. Using machine learning to accurately predict the EI-MS spectra and/or RIs for millions of metabolomics and/or exposomics-relevant compounds could (partially) solve this spectral matching problem. This computational approach to compound identification is called in silico metabolomics. GCMS-ID brings this concept of in silico metabolomics closer to reality by intelligently integrating two of our previously published webservers: CFM-EI and RIpred. CFM-EI is an EI-MS spectral prediction webserver, and RIpred is a Kovats RI prediction webserver. We have found that GCMS-ID can accurately identify compounds from experimental RI, EI-MS or RI + EI-MS data through matching to its own large library of >1 million predicted RI/EI-MS values generated for metabolomics/exposomics-relevant compounds. GCMS-ID can also predict the RI or EI-MS spectrum from a user-submitted structure or annotate a user-submitted EI-MS spectrum. GCMS-ID is freely available at https://gcms-id.ca/.


Assuntos
Cromatografia Gasosa-Espectrometria de Massas , Internet , Metabolômica , Software , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica/métodos , Aprendizado de Máquina
3.
Life (Basel) ; 14(5)2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38792560

RESUMO

We show that the nucleic acid bases adenine, cytosine, guanine, thymine, and uracil, as well as 2,6-diaminopurine, and the "core" nucleic acid bases purine and pyrimidine, are stable for more than one year in concentrated sulfuric acid at room temperature and at acid concentrations relevant for Venus clouds (81% w/w to 98% w/w acid, the rest water). This work builds on our initial stability studies and is the first ever to test the reactivity and structural integrity of organic molecules subjected to extended incubation in concentrated sulfuric acid. The one-year-long stability of nucleic acid bases supports the notion that the Venus cloud environment-composed of concentrated sulfuric acid-may be able to support complex organic chemicals for extended periods of time.

4.
Metabolites ; 14(5)2024 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-38786767

RESUMO

NMR is widely considered the gold standard for organic compound structure determination. As such, NMR is routinely used in organic compound identification, drug metabolite characterization, natural product discovery, and the deconvolution of metabolite mixtures in biofluids (metabolomics and exposomics). In many cases, compound identification by NMR is achieved by matching measured NMR spectra to experimentally collected NMR spectral reference libraries. Unfortunately, the number of available experimental NMR reference spectra, especially for metabolomics, medical diagnostics, or drug-related studies, is quite small. This experimental gap could be filled by predicting NMR chemical shifts for known compounds using computational methods such as machine learning (ML). Here, we describe how a deep learning algorithm that is trained on a high-quality, "solvent-aware" experimental dataset can be used to predict 1H chemical shifts more accurately than any other known method. The new program, called PROSPRE (PROton Shift PREdictor) can accurately (mean absolute error of <0.10 ppm) predict 1H chemical shifts in water (at neutral pH), chloroform, dimethyl sulfoxide, and methanol from a user-submitted chemical structure. PROSPRE (pronounced "prosper") has also been used to predict 1H chemical shifts for >600,000 molecules in many popular metabolomic, drug, and natural product databases.

6.
J Agric Food Chem ; 72(25): 14099-14113, 2024 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-38181219

RESUMO

Cannabis is widely used for medicinal and recreational purposes. As a result, there is increased interest in its chemical components and their physiological effects. However, current information on cannabis chemistry is often outdated or scattered across many books and journals. To address this issue, we used modern metabolomics techniques and modern bioinformatics techniques to compile a comprehensive list of >6000 chemical constituents in commercial cannabis. The metabolomics methods included a combination of high- and low-resolution liquid chromatography-mass spectrometry (MS), gas chromatography-MS, and inductively coupled plasma-MS. The bioinformatics methods included computer-aided text mining and computational genome-scale metabolic inference. This information, along with detailed compound descriptions, physicochemical data, known physiological effects, protein targets, and referential compound spectra, has been made available through a publicly accessible database called the Cannabis Compound Database (https://cannabisdatabase.ca). Such a centralized, open-access resource should prove to be quite useful for the cannabis community.


Assuntos
Cannabis , Cannabis/química , Metabolômica , Cromatografia Gasosa-Espectrometria de Massas , Extratos Vegetais/química , Espectrometria de Massas , Biologia Computacional
7.
Nucleic Acids Res ; 52(D1): D654-D662, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37962386

RESUMO

PathBank (https://pathbank.org) and its predecessor database, the Small Molecule Pathway Database (SMPDB), have been providing comprehensive metabolite pathway information for the metabolomics community since 2010. Over the past 14 years, these pathway databases have grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in computing technology. This year's update, PathBank 2.0, brings a number of important improvements and upgrades that should make the database more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the number of primary or canonical pathways (from 1720 to 6951); (ii) a massive increase in the total number of pathways (from 110 234 to 605 359); (iii) significant improvements to the quality of pathway diagrams and pathway descriptions; (iv) a strong emphasis on drug metabolism and drug mechanism pathways; (v) making most pathway images more slide-compatible and manuscript-compatible; (vi) adding tools to support better pathway filtering and selecting through a more complete pathway taxonomy; (vii) adding pathway analysis tools for visualizing and calculating pathway enrichment. Many other minor improvements and updates to the content, the interface and general performance of the PathBank website have also been made. Overall, we believe these upgrades and updates should greatly enhance PathBank's ease of use and its potential applications for interpreting metabolomics data.


Assuntos
Bases de Dados Genéticas , Redes e Vias Metabólicas , Metabolômica , Redes e Vias Metabólicas/genética , Metaboloma , Metabolômica/métodos , Internet
8.
Nucleic Acids Res ; 52(D1): D1265-D1275, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37953279

RESUMO

First released in 2006, DrugBank (https://go.drugbank.com) has grown to become the 'gold standard' knowledge resource for drug, drug-target and related pharmaceutical information. DrugBank is widely used across many diverse biomedical research and clinical applications, and averages more than 30 million views/year. Since its last update in 2018, we have been actively enhancing the quantity and quality of the drug data in this knowledgebase. In this latest release (DrugBank 6.0), the number of FDA approved drugs has grown from 2646 to 4563 (a 72% increase), the number of investigational drugs has grown from 3394 to 6231 (a 38% increase), the number of drug-drug interactions increased from 365 984 to 1 413 413 (a 300% increase), and the number of drug-food interactions expanded from 1195 to 2475 (a 200% increase). In addition to this notable expansion in database size, we have added thousands of new, colorful, richly annotated pathways depicting drug mechanisms and drug metabolism. Likewise, existing datasets have been significantly improved and expanded, by adding more information on drug indications, drug-drug interactions, drug-food interactions and many other relevant data types for 11 891 drugs. We have also added experimental and predicted MS/MS spectra, 1D/2D-NMR spectra, CCS (collision cross section), RT (retention time) and RI (retention index) data for 9464 of DrugBank's 11 710 small molecule drugs. These and other improvements should make DrugBank 6.0 even more useful to a much wider research audience ranging from medicinal chemists to metabolomics specialists to pharmacologists.


Assuntos
Bases de Conhecimento , Metabolômica , Espectrometria de Massas em Tandem , Bases de Dados Factuais , Interações Alimento-Droga
9.
Anal Chem ; 95(50): 18326-18334, 2023 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-38048435

RESUMO

The market for illicit drugs has been reshaped by the emergence of more than 1100 new psychoactive substances (NPS) over the past decade, posing a major challenge to the forensic and toxicological laboratories tasked with detecting and identifying them. Tandem mass spectrometry (MS/MS) is the primary method used to screen for NPS within seized materials or biological samples. The most contemporary workflows necessitate labor-intensive and expensive MS/MS reference standards, which may not be available for recently emerged NPS on the illicit market. Here, we present NPS-MS, a deep learning method capable of accurately predicting the MS/MS spectra of known and hypothesized NPS from their chemical structures alone. NPS-MS is trained by transfer learning from a generic MS/MS prediction model on a large data set of MS/MS spectra. We show that this approach enables a more accurate identification of NPS from experimentally acquired MS/MS spectra than any existing method. We demonstrate the application of NPS-MS to identify a novel derivative of phencyclidine (PCP) within an unknown powder seized in Denmark without the use of any reference standards. We anticipate that NPS-MS will allow forensic laboratories to identify more rapidly both known and newly emerging NPS. NPS-MS is available as a web server at https://nps-ms.ca/, which provides MS/MS spectra prediction capabilities for given NPS compounds. Additionally, it offers MS/MS spectra identification against a vast database comprising approximately 8.7 million predicted NPS compounds from DarkNPS and 24.5 million predicted ESI-QToF-MS/MS spectra for these compounds.


Assuntos
Aprendizado Profundo , Drogas Ilícitas , Espectrometria de Massas em Tandem/métodos , Psicotrópicos/análise , Drogas Ilícitas/análise , Espectrometria de Massas por Ionização por Electrospray
10.
J Nat Prod ; 86(11): 2554-2561, 2023 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-37935005

RESUMO

Nuclear magnetic resonance (NMR) data are rarely deposited in open databases, leading to loss of critical scientific knowledge. Existing data reporting methods (images, tables, lists of values) contain less information than raw data and are poorly standardized. Together, these issues limit FAIR (findable, accessible, interoperable, reusable) access to these data, which in turn creates barriers for compound dereplication and the development of new data-driven discovery tools. Existing NMR databases either are not designed for natural products data or employ complex deposition interfaces that disincentivize deposition. Journals, including the Journal of Natural Products (JNP), are now requiring data submission as part of the publication process, creating the need for a streamlined, user-friendly mechanism to deposit and distribute NMR data.


Assuntos
Produtos Biológicos , Bases de Dados Factuais , Espectroscopia de Ressonância Magnética
11.
J Chromatogr A ; 1705: 464176, 2023 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-37413909

RESUMO

We describe a freely available web server called Retention Index Predictor (RIpred) (https://ripred.ca) that rapidly and accurately predicts Gas Chromatographic Kováts Retention Indices (RI) using SMILES strings as chemical structure input. RIpred performs RI prediction for three different stationary phases (semi-standard non-polar (SSNP), standard non-polar (SNP), and standard polar (SP)) for both derivatized (trimethylsilyl (TMS) and tert­butyldimethylsilyl (TBDMS) derivatized) and underivatized (base compound) forms of GC-amenable structures. RIpred was developed to address the need for freely available, fast, highly accurate RI predictions for a wide range of derivatized and underivatized chemicals for all common GC stationary phases. RIpred was trained using a Graph Neural Network (GNN) that used compound structures, their extracted features (mostly atom-level features) and the GC-RI data from the National Institute of Standards and Technology databases (NIST 17 and NIST 20). We curated this NIST 17 and NIST 20 GC-RI data, which is available for all three stationary phases, to create appropriate inputs (molecular graphs in this case) needed to enhance our model performance. The performance of different RIpred predictive models was evaluated using 10-fold cross validation (CV). The best performing RIpred models were identified and when tested on hold-out test sets from all stationary phases, achieved a Mean Absolute Error (MAE) of <73 RI units (SSNP: 16.5-29.5, SNP: 38.5-45.9, SP: 46.52-72.53). The Mean Absolute Percentage Error (MAPE) of these models were typically within 3% (SSNP: 0.78-1.62%, SNP: 1.87-2.88%, SP: 2.34-4.05%). When compared to the best performing model by Qu et al., 2021, RIpred performed similarly (MAE of 16.57 RI units [RIpred] vs. 16.84 RI units [Qu et al., 2021 predictor] for derivatized compounds). RIpred also includes ∼5 million predicted RI values for all GC-amenable compounds (∼57,000) in the Human Metabolome Database HMDB 5.0 (Wishart et al., 2022).


Assuntos
Metaboloma , Redes Neurais de Computação , Humanos , Cromatografia Gasosa/métodos , Bases de Dados Factuais
12.
Proc Natl Acad Sci U S A ; 120(25): e2220007120, 2023 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-37307485

RESUMO

What constitutes a habitable planet is a frontier to be explored and requires pushing the boundaries of our terracentric viewpoint for what we deem to be a habitable environment. Despite Venus' 700 K surface temperature being too hot for any plausible solvent and most organic covalent chemistry, Venus' cloud-filled atmosphere layers at 48 to 60 km above the surface hold the main requirements for life: suitable temperatures for covalent bonds; an energy source (sunlight); and a liquid solvent. Yet, the Venus clouds are widely thought to be incapable of supporting life because the droplets are composed of concentrated liquid sulfuric acid-an aggressive solvent that is assumed to rapidly destroy most biochemicals of life on Earth. Recent work, however, demonstrates that a rich organic chemistry can evolve from simple precursor molecules seeded into concentrated sulfuric acid, a result that is corroborated by domain knowledge in industry that such chemistry leads to complex molecules, including aromatics. We aim to expand the set of molecules known to be stable in concentrated sulfuric acid. Here, we show that nucleic acid bases adenine, cytosine, guanine, thymine, and uracil, as well as 2,6-diaminopurine and the "core" nucleic acid bases purine and pyrimidine, are stable in sulfuric acid in the Venus cloud temperature and sulfuric acid concentration range, using UV spectroscopy and combinations of 1D and 2D 1H 13C 15N NMR spectroscopy. The stability of nucleic acid bases in concentrated sulfuric acid advances the idea that chemistry to support life may exist in the Venus cloud particle environment.


Assuntos
Bivalves , Vênus , Adenina , Agressão , Ácidos Sulfúricos
13.
Magn Reson Chem ; 61(12): 681-704, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37265034

RESUMO

Nuclear magnetic resonance (NMR) spectral analysis of biofluids can be a time-consuming process, requiring the expertise of a trained operator. With NMR becoming increasingly popular in the field of metabolomics, there is a growing need to change this paradigm and to automate the process. Here we introduce MagMet, an online web server, that automates the processing and quantification of 1D 1 H NMR spectra from biofluids-specifically, human serum/plasma metabolites, including those associated with inborn errors of metabolism (IEM). MagMet uses a highly efficient data processing procedure that performs automatic Fourier Transformation, phase correction, baseline optimization, chemical shift referencing, water signal removal, and peak picking/peak alignment. MagMet then uses the peak positions, linewidth information, and J-couplings from its own specially prepared standard metabolite reference spectral NMR library of 85 serum/plasma compounds to identify and quantify compounds from experimentally acquired NMR spectra of serum/plasma. MagMet employs linewidth adjustment for more consistent quantification of metabolites from higher field instruments and incorporates a highly efficient data processing procedure for more rapid and accurate detection and quantification of metabolites. This optimized algorithm allows the MagMet webserver to quickly detect and quantify 58 serum/plasma metabolites in 2.6 min per spectrum (when processing a dataset of 50-100 spectra). MagMet's performance was also assessed using spectra collected from defined mixtures (simulating other biofluids), with >100 previously measured plasma spectra, and from spiked serum/plasma samples simulating known IEMs. In all cases, MagMet performed with precision and accuracy matching the performance of human spectral profiling experts. MagMet is available at http://magmet.ca.


Assuntos
Imageamento por Ressonância Magnética , Metabolômica , Humanos , Espectroscopia de Ressonância Magnética/métodos , Metabolômica/métodos , Soro , Algoritmos
14.
Anal Chem ; 95(23): 8998-9005, 2023 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-37262385

RESUMO

Infrared ion spectroscopy (IRIS) continues to see increasing use as an analytical tool for small-molecule identification in conjunction with mass spectrometry (MS). The IR spectrum of an m/z selected population of ions constitutes a unique fingerprint that is specific to the molecular structure. However, direct translation of an IR spectrum to a molecular structure remains challenging, as reference libraries of IR spectra of molecular ions largely do not exist. Quantum-chemically computed spectra can reliably be used as reference, but the challenge of selecting the candidate structures remains. Here, we introduce an in silico library of vibrational spectra of common MS adducts of over 4500 compounds found in the human metabolome database. In total, the library currently contains more than 75,000 spectra computed at the DFT level that can be queried with an experimental IR spectrum. Moreover, we introduce a database of 189 experimental IRIS spectra, which is employed to validate the automated spectral matching routines. This demonstrates that 75% of the metabolites in the experimental data set are correctly identified, based solely on their exact m/z and IRIS spectrum. Additionally, we demonstrate an approach for specifically identifying substructures by performing a search without m/z constraints to find structural analogues. Such an unsupervised search paves the way toward the de novo identification of unknowns that are absent in spectral libraries. We apply the in silico spectral library to identify an unknown in a plasma sample as 3-hydroxyhexanoic acid, highlighting the potential of the method.


Assuntos
Metaboloma , Metabolômica , Humanos , Metabolômica/métodos , Espectrometria de Massas/métodos , Biblioteca Gênica , Íons
15.
Nucleic Acids Res ; 51(W1): W443-W450, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37194694

RESUMO

PHASTEST (PHAge Search Tool with Enhanced Sequence Translation) is the successor to the PHAST and PHASTER prophage finding web servers. PHASTEST is designed to support the rapid identification, annotation and visualization of prophage sequences within bacterial genomes and plasmids. PHASTEST also supports rapid annotation and interactive visualization of all other genes (protein coding regions, tRNA/tmRNA/rRNA sequences) in bacterial genomes. Given that bacterial genome sequencing has become so routine, the need for fast tools to comprehensively annotate bacterial genomes has become progressively more important. PHASTEST not only offers faster and more accurate prophage annotations than its predecessors, it also provides more complete whole genome annotations and much improved genome visualization capabilities. In standardized tests, we found that PHASTEST is 31% faster and 2-3% more accurate in prophage identification than PHASTER. Specifically, PHASTEST can process a typical bacterial genome in 3.2 min (raw sequence) or in 1.3 min when given a pre-annotated GenBank file. Improvements in PHASTEST's ability to annotate bacterial genomes now make it a particularly powerful tool for whole genome annotation. In addition, PHASTEST now offers a much more modern and responsive visualization interface that allows users to generate, edit, annotate and interactively visualize (via zooming, rotating, dragging, panning, resetting), colourful, publication quality genome maps. PHASTEST continues to offer popular options such as an API for programmatic queries, a Docker image for local installations, support for multiple (metagenomic) queries and the ability to perform automated look-ups against thousands of previously PHAST-annotated bacterial genomes. PHASTEST is available online at https://phastest.ca.


Assuntos
Bases de Dados de Ácidos Nucleicos , Prófagos , Ferramenta de Busca , Software , Genoma Bacteriano , Anotação de Sequência Molecular , Plasmídeos , Prófagos/genética
16.
Nucleic Acids Res ; 51(W1): W459-W467, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37099365

RESUMO

PlasMapper 3.0 is a web server that allows users to generate, edit, annotate and interactively visualize publication quality plasmid maps. Plasmid maps are used to plan, design, share and publish critical information about gene cloning experiments. PlasMapper 3.0 is the successor to PlasMapper 2.0 and offers many features found only in commercial plasmid mapping/editing packages. PlasMapper 3.0 allows users to paste or upload plasmid sequences as input or to upload existing plasmid maps from its large database of >2000 pre-annotated plasmids (PlasMapDB). This database can be searched by plasmid names, sequence features, restriction sites, preferred host organisms, and sequence length. PlasMapper 3.0 also supports the annotation of new or never-before-seen plasmids using its own feature database that contains common promoters, terminators, regulatory sequences, replication origins, selectable markers and other features found in most cloning plasmids. PlasMapper 3.0 has several interactive sequence editors/viewers that allow users to select and view plasmid regions, insert genes, modify restriction sites or perform codon optimization. The graphics for PlasMapper 3.0 have also been substantially upgraded. It now offers an interactive, full-color plasmid viewer/editor that allows users to zoom, rotate, re-color, linearize, circularize, edit annotated features and modify plasmid images or labels to improve the esthetic qualities of their plasmid map and textual displays. All the plasmid images and textual displays are downloadable in multiple formats. PlasMapper 3.0 is available online at https://plasmapper.ca.


Assuntos
Software , Interface Usuário-Computador , Plasmídeos/genética , Computadores , Sequência de Bases , Internet
17.
Nucleic Acids Res ; 51(D1): D1220-D1229, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36305829

RESUMO

The Chemical Functional Ontology (ChemFOnt), located at https://www.chemfont.ca, is a hierarchical, OWL-compatible ontology describing the functions and actions of >341 000 biologically important chemicals. These include primary metabolites, secondary metabolites, natural products, food chemicals, synthetic food additives, drugs, herbicides, pesticides and environmental chemicals. ChemFOnt is a FAIR-compliant resource intended to bring the same rigor, standardization and formal structure to the terms and terminology used in biochemistry, food chemistry and environmental chemistry as the gene ontology (GO) has brought to molecular biology. ChemFOnt is available as both a freely accessible, web-enabled database and a downloadable Web Ontology Language (OWL) file. Users may download and deploy ChemFOnt within their own chemical databases or integrate ChemFOnt into their own analytical software to generate machine readable relationships that can be used to make new inferences, enrich their omics data sets or make new, non-obvious connections between chemicals and their direct or indirect effects. The web version of the ChemFOnt database has been designed to be easy to search, browse and navigate. Currently ChemFOnt contains data on 341 627 chemicals, including 515 332 terms or definitions. The functional hierarchy for ChemFOnt consists of four functional 'aspects', 12 functional super-categories and a total of 173 705 functional terms. In addition, each of the chemicals are classified into 4825 structure-based chemical classes. ChemFOnt currently contains 3.9 million protein-chemical relationships and ∼10.3 million chemical-functional relationships. The long-term goal for ChemFOnt is for it to be adopted by databases and software tools used by the general chemistry community as well as the metabolomics, exposomics, metagenomics, genomics and proteomics communities.


Assuntos
Bases de Dados de Compostos Químicos , Software , Bases de Dados Factuais , Ontologia Genética , Genômica , Proteômica
18.
Nucleic Acids Res ; 51(D1): D611-D620, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36215042

RESUMO

The Human Microbial Metabolome Database (MiMeDB) (https://mimedb.org) is a comprehensive, multi-omic, microbiome resource that connects: (i) microbes to microbial genomes; (ii) microbial genomes to microbial metabolites; (iii) microbial metabolites to the human exposome and (iv) all of these 'omes' to human health. MiMeDB was established to consolidate the growing body of data connecting the human microbiome and the chemicals it produces to both health and disease. MiMeDB contains detailed taxonomic, microbiological and body-site location data on most known human microbes (bacteria and fungi). This microbial data is linked to extensive genomic and proteomic sequence data that is closely coupled to colourful interactive chromosomal maps. The database also houses detailed information about all the known metabolites generated by these microbes, their structural, chemical and spectral properties, the reactions and enzymes responsible for these metabolites and the primary exposome sources (food, drug, cosmetic, pollutant, etc.) that ultimately lead to the observed microbial metabolites in humans. Additional, extensively referenced data about the known or presumptive health effects, measured biosample concentrations and human protein targets for these compounds is provided. All of this information is housed in richly annotated, highly interactive, visually pleasing database that has been designed to be easy to search, easy to browse and easy to navigate. Currently MiMeDB contains data on 626 health effects or bioactivities, 1904 microbes, 3112 references, 22 054 reactions, 24 254 metabolites or exposure chemicals, 648 861 MS and NMR spectra, 6.4 million genes and 7.6 billion DNA bases. We believe that MiMeDB represents the kind of integrated, multi-omic or systems biology database that is needed to enable comprehensive multi-omic integration.


Assuntos
Metabolômica , Proteômica , Humanos , Metaboloma/genética , Bases de Dados Factuais , Gerenciamento de Dados
19.
Nucleic Acids Res ; 50(W1): W165-W174, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35610037

RESUMO

The CFM-ID 4.0 web server (https://cfmid.wishartlab.com) is an online tool for predicting, annotating and interpreting tandem mass (MS/MS) spectra of small molecules. It is specifically designed to assist researchers pursuing studies in metabolomics, exposomics and analytical chemistry. More specifically, CFM-ID 4.0 supports the: 1) prediction of electrospray ionization quadrupole time-of-flight tandem mass spectra (ESI-QTOF-MS/MS) for small molecules over multiple collision energies (10 eV, 20 eV, and 40 eV); 2) annotation of ESI-QTOF-MS/MS spectra given the structure of the compound; and 3) identification of a small molecule that generated a given ESI-QTOF-MS/MS spectrum at one or more collision energies. The CFM-ID 4.0 web server makes use of a substantially improved MS fragmentation algorithm, a much larger database of experimental and in silico predicted MS/MS spectra and improved scoring methods to offer more accurate MS/MS spectral prediction and MS/MS-based compound identification. Compared to earlier versions of CFM-ID, this new version has an MS/MS spectral prediction performance that is ∼22% better and a compound identification accuracy that is ∼35% better on a standard (CASMI 2016) testing dataset. CFM-ID 4.0 also features a neutral loss function that allows users to identify similar or substituent compounds where no match can be found using CFM-ID's regular MS/MS-to-compound identification utility. Finally, the CFM-ID 4.0 web server now offers a much more refined user interface that is easier to use, supports molecular formula identification (from MS/MS data), provides more interactively viewable data (including proposed fragment ion structures) and displays MS mirror plots for comparing predicted with observed MS/MS spectra. These improvements should make CFM-ID 4.0 much more useful to the community and should make small molecule identification much easier, faster, and more accurate.


Assuntos
Algoritmos , Metabolômica , Software , Espectrometria de Massas em Tandem , Computadores , Metabolômica/métodos , Espectrometria de Massas por Ionização por Electrospray , Espectrometria de Massas em Tandem/métodos , Internet
20.
Nucleic Acids Res ; 50(W1): W115-W123, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35536252

RESUMO

BioTransformer 3.0 (https://biotransformer.ca) is a freely available web server that supports accurate, rapid and comprehensive in silico metabolism prediction. It combines machine learning approaches with a rule-based system to predict small-molecule metabolism in human tissues, the human gut as well as the external environment (soil and water microbiota). Simply stated, BioTransformer takes a molecular structure as input (SMILES or SDF) and outputs an interactively sortable table of the predicted metabolites or transformation products (SMILES, PNG images) along with the enzymes that are predicted to be responsible for those reactions and richly annotated downloadable files (CSV and JSON). The entire process typically takes less than a minute. Previous versions of BioTransformer focused exclusively on predicting the metabolism of xenobiotics (such as plant natural products, drugs, cosmetics and other synthetic compounds) using a limited number of pre-defined steps and somewhat limited rule-based methods. BioTransformer 3.0 uses much more sophisticated methods and incorporates new databases, new constraints and new prediction modules to not only more accurately predict the metabolic transformation products of exogenous xenobiotics but also the transformation products of endogenous metabolites, such as amino acids, peptides, carbohydrates, organic acids, and lipids. BioTransformer 3.0 can also support customized sequential combinations of these transformations along with multiple iterations to simulate multi-step human biotransformation events. Performance tests indicate that BioTransformer 3.0 is 40-50% more accurate, far less prone to combinatorial 'explosions' and much more comprehensive in terms of metabolite coverage/capabilities than previous versions of BioTransformer.


Assuntos
Biologia Computacional , Xenobióticos , Humanos , Biologia Computacional/métodos , Biotransformação , Bases de Dados Factuais , Estrutura Molecular , Xenobióticos/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA