Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 92
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 120(25): e2219373120, 2023 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-37319116

RESUMEN

Fungus-growing ants depend on a fungal mutualist that can fall prey to fungal pathogens. This mutualist is cultivated by these ants in structures called fungus gardens. Ants exhibit weeding behaviors that keep their fungus gardens healthy by physically removing compromised pieces. However, how ants detect diseases of their fungus gardens is unknown. Here, we applied the logic of Koch's postulates using environmental fungal community gene sequencing, fungal isolation, and laboratory infection experiments to establish that Trichoderma spp. can act as previously unrecognized pathogens of Trachymyrmex septentrionalis fungus gardens. Our environmental data showed that Trichoderma are the most abundant noncultivar fungi in wild T. septentrionalis fungus gardens. We further determined that metabolites produced by Trichoderma induce an ant weeding response that mirrors their response to live Trichoderma. Combining ant behavioral experiments with bioactivity-guided fractionation and statistical prioritization of metabolites in Trichoderma extracts demonstrated that T. septentrionalis ants weed in response to peptaibols, a specific class of secondary metabolites known to be produced by Trichoderma fungi. Similar assays conducted using purified peptaibols, including the two previously undescribed peptaibols trichokindins VIII and IX, suggested that weeding is likely induced by peptaibols as a class rather than by a single peptaibol metabolite. In addition to their presence in laboratory experiments, we detected peptaibols in wild fungus gardens. Our combination of environmental data and laboratory infection experiments strongly support that peptaibols act as chemical cues of Trichoderma pathogenesis in T. septentrionalis fungus gardens.


Asunto(s)
Hormigas , Infección de Laboratorio , Trichoderma , Animales , Hormigas/fisiología , Jardines , Señales (Psicología) , Simbiosis , Peptaiboles
2.
Nucleic Acids Res ; 51(D1): D603-D610, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36399496

RESUMEN

With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.


Asunto(s)
Genoma , Genómica , Familia de Multigenes , Vías Biosintéticas/genética
3.
Anal Chem ; 96(15): 5798-5806, 2024 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-38564584

RESUMEN

Untargeted metabolomics promises comprehensive characterization of small molecules in biological samples. However, the field is hampered by low annotation rates and abstract spectral data. Despite recent advances in computational metabolomics, manual annotations and manual confirmation of in-silico annotations remain important in the field. Here, exploratory data analysis methods for mass spectral data provide overviews, prioritization, and structural hypothesis starting points to researchers facing large quantities of spectral data. In this research, we propose a fluid means of dealing with mass spectral data using specXplore, an interactive Python dashboard providing interactive and complementary visualizations facilitating mass spectral similarity matrix exploration. Specifically, specXplore provides a two-dimensional t-distributed stochastic neighbor embedding embedding as a jumping board for local connectivity exploration using complementary interactive visualizations in the form of partial network drawings, similarity heatmaps, and fragmentation overview maps. SpecXplore makes use of state-of-the-art ms2deepscore pairwise spectral similarities as a quantitative backbone while allowing fast changes of threshold and connectivity limitation settings, providing flexibility in adjusting settings to suit the localized node environment being explored. We believe that specXplore can become an integral part of mass spectral data exploration efforts and assist users in the generation of structural hypotheses for compounds of interest.

4.
Metabolomics ; 20(3): 62, 2024 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-38796627

RESUMEN

INTRODUCTION: The chemical classification of Cannabis is typically confined to the cannabinoid content, whilst Cannabis encompasses diverse chemical classes that vary in abundance among all its varieties. Hence, neglecting other chemical classes within Cannabis strains results in a restricted and biased comprehension of elements that may contribute to chemical intricacy and the resultant medicinal qualities of the plant. OBJECTIVES: Thus, herein, we report a computational metabolomics study to elucidate the Cannabis metabolic map beyond the cannabinoids. METHODS: Mass spectrometry-based computational tools were used to mine and evaluate the methanolic leaf and flower extracts of two Cannabis cultivars: Amnesia haze (AMNH) and Royal dutch cheese (RDC). RESULTS: The results revealed the presence of different chemical compound classes including cannabinoids, but extending it to flavonoids and phospholipids at varying distributions across the cultivar plant tissues, where the phenylpropnoid superclass was more abundant in the leaves than in the flowers. Therefore, the two cultivars were differentiated based on the overall chemical content of their plant tissues where AMNH was observed to be more dominant in the flavonoid content while RDC was more dominant in the lipid-like molecules. Additionally, in silico molecular docking studies in combination with biological assay studies indicated the potentially differing anti-cancer properties of the two cultivars resulting from the elucidated chemical profiles. CONCLUSION: These findings highlight distinctive chemical profiles beyond cannabinoids in Cannabis strains. This novel mapping of the metabolomic landscape of Cannabis provides actionable insights into plant biochemistry and justifies selecting certain varieties for medicinal use.


Asunto(s)
Cannabis , Metabolómica , Hojas de la Planta , Cannabis/química , Cannabis/metabolismo , Metabolómica/métodos , Hojas de la Planta/metabolismo , Hojas de la Planta/química , Flores/metabolismo , Flores/química , Extractos Vegetales/metabolismo , Extractos Vegetales/química , Extractos Vegetales/farmacología , Cannabinoides/metabolismo , Cannabinoides/análisis , Simulación del Acoplamiento Molecular , Flavonoides/metabolismo , Flavonoides/análisis , Espectrometría de Masas/métodos
5.
Nat Chem Biol ; 18(3): 295-304, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34969972

RESUMEN

Major advances in genome sequencing and large-scale biosynthetic gene cluster (BGC) analysis have prompted an age of natural product discovery driven by genome mining. Still, connecting molecules to their cognate BGCs is a substantial bottleneck for this approach. We have developed a mass-spectrometry-based parallel stable isotope labeling platform, termed IsoAnalyst, which assists in associating metabolite stable isotope labeling patterns with BGC structure prediction to connect natural products to their corresponding BGCs. Here we show that IsoAnalyst can quickly associate both known metabolites and unknown analytes with BGCs to elucidate the complex chemical phenotypes of these biosynthetic systems. We validate this approach for a range of compound classes, using both the type strain Saccharopolyspora erythraea and an environmentally isolated Micromonospora sp. We further demonstrate the utility of this tool with the discovery of lobosamide D, a new and structurally unique member of the family of lobosamide macrolactams.


Asunto(s)
Productos Biológicos , Micromonospora , Vías Biosintéticas/genética , Marcaje Isotópico , Familia de Multigenes
6.
PLoS Comput Biol ; 19(2): e1010462, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36758069

RESUMEN

Microbial specialised metabolism is full of valuable natural products that are applied clinically, agriculturally, and industrially. The genes that encode their biosynthesis are often physically clustered on the genome in biosynthetic gene clusters (BGCs). Many BGCs consist of multiple groups of co-evolving genes called sub-clusters that are responsible for the biosynthesis of a specific chemical moiety in a natural product. Sub-clusters therefore provide an important link between the structures of a natural product and its BGC, which can be leveraged for predicting natural product structures from sequence, as well as for linking chemical structures and metabolomics-derived mass features to BGCs. While some initial computational methodologies have been devised for sub-cluster detection, current approaches are not scalable, have only been run on small and outdated datasets, or produce an impractically large number of possible sub-clusters to mine through. Here, we constructed a scalable method for unsupervised sub-cluster detection, called iPRESTO, based on topic modelling and statistical analysis of co-occurrence patterns of enzyme-coding protein families. iPRESTO was used to mine sub-clusters across 150,000 prokaryotic BGCs from antiSMASH-DB. After annotating a fraction of the resulting sub-cluster families, we could predict a substructure for 16% of the antiSMASH-DB BGCs. Additionally, our method was able to confirm 83% of the experimentally characterised sub-clusters in MIBiG reference BGCs. Based on iPRESTO-detected sub-clusters, we could correctly identify the BGCs for xenorhabdin and salbostatin biosynthesis (which had not yet been annotated in BGC databases), as well as propose a candidate BGC for akashin biosynthesis. Additionally, we show for a collection of 145 actinobacteria how substructures can aid in linking BGCs to molecules by correlating iPRESTO-detected sub-clusters to MS/MS-derived Mass2Motifs substructure patterns. This work paves the way for deeper functional and structural annotation of microbial BGCs by improved linking of orphan molecules to their cognate gene clusters, thus facilitating accelerated natural product discovery.


Asunto(s)
Productos Biológicos , Espectrometría de Masas en Tándem , Metabolómica , Bacterias/genética , Familia de Multigenes
7.
Nat Methods ; 17(9): 901-904, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32807955

RESUMEN

We present ReDU ( https://redu.ucsd.edu/ ), a system for metadata capture of public mass spectrometry-based metabolomics data, with validated controlled vocabularies. Systematic capture of knowledge enables the reanalysis of public data and/or co-analysis of one's own data. ReDU enables multiple types of analyses, including finding chemicals and associated metadata, comparing the shared and different chemicals between groups of samples, and metadata-filtered, repository-scale molecular networking.


Asunto(s)
Bases de Datos de Compuestos Químicos , Espectrometría de Masas , Metabolómica/métodos , Programas Informáticos , Metadatos , Modelos Químicos
8.
Nat Methods ; 17(9): 905-908, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32839597

RESUMEN

Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.


Asunto(s)
Productos Biológicos/química , Espectrometría de Masas , Biología Computacional/métodos , Bases de Datos Factuales , Metabolómica/métodos , Programas Informáticos
9.
Bioinformatics ; 38(22): 5139-5140, 2022 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-36165687

RESUMEN

SUMMARY: Untargeted metabolomics data analysis is highly labour intensive and can be severely frustrated by both experimental noise and redundant features. Homologous polymer series is a particular case of features that can either represent large numbers of noise features or alternatively represent features of interest with large peak redundancy. Here, we present homologueDiscoverer, an R package that allows for the targeted and untargeted detection of homologue series as well as their evaluation and management using interactive plots and simple local database functionalities. AVAILABILITY AND IMPLEMENTATION: homologueDiscoverer is freely available at GitHub https://github.com/kevinmildau/homologueDiscoverer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Espectrometría de Masas en Tándem , Cromatografía Liquida , Metabolómica , Análisis de Datos
10.
Nat Chem Biol ; 17(2): 146-151, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33199911

RESUMEN

Untargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult to interpret. We developed Qemistree, a data exploration strategy based on the hierarchical organization of molecular fingerprints predicted from fragmentation spectra. Qemistree allows mass spectrometry data to be represented in the context of sample metadata and chemical ontologies. By expressing molecular relationships as a tree, we can apply ecological tools that are designed to analyze and visualize the relatedness of DNA sequences to metabolomics data. Here we demonstrate the use of tree-guided data exploration tools to compare metabolomics samples across different experimental conditions such as chromatographic shifts. Additionally, we leverage a tree representation to visualize chemical diversity in a heterogeneous collection of samples. The Qemistree software pipeline is freely available to the microbiome and metabolomics communities in the form of a QIIME2 plugin, and a global natural products social molecular networking workflow.


Asunto(s)
Espectrometría de Masas/métodos , Metabolómica , Algoritmos , Análisis por Conglomerados , ADN/química , Dermatoglifia del ADN , Bases de Datos Factuales , Ecología , Análisis de los Alimentos , Microbiota , Análisis Multivariante , Programas Informáticos , Espectrometría de Masas en Tándem , Flujo de Trabajo
11.
Photochem Photobiol Sci ; 22(10): 2341-2356, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37505444

RESUMEN

UV-B radiation regulates numerous morphogenic, biochemical and physiological responses in plants, and can stimulate some responses typically associated with other abiotic and biotic stimuli, including invertebrate herbivory. Removal of UV-B from the growing environment of various plant species has been found to increase their susceptibility to consumption by invertebrate pests, however, to date, little research has been conducted to investigate the effects of UV-B on crop susceptibility to field pests. Here, we report findings from a multi-omic and genetic-based study investigating the mechanisms of UV-B-stimulated resistance of the crop, Brassica napus (oilseed rape), to herbivory from an economically important lepidopteran specialist of the Brassicaceae, Plutella xylostella (diamondback moth). The UV-B photoreceptor, UV RESISTANCE LOCUS 8 (UVR8), was not found to mediate resistance to this pest. RNA-Seq and untargeted metabolomics identified components of the sinapate/lignin biosynthetic pathway that were similarly regulated by UV-B and herbivory. Arabidopsis mutants in genes encoding two enzymes in the sinapate/lignin biosynthetic pathway, CAFFEATE O-METHYLTRANSFERASE 1 (COMT1) and ELICITOR-ACTIVATED GENE 3-2 (ELI3-2), retained UV-B-mediated resistance to P. xylostella herbivory. However, the overexpression of B. napus COMT1 in Arabidopsis further reduced plant susceptibility to P. xylostella herbivory in a UV-B-dependent manner. These findings demonstrate that overexpression of a component of the sinapate/lignin biosynthetic pathway in a member of the Brassicaceae can enhance UV-B-stimulated resistance to herbivory from P. xylostella.


Asunto(s)
Arabidopsis , Brassica napus , Mariposas Nocturnas , Animales , Arabidopsis/genética , Arabidopsis/efectos de la radiación , Brassica napus/genética , Herbivoria , Lignina , Mariposas Nocturnas/fisiología , Plantas
12.
Nat Prod Rep ; 39(9): 1876-1896, 2022 09 21.
Artículo en Inglés | MEDLINE | ID: mdl-35997060

RESUMEN

Covering: up to 2022With the emergence of large amounts of omics data, computational approaches for the identification of plant natural product biosynthetic pathways and their genetic regulation have become increasingly important. While genomes provide clues regarding functional associations between genes based on gene clustering, metabolome mining provides a foundational technology to chart natural product structural diversity in plants, and transcriptomics has been successfully used to identify new members of their biosynthetic pathways based on coexpression. Thus far, most approaches utilizing transcriptomics and metabolomics have been targeted towards specific pathways and use one type of omics data at a time. Recent technological advances now provide new opportunities for integration of multiple omics types and untargeted pathway discovery. Here, we review advances in plant biosynthetic pathway discovery using genomics, transcriptomics, and metabolomics, as well as recent efforts towards omics integration. We highlight how transcriptomics and metabolomics provide complementary information to link genes to metabolites, by associating temporal and spatial gene expression levels with metabolite abundance levels across samples, and by matching mass-spectral features to enzyme families. Furthermore, we suggest that elucidation of gene regulatory networks using time-series data may prove useful for efforts to unwire the complexities of biosynthetic pathway components based on regulatory interactions and events.


Asunto(s)
Productos Biológicos , Vías Biosintéticas , Productos Biológicos/metabolismo , Vías Biosintéticas/genética , Genómica , Metaboloma , Metabolómica , Plantas/genética , Plantas/metabolismo
13.
PLoS Pathog ; 16(11): e1008932, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33141865

RESUMEN

Livestock diseases caused by Trypanosoma congolense, T. vivax and T. brucei, collectively known as nagana, are responsible for billions of dollars in lost food production annually. There is an urgent need for novel therapeutics. Encouragingly, promising antitrypanosomal benzoxaboroles are under veterinary development. Here, we show that the most efficacious subclass of these compounds are prodrugs activated by trypanosome serine carboxypeptidases (CBPs). Drug-resistance to a development candidate, AN11736, emerged readily in T. brucei, due to partial deletion within the locus containing three tandem copies of the CBP genes. T. congolense parasites, which possess a larger array of related CBPs, also developed resistance to AN11736 through deletion within the locus. A genome-scale screen in T. brucei confirmed CBP loss-of-function as the primary mechanism of resistance and CRISPR-Cas9 editing proved that partial deletion within the locus was sufficient to confer resistance. CBP re-expression in either T. brucei or T. congolense AN11736-resistant lines restored drug-susceptibility. CBPs act by cleaving the benzoxaborole AN11736 to a carboxylic acid derivative, revealing a prodrug activation mechanism. Loss of CBP activity results in massive reduction in net uptake of AN11736, indicating that entry is facilitated by the concentration gradient created by prodrug metabolism.


Asunto(s)
Compuestos de Boro/metabolismo , Carboxipeptidasas/metabolismo , Tripanocidas/metabolismo , Trypanosoma brucei brucei/enzimología , Trypanosoma congolense/enzimología , Trypanosoma vivax/enzimología , Tripanosomiasis Africana/veterinaria , Valina/análogos & derivados , Animales , Ácidos Carboxílicos/metabolismo , Resistencia a Medicamentos , Femenino , Ganado , Ratones , Parasitemia/veterinaria , Profármacos/metabolismo , Proteínas Protozoarias/metabolismo , Trypanosoma brucei brucei/efectos de los fármacos , Trypanosoma congolense/efectos de los fármacos , Trypanosoma vivax/efectos de los fármacos , Tripanosomiasis Africana/tratamiento farmacológico , Tripanosomiasis Africana/parasitología , Valina/metabolismo
14.
Metabolomics ; 18(12): 103, 2022 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-36469190

RESUMEN

BACKGROUND: Untargeted metabolomics approaches based on mass spectrometry obtain comprehensive profiles of complex biological samples. However, on average only 10% of the molecules can be annotated. This low annotation rate hampers biochemical interpretation and effective comparison of metabolomics studies. Furthermore, de novo structural characterization of mass spectral data remains a complicated and time-intensive process. Recently, the field of computational metabolomics has gained traction and novel methods have started to enable large-scale and reliable metabolite annotation. Molecular networking and machine learning-based in-silico annotation tools have been shown to greatly assist metabolite characterization in diverse fields such as clinical metabolomics and natural product discovery. AIM OF REVIEW: We highlight recent advances in computational metabolite annotation workflows with a special focus on their evaluation and comparison with other tools. Whilst the progress is substantial and promising, we also argue that inconsistencies in benchmarking different tools hamper users from selecting the most appropriate and promising method for their research. We summarize benchmarking strategies of the different tools and outline several recommendations for benchmarking and comparing novel tools. KEY SCIENTIFIC CONCEPTS OF REVIEW: This review focuses on recent advances in mass spectral library-based and machine learning-supported metabolite annotation workflows. We discuss large-scale library matching and analogue search, the current bloom of mass spectral similarity scores, and how molecular networking has changed the field. In addition, the potentials and challenges of machine learning-supported metabolite annotation workflows are highlighted. Overall, recent developments in computational metabolomics have started to fundamentally change metabolomics workflows, and we expect that as a community we will be able to overcome current method performance ambiguities and annotation bottlenecks.


Asunto(s)
Benchmarking , Metabolómica , Metabolómica/métodos , Espectrometría de Masas , Aprendizaje Automático
15.
PLoS Comput Biol ; 17(2): e1008724, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33591968

RESUMEN

Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses such as library matching and molecular networking. Although weaknesses in the relationship between spectral similarity scores and the true structural similarities have been described, little development of alternative scores has been undertaken. Here, we introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm-Word2Vec. Spec2Vec learns fragmental relationships within a large set of spectral data to derive abstract spectral embeddings that can be used to assess spectral similarities. Using data derived from GNPS MS/MS libraries including spectra for nearly 13,000 unique molecules, we show how Spec2Vec scores correlate better with structural similarity than cosine-based scores. We demonstrate the advantages of Spec2Vec in library matching and molecular networking. Spec2Vec is computationally more scalable allowing structural analogue searches in large databases within seconds.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Biblioteca de Genes , Metabolómica/métodos , Espectrometría de Masas en Tándem/métodos , Simulación por Computador , Bases de Datos Factuales , Reacciones Falso Positivas , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Reproducibilidad de los Resultados
16.
PLoS Comput Biol ; 17(5): e1008920, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33945539

RESUMEN

Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.


Asunto(s)
Genética Microbiana/estadística & datos numéricos , Genómica/estadística & datos numéricos , Metabolómica/estadística & datos numéricos , Programas Informáticos , Vías Biosintéticas/genética , Biología Computacional , Minería de Datos , Bases de Datos Factuales , Bases de Datos Genéticas , Genoma Microbiano , Fenómenos Microbiológicos , Familia de Multigenes , Análisis de Regresión
17.
Nucleic Acids Res ; 48(D1): D454-D458, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31612915

RESUMEN

Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.


Asunto(s)
Bases de Datos Genéticas , Genoma Bacteriano , Genómica/métodos , Familia de Multigenes , Programas Informáticos , Vías Biosintéticas/genética , Anotación de Secuencia Molecular
18.
Nat Prod Rep ; 38(11): 2066-2082, 2021 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-34612288

RESUMEN

Covering: 2016 up to 2021Mass spectrometry (MS) is an essential technology in natural products research with MS fragmentation (MS/MS) approaches becoming a key tool. Recent advancements in MS yield dense metabolomics datasets which have been, conventionally, used by individual labs for individual projects; however, a shift is brewing. The movement towards open MS data (and other structural characterization data) and accessible data mining tools is emerging in natural products research. Over the past 5 years, this movement has rapidly expanded and evolved with no slowdown in sight; the capabilities of today vastly exceed those of 5 years ago. Herein, we address the analysis of individual datasets, a situation we are calling the '2021 status quo', and the emergent framework to systematically capture sample information (metadata) and perform repository-scale analyses. We evaluate public data deposition, discuss the challenges of working in the repository scale, highlight the challenges of metadata capture and provide illustrative examples of the power of utilizing repository data and the tools that enable it. We conclude that the advancements in MS data collection must be met with advancements in how we utilize data; therefore, we argue that open data and data mining is the next evolution in obtaining the maximum potential in natural products research.


Asunto(s)
Productos Biológicos/química , Minería de Datos , Espectrometría de Masas en Tándem/métodos , Productos Biológicos/metabolismo , Análisis de Datos , Metabolómica
19.
Nat Prod Rep ; 38(11): 1967-1993, 2021 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-34821250

RESUMEN

Covering: up to the end of 2020Recently introduced computational metabolome mining tools have started to positively impact the chemical and biological interpretation of untargeted metabolomics analyses. We believe that these current advances make it possible to start decomposing complex metabolite mixtures into substructure and chemical class information, thereby supporting pivotal tasks in metabolomics analysis including metabolite annotation, the comparison of metabolic profiles, and network analyses. In this review, we highlight and explain key tools and emerging strategies covering 2015 up to the end of 2020. The majority of these tools aim at processing and analyzing liquid chromatography coupled to mass spectrometry fragmentation data. We start with defining what substructures are, how they relate to molecular fingerprints, and how recognizing them helps to decompose complex mixtures. We continue with chemical classes that are based on the presence or absence of particular molecular scaffolds and/or functional groups and are thus intrinsically related to substructures. We discuss novel tools to mine substructures, annotate chemical compound classes, and create mass spectral networks from metabolomics data and demonstrate them using two case studies. We also review and speculate about the opportunities that NMR spectroscopy-based metabolome mining of complex metabolite mixtures offers to discover substructures and chemical classes. Finally, we will describe the main benefits and limitations of the current tools and strategies that rely on them, and our vision on how this exciting field can develop toward repository-scale-sized metabolomics analyses. Complementary sources of structural information from genomics analyses and well-curated taxonomic records are also discussed. Many research fields such as natural products discovery, pharmacokinetic and drug metabolism studies, and environmental metabolomics increasingly rely on untargeted metabolomics to gain biochemical and biological insights. The here described technical advances will benefit all those metabolomics disciplines by transforming spectral data into knowledge that can answer biological questions.


Asunto(s)
Mezclas Complejas/química , Metabolómica/métodos , Cromatografía Liquida , Flavonas/análisis , Espectroscopía de Resonancia Magnética , Sideritis/química , Espectrometría de Masas en Tándem
20.
Anal Chem ; 93(14): 5676-5683, 2021 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-33784814

RESUMEN

Tandem mass spectrometry (LC-MS/MS) is widely used to identify unknown ions in untargeted metabolomics. Data-dependent acquisition (DDA) chooses which ions to fragment based upon intensities observed in MS1 survey scans and typically only fragments a small subset of the ions present. Despite this inefficiency, relatively little work has addressed the development of new DDA methods, partly due to the high overhead associated with running the many extracts necessary to optimize approaches in busy MS facilities. In this work, we first provide theoretical results that show how much improvement is possible over current DDA strategies. We then describe an in silico framework for fast and cost-efficient development of new DDA strategies using a previously developed virtual metabolomics mass spectrometer (ViMMS). Additional functionality is added to ViMMS to allow methods to be used both in simulation and on real samples via an Instrument Application Programming Interface (IAPI). We demonstrate this framework through the development and optimization of two new DDA methods that introduce new advanced ion prioritization strategies. Upon application of these developed methods to two complex metabolite mixtures, our results show that they are able to fragment more unique ions than standard DDA strategies.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA