Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37364005

RESUMEN

MOTIVATION: Liquid Chromatography Tandem Mass Spectrometry experiments aim to produce high-quality fragmentation spectra, which can be used to annotate metabolites. However, current Data-Dependent Acquisition approaches may fail to collect spectra of sufficient quality and quantity for experimental outcomes, and extend poorly across multiple samples by failing to share information across samples or by requiring manual expert input. RESULTS: We present TopNEXt, a real-time scan prioritization framework that improves data acquisition in multi-sample Liquid Chromatography Tandem Mass Spectrometry metabolomics experiments. TopNEXt extends traditional Data-Dependent Acquisition exclusion methods across multiple samples by using a Region of Interest and intensity-based scoring system. Through both simulated and lab experiments, we show that methods incorporating these novel concepts acquire fragmentation spectra for an additional 10% of our set of target peaks and with an additional 20% of acquisition intensity. By increasing the quality and quantity of fragmentation spectra, TopNEXt can help improve metabolite identification with a potential impact across a variety of experimental contexts. AVAILABILITY AND IMPLEMENTATION: TopNEXt is implemented as part of the ViMMS framework and the latest version can be found at https://github.com/glasgowcompbio/vimms. A stable version used to produce our results can be found at 10.5281/zenodo.7468914.


Asunto(s)
Metabolómica , Espectrometría de Masas/métodos , Cromatografía Liquida/métodos , Metabolómica/métodos
2.
PLoS Comput Biol ; 17(5): e1008920, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33945539

RESUMEN

Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.


Asunto(s)
Genética Microbiana/estadística & datos numéricos , Genómica/estadística & datos numéricos , Metabolómica/estadística & datos numéricos , Programas Informáticos , Vías Biosintéticas/genética , Biología Computacional , Minería de Datos , Bases de Datos Factuales , Bases de Datos Genéticas , Genoma Microbiano , Fenómenos Microbiológicos , Familia de Multigenes , Análisis de Regresión
3.
BMC Bioinformatics ; 22(1): 603, 2021 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-34922446

RESUMEN

BACKGROUND: An increasing number of studies now produce multiple omics measurements that require using sophisticated computational methods for analysis. While each omics data can be examined separately, jointly integrating multiple omics data allows for deeper understanding and insights to be gained from the study. In particular, data integration can be performed horizontally, where biological entities from multiple omics measurements are mapped to common reactions and pathways. However, data integration remains a challenge due to the complexity of the data and the difficulty in interpreting analysis results. RESULTS: Here we present GraphOmics, a user-friendly platform to explore and integrate multiple omics datasets and support hypothesis generation. Users can upload transcriptomics, proteomics and metabolomics data to GraphOmics. Relevant entities are connected based on their biochemical relationships, and mapped to reactions and pathways from Reactome. From the Data Browser in GraphOmics, mapped entities and pathways can be ranked, sorted and filtered according to their statistical significance (p values) and fold changes. Context-sensitive panels provide information on the currently selected entities, while interactive heatmaps and clustering functionalities are also available. As a case study, we demonstrated how GraphOmics was used to interactively explore multi-omics data and support hypothesis generation using two complex datasets from existing Zebrafish regeneration and Covid-19 human studies. CONCLUSIONS: GraphOmics is fully open-sourced and freely accessible from https://graphomics.glasgowcompbio.org/ . It can be used to integrate multiple omics data horizontally by mapping entities across omics to reactions and pathways. Our demonstration showed that by using interactive explorations from GraphOmics, interesting insights and biological hypotheses could be rapidly revealed.


Asunto(s)
COVID-19 , Animales , Humanos , Metabolómica , Proteómica , SARS-CoV-2 , Pez Cebra/genética
4.
Anal Chem ; 93(14): 5676-5683, 2021 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-33784814

RESUMEN

Tandem mass spectrometry (LC-MS/MS) is widely used to identify unknown ions in untargeted metabolomics. Data-dependent acquisition (DDA) chooses which ions to fragment based upon intensities observed in MS1 survey scans and typically only fragments a small subset of the ions present. Despite this inefficiency, relatively little work has addressed the development of new DDA methods, partly due to the high overhead associated with running the many extracts necessary to optimize approaches in busy MS facilities. In this work, we first provide theoretical results that show how much improvement is possible over current DDA strategies. We then describe an in silico framework for fast and cost-efficient development of new DDA strategies using a previously developed virtual metabolomics mass spectrometer (ViMMS). Additional functionality is added to ViMMS to allow methods to be used both in simulation and on real samples via an Instrument Application Programming Interface (IAPI). We demonstrate this framework through the development and optimization of two new DDA methods that introduce new advanced ion prioritization strategies. Upon application of these developed methods to two complex metabolite mixtures, our results show that they are able to fragment more unique ions than standard DDA strategies.

5.
Bioinformatics ; 34(13): 2314-2315, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29490021

RESUMEN

Motivation: Mathematical modelling based on ordinary differential equations (ODEs) is widely used to describe the dynamics of biological systems, particularly in systems and pathway biology. Often the kinetic parameters of these ODE systems are unknown and have to be inferred from the data. Approximate parameter inference methods based on gradient matching (which do not require performing computationally expensive numerical integration of the ODEs) have been getting popular in recent years, but many implementations are difficult to run without expert knowledge. Here, we introduce ShinyKGode, an interactive web application to perform fast parameter inference on ODEs using gradient matching. Results: ShinyKGode can be used to infer ODE parameters on simulated and observed data using gradient matching. Users can easily load their own models in Systems Biology Markup Language format, and a set of pre-defined ODE benchmark models are provided in the application. Inferred parameters are visualized alongside diagnostic plots to assess convergence. Availability and implementation: The R package for ShinyKGode can be installed through the Comprehensive R Archive Network (CRAN). Installation instructions, as well as tutorial videos and source code are available at https://joewandy.github.io/shinyKGode. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Cinética , Biología de Sistemas/métodos
6.
Bioinformatics ; 34(2): 317-318, 2018 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-28968802

RESUMEN

MOTIVATION: We recently published MS2LDA, a method for the decomposition of sets of molecular fragment data derived from large metabolomics experiments. To make the method more widely available to the community, here we present ms2lda.org, a web application that allows users to upload their data, run MS2LDA analyses and explore the results through interactive visualizations. RESULTS: Ms2lda.org takes tandem mass spectrometry data in many standard formats and allows the user to infer the sets of fragment and neutral loss features that co-occur together (Mass2Motifs). As an alternative workflow, the user can also decompose a data set onto predefined Mass2Motifs. This is accomplished through the web interface or programmatically from our web service. AVAILABILITY AND IMPLEMENTATION: The website can be found at http://ms2lda.org, while the source code is available at https://github.com/sdrogers/ms2ldaviz under the MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

7.
Faraday Discuss ; 218(0): 284-302, 2019 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-31120050

RESUMEN

Complex metabolite mixtures are challenging to unravel. Mass spectrometry (MS) is a widely used and sensitive technique for obtaining structural information of complex mixtures. However, just knowing the molecular masses of the mixture's constituents is almost always insufficient for confident assignment of the associated chemical structures. Structural information can be augmented through MS fragmentation experiments whereby detected metabolites are fragmented, giving rise to MS/MS spectra. However, how can we maximize the structural information we gain from fragmentation spectra? We recently proposed a substructure-based strategy to enhance metabolite annotation for complex mixtures by considering metabolites as the sum of (bio)chemically relevant moieties that we can detect through mass spectrometry fragmentation approaches. Our MS2LDA tool allows us to discover - unsupervised - groups of mass fragments and/or neutral losses, termed Mass2Motifs, that often correspond to substructures. After manual annotation, these Mass2Motifs can be used in subsequent MS2LDA analyses of new datasets, thereby providing structural annotations for many molecules that are not present in spectral databases. Here, we describe how additional strategies, taking advantage of (i) combinatorial in silico matching of experimental mass features to substructures of candidate molecules, and (ii) automated machine learning classification of molecules, can facilitate semi-automated annotation of substructures. We show how our approach accelerates the Mass2Motif annotation process and therefore broadens the chemical space spanned by characterized motifs. Our machine learning model used to classify fragmentation spectra learns the relationships between fragment spectra and chemical features. Classification prediction on these features can be aggregated for all molecules that contribute to a particular Mass2Motif and guide Mass2Motif annotations. To make annotated Mass2Motifs available to the community, we also present MotifDB: an open database of Mass2Motifs that can be browsed and accessed programmatically through an Application Programming Interface (API). MotifDB is integrated within ms2lda.org, allowing users to efficiently search for characterized motifs in their own experiments. We expect that with an increasing number of Mass2Motif annotations available through a growing database, we can more quickly gain insight into the constituents of complex mixtures. This will allow prioritization towards novel or unexpected chemistries and faster recognition of known biochemical building blocks.


Asunto(s)
Automatización , Mezclas Complejas/análisis , Mezclas Complejas/metabolismo , Aprendizaje Automático Supervisado , Aprendizaje Automático no Supervisado , Bases de Datos Factuales , Espectrometría de Masas en Tándem
8.
Proc Natl Acad Sci U S A ; 113(48): 13738-13743, 2016 11 29.
Artículo en Inglés | MEDLINE | ID: mdl-27856765

RESUMEN

The potential of untargeted metabolomics to answer important questions across the life sciences is hindered because of a paucity of computational tools that enable extraction of key biochemically relevant information. Available tools focus on using mass spectrometry fragmentation spectra to identify molecules whose behavior suggests they are relevant to the system under study. Unfortunately, fragmentation spectra cannot identify molecules in isolation but require authentic standards or databases of known fragmented molecules. Fragmentation spectra are, however, replete with information pertaining to the biochemical processes present, much of which is currently neglected. Here, we present an analytical workflow that exploits all fragmentation data from a given experiment to extract biochemically relevant features in an unsupervised manner. We demonstrate that an algorithm originally used for text mining, latent Dirichlet allocation, can be adapted to handle metabolomics datasets. Our approach extracts biochemically relevant molecular substructures ("Mass2Motifs") from spectra as sets of co-occurring molecular fragments and neutral losses. The analysis allows us to isolate molecular substructures, whose presence allows molecules to be grouped based on shared substructures regardless of classical spectral similarity. These substructures, in turn, support putative de novo structural annotation of molecules. Combining this spectral connectivity to orthogonal correlations (e.g., common abundance changes under system perturbation) significantly enhances our ability to provide mechanistic explanations for biological behavior.


Asunto(s)
Metabolómica/métodos , Espectrometría de Masas en Tándem/métodos , Flujo de Trabajo , Algoritmos , Bases de Datos Factuales , Metabolómica/normas , Espectrometría de Masas en Tándem/normas
9.
Bioinformatics ; 33(24): 4007-4009, 2017 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-28961954

RESUMEN

SUMMARY: The Polyomics integrated Metabolomics Pipeline (PiMP) fulfils an unmet need in metabolomics data analysis. PiMP offers automated and user-friendly analysis from mass spectrometry data acquisition to biological interpretation. Our key innovations are the Summary Page, which provides a simple overview of the experiment in the format of a scientific paper, containing the key findings of the experiment along with associated metadata; and the Metabolite Page, which provides a list of each metabolite accompanied by 'evidence cards', which provide a variety of criteria behind metabolite annotation including peak shapes, intensities in different sample groups and database information. AVAILABILITY AND IMPLEMENTATION: PiMP is available at http://polyomics.mvls.gla.ac.uk, and access is freely available on request. 50 GB of space is allocated for data storage, with unrestricted number of samples and analyses per user. Source code is available at https://github.com/RonanDaly/pimp and licensed under the GPL. CONTACT: karl.burgess@glasgow.ac.uk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Cromatografía Liquida , Espectrometría de Masas , Metabolómica/métodos , Programas Informáticos , Internet , Metaboloma
10.
Anal Chem ; 89(14): 7569-7577, 2017 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-28621528

RESUMEN

In untargeted metabolomics approaches, the inability to structurally annotate relevant features and map them to biochemical pathways is hampering the full exploitation of many metabolomics experiments. Furthermore, variable metabolic content across samples result in sparse feature matrices that are statistically hard to handle. Here, we introduce MS2LDA+ that tackles both above-mentioned problems. Previously, we presented MS2LDA, which extracts biochemically relevant molecular substructures ("Mass2Motifs") from a collection of fragmentation spectra as sets of co-occurring molecular fragments and neutral losses, thereby recognizing building blocks of metabolomics. Here, we extend MS2LDA to handle multiple metabolomics experiments in one analysis, resulting in MS2LDA+. By linking Mass2Motifs across samples, we expose the variability in prevalence of structurally related metabolite families. We validate the differential prevalence of substructures between two distinct samples groups and apply it to fecal samples. Subsequently, within one sample group of urines, we rank the Mass2Motifs based on their variance to assess whether xenobiotic-derived substructures are among the most-variant Mass2Motifs. Indeed, we could ascribe 22 out of the 30 most-variant Mass2Motifs to xenobiotic-derived substructures including paracetamol/acetaminophen mercapturate and dimethylpyrogallol. In total, we structurally characterized 101 Mass2Motifs with biochemically or chemically relevant substructures. Finally, we combined the discovered metabolite families with full scan feature intensity information to obtain insight into core metabolites present in most samples and rare metabolites present in small subsets now linked through their common substructures. We conclude that by biochemical grouping of metabolites across samples MS2LDA+ aids in structural annotation of metabolites and guides prioritization of analysis by using Mass2Motif prevalence.


Asunto(s)
Antihipertensivos/metabolismo , Descubrimiento de Drogas , Metabolómica , Modelos Estadísticos , Adolescente , Anciano , Anciano de 80 o más Años , Antihipertensivos/análisis , Cerveza/análisis , Niño , Cromatografía Liquida , Heces/química , Femenino , Humanos , Masculino , Espectrometría de Masas , Persona de Mediana Edad , Estructura Molecular
11.
Bioinformatics ; 31(12): 1999-2006, 2015 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25649621

RESUMEN

MOTIVATION: The combination of liquid chromatography and mass spectrometry (LC/MS) has been widely used for large-scale comparative studies in systems biology, including proteomics, glycomics and metabolomics. In almost all experimental design, it is necessary to compare chromatograms across biological or technical replicates and across sample groups. Central to this is the peak alignment step, which is one of the most important but challenging preprocessing steps. Existing alignment tools do not take into account the structural dependencies between related peaks that coelute and are derived from the same metabolite or peptide. We propose a direct matching peak alignment method for LC/MS data that incorporates related peaks information (within each LC/MS run) and investigate its effect on alignment performance (across runs). The groupings of related peaks necessary for our method can be obtained from any peak clustering method and are built into a pair-wise peak similarity score function. The similarity score matrix produced is used by an approximation algorithm for the weighted matching problem to produce the actual alignment result. RESULTS: We demonstrate that related peak information can improve alignment performance. The performance is evaluated on a set of benchmark datasets, where our method performs competitively compared to other popular alignment tools. AVAILABILITY: The proposed alignment method has been implemented as a stand-alone application in Python, available for download at http://github.com/joewandy/peak-grouping-alignment.


Asunto(s)
Algoritmos , Cromatografía Liquida/métodos , Glicómica/métodos , Espectrometría de Masas/métodos , Metabolómica/métodos , Fragmentos de Péptidos/análisis , Proteómica/métodos , Humanos
12.
Bioinformatics ; 30(19): 2764-71, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24916385

RESUMEN

MOTIVATION: The use of liquid chromatography coupled to mass spectrometry has enabled the high-throughput profiling of the metabolite composition of biological samples. However, the large amount of data obtained can be difficult to analyse and often requires computational processing to understand which metabolites are present in a sample. This article looks at the dual problem of annotating peaks in a sample with a metabolite, together with putatively annotating whether a metabolite is present in the sample. The starting point of the approach is a Bayesian clustering of peaks into groups, each corresponding to putative adducts and isotopes of a single metabolite. RESULTS: The Bayesian modelling introduced here combines information from the mass-to-charge ratio, retention time and intensity of each peak, together with a model of the inter-peak dependency structure, to increase the accuracy of peak annotation. The results inherently contain a quantitative estimate of confidence in the peak annotations and allow an accurate trade-off between precision and recall. Extensive validation experiments using authentic chemical standards show that this system is able to produce more accurate putative identifications than other state-of-the-art systems, while at the same time giving a probabilistic measure of confidence in the annotations. AVAILABILITY AND IMPLEMENTATION: The software has been implemented as part of the mzMatch metabolomics analysis pipeline, which is available for download at http://mzmatch.sourceforge.net/.


Asunto(s)
Cromatografía Liquida/métodos , Espectrometría de Masas/métodos , Metabolómica , Algoritmos , Teorema de Bayes , Análisis por Conglomerados , Ácido Cisteico/análisis , Interpretación Estadística de Datos , Distribución Normal , Probabilidad , Reproducibilidad de los Resultados , Programas Informáticos , Triazoles/análisis
13.
Front Mol Biosci ; 10: 1130781, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36959982

RESUMEN

Data-Dependent and Data-Independent Acquisition modes (DDA and DIA, respectively) are both widely used to acquire MS2 spectra in untargeted liquid chromatography tandem mass spectrometry (LC-MS/MS) metabolomics analyses. Despite their wide use, little work has been attempted to systematically compare their MS/MS spectral annotation performance in untargeted settings due to the lack of ground truth and the costs involved in running a large number of acquisitions. Here, we present a systematic in silico comparison of these two acquisition methods in untargeted metabolomics by extending our Virtual Metabolomics Mass Spectrometer (ViMMS) framework with a DIA module. Our results show that the performance of these methods varies with the average number of co-eluting ions as the most important factor. At low numbers, DIA outperforms DDA, but at higher numbers, DDA has an advantage as DIA can no longer deal with the large amount of overlapping ion chromatograms. Results from simulation were further validated on an actual mass spectrometer, demonstrating that using ViMMS we can draw conclusions from simulation that translate well into the real world. The versatility of the Virtual Metabolomics Mass Spectrometer (ViMMS) framework in simulating different parameters of both Data-Dependent and Data-Independent Acquisition (DDA and DIA) modes is a key advantage of this work. Researchers can easily explore and compare the performance of different acquisition methods within the ViMMS framework, without the need for expensive and time-consuming experiments with real experimental data. By identifying the strengths and limitations of each acquisition method, researchers can optimize their choice and obtain more accurate and robust results. Furthermore, the ability to simulate and validate results using the ViMMS framework can save significant time and resources, as it eliminates the need for numerous experiments. This work not only provides valuable insights into the performance of DDA and DIA, but it also opens the door for further advancements in LC-MS/MS data acquisition methods.

14.
Metabolites ; 11(2)2021 Feb 11.
Artículo en Inglés | MEDLINE | ID: mdl-33670102

RESUMEN

Related metabolites can be grouped into sets in many ways, e.g., by their participation in series of chemical reactions (forming metabolic pathways), or based on fragmentation spectral similarities or shared chemical substructures. Understanding how such metabolite sets change in relation to experimental factors can be incredibly useful in the interpretation and understanding of complex metabolomics data sets. However, many of the available tools that are used to perform this analysis are not entirely suitable for the analysis of untargeted metabolomics measurements. Here, we present PALS (Pathway Activity Level Scoring), a Python library, command line tool, and Web application that performs the ranking of significantly changing metabolite sets over different experimental conditions. The main algorithm in PALS is based on the pathway level analysis of gene expression (PLAGE) factorisation method and is denoted as mPLAGE (PLAGE for metabolomics). As an example of an application, PALS is used to analyse metabolites grouped as metabolic pathways and by shared tandem mass spectrometry fragmentation patterns. A comparison of mPLAGE with two other commonly used methods (overrepresentation analysis (ORA) and gene set enrichment analysis (GSEA)) is also given and reveals that mPLAGE is more robust to missing features and noisy data than the alternatives. As further examples, PALS is also applied to human African trypanosomiasis, Rhamnaceae, and American Gut Project data. In addition, normalisation can have a significant impact on pathway analysis results, and PALS offers a framework to further investigate this. PALS is freely available from our project Web site.

15.
Metabolites ; 9(10)2019 Oct 09.
Artículo en Inglés | MEDLINE | ID: mdl-31600991

RESUMEN

Liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) is widely used in identifying small molecules in untargeted metabolomics. Various strategies exist to acquire MS/MS fragmentation spectra; however, the development of new acquisition strategies is hampered by the lack of simulators that let researchers prototype, compare, and optimize strategies before validations on real machines. We introduce Virtual Metabolomics Mass Spectrometer (ViMMS), a metabolomics LC-MS/MS simulator framework that allows for scan-level control of the MS2 acquisition process in silico. ViMMS can generate new LC-MS/MS data based on empirical data or virtually re-run a previous LC-MS/MS analysis using pre-existing data to allow the testing of different fragmentation strategies. To demonstrate its utility, we show how ViMMS can be used to optimize N for Top-N data-dependent acquisition (DDA) acquisition, giving results comparable to modifying N on the mass spectrometer. We expect that ViMMS will save method development time by allowing for offline evaluation of novel fragmentation strategies and optimization of the fragmentation strategy for a particular experiment.

16.
Metabolites ; 9(7)2019 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-31315242

RESUMEN

Metabolomics has started to embrace computational approaches for chemical interpretation of large data sets. Yet, metabolite annotation remains a key challenge. Recently, molecular networking and MS2LDA emerged as molecular mining tools that find molecular families and substructures in mass spectrometry fragmentation data. Moreover, in silico annotation tools obtain and rank candidate molecules for fragmentation spectra. Ideally, all structural information obtained and inferred from these computational tools could be combined to increase the resulting chemical insight one can obtain from a data set. However, integration is currently hampered as each tool has its own output format and efficient matching of data across these tools is lacking. Here, we introduce MolNetEnhancer, a workflow that combines the outputs from molecular networking, MS2LDA, in silico annotation tools (such as Network Annotation Propagation or DEREPLICATOR), and the automated chemical classification through ClassyFire to provide a more comprehensive chemical overview of metabolomics data whilst at the same time illuminating structural details for each fragmentation spectrum. We present examples from four plant and bacterial case studies and show how MolNetEnhancer enables the chemical annotation, visualization, and discovery of the subtle substructural diversity within molecular families. We conclude that MolNetEnhancer is a useful tool that greatly assists the metabolomics researcher in deciphering the metabolome through combination of multiple independent in silico pipelines.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA