Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
PLoS Comput Biol ; 17(2): e1008724, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33591968

RESUMEN

Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses such as library matching and molecular networking. Although weaknesses in the relationship between spectral similarity scores and the true structural similarities have been described, little development of alternative scores has been undertaken. Here, we introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm-Word2Vec. Spec2Vec learns fragmental relationships within a large set of spectral data to derive abstract spectral embeddings that can be used to assess spectral similarities. Using data derived from GNPS MS/MS libraries including spectra for nearly 13,000 unique molecules, we show how Spec2Vec scores correlate better with structural similarity than cosine-based scores. We demonstrate the advantages of Spec2Vec in library matching and molecular networking. Spec2Vec is computationally more scalable allowing structural analogue searches in large databases within seconds.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Biblioteca de Genes , Metabolómica/métodos , Espectrometría de Masas en Tándem/métodos , Simulación por Computador , Bases de Datos Factuales , Reacciones Falso Positivas , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Reproducibilidad de los Resultados
2.
Faraday Discuss ; 218(0): 284-302, 2019 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-31120050

RESUMEN

Complex metabolite mixtures are challenging to unravel. Mass spectrometry (MS) is a widely used and sensitive technique for obtaining structural information of complex mixtures. However, just knowing the molecular masses of the mixture's constituents is almost always insufficient for confident assignment of the associated chemical structures. Structural information can be augmented through MS fragmentation experiments whereby detected metabolites are fragmented, giving rise to MS/MS spectra. However, how can we maximize the structural information we gain from fragmentation spectra? We recently proposed a substructure-based strategy to enhance metabolite annotation for complex mixtures by considering metabolites as the sum of (bio)chemically relevant moieties that we can detect through mass spectrometry fragmentation approaches. Our MS2LDA tool allows us to discover - unsupervised - groups of mass fragments and/or neutral losses, termed Mass2Motifs, that often correspond to substructures. After manual annotation, these Mass2Motifs can be used in subsequent MS2LDA analyses of new datasets, thereby providing structural annotations for many molecules that are not present in spectral databases. Here, we describe how additional strategies, taking advantage of (i) combinatorial in silico matching of experimental mass features to substructures of candidate molecules, and (ii) automated machine learning classification of molecules, can facilitate semi-automated annotation of substructures. We show how our approach accelerates the Mass2Motif annotation process and therefore broadens the chemical space spanned by characterized motifs. Our machine learning model used to classify fragmentation spectra learns the relationships between fragment spectra and chemical features. Classification prediction on these features can be aggregated for all molecules that contribute to a particular Mass2Motif and guide Mass2Motif annotations. To make annotated Mass2Motifs available to the community, we also present MotifDB: an open database of Mass2Motifs that can be browsed and accessed programmatically through an Application Programming Interface (API). MotifDB is integrated within ms2lda.org, allowing users to efficiently search for characterized motifs in their own experiments. We expect that with an increasing number of Mass2Motif annotations available through a growing database, we can more quickly gain insight into the constituents of complex mixtures. This will allow prioritization towards novel or unexpected chemistries and faster recognition of known biochemical building blocks.


Asunto(s)
Automatización , Mezclas Complejas/análisis , Mezclas Complejas/metabolismo , Aprendizaje Automático Supervisado , Aprendizaje Automático no Supervisado , Bases de Datos Factuales , Espectrometría de Masas en Tándem
3.
J Chem Inf Model ; 59(7): 3191-3197, 2019 07 22.
Artículo en Inglés | MEDLINE | ID: mdl-31260292

RESUMEN

We present the QMflows Python package for quantum chemistry workflow automatization. QMflows allows users to write complex workflows in terms of simple Python scripts. It supports the development of interoperable workflows involving multiple quantum chemistry codes and executes them efficiently on large scale parallel computers. This open source library provides standardized interfaces to a number of quantum chemistry packages and can be easily extended to accommodate additional codes. QMflows features are described and illustrated with a number of representative applications.


Asunto(s)
Fenómenos Químicos , Compuestos Orgánicos/química , Automatización , Simulación por Computador , Modelos Químicos , Programas Informáticos
5.
Plant Physiol ; 172(4): 2516-2529, 2016 12.
Artículo en Inglés | MEDLINE | ID: mdl-27803191

RESUMEN

Somatic embryogenesis receptor kinases (SERKs) are ligand-binding coreceptors that are able to combine with different ligand-perceiving receptors such as BRASSINOSTEROID INSENSITIVE1 (BRI1) and FLAGELLIN-SENSITIVE2. Phenotypical analysis of serk single mutants is not straightforward because multiple pathways can be affected, while redundancy is observed for a single phenotype. For example, serk1serk3 double mutant roots are insensitive toward brassinosteroids but have a phenotype different from bri1 mutant roots. To decipher these effects, 4-d-old Arabidopsis (Arabidopsis thaliana) roots were studied using microarray analysis. A total of 698 genes, involved in multiple biological processes, were found to be differentially regulated in serk1-3serk3-2 double mutants. About half of these are related to brassinosteroid signaling. The remainder appear to be unlinked to brassinosteroids and related to primary and secondary metabolism. In addition, methionine-derived glucosinolate biosynthesis genes are up-regulated, which was verified by metabolite profiling. The results also show that the gene expression pattern in serk3-2 mutant roots is similar to that of the serk1-3serk3-2 double mutant roots. This confirms the existence of partial redundancy between SERK3 and SERK1 as well as the promoting or repressive activity of a single coreceptor in multiple simultaneously active pathways.


Asunto(s)
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas , Mutación/genética , Proteínas Quinasas/genética , Proteínas Serina-Treonina Quinasas/genética , Transcripción Genética , Alelos , Brasinoesteroides/metabolismo , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las Plantas/efectos de los fármacos , Glucosinolatos/farmacología , Metaboloma/efectos de los fármacos , Análisis Multivariante , Fenotipo , Raíces de Plantas/efectos de los fármacos , Raíces de Plantas/genética , Transcripción Genética/efectos de los fármacos
6.
Exp Physiol ; 102(1): 86-99, 2017 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-27808433

RESUMEN

NEW FINDINGS: What is the central question of this study? Exercise is known to induce stress-related physiological responses, such as changes in intestinal barrier function. Our aim was to determine the test-retest repeatability of these responses in well-trained individuals. What is the main finding and its importance? Responses to strenuous exercise, as indicated by stress-related markers such as intestinal integrity markers and myokines, showed high test-retest variation. Even in well-trained young men an adapted response is seen after a single repetition after 1 week. This finding has implications for the design of studies aimed at evaluating physiological responses to exercise. Strenuous exercise induces different stress-related physiological changes, potentially including changes in intestinal barrier function. In the Protégé Study (ISRCTN14236739; www.isrctn.com), we determined the test-retest repeatability in responses to exercise in well-trained individuals. Eleven well-trained men (27 ± 4 years old) completed an exercise protocol that consisted of intensive cycling intervals, followed by an overnight fast and an additional 90 min cycling phase at 50% of maximal workload the next morning. The day before (rest), and immediately after the exercise protocol (exercise) a lactulose and rhamnose solution was ingested. Markers of energy metabolism, lactulose-to-rhamnose ratio, several cytokines and potential stress-related markers were measured at rest and during exercise. In addition, untargeted urine metabolite profiles were obtained. The complete procedure (Test) was repeated 1 week later (Retest) to assess repeatability. Metabolic effect parameters with regard to energy metabolism and urine metabolomics were similar for both the Test and Retest period, underlining comparable exercise load. Following exercise, intestinal permeability (1 h plasma lactulose-to-rhamnose ratio) and the serum interleukin-6, interleukin-10, fibroblast growth factor-21 and muscle creatine kinase concentrations were significantly increased compared with rest only during the first test and not when the test was repeated. Responses to strenuous exercise in well-trained young men, as indicated by intestinal markers and myokines, show adaptation in Test-Retest outcome. This might be attributable to a carry-over effect of the defense mechanisms triggered during the Test. This finding has implications for the design of studies aimed at evaluating physiological responses to exercise.


Asunto(s)
Adaptación Fisiológica/fisiología , Ejercicio Físico/fisiología , Estrés Fisiológico/fisiología , Adulto , Biomarcadores/metabolismo , Creatina Quinasa/metabolismo , Citocinas/metabolismo , Metabolismo Energético/fisiología , Factores de Crecimiento de Fibroblastos/metabolismo , Humanos , Interleucina-10/metabolismo , Interleucina-6/metabolismo , Mucosa Intestinal/metabolismo , Lactulosa/metabolismo , Masculino , Permeabilidad , Descanso/fisiología , Ramnosa/metabolismo , Orina/química , Adulto Joven
7.
J Chem Inf Model ; 57(2): 115-121, 2017 02 27.
Artículo en Inglés | MEDLINE | ID: mdl-28125221

RESUMEN

3D-e-Chem-VM is an open source, freely available Virtual Machine ( http://3d-e-chem.github.io/3D-e-Chem-VM/ ) that integrates cheminformatics and bioinformatics tools for the analysis of protein-ligand interaction data. 3D-e-Chem-VM consists of software libraries, and database and workflow tools that can analyze and combine small molecule and protein structural information in a graphical programming environment. New chemical and biological data analytics tools and workflows have been developed for the efficient exploitation of structural and pharmacological protein-ligand interaction data from proteomewide databases (e.g., ChEMBLdb and PDB), as well as customized information systems focused on, e.g., G protein-coupled receptors (GPCRdb) and protein kinases (KLIFS). The integrated structural cheminformatics research infrastructure compiled in the 3D-e-Chem-VM enables the design of new approaches in virtual ligand screening (Chemdb4VS), ligand-based metabolism prediction (SyGMa), and structure-based protein binding site comparison and bioisosteric replacement for ligand design (KRIPOdb).


Asunto(s)
Informática/métodos , Diseño de Fármacos , Ligandos , Proteínas Quinasas/metabolismo , Receptores Acoplados a Proteínas G/metabolismo , Programas Informáticos , Interfaz Usuario-Computador
8.
Anal Chem ; 86(10): 4767-74, 2014 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-24779709

RESUMEN

The colonic breakdown and human biotransformation of small molecules present in food can give rise to a large variety of potentially bioactive metabolites in the human body. However, the absence of reference data for many of these components limits their identification in complex biological samples, such as plasma and urine. We present an in silico workflow for automatic chemical annotation of metabolite profiling data from liquid chromatography coupled with multistage accurate mass spectrometry (LC-MS(n)), which we used to systematically screen for the presence of tea-derived metabolites in human urine samples after green tea consumption. Reaction rules for intestinal degradation and human biotransformation were systematically applied to chemical structures of 75 green tea components, resulting in a virtual library of 27,245 potential metabolites. All matching precursor ions in the urine LC-MS(n) data sets, as well as the corresponding fragment ions, were automatically annotated by in silico generated (sub)structures. The results were evaluated based on 74 previously identified urinary metabolites and lead to the putative identification of 26 additional green tea-derived metabolites. A total of 77% of all annotated metabolites were not present in the Pubchem database, demonstrating the benefit of in silico metabolite prediction for the automatic annotation of yet unknown metabolites in LC-MS(n) data from nutritional metabolite profiling experiments.


Asunto(s)
Té/química , Orina/química , Biotransformación , Cromatografía Liquida , Simulación por Computador , Humanos , Mucosa Intestinal/metabolismo , Espectrometría de Masas en Tándem
9.
Anal Chem ; 85(12): 6033-40, 2013 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-23662787

RESUMEN

Liquid chromatography coupled with multistage accurate mass spectrometry (LC-MS(n)) can generate comprehensive spectral information of metabolites in crude extracts. To support structural characterization of the many metabolites present in such complex samples, we present a novel method ( http://www.emetabolomics.org/magma ) to automatically process and annotate the LC-MS(n) data sets on the basis of candidate molecules from chemical databases, such as PubChem or the Human Metabolite Database. Multistage MS(n) spectral data is automatically annotated with hierarchical trees of in silico generated substructures of candidate molecules to explain the observed fragment ions and alternative candidates are ranked on the basis of the calculated matching score. We tested this method on an untargeted LC-MS(n) (n ≤ 3) data set of a green tea extract, generated on an LC-LTQ/Orbitrap hybrid MS system. For the 623 spectral trees obtained in a single LC-MS(n) run, a total of 116,240 candidate molecules with monoisotopic masses matching within 5 ppm mass accuracy were retrieved from the PubChem database, ranging from 4 to 1327 candidates per molecular ion. The matching scores were used to rank the candidate molecules for each LC-MS(n) component. The median and third quartile fractional ranks for 85 previously identified tea compounds were 3.5 and 7.5, respectively. The substructure annotations and rankings provided detailed structural information of the detected components, beyond annotation with elemental formula only. Twenty-four additional components were putatively identified by expert interpretation of the automatically annotated data set, illustrating the potential to support systematic and untargeted metabolite identification.


Asunto(s)
Metaboloma/fisiología , Extractos Vegetales/química , Extractos Vegetales/metabolismo , Espectrometría de Masas en Tándem/métodos , Té/química , Té/metabolismo , Automatización de Laboratorios/métodos , Cromatografía Liquida/métodos , Espectrometría de Masas/métodos , Extractos Vegetales/análisis
10.
Biochemistry ; 51(8): 1774-86, 2012 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-22280021

RESUMEN

Soluble epoxide hydrolase (sEH) is an enzyme involved in drug metabolism that catalyzes the hydrolysis of epoxides to form their corresponding diols. sEH has a broad substrate range and shows high regio- and enantioselectivity for nucleophilic ring opening by Asp333. Epoxide hydrolases therefore have potential synthetic applications. We have used combined quantum mechanics/molecular mechanics (QM/MM) umbrella sampling molecular dynamics (MD) simulations (at the AM1/CHARMM22 level) and high-level ab initio (SCS-MP2) QM/MM calculations to analyze the reactions, and determinants of selectivity, for two substrates: trans-stilbene oxide (t-SO) and trans-diphenylpropene oxide (t-DPPO). The calculated free energy barriers from the QM/MM (AM1/CHARMM22) umbrella sampling MD simulations show a lower barrier for phenyl attack in t-DPPO, compared with that for benzylic attack, in agreement with experiment. Activation barriers in agreement with experimental rate constants are obtained only with the highest level of QM theory (SCS-MP2) used. Our results show that the selectivity of the ring-opening reaction is influenced by several factors, including proximity to the nucleophile, electronic stabilization of the transition state, and hydrogen bonding to two active site tyrosine residues. The protonation state of His523 during nucleophilic attack has also been investigated, and our results show that the protonated form is most consistent with experimental findings. The work presented here illustrates how determinants of selectivity can be identified from QM/MM simulations. These insights may also provide useful information for the design of novel catalysts for use in the synthesis of enantiopure compounds.


Asunto(s)
Epóxido Hidrolasas/química , Compuestos Epoxi/química , Simulación de Dinámica Molecular , Catálisis , Dominio Catalítico , Epóxido Hidrolasas/metabolismo , Enlace de Hidrógeno , Modelos Moleculares , Teoría Cuántica , Estereoisomerismo , Estilbenos/química
11.
Anal Chem ; 84(16): 7263-71, 2012 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-22827565

RESUMEN

In dietary polyphenol exposure studies, annotation and identification of urinary metabolites present at low (micromolar) concentrations are major obstacles. To determine the biological activity of specific components, it is necessary to have the correct structures and the quantification of the polyphenol-derived conjugates present in the human body. We present a procedure for identification and quantification of metabolites and conjugates excreted in human urine after single bolus intake of black or green tea. A combination of a solid-phase extraction (SPE) preparation step and two high pressure liquid chromatography (HPLC)-based analytical platforms was used, namely, accurate mass fragmentation (HPLC-FTMS(n)) and mass-guided SPE-trapping of selected compounds for nuclear magnetic resonance spectroscopy (NMR) measurements (HPLC-TOFMS-SPE-NMR). HPLC-FTMS(n) analysis led to the annotation of 138 urinary metabolites, including 48 valerolactone and valeric acid conjugates. By combining the results from MS(n) fragmentation with the one-dimensional (1D)-(1)H NMR spectra of HPLC-TOFMS-SPE-trapped compounds, we elucidated the structures of 36 phenolic conjugates, including the glucuronides of 3',4'-di- and 3',4',5'-trihydroxyphenyl-γ-valerolactone, three urolithin glucuronides, and indole-3-acetic acid glucuronide. We also obtained 26 h-quantitative excretion profiles for specific valerolactone conjugates. The combination of the HPLC-FTMS(n) and HPLC-TOFMS-SPE-NMR platforms results in the efficient identification and quantification of less abundant phenolic conjugates down to nanomoles of trapped amounts of metabolite corresponding to micromolar metabolite concentrations in urine.


Asunto(s)
Ingestión de Líquidos , Fenol/química , Fenol/orina , Té/química , Urinálisis/métodos , Cromatografía Líquida de Alta Presión , Humanos , Espectroscopía de Resonancia Magnética , Espectrometría de Masas , Fenol/metabolismo , Extracción en Fase Sólida , Tilidina/química
12.
Rapid Commun Mass Spectrom ; 26(20): 2461-71, 2012 Oct 30.
Artículo en Inglés | MEDLINE | ID: mdl-22976213

RESUMEN

RATIONALE: High-resolution multistage MS(n) data contains detailed information that can be used for structural elucidation of compounds observed in metabolomics studies. However, full exploitation of this complex data requires significant analysis efforts by human experts. In silico methods currently used to support data annotation by assigning substructures of candidate molecules are limited to a single level of MS fragmentation. METHODS: We present an extended substructure-based approach which allows annotation of hierarchical spectral trees obtained from high-resolution multistage MS(n) experiments. The algorithm yields a hierarchical tree of substructures of a candidate molecule to explain the fragment peaks observed at consecutive levels of the multistage MS(n) spectral tree. A matching score is calculated that indicates how well the candidate structure can explain the observed hierarchical fragmentation pattern. RESULTS: The method is applied to MS(n) spectral trees of a set of compounds representing important chemical classes in metabolomics. Based on the calculated score, the correct molecules were successfully prioritized among extensive sets of candidates structures retrieved from the PubChem database. CONCLUSIONS: The results indicate that the inclusion of subsequent levels of fragmentation in the automatic annotation of MS(n) data improves the identification of the correct compounds. We show that, especially in the case of lower mass accuracy, this improvement is not only due to the inclusion of additional fragment ions in the analysis, but also to the specific hierarchical information present in the MS(n) spectral trees. This method may significantly reduce the time required by MS experts to analyze complex MS(n) data.


Asunto(s)
Algoritmos , Espectrometría de Masas/métodos , Bases de Datos Factuales , Metabolómica/métodos
13.
J Cheminform ; 13(1): 84, 2021 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-34715914

RESUMEN

Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model's prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.

14.
Nat Commun ; 12(1): 7068, 2021 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-34862392

RESUMEN

Three-dimensional (3D) structures of protein complexes provide fundamental information to decipher biological processes at the molecular scale. The vast amount of experimentally and computationally resolved protein-protein interfaces (PPIs) offers the possibility of training deep learning models to aid the predictions of their biological relevance. We present here DeepRank, a general, configurable deep learning framework for data mining PPIs using 3D convolutional neural networks (CNNs). DeepRank maps features of PPIs onto 3D grids and trains a user-specified CNN on these 3D grids. DeepRank allows for efficient training of 3D CNNs with data sets containing millions of PPIs and supports both classification and regression. We demonstrate the performance of DeepRank on two distinct challenges: The classification of biological versus crystallographic PPIs, and the ranking of docking models. For both problems DeepRank is competitive with, or outperforms, state-of-the-art methods, demonstrating the versatility of the framework for research in structural biology.


Asunto(s)
Minería de Datos/métodos , Aprendizaje Profundo , Mapeo de Interacción de Proteínas/métodos , Cristalografía , Conjuntos de Datos como Asunto , Simulación del Acoplamiento Molecular , Dominios y Motivos de Interacción de Proteínas , Mapas de Interacción de Proteínas
15.
Sci Rep ; 11(1): 24, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33420133

RESUMEN

Accurate and low-cost sleep measurement tools are needed in both clinical and epidemiological research. To this end, wearable accelerometers are widely used as they are both low in price and provide reasonably accurate estimates of movement. Techniques to classify sleep from the high-resolution accelerometer data primarily rely on heuristic algorithms. In this paper, we explore the potential of detecting sleep using Random forests. Models were trained using data from three different studies where 134 adult participants (70 with sleep disorder and 64 good healthy sleepers) wore an accelerometer on their wrist during a one-night polysomnography recording in the clinic. The Random forests were able to distinguish sleep-wake states with an F1 score of 73.93% on a previously unseen test set of 24 participants. Detecting when the accelerometer is not worn was also successful using machine learning ([Formula: see text]), and when combined with our sleep detection models on day-time data provide a sleep estimate that is correlated with self-reported habitual nap behaviour ([Formula: see text]). These Random forest models have been made open-source to aid further research. In line with literature, sleep stage classification turned out to be difficult using only accelerometer data.


Asunto(s)
Acelerometría/métodos , Polisomnografía/métodos , Sueño/fisiología , Acelerometría/instrumentación , Acelerometría/estadística & datos numéricos , Adolescente , Adulto , Anciano , Algoritmos , Aprendizaje Profundo , Femenino , Humanos , Aprendizaje Automático , Masculino , Persona de Mediana Edad , Polisomnografía/instrumentación , Polisomnografía/estadística & datos numéricos , Fases del Sueño , Trastornos del Sueño-Vigilia/diagnóstico , Dispositivos Electrónicos Vestibles , Adulto Joven
16.
PeerJ Comput Sci ; 6: e281, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33816932

RESUMEN

It is essential for the advancement of science that researchers share, reuse and reproduce each other's workflows and protocols. The FAIR principles are a set of guidelines that aim to maximize the value and usefulness of research data, and emphasize the importance of making digital objects findable and reusable by others. The question of how to apply these principles not just to data but also to the workflows and protocols that consume and produce them is still under debate and poses a number of challenges. In this paper we describe a two-fold approach of simultaneously applying the FAIR principles to scientific workflows as well as the involved data. We apply and evaluate our approach on the case of the PREDICT workflow, a highly cited drug repurposing workflow. This includes FAIRification of the involved datasets, as well as applying semantic technologies to represent and store data about the detailed versions of the general protocol, of the concrete workflow instructions, and of their execution traces. We propose a semantic model to address these specific requirements and was evaluated by answering competency questions. This semantic model consists of classes and relations from a number of existing ontologies, including Workflow4ever, PROV, EDAM, and BPMN. This allowed us then to formulate and answer new kinds of competency questions. Our evaluation shows the high degree to which our FAIRified OpenPREDICT workflow now adheres to the FAIR principles and the practicality and usefulness of being able to answer our new competency questions.

17.
Artículo en Inglés | MEDLINE | ID: mdl-25806366

RESUMEN

Metabolite annotation and identification are primary challenges in untargeted metabolomics experiments. Rigorous workflows for reliable annotation of mass features with chemical structures or compound classes are needed to enhance the power of untargeted mass spectrometry. High-resolution mass spectrometry considerably improves the confidence in assigning elemental formulas to mass features in comparison to nominal mass spectrometry, and embedding of fragmentation methods enables more reliable metabolite annotations and facilitates metabolite classification. However, the analysis of mass fragmentation spectra can be a time-consuming step and requires expert knowledge. This study demonstrates how characteristic fragmentations, specific to compound classes, can be used to systematically analyze their presence in complex biological extracts like urine that have undergone untargeted mass spectrometry combined with data dependent or targeted fragmentation. Human urine extracts were analyzed using normal phase liquid chromatography (hydrophilic interaction chromatography) coupled to an Ion Trap-Orbitrap hybrid instrument. Subsequently, mass chromatograms and collision-induced dissociation and higher-energy collisional dissociation (HCD) fragments were annotated using the freely available MAGMa software. Acylcarnitines play a central role in energy metabolism by transporting fatty acids into the mitochondrial matrix. By filtering on a combination of a mass fragment and neutral loss designed based on the MAGMa fragment annotations, we were able to classify and annotate 50 acylcarnitines in human urine extracts, based on high-resolution mass spectrometry HCD fragmentation spectra at different energies for all of them. Of these annotated acylcarnitines, 31 are not described in HMDB yet and for only 4 annotated acylcarnitines the fragmentation spectra could be matched to reference spectra. Therefore, we conclude that the use of mass fragmentation filters within the context of untargeted metabolomics experiments is a valuable tool to enhance the annotation of small metabolites.

18.
Curr Top Med Chem ; 3(11): 1241-56, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-12769703

RESUMEN

An overview of the combined quantum mechanical/molecular mechanical (QM/MM) approach and its application to studies of biotransformation enzymes and drug metabolism is given. Theoretical methods to simulate enzymatic reactions have rapidly developed during the last decade. In particular, QM/MM methods provide detailed insights into enzyme catalyzed reactions, which can be extremely valuable in complementing experimental research. QM/MM methods allow the reacting groups in the active site of an enzyme to be studied at a quantum mechanical level, while the surrounding protein and solvent is included at a classical (and computationally less expensive) molecular mechanical level. Existing QM/MM implementations vary in the level of interaction between the QM and MM regions and in the way the partitioning into QM and MM regions is setup. Some general considerations concerning reaction modeling are discussed and a number of QM/MM studies related to drug metabolism are described. These studies illustrate that theoretical modeling of important metabolic reactions provides detailed insights into mechanisms of reaction and specific catalytic effects of enzyme residues as well as explaining variation in rates of conversion of different metabolites. Such information is essential in the development of methods to predict metabolism of drugs and to understand metabolic effects of genetic polymorphism in biotransformation enzymes.


Asunto(s)
Enzimas/química , Enzimas/metabolismo , Modelos Químicos , Animales , Biotransformación , Catálisis , Simulación por Computador , Humanos , Modelos Moleculares , Teoría Cuántica , Relación Estructura-Actividad , Termodinámica
19.
Mass Spectrom (Tokyo) ; 3(Spec Iss 2): S0033, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-26819876

RESUMEN

The MAGMa software for automatic annotation of mass spectrometry based fragmentation data was applied to 16 MS/MS datasets of the CASMI 2013 contest. Eight solutions were submitted in category 1 (molecular formula assignments) and twelve in category 2 (molecular structure assignment). The MS/MS peaks of each challenge were matched with in silico generated substructures of candidate molecules from PubChem, resulting in penalty scores that were used for candidate ranking. In 6 of the 12 submitted solutions in category 2, the correct chemical structure obtained the best score, whereas 3 molecules were ranked outside the top 5. All top ranked molecular formulas submitted in category 1 were correct. In addition, we present MAGMa results generated retrospectively for the remaining challenges. Successful application of the MAGMa algorithm required inclusion of the relevant candidate molecules, application of the appropriate mass tolerance and a sufficient degree of in silico fragmentation of the candidate molecules. Furthermore, the effect of the exhaustiveness of the candidate lists and limitations of substructure based scoring are discussed.

20.
Bioanalysis ; 5(17): 2115-28, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23962251

RESUMEN

BACKGROUND: Comprehensive identification of human drug metabolites in first-in-man studies is crucial to avoid delays in later stages of drug development. We developed an efficient workflow for systematic identification of human metabolites in plasma or serum that combines metabolite prediction, high-resolution accurate mass LC-MS and MS vendor independent data processing. Retrospective evaluation of predictions for 14 (14)C-ADME studies published in the period 2007-January 2012 indicates that on average 90% of the major metabolites in human plasma can be identified by searching for accurate masses of predicted metabolites. Furthermore, the workflow can identify unexpected metabolites in the same processing run, by differential analysis of samples of drug-dosed subjects and (placebo-dosed, pre-dose or otherwise blank) control samples. To demonstrate the utility of the workflow we applied it to identify tamoxifen metabolites in serum of a breast cancer patient treated with tamoxifen. RESULTS & CONCLUSION: Previously published metabolites were confirmed in this study and additional metabolites were identified, two of which are discussed to illustrate the advantages of the workflow.


Asunto(s)
Antineoplásicos Hormonales/sangre , Neoplasias de la Mama/sangre , Tamoxifeno/sangre , Antineoplásicos Hormonales/uso terapéutico , Biotransformación , Neoplasias de la Mama/tratamiento farmacológico , Cromatografía Liquida , Interpretación Estadística de Datos , Femenino , Humanos , Espectrometría de Masas , Tamoxifeno/uso terapéutico
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA