Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
1.
Environ Sci Process Impacts ; 21(11): 1835-1851, 2019 Nov 01.
Article in English | MEDLINE | ID: mdl-31576380

ABSTRACT

Per- and polyfluoroalkyl substances (PFASs) are a large and diverse class of chemicals of great interest due to their wide commercial applicability, as well as increasing public concern regarding their adverse impacts. A common terminology for PFASs was recommended in 2011, including broad categorization and detailed naming for many PFASs with rather simple molecular structures. Recent advancements in chemical analysis have enabled identification of a wide variety of PFASs that are not covered by this common terminology. The resulting inconsistency in categorizing and naming of PFASs is preventing efficient assimilation of reported information. This article explores how a combination of expert knowledge and cheminformatics approaches could help address this challenge in a systematic manner. First, the "splitPFAS" approach was developed to systematically subdivide PFASs (for eventual categorization) following a CnF2n+1-X-R pattern into their various parts, with a particular focus on 4 PFAS categories where X is CO, SO2, CH2 and CH2CH2. Then, the open, ontology-based "ClassyFire" approach was tested for potential applicability to categorizing and naming PFASs using five scenarios of original and simplified structures based on the "splitPFAS" output. This workflow was applied to a set of 770 PFASs from the latest OECD PFAS list. While splitPFAS categorized PFASs as intended, the ClassyFire results were mixed. These results reveal that open cheminformatics approaches have the potential to assist in categorizing PFASs in a consistent manner, while much development is needed for future systematic naming of PFASs. The "splitPFAS" tool and related code are publicly available, and include options to extend this proof-of-concept to encompass further PFASs in the future.


Subject(s)
Cheminformatics , Fluorocarbons , Fluorocarbons/chemistry , Fluorocarbons/classification , Humans , Molecular Structure
2.
BMC Bioinformatics ; 20(1): 376, 2019 Jul 05.
Article in English | MEDLINE | ID: mdl-31277571

ABSTRACT

BACKGROUND: Molecule identification is a crucial step in metabolomics and environmental sciences. Besides in silico fragmentation, as performed by MetFrag, also machine learning and statistical methods evolved, showing an improvement in molecule annotation based on MS/MS data. In this work we present a new statistical scoring method where annotations of m/z fragment peaks to fragment-structures are learned in a training step. Based on a Bayesian model, two additional scoring terms are integrated into the new MetFrag2.4.5 and evaluated on the test data set of the CASMI 2016 contest. RESULTS: The results on the 87 MS/MS spectra from positive and negative mode show a substantial improvement of the results compared to submissions made by the former MetFrag approach. Top1 rankings increased from 5 to 21 and Top10 rankings from 39 to 55 both showing higher values than for CSI:IOKR, the winner of the CASMI 2016 contest. For the negative mode spectra, MetFrag's statistical scoring outperforms all other participants which submitted results for this type of spectra. CONCLUSIONS: This study shows how statistical learning can improve molecular structure identification based on MS/MS data compared on the same method using combinatorial in silico fragmentation only. MetFrag2.4.5 shows especially in negative mode a better performance compared to the other participating approaches.


Subject(s)
Metabolomics/methods , Tandem Mass Spectrometry/methods , Bayes Theorem , Computer Simulation , Molecular Structure
3.
Anal Bioanal Chem ; 411(19): 4683-4700, 2019 Jul.
Article in English | MEDLINE | ID: mdl-31209548

ABSTRACT

Liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) is increasingly popular for the non-targeted exploration of complex samples, where tandem mass spectrometry (MS/MS) is used to characterize the structure of unknown compounds. However, mass spectra do not always contain sufficient information to unequivocally identify the correct structure. This study investigated how much additional information can be gained using hydrogen deuterium exchange (HDX) experiments. The exchange of "easily exchangeable" hydrogen atoms (connected to heteroatoms), with predominantly [M+D]+ ions in positive mode and [M-D]- in negative mode was observed. To enable high-throughput processing, new scoring terms were incorporated into the in silico fragmenter MetFrag. These were initially developed on small datasets and then tested on 762 compounds of environmental interest. Pairs of spectra (normal and deuterated) were found for 593 of these substances (506 positive mode, 155 negative mode spectra). The new scoring terms resulted in 29 additional correct identifications (78 vs 49) for positive mode and an increase in top 10 rankings from 80 to 106 in negative mode. Compounds with dual functionality (polar head group, long apolar tail) exhibited dramatic retention time (RT) shifts of up to several minutes, compared with an average 0.04 min RT shift. For a smaller dataset of 80 metabolites, top 10 rankings improved from 13 to 24 (positive mode, 57 spectra) and from 14 to 31 (negative mode, 63 spectra) when including HDX information. The results of standard measurements were confirmed using targets and tentatively identified surfactant species in an environmental sample collected from the river Danube near Novi Sad (Serbia). The changes to MetFrag have been integrated into the command line version available at http://c-ruttkies.github.io/MetFrag and all resulting spectra and compounds are available in online resources and in the Electronic Supplementary Material (ESM). Graphical abstract.

4.
Bioinformatics ; 35(19): 3752-3760, 2019 10 01.
Article in English | MEDLINE | ID: mdl-30851093

ABSTRACT

MOTIVATION: Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. RESULTS: We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science. AVAILABILITY AND IMPLEMENTATION: The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Data Analysis , Metabolomics , Computational Biology , Software , Workflow
5.
Gigascience ; 8(2)2019 02 01.
Article in English | MEDLINE | ID: mdl-30535405

ABSTRACT

BACKGROUND: Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological, and many other applied biological domains. Its computationally intensive nature has driven requirements for open data formats, data repositories, and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution. FINDINGS: PhenoMeNal (Phenome and Metabolome aNalysis) is an advanced and complete solution to set up Infrastructure-as-a-Service (IaaS) that brings workflow-oriented, interoperable metabolomics data analysis platforms into the cloud. PhenoMeNal seamlessly integrates a wide array of existing open-source tools that are tested and packaged as Docker containers through the project's continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated, and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi, and Pachyderm. CONCLUSIONS: PhenoMeNal constitutes a keystone solution in cloud e-infrastructures available for metabolomics. PhenoMeNal is a unique and complete solution for setting up cloud e-infrastructures through easy-to-use web interfaces that can be scaled to any custom public and private cloud environment. By harmonizing and automating software installation and configuration and through ready-to-use scientific workflow user interfaces, PhenoMeNal has succeeded in providing scientists with workflow-driven, reproducible, and shareable metabolomics data analysis platforms that are interfaced through standard data formats, representative datasets, versioned, and have been tested for reproducibility and interoperability. The elastic implementation of PhenoMeNal further allows easy adaptation of the infrastructure to other application areas and 'omics research domains.


Subject(s)
Metabolomics/methods , Software , Cloud Computing , Humans , Workflow
6.
J Cheminform ; 10(1): 45, 2018 Aug 30.
Article in English | MEDLINE | ID: mdl-30167882

ABSTRACT

Chemical database searching has become a fixture in many non-targeted identification workflows based on high-resolution mass spectrometry (HRMS). However, the form of a chemical structure observed in HRMS does not always match the form stored in a database (e.g., the neutral form versus a salt; one component of a mixture rather than the mixture form used in a consumer product). Linking the form of a structure observed via HRMS to its related form(s) within a database will enable the return of all relevant variants of a structure, as well as the related metadata, in a single query. A Konstanz Information Miner (KNIME) workflow has been developed to produce structural representations observed using HRMS ("MS-Ready structures") and links them to those stored in a database. These MS-Ready structures, and associated mappings to the full chemical representations, are surfaced via the US EPA's Chemistry Dashboard ( https://comptox.epa.gov/dashboard/ ). This article describes the workflow for the generation and linking of ~ 700,000 MS-Ready structures (derived from ~ 760,000 original structures) as well as download, search and export capabilities to serve structure identification using HRMS. The importance of this form of structural representation for HRMS is demonstrated with several examples, including integration with the in silico fragmentation software application MetFrag. The structures, search, download and export functionality are all available through the CompTox Chemistry Dashboard, while the MetFrag implementation can be viewed at https://msbi.ipb-halle.de/MetFragBeta/ .

7.
Int J Mol Sci ; 19(5)2018 May 06.
Article in English | MEDLINE | ID: mdl-29734799

ABSTRACT

The relatively new research discipline of Eco-Metabolomics is the application of metabolomics techniques to ecology with the aim to characterise biochemical interactions of organisms across different spatial and temporal scales. Metabolomics is an untargeted biochemical approach to measure many thousands of metabolites in different species, including plants and animals. Changes in metabolite concentrations can provide mechanistic evidence for biochemical processes that are relevant at ecological scales. These include physiological, phenotypic and morphological responses of plants and communities to environmental changes and also interactions with other organisms. Traditionally, research in biochemistry and ecology comes from two different directions and is performed at distinct spatiotemporal scales. Biochemical studies most often focus on intrinsic processes in individuals at physiological and cellular scales. Generally, they take a bottom-up approach scaling up cellular processes from spatiotemporally fine to coarser scales. Ecological studies usually focus on extrinsic processes acting upon organisms at population and community scales and typically study top-down and bottom-up processes in combination. Eco-Metabolomics is a transdisciplinary research discipline that links biochemistry and ecology and connects the distinct spatiotemporal scales. In this review, we focus on approaches to study chemical and biochemical interactions of plants at various ecological levels, mainly plant⁻organismal interactions, and discuss related examples from other domains. We present recent developments and highlight advancements in Eco-Metabolomics over the last decade from various angles. We further address the five key challenges: (1) complex experimental designs and large variation of metabolite profiles; (2) feature extraction; (3) metabolite identification; (4) statistical analyses; and (5) bioinformatics software tools and workflows. The presented solutions to these challenges will advance connecting the distinct spatiotemporal scales and bridging biochemistry and ecology.


Subject(s)
Ecology , Metabolomics/trends , Plants/genetics , Plants/metabolism
8.
Anal Bioanal Chem ; 410(7): 1931-1941, 2018 Mar.
Article in English | MEDLINE | ID: mdl-29380019

ABSTRACT

In nontarget screening, structure elucidation of small molecules from high resolution mass spectrometry (HRMS) data is challenging, particularly the selection of the most likely candidate structure among the many retrieved from compound databases. Several fragmentation and retention prediction methods have been developed to improve this candidate selection. In order to evaluate their performance, we compared two in silico fragmenters (MetFrag and CFM-ID) and two retention time prediction models (based on the chromatographic hydrophobicity index (CHI) and on log D). A set of 78 known organic micropollutants was analyzed by liquid chromatography coupled to a LTQ Orbitrap HRMS with electrospray ionization (ESI) in positive and negative mode using two fragmentation techniques with different collision energies. Both fragmenters (MetFrag and CFM-ID) performed well for most compounds, with average ranking the correct candidate structure within the top 25% and 22 to 37% for ESI+ and ESI- mode, respectively. The rank of the correct candidate structure slightly improved when MetFrag and CFM-ID were combined. For unknown compounds detected in both ESI+ and ESI-, generally positive mode mass spectra were better for further structure elucidation. Both retention prediction models performed reasonably well for more hydrophobic compounds but not for early eluting hydrophilic substances. The log D prediction showed a better accuracy than the CHI model. Although the two fragmentation prediction methods are more diagnostic and sensitive for candidate selection, the inclusion of retention prediction by calculating a consensus score with optimized weighting can improve the ranking of correct candidates as compared to the individual methods. Graphical abstract Consensus workflow for combining fragmentation and retention prediction in LC-HRMS-based micropollutant identification.

9.
J Cheminform ; 9(1): 22, 2017 Mar 27.
Article in English | MEDLINE | ID: mdl-29086042

ABSTRACT

BACKGROUND: The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification. RESULTS: The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in "Category 2: Best Automatic Structural Identification-In Silico Fragmentation Only", won by Team Brouard with 41% challenge wins. The winner of "Category 3: Best Automatic Structural Identification-Full Information" was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways. CONCLUSIONS: The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for "known unknowns". As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for "real life" annotations. The true "unknown unknowns" remain to be evaluated in future CASMI contests. Graphical abstract .

10.
Chemosphere ; 184: 1186-1193, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28672699

ABSTRACT

The inclusion of new psychoactive substances (NPS) in the wastewater-based epidemiology approach presents challenges, such as the reduced number of users that translates into low concentrations of residues and the limited pharmacokinetics information available, which renders the choice of target biomarker difficult. The sampling during special social settings, the analysis with improved analytical techniques, and data processing with specific workflow to narrow the search, are required approaches for a successful monitoring. This work presents the application of a qualitative screening technique to wastewater samples collected during a city festival, where likely users of recreational substances gather and consequently higher residual concentrations of used NPS are expected. The analysis was performed using liquid chromatography coupled to high-resolution mass spectrometry. Data were processed using an algorithm that involves the extraction of accurate masses (calculated based on molecular formula) of expected m/z from an in-house database containing about 2,000 entries, including NPS and transformation products. We positively identified eight NPS belonging to the classes of synthetic cathinones, phenethylamines and opioids. In addition, the presence of benzodiazepine analogues, classical drugs and other licit substances with potential for abuse was confirmed. The screening workflow based on a database search was useful in the identification of NPS biomarkers in wastewater. The findings highlight the specific classical drugs and low NPS use in the Netherlands. Additionally, meta-chlorophenylpiperazine (mCPP), 2,5-dimethoxy-4-bromophenethylamine (2C-B), and 4-fluoroamphetamine (FA) were identified in wastewater for the first time.


Subject(s)
Psychotropic Drugs/analysis , Water Pollutants, Chemical/analysis , Amphetamines/analysis , Chromatography, Liquid/methods , Cities , Environmental Monitoring/methods , Holidays , Netherlands , Piperazines/analysis , Substance Abuse Detection/methods , Tandem Mass Spectrometry/methods , Wastewater/chemistry
11.
J Biotechnol ; 261: 137-141, 2017 Nov 10.
Article in English | MEDLINE | ID: mdl-28554829

ABSTRACT

Metabolomics is the modern term for the field of small molecule research in biology and biochemistry. Currently, metabolomics is undergoing a transition where the classic analytical chemistry is combined with modern cheminformatics and bioinformatics methods, paving the way for large-scale data analysis. We give some background on past developments, highlight current state-of-the-art approaches, and give a perspective on future requirements.


Subject(s)
Biomedical Research , Computational Biology , Metabolomics , Mass Spectrometry
12.
PLoS One ; 12(3): e0172311, 2017.
Article in English | MEDLINE | ID: mdl-28278196

ABSTRACT

Lipid identification is a major bottleneck in high-throughput lipidomics studies. However, tools for the analysis of lipid tandem MS spectra are rather limited. While the comparison against spectra in reference libraries is one of the preferred methods, these libraries are far from being complete. In order to improve identification rates, the in silico fragmentation tool MetFrag was combined with Lipid Maps and lipid-class specific classifiers which calculate probabilities for lipid class assignments. The resulting LipidFrag workflow was trained and evaluated on different commercially available lipid standard materials, measured with data dependent UPLC-Q-ToF-MS/MS acquisition. The automatic analysis was compared against manual MS/MS spectra interpretation. With the lipid class specific models, identification of the true positives was improved especially for cases where candidate lipids from different lipid classes had similar MetFrag scores by removing up to 56% of false positive results. This LipidFrag approach was then applied to MS/MS spectra of lipid extracts of the nematode Caenorhabditis elegans. Fragments explained by LipidFrag match known fragmentation pathways, e.g., neutral losses of lipid headgroups and fatty acid side chain fragments. Based on prediction models trained on standard lipid materials, high probabilities for correct annotations were achieved, which makes LipidFrag a good choice for automated lipid data analysis and reliability testing of lipid identifications.


Subject(s)
Caenorhabditis elegans/metabolism , Lipids/analysis , Lipids/chemistry , Tandem Mass Spectrometry/methods , Animals , Caenorhabditis elegans/growth & development , Computer Simulation , Reproducibility of Results
13.
J Cheminform ; 8: 3, 2016.
Article in English | MEDLINE | ID: mdl-26834843

ABSTRACT

BACKGROUND: The in silico fragmenter MetFrag, launched in 2010, was one of the first approaches combining compound database searching and fragmentation prediction for small molecule identification from tandem mass spectrometry data. Since then many new approaches have evolved, as has MetFrag itself. This article details the latest developments to MetFrag and its use in small molecule identification since the original publication. RESULTS: MetFrag has gone through algorithmic and scoring refinements. New features include the retrieval of reference, data source and patent information via ChemSpider and PubChem web services, as well as InChIKey filtering to reduce candidate redundancy due to stereoisomerism. Candidates can be filtered or scored differently based on criteria like occurence of certain elements and/or substructures prior to fragmentation, or presence in so-called "suspect lists". Retention time information can now be calculated either within MetFrag with a sufficient amount of user-provided retention times, or incorporated separately as "user-defined scores" to be included in candidate ranking. The changes to MetFrag were evaluated on the original dataset as well as a dataset of 473 merged high resolution tandem mass spectra (HR-MS/MS) and compared with another open source in silico fragmenter, CFM-ID. Using HR-MS/MS information only, MetFrag2.2 and CFM-ID had 30 and 43 Top 1 ranks, respectively, using PubChem as a database. Including reference and retention information in MetFrag2.2 improved this to 420 and 336 Top 1 ranks with ChemSpider and PubChem (89 and 71 %), respectively, and even up to 343 Top 1 ranks (PubChem) when combining with CFM-ID. The optimal parameters and weights were verified using three additional datasets of 824 merged HR-MS/MS spectra in total. Further examples are given to demonstrate flexibility of the enhanced features. CONCLUSIONS: In many cases additional information is available from the experimental context to add to small molecule identification, which is especially useful where the mass spectrum alone is not sufficient for candidate selection from a large number of candidates. The results achieved with MetFrag2.2 clearly show the benefit of considering this additional information. The new functions greatly enhance the chance of identification success and have been incorporated into a command line interface in a flexible way designed to be integrated into high throughput workflows. Feedback on the command line version of MetFrag2.2 available at http://c-ruttkies.github.io/MetFrag/ is welcome.

14.
Sci Total Environ ; 544: 1073-118, 2016 Feb 15.
Article in English | MEDLINE | ID: mdl-26779957

ABSTRACT

Aquatic environments are often contaminated with complex mixtures of chemicals that may pose a risk to ecosystems and human health. This contamination cannot be addressed with target analysis alone but tools are required to reduce this complexity and identify those chemicals that might cause adverse effects. Effect-directed analysis (EDA) is designed to meet this challenge and faces increasing interest in water and sediment quality monitoring. Thus, the present paper summarizes current experience with the EDA approach and the tools required, and provides practical advice on their application. The paper highlights the need for proper problem formulation and gives general advice for study design. As the EDA approach is directed by toxicity, basic principles for the selection of bioassays are given as well as a comprehensive compilation of appropriate assays, including their strengths and weaknesses. A specific focus is given to strategies for sampling, extraction and bioassay dosing since they strongly impact prioritization of toxicants in EDA. Reduction of sample complexity mainly relies on fractionation procedures, which are discussed in this paper, including quality assurance and quality control. Automated combinations of fractionation, biotesting and chemical analysis using so-called hyphenated tools can enhance the throughput and might reduce the risk of artifacts in laboratory work. The key to determining the chemical structures causing effects is analytical toxicant identification. The latest approaches, tools, software and databases for target-, suspect and non-target screening as well as unknown identification are discussed together with analytical and toxicological confirmation approaches. A better understanding of optimal use and combination of EDA tools will help to design efficient and successful toxicant identification studies in the context of quality monitoring in multiply stressed environments.


Subject(s)
Environmental Monitoring/methods , Biological Assay , Ecosystem , Hazardous Substances/analysis , Risk Assessment
16.
Rapid Commun Mass Spectrom ; 29(16): 1521-9, 2015 Aug 30.
Article in English | MEDLINE | ID: mdl-26212167

ABSTRACT

RATIONALE: Gas chromatography (GC) coupled to atmospheric pressure chemical ionization quadrupole time-of-flight mass spectrometry (APCI-QTOFMS) is an emerging technology in metabolomics. Reference spectra for GC/APCI-MS/MS barely exist; therefore, in silico fragmentation approaches and structure databases are prerequisites for annotation. To expand the limited coverage of derivatised structures in structure databases, in silico derivatisation procedures are required. METHODS: A cheminformatics workflow has been developed for in silico derivatisation of compounds found in KEGG and PubChem, and validated on the Golm Metabolome Database (GMD). To demonstrate this workflow, these in silico generated databases were applied together with MetFrag to APCI-MS/MS spectra acquired from GC/APCI-MS/MS profiles of Arabidopsis thaliana and Solanum tuberosum. The Metabolite-Likeness of the original candidate structure was included as additional scoring term aiming at candidate structures of natural origin. RESULTS: The validation of our in silico derivatisation workflow on the GMD showed a true positive rate of 94%. MetFrag was applied to two datasets. In silico derivatisation of the KEGG and PubChem database served as a candidate source. For both datasets the Metabolite-Likeness score improved the identification performance. The derivatised data sources have been included into the MetFrag web application for the annotation of GC/APCI-MS/MS spectra. CONCLUSIONS: We demonstrated that MetFrag can support the identification of components from GC/APCI-MS/MS profiles, especially in the (common) case where reference spectra are not available. This workflow can be easily adapted to other types of derivatisation and is freely accessible together with the generated structure databases.


Subject(s)
Data Curation/methods , Databases, Chemical , Gas Chromatography-Mass Spectrometry/methods , Software , Tandem Mass Spectrometry/methods , Computer Simulation , Internet , Models, Chemical , Plant Extracts/analysis , Plant Extracts/chemistry , Reproducibility of Results
17.
Mass Spectrom (Tokyo) ; 3(Spec Iss 2): S0036, 2014.
Article in English | MEDLINE | ID: mdl-26819879

ABSTRACT

The second Critical Assessment of Small Molecule Identification (CASMI) contest took place in 2013. A joint team from the Swiss Federal Institute of Aquatic Science and Technology (Eawag) and Leibniz Institute of Plant Biochemistry (IPB) participated in CASMI 2013 with an automatic workflow-style entry. MOLGEN-MS/MS was used for Category 1, molecular formula calculation, restricted by the information given for each challenge. MetFrag and MetFusion were used for Category 2, structure identification, retrieving candidates from the compound databases KEGG, PubChem and ChemSpider and joining these lists pre-submission. The results from Category 1 were used to guide whether formula or exact mass searches were performed for Category 2. The Category 2 results were impressive considering the database size and automated regime used, although these could not compete with the manual approach of the contest winner. The Category 1 results were affected by large m/z and ppm values in the challenge data, where strategies beyond pure enumeration from other participants were more successful. However, the combination used for the CASMI 2013 entries was extremely useful for developing decision-making criteria for automatic, high throughput general unknown (non-target) identification and for future contests.

18.
Biochim Biophys Acta ; 1830(4): 2994-3004, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23375722

ABSTRACT

BACKGROUND: Elastin is a vital protein and the major component of elastic fibers which provides resilience to many vertebrate tissues. Elastin's structure and function are influenced by extensive cross-linking, however, the cross-linking pattern is still unknown. METHODS: Small peptides containing reactive allysine residues based on sequences of cross-linking domains of human elastin were incubated in vitro to form cross-links characteristic of mature elastin. The resultant insoluble polymeric biomaterials were studied by scanning electron microscopy. Both, the supernatants of the samples and the insoluble polymers, after digestion with pancreatic elastase or trypsin, were furthermore comprehensively characterized on the molecular level using MALDI-TOF/TOF mass spectrometry. RESULTS: MS(2) data was used to develop the software PolyLinX, which is able to sequence not only linear and bifunctionally cross-linked peptides, but for the first time also tri- and tetrafunctionally cross-linked species. Thus, it was possible to identify intra- and intermolecular cross-links including allysine aldols, dehydrolysinonorleucines and dehydromerodesmosines. The formation of the tetrafunctional cross-link desmosine or isodesmosine was unexpected, however, could be confirmed by tandem mass spectrometry and molecular dynamics simulations. CONCLUSIONS: The study demonstrated that it is possible to produce biopolymers containing polyfunctional cross-links characteristic of mature elastin from small elastin peptides. MALDI-TOF/TOF mass spectrometry and the newly developed software PolyLinX proved suitable for sequencing of native cross-links in proteolytic digests of elastin-like biomaterials. GENERAL SIGNIFICANCE: The study provides important insight into the formation of native elastin cross-links and represents a considerable step towards the characterization of the complex cross-linking pattern of mature elastin.


Subject(s)
Elastin/chemistry , Amino Acid Sequence , Humans , Molecular Dynamics Simulation , Molecular Sequence Data , Protein Structure, Tertiary , Software , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
19.
Metabolites ; 3(3): 623-36, 2013 Aug 05.
Article in English | MEDLINE | ID: mdl-24958142

ABSTRACT

The task in the critical assessment of small molecule identification (CASMI) contest category 2 was to determine the identification of (initially) unknown compounds for which high-resolution tandem mass spectra were published. We focused on computer-assisted methods that tried to correctly identify the compound automatically and entered the contest with MetFrag and MetFusion to score candidate structures retrieved from the PubChem structure database. MetFrag was combined with the metabolite-likeness score, which helped to improve the performance for the natural product challenges. We present the results, discuss the performance, and give details of how to interpret the MetFrag and MetFusion output.

SELECTION OF CITATIONS
SEARCH DETAIL
...