RESUMO
Progress in mass spectrometry lipidomics has led to a rapid proliferation of studies across biology and biomedicine. These generate extremely large raw datasets requiring sophisticated solutions to support automated data processing. To address this, numerous software tools have been developed and tailored for specific tasks. However, for researchers, deciding which approach best suits their application relies on ad hoc testing, which is inefficient and time consuming. Here we first review the data processing pipeline, summarizing the scope of available tools. Next, to support researchers, LIPID MAPS provides an interactive online portal listing open-access tools with a graphical user interface. This guides users towards appropriate solutions within major areas in data processing, including (1) lipid-oriented databases, (2) mass spectrometry data repositories, (3) analysis of targeted lipidomics datasets, (4) lipid identification and (5) quantification from untargeted lipidomics datasets, (6) statistical analysis and visualization, and (7) data integration solutions. Detailed descriptions of functions and requirements are provided to guide customized data analysis workflows.
Assuntos
Biologia Computacional , Lipidômica , Biologia Computacional/métodos , Software , Informática , Lipídeos/químicaRESUMO
BACKGROUND: RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. RESULTS: Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. CONCLUSIONS: transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.
Assuntos
Software , Transcriptoma , Análise de Sequência de RNA/métodos , RNA-Seq , Perfilação da Expressão Gênica , Anotação de Sequência MolecularRESUMO
Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.
Assuntos
Produtos Biológicos/química , Espectrometria de Massas , Biologia Computacional/métodos , Bases de Dados Factuais , Metabolômica/métodos , SoftwareRESUMO
As a means to maintain their sessile lifestyle amid challenging environments, plants produce an enormous diversity of compounds as chemical defenses against biotic and abiotic insults. The underpinning metabolic pathways that support the biosynthesis of these specialized chemicals in divergent plant species provide a rich arena for understanding the molecular evolution of complex metabolic traits. Rosmarinic acid (RA) is a phenolic natural product first discovered in plants of the mint family (Lamiaceae) and is recognized for its wide range of medicinal properties and potential applications in human dietary and medical interventions. Interestingly, the RA chemotype is present sporadically in multiple taxa of flowering plants as well as some hornworts and ferns, prompting the question whether its biosynthesis arose independently across different lineages. Here we report the elucidation of the RA biosynthetic pathway in Phacelia campanularia (desert bells). This species represents the borage family (Boraginaceae), an RA-producing family closely related to the Lamiaceae within the Lamiids clade. Using a multi-omics approach in combination with functional characterization of candidate genes both in vitro and in vivo, we found that RA biosynthesis in P. campanularia involves specific activities of a BAHD acyltransferase and two cytochrome P450 hydroxylases. Further phylogenetic and comparative structure-function analyses of the P. campanularia RA biosynthetic enzymes clearly indicate that RA biosynthesis has evolved independently at least twice in the Lamiids, an exemplary case of chemotypic convergence through disparate evolutionary trajectories.
Assuntos
Cinamatos/metabolismo , Depsídeos/metabolismo , Evolução Molecular , Lamiaceae/metabolismo , Aciltransferases/genética , Aciltransferases/metabolismo , Vias Biossintéticas , Sistema Enzimático do Citocromo P-450/genética , Sistema Enzimático do Citocromo P-450/metabolismo , Lamiaceae/classificação , Lamiaceae/genética , Redes e Vias Metabólicas , Filogenia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Ácido RosmarínicoRESUMO
Comparing newly obtained and previously known nucleotide and amino-acid sequences underpins modern biological research. BLAST is a well-established tool for such comparisons but is challenging to use on new data sets. We combined a user-centric design philosophy with sustainable software development approaches to create Sequenceserver, a tool for running BLAST and visually inspecting BLAST results for biological interpretation. Sequenceserver uses simple algorithms to prevent potential analysis errors and provides flexible text-based and visual outputs to support researcher productivity. Our software can be rapidly installed for use by individuals or on shared servers.
Assuntos
Biologia Computacional/métodos , Técnicas Genéticas , SoftwareRESUMO
Humans perceive physical information about the surrounding environment through their senses. This physical information is registered by a collection of highly evolved and finely tuned molecular sensory receptors. A multitude of bioactive, structurally diverse ligands have evolved in nature that bind these molecular receptors. The complex, dynamic interactions between the ligands and the receptors lead to changes in our sensory perception or mood. Here, we review our current knowledge of natural products and their derived analogues that interact specifically with human G protein-coupled receptors, ion channels, and nuclear hormone receptors to modulate the sensations of taste, smell, temperature, pain, and itch, as well as mood and its associated behaviour. We discuss the molecular and structural mechanisms underlying such interactions and highlight cases where subtle differences in natural product chemistry produce drastic changes in functional outcome. We also discuss cases where a single compound triggers complex sensory or behavioural changes in humans through multiple mechanistic targets. Finally, we comment on the therapeutic potential of the reviewed area of research and draw attention to recent technological developments in genomics, metabolomics, and metabolic engineering that allow us to tap the medicinal properties of natural product chemistry without taxing nature.
Assuntos
Produtos Biológicos/farmacologia , Sensação/efeitos dos fármacos , Produtos Biológicos/química , Humanos , Canais Iônicos/metabolismo , Receptores Citoplasmáticos e Nucleares/metabolismo , Receptores Acoplados a Proteínas G/metabolismoRESUMO
Sesquiterpene scaffolds are the core backbones of many medicinally and industrially important natural products. A plethora of sesquiterpene synthases, widely present in bacteria, fungi, and plants, catalyze the formation of these intricate structures often with multiple stereocenters starting from linear farnesyl diphosphate substrates. Recent advances in next-generation sequencing and metabolomics technologies have greatly facilitated gene discovery for sesquiterpene synthases. However, a major bottleneck limits biochemical characterization of recombinant sesquiterpene synthases: the absolute structural elucidation of the derived sesquiterpene products. Here, we report the identification and biochemical characterization of LphTPS-A, a sesquiterpene synthase from the red macroalga Laurencia pacifica. Using the combination of transcriptomics, sesquiterpene synthase expression in yeast, and microgram-scale nuclear magnetic resonance-coupled crystalline sponge X-ray diffraction analysis, we resolved the absolute stereochemistry of prespatane, the major sesquiterpene product of LphTPS-A, and thereby functionally define LphTPS-A as the first bourbonane-producing sesquiterpene synthase and the first biochemically characterized sesquiterpene synthase from red algae. Our study showcases a workflow integrating multiomics approaches, synthetic biology, and the crystalline sponge method, which is generally applicable for uncovering new terpene chemistry and biochemistry from source-limited living organisms.
RESUMO
Living organisms have evolved multiple sophisticated mechanisms to deal with reactive oxygen species. We constructed a collection of twelve single-gene deletion strains of the fission yeast Schizosaccharomyces pombe designed for the study of oxidative and heavy metal stress responses. This collection contains deletions of biosynthetic enzymes of glutathione (Δgcs1 and Δgsa1), phytochelatin (Δpcs2), ubiquinone (Δabc1) and ergothioneine (Δegt1), as well as catalase (Δctt1), thioredoxins (Δtrx1 and Δtrx2), Cu/Zn- and Mn- superoxide dismutases (SODs; Δsod1 and Δsod2), sulfiredoxin (Δsrx1) and sulfide-quinone oxidoreductase (Δhmt2). First, we employed metabolomic analysis to examine the mutants of the glutathione biosynthetic pathway. We found that ophthalmic acid was produced by the same enzymes as glutathione in S. pombe. The identical genetic background of the strains allowed us to assess the severity of the individual gene knockouts by treating the deletion strains with oxidative agents. Among other results, we found that glutathione deletion strains were not particularly sensitive to peroxide or superoxide, but highly sensitive to cadmium stress. Our results show the astonishing diversity in cellular adaptation mechanisms to various types of oxidative and metal stress and provide a useful tool for further research into stress responses.
Assuntos
Metais Pesados/toxicidade , Estresse Oxidativo , Schizosaccharomyces/fisiologia , Vias Biossintéticas , Deleção de Genes , Glutationa/genética , Oligopeptídeos/biossíntese , Schizosaccharomyces/classificação , Schizosaccharomyces/efeitos dos fármacos , Schizosaccharomyces/genética , Proteínas de Schizosaccharomyces pombe/genética , Proteínas de Schizosaccharomyces pombe/metabolismo , Estresse FisiológicoRESUMO
Modern separation methods in conjunction with high-resolution accurate mass (HRAM) spectrometry can provide an enormous number of features characterized by exact mass and chromatographic behavior. Higher mass resolving power usually requires longer scanning times, and thus fewer data points are acquired across the target peak. This could cause difficulties for quantification, feature detection and deconvolution. The aim of this work was to describe the influence of mass spectrometry resolving power on profiling metabolomics experiments. From metabolic databases (HMDB, LipidMaps, KEGG), a list of compounds (41â¯474) was compiled and potential adducts and isotopes were calculated (622â¯110 features). The number of distinguishable masses was calculated for up to 3840k resolution. To evaluate these models, human plasma samples were analyzed by LC-HRMS on an Orbitrap Elite hybrid mass spectrometer (Thermo Fisher Scientific, CA, USA) at resolving power settings of 15k (7.8 Hz) up to a maximum of 480k (1.2 Hz). Software XCMS 1.44, MZmine 2.13.1, and Compound Discoverer 2.0.0.303 were used for evaluation. In plasma samples, the number of detected features increased sharply up to 60k in both positive and negative mode. However, beyond these values, it either flattened out or decreased owing to technical limitations. In conclusion, the most effective mass resolving powers for profiling analyses of metabolite rich biofluids on the Orbitrap Elite were around 60â¯000-120â¯000 fwhm to retrieve the highest amount of information. The region between 400-800 m/z was influenced the most by resolution.
Assuntos
Lipídeos/sangue , Metabolômica , Cromatografia Líquida , Simulação por Computador , Bases de Dados Factuais , Voluntários Saudáveis , Humanos , Espectrometria de Massas , Estrutura MolecularRESUMO
The fission yeast Schizosaccharomyces pombe is a popular model organism in molecular biology and cell physiology. With its ease of genetic manipulation and growth, supported by in-depth functional annotations in the PomBase database and genome-wide metabolic models,S. pombe is an attractive option for synthetic biology applications. However,S. pombe currently lacks modular tools for generating genetic circuits with more than 1 transcriptional unit. We developed a toolkit to address this gap. Adapted from the MoClo-YTK plasmid kit for Saccharomyces cerevisiae and using the same modular cloning grammar, our POMBOX toolkit is designed to facilitate fast, efficient, and modular construction of genetic circuits inS. pombe. It allows for interoperability when working with DNA sequences that are functional in bothS. cerevisiae and S. pombe (e.g., protein tags, antibiotic resistance cassettes, and coding sequences). Moreover, POMBOX enables the modular assembly of multigene pathways and increases the possible pathway length from 6 to 12 transcriptional units. We also adapted the stable integration vector homology arms to Golden Gate assembly and tested the genomic integration success rates depending on different sequence sizes, from 4 to 24 kb. We included 14 S. pombe promoters that we characterized using two fluorescent proteins, in both minimally defined (EMM2âEdinburgh minimal media) and complex (YESâyeast extract with supplements) media. Then, we examined the efficacy of 6 S. cerevisiae and 6 synthetic terminators in S. pombe. Finally, we used the POMBOX kit for a synthetic biology application in metabolic engineering and expressed plant enzymes in S. pombe to produce specialized metabolite precursors, namely, methylxanthine, amorpha-4,11-diene, and cinnamic acid from the purine, mevalonate, and aromatic amino acid pathways.
Assuntos
Schizosaccharomyces , Schizosaccharomyces/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Biologia Sintética , Plasmídeos/genética , Clonagem MolecularRESUMO
Plant specialized metabolites have diversified vastly over the course of plant evolution, and they are considered key players in complex interactions between plants and their environment. The chemical diversity of these metabolites has been widely explored and utilized in agriculture and crop enhancement, the food industry, and drug development, among other areas. However, the immensity of the plant metabolome can make its exploration challenging. Here we describe a protocol for exploring plant specialized metabolites that combines high-resolution mass spectrometry and computational metabolomics strategies, including molecular networking, identification of structural motifs, as well as prediction of chemical structures and metabolite classes.
Assuntos
Espectrometria de Massas , Metaboloma , Metabolômica , Plantas , Metabolômica/métodos , Plantas/metabolismo , Espectrometria de Massas/métodos , Biologia Computacional/métodosRESUMO
Untargeted mass spectrometry (MS) experiments produce complex, multidimensional data that are practically impossible to investigate manually. For this reason, computational pipelines are needed to extract relevant information from raw spectral data and convert it into a more comprehensible format. Depending on the sample type and/or goal of the study, a variety of MS platforms can be used for such analysis. MZmine is an open-source software for the processing of raw spectral data generated by different MS platforms. Examples include liquid chromatography-MS, gas chromatography-MS and MS-imaging. These data might typically be associated with various applications including metabolomics and lipidomics. Moreover, the third version of the software, described herein, supports the processing of ion mobility spectrometry (IMS) data. The present protocol provides three distinct procedures to perform feature detection and annotation of untargeted MS data produced by different instrumental setups: liquid chromatography-(IMS-)MS, gas chromatography-MS and (IMS-)MS imaging. For training purposes, example datasets are provided together with configuration batch files (i.e., list of processing steps and parameters) to allow new users to easily replicate the described workflows. Depending on the number of data files and available computing resources, we anticipate this to take between 2 and 24 h for new MZmine users and nonexperts. Within each procedure, we provide a detailed description for all processing parameters together with instructions/recommendations for their optimization. The main generated outputs are represented by aligned feature tables and fragmentation spectra lists that can be used by other third-party tools for further downstream analysis.
Assuntos
Espectrometria de Massas , Software , Espectrometria de Massas/métodos , Cromatografia Líquida/métodos , Metabolômica/métodos , Reprodutibilidade dos Testes , Espectrometria de Mobilidade Iônica/métodos , Cromatografia Gasosa-Espectrometria de Massas/métodosRESUMO
Feature-based molecular networking (FBMN) is a popular analysis approach for liquid chromatography-tandem mass spectrometry-based non-targeted metabolomics data. While processing liquid chromatography-tandem mass spectrometry data through FBMN is fairly streamlined, downstream data handling and statistical interrogation are often a key bottleneck. Especially users new to statistical analysis struggle to effectively handle and analyze complex data matrices. Here we provide a comprehensive guide for the statistical analysis of FBMN results, focusing on the downstream analysis of the FBMN output table. We explain the data structure and principles of data cleanup and normalization, as well as uni- and multivariate statistical analysis of FBMN results. We provide explanations and code in two scripting languages (R and Python) as well as the QIIME2 framework for all protocol steps, from data clean-up to statistical analysis. All code is shared in the form of Jupyter Notebooks ( https://github.com/Functional-Metabolomics-Lab/FBMN-STATS ). Additionally, the protocol is accompanied by a web application with a graphical user interface ( https://fbmn-statsguide.gnps2.org/ ) to lower the barrier of entry for new users and for educational purposes. Finally, we also show users how to integrate their statistical results into the molecular network using the Cytoscape visualization tool. Throughout the protocol, we use a previously published environmental metabolomics dataset for demonstration purposes. Together, the protocol, code and web application provide a complete guide and toolbox for FBMN data integration, cleanup and advanced statistical analysis, enabling new users to uncover molecular insights from their non-targeted metabolomics data. Our protocol is tailored for the seamless analysis of FBMN results from Global Natural Products Social Molecular Networking and can be easily adapted to other mass spectrometry feature detection, annotation and networking tools.
RESUMO
Regulations of proliferation and quiescence in response to nutritional cues are important for medicine and basic biology. The fission yeast Schizosaccharomyces pombe serves as a model, owing to the shift of proliferating cells to the metabolically active quiescence (designate G0 phase hereafter) by responding to low nitrogen source. S. pombe G0 phase cells keep alive for months without growth and division. Nitrogen replenishment reinstates vegetative proliferation phase (designate VEG). Some 40 genes required for G0 maintenance were identified, but many more remain to be identified. We here show, using mutants, that the proteasome is required for maintaining G0 quiescence. Functional outcomes of proteasome in G0 and VEG phases appear to be distinct. Upon proteasome dysfunction, a number of antioxidant proteins and compounds responsive to ROS (reactive oxygen species) are produced. In addition, autophagy-mediated destruction of mitochondria occurs, which suppresses the loss of viability by eliminating ROS-generating mitochondria. These defensive responses are found in G0 but not in VEG, suggesting that the main function of proteasome in G0 phase homeostasis is to minimize ROS. Proteasome and autophagy are thus collaborative to support the lifespan of S. pombe G0 phase.
Assuntos
Autofagia/fisiologia , Longevidade/fisiologia , Mitocôndrias/fisiologia , Complexo de Endopeptidases do Proteassoma/fisiologia , Schizosaccharomyces/crescimento & desenvolvimento , Autofagia/genética , Proliferação de Células , Regulação Fúngica da Expressão Gênica , Longevidade/genética , Mitocôndrias/genética , Nitrogênio/metabolismo , Complexo de Endopeptidases do Proteassoma/genética , Espécies Reativas de Oxigênio/metabolismo , Fase de Repouso do Ciclo Celular/genética , Fase de Repouso do Ciclo Celular/fisiologia , Schizosaccharomyces/genéticaRESUMO
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
RESUMO
Mass spectrometry is commonly applied to qualitatively and quantitatively profile small molecules, such as peptides, metabolites, or lipids. Modern mass spectrometers provide accurate measurements of mass-to-charge ratios of ions, with errors as low as 1 ppm. Even such high mass accuracy, however, is not sufficient to determine the unique chemical formula of each ion, and additional algorithms are necessary. Here we present a universal software tool for predicting chemical formulas from high-resolution mass spectrometry data, developed within the MZmine 2 framework. The tool is based on the use of a combination of heuristic techniques, including MS/MS fragmentation analysis and isotope pattern matching. The performance of the tool was evaluated using a real metabolomic data set obtained with the Orbitrap MS detector. The true formula was correctly determined as the highest-ranking candidate for 79% of the tested compounds. The novel isotope pattern-scoring algorithm outperformed a previously published method in 64% of the tested Orbitrap spectra. The software described in this manuscript is freely available and its source code can be accessed within the MZmine 2 source code repository.
Assuntos
Espectrometria de Massas , Algoritmos , Isótopos de Carbono/química , Cromatografia Líquida de Alta Pressão , Schizosaccharomyces/metabolismo , SoftwareRESUMO
Molecular networking connects mass spectra of molecules based on the similarity of their fragmentation patterns. However, during ionization, molecules commonly form multiple ion species with different fragmentation behavior. As a result, the fragmentation spectra of these ion species often remain unconnected in tandem mass spectrometry-based molecular networks, leading to redundant and disconnected sub-networks of the same compound classes. To overcome this bottleneck, we develop Ion Identity Molecular Networking (IIMN) that integrates chromatographic peak shape correlation analysis into molecular networks to connect and collapse different ion species of the same molecule. The new feature relationships improve network connectivity for structurally related molecules, can be used to reveal unknown ion-ligand complexes, enhance annotation within molecular networks, and facilitate the expansion of spectral reference libraries. IIMN is integrated into various open source feature finding tools and the GNPS environment. Moreover, IIMN-based spectral libraries with a broad coverage of ion species are publicly available.
Assuntos
Biologia Computacional/métodos , Íons/metabolismo , Espectrometria de Massas/métodos , Redes e Vias Metabólicas , Metabolômica/métodos , Animais , Internet , Íons/química , Estrutura Molecular , Reprodutibilidade dos Testes , SoftwareRESUMO
BACKGROUND: Mass spectrometry (MS) coupled with online separation methods is commonly applied for differential and quantitative profiling of biological samples in metabolomic as well as proteomic research. Such approaches are used for systems biology, functional genomics, and biomarker discovery, among others. An ongoing challenge of these molecular profiling approaches, however, is the development of better data processing methods. Here we introduce a new generation of a popular open-source data processing toolbox, MZmine 2. RESULTS: A key concept of the MZmine 2 software design is the strict separation of core functionality and data processing modules, with emphasis on easy usability and support for high-resolution spectra processing. Data processing modules take advantage of embedded visualization tools, allowing for immediate previews of parameter settings. Newly introduced functionality includes the identification of peaks using online databases, MSn data support, improved isotope pattern support, scatter plot visualization, and a new method for peak list alignment based on the random sample consensus (RANSAC) algorithm. The performance of the RANSAC alignment was evaluated using synthetic datasets as well as actual experimental data, and the results were compared to those obtained using other alignment algorithms. CONCLUSIONS: MZmine 2 is freely available under a GNU GPL license and can be obtained from the project website at: http://mzmine.sourceforge.net/. The current version of MZmine 2 is suitable for processing large batches of data and has been applied to both targeted and non-targeted metabolomic analyses.