RESUMO
Plant responses to environmental change are mediated via changes in cellular metabolomes. However, <5% of signals obtained from liquid chromatography tandem mass spectrometry (LC-MS/MS) can be identified, limiting our understanding of how metabolomes change under biotic/abiotic stress. To address this challenge, we performed untargeted LC-MS/MS of leaves, roots, and other organs of Brachypodium distachyon (Poaceae) under 17 organ-condition combinations, including copper deficiency, heat stress, low phosphate, and arbuscular mycorrhizal symbiosis. We found that both leaf and root metabolomes were significantly affected by the growth medium. Leaf metabolomes were more diverse than root metabolomes, but the latter were more specialized and more responsive to environmental change. We found that 1 week of copper deficiency shielded the root, but not the leaf metabolome, from perturbation due to heat stress. Machine learning (ML)-based analysis annotated approximately 81% of the fragmented peaks versus approximately 6% using spectral matches alone. We performed one of the most extensive validations of ML-based peak annotations in plants using thousands of authentic standards, and analyzed approximately 37% of the annotated peaks based on these assessments. Analyzing responsiveness of each predicted metabolite class to environmental change revealed significant perturbations of glycerophospholipids, sphingolipids, and flavonoids. Co-accumulation analysis further identified condition-specific biomarkers. To make these results accessible, we developed a visualization platform on the Bio-Analytic Resource for Plant Biology website (https://bar.utoronto.ca/efp_brachypodium_metabolites/cgi-bin/efpWeb.cgi), where perturbed metabolite classes can be readily visualized. Overall, our study illustrates how emerging chemoinformatic methods can be applied to reveal novel insights into the dynamic plant metabolome and stress adaptation.
Assuntos
Brachypodium , Brachypodium/metabolismo , Cromatografia Líquida , Teoria da Informação , Cobre/metabolismo , Espectrometria de Massas em Tandem , Metabolômica/métodos , MetabolomaRESUMO
Untargeted tandem mass spectrometry (MS/MS) is an essential technique in modern analytical chemistry, providing a comprehensive snapshot of chemical entities in complex samples and identifying unknowns through their fragmentation patterns. This high-throughput approach generates large data sets that can be challenging to interpret. Molecular Networks (MNs) have been developed as a computational tool to aid in the organization and visualization of complex chemical space in untargeted mass spectrometry data, thereby supporting comprehensive data analysis and interpretation. MNs group related compounds with potentially similar structures from MS/MS data by calculating all pairwise MS/MS similarities and filtering these connections to produce a MN. Such networks are instrumental in metabolomics for identifying novel metabolites, elucidating metabolic pathways, and even discovering biomarkers for disease. While MS/MS similarity metrics have been explored in the literature, the influence of network topology approaches on MN construction remains unexplored. This manuscript introduces metrics for evaluating MN construction, benchmarks state-of-the-art approaches, and proposes the Transitive Alignments approach to improve MN construction. The Transitive Alignment technique leverages the MN topology to realign MS/MS spectra of related compounds that differ by multiple structural modifications. Combining this Transitive Alignments approach with pseudoclique finding, a method for identifying highly connected groups of nodes in a network, resulted in more complete and higher-quality molecular families. Finally, we also introduce a targeted network construction technique called induced transitive alignments where we demonstrate effectiveness on a real world natural product discovery application. We release this transitive alignment technique as a high-throughput workflow that can be used by the wider research community.
Assuntos
Metabolômica , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Metabolômica/métodos , Algoritmos , Redes e Vias MetabólicasRESUMO
Acylsugars are a class of plant defense compounds produced across many distantly related families. Members of the horticulturally important morning glory (Convolvulaceae) family produce a diverse sub-class of acylsugars called resin glycosides (RGs), which comprise oligosaccharide cores, hydroxyacyl chain(s), and decorating aliphatic and aromatic acyl chains. While many RG structures are characterized, the extent of structural diversity of this class in different genera and species is not known. In this study, we asked whether there has been lineage-specific diversification of RG structures in different Convolvulaceae species that may suggest diversification of the underlying biosynthetic pathways. Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) was performed from root and leaf extracts of 26 species sampled in a phylogeny-guided manner. LC-MS/MS revealed thousands of peaks with signature RG fragmentation patterns with one species producing over 300 signals, mirroring the diversity in Solanaceae-type acylsugars. A novel RG from Dichondra argentea was characterized using Nuclear Magnetic Resonance spectroscopy, supporting previous observations of RGs with open hydroxyacyl chains instead of closed macrolactone ring structures. Substantial lineage-specific differentiation in utilization of sugars, hydroxyacyl chains, and decorating acyl chains was discovered, especially among Ipomoea and Convolvulus - the two largest genera in Convolvulaceae. Adopting a computational, knowledge-based strategy, we further developed a high-recall workflow that successfully explained ~72% of the MS/MS fragments, predicted the structural components of 11/13 previously characterized RGs, and partially annotated ~45% of the RGs. Overall, this study improves our understanding of phytochemical diversity and lays a foundation for characterizing the evolutionary mechanisms underlying RG diversification.
RESUMO
Anthocyanins are economically valuable phytochemicals of significant relevance to human health. Industrially extracted from multiple fruit and vegetable sources, anthocyanin yield and profiles can vary between sources and growing conditions. In this study, we focused on three purple-fleshed and one orange-fleshed cultivars of sweet potato-a warm-weather, nutritious crop of substantial interest to growers in northern, cooler latitudes-to determine the yield and diversity of anthocyanins and flavonoids. Acidified ethanol extraction of lyophilized roots yielded ~ 800 mg average anthocyanins/100 g dry weight from all three cultivars. UHPLC-DAD-Orbitrap analysis of sweet potato extracts identified 18 high-confidence, mostly acylated peonidin and cyanidin derivatives contributing to > 90% of the total anthocyanin signal. Further assessment of the untargeted Liquid Chromatography-Tandem Mass Spectrometry data using deep learning and molecular networking identified over 350 flavonoid peaks with variable distributions in different sweet potato cultivars. These results provide a novel insight into anthocyanin content of purple-fleshed sweet potatoes grown in the northern latitudes, and reveal the large structural diversity of anthocyanins and flavonoids in this popular crop.
Assuntos
Antocianinas/metabolismo , Flavonoides/metabolismo , Ipomoea batatas/metabolismo , Cromatografia Líquida de Alta Pressão/métodos , Cor , Metabolômica/métodos , Extratos Vegetais/metabolismo , Espectrometria de Massas em Tandem/métodosRESUMO
Recent advances in sequencing and informatic technologies have led to a deluge of publicly available genomic data. While it is now relatively easy to sequence, assemble, and identify genic regions in diploid plant genomes, functional annotation of these genes is still a challenge. Over the past decade, there has been a steady increase in studies utilizing machine learning algorithms for various aspects of functional prediction, because these algorithms are able to integrate large amounts of heterogeneous data and detect patterns inconspicuous through rule-based approaches. The goal of this review is to introduce experimental plant biologists to machine learning, by describing how it is currently being used in gene function prediction to gain novel biological insights. In this review, we discuss specific applications of machine learning in identifying structural features in sequenced genomes, predicting interactions between different cellular components, and predicting gene function and organismal phenotypes. Finally, we also propose strategies for stimulating functional discovery using machine learning-based approaches in plants.