RESUMEN
Natural products research increasingly applies -omics technologies to guide molecular discovery. While the combined analysis of genomic and metabolomic datasets has proved valuable for identifying natural products and their biosynthetic gene clusters (BGCs) in bacteria, this integrated approach lacks application to fungi. Because fungi are hyper-diverse and underexplored for new chemistry and bioactivities, we created a linked genomics-metabolomics dataset for 110 Ascomycetes, and optimized both gene cluster family (GCF) networking parameters and correlation-based scoring for pairing fungal natural products with their BGCs. Using a network of 3,007 GCFs (organized from 7,020 BGCs), we examined 25 known natural products originating from 16 known BGCs and observed statistically significant associations between 21 of these compounds and their validated BGCs. Furthermore, the scalable platform identified the BGC for the pestalamides, demystifying its biogenesis, and revealed more than 200 high-scoring natural product-GCF linkages to direct future discovery.
Asunto(s)
Productos Biológicos , Genómica , Metabolómica , Familia de Multigenes , Hongos/genéticaRESUMEN
Liquid chromatography-mass spectrometry (LC-MS) intact mass analysis and LC-MS/MS peptide mapping are decisional assays for developing biological drugs and other commercial protein products. Certain PTM types, such as truncation and oxidation, increase the difficulty of precise proteoform characterization owing to inherent limitations in peptide and intact protein analyses. Top-down MS (TDMS) can resolve this ambiguity via fragmentation of specific proteoforms. We leveraged the strengths of flow-programmed (fp) denaturing online buffer exchange (dOBE) chromatography, including robust automation, relatively high ESI sensitivity, and long MS/MS window time, to support a TDMS platform for industrial protein characterization. We tested data-dependent (DDA) and targeted strategies using 14 different MS/MS scan types featuring combinations of collisional- and electron-based fragmentation as well as proton transfer charge reduction. This large, focused dataset was processed using a new software platform, named TDAcquireX, that improves proteoform characterization through TDMS data aggregation. A DDA-based workflow provided objective identification of αLac truncation proteoforms with a two-termini clipping search. A targeted TDMS workflow facilitated the characterization of αLac oxidation positional isomers. This strategy relied on using sliding window-based fragment ion deconvolution to generate composite proteoform spectral match (cPrSM) results amenable to fragment noise filtering, which is a fundamental enhancement relevant to TDMS applications generally.
RESUMEN
Existing mass spectrometric assays used for sensitive and specific measurements of target proteins across multiple samples, such as selected/multiple reaction monitoring (SRM/MRM) or parallel reaction monitoring (PRM), are peptide-based methods for bottom-up proteomics. Here, we describe an approach based on the principle of PRM for the measurement of intact proteoforms by targeted top-down proteomics, termed proteoform reaction monitoring (PfRM). We explore the ability of our method to circumvent traditional limitations of top-down proteomics, such as sensitivity and reproducibility. We also introduce a new software program, Proteoform Finder (part of ProSight Native), specifically designed for the easy analysis of PfRM data. PfRM was initially benchmarked by quantifying three standard proteins. The linearity of the assay was shown over almost 3 orders of magnitude in the femtomole range, with limits of detection and quantification in the low femtomolar range. We later applied our multiplexed PfRM assay to complex samples to quantify biomarker candidates in peripheral blood mononuclear cells (PBMCs) from liver-transplanted patients, suggesting their possible translational applications. These results demonstrate that PfRM has the potential to contribute to the accurate quantification of protein biomarkers for diagnostic purposes and to improve our understanding of disease etiology at the proteoform level.
Asunto(s)
Leucocitos Mononucleares , Proteínas , Humanos , Leucocitos Mononucleares/química , Reproducibilidad de los Resultados , Espectrometría de Masas , Proteómica/métodos , Procesamiento Proteico-Postraduccional , Proteoma/análisisRESUMEN
INTRODUCTION: Fungi biosynthesize chemically diverse secondary metabolites with a wide range of biological activities. Natural product scientists have increasingly turned towards bioinformatics approaches, combining metabolomics and genomics to target secondary metabolites and their biosynthetic machinery. We recently applied an integrated metabologenomics workflow to 110 fungi and identified more than 230 high-confidence linkages between metabolites and their biosynthetic pathways. OBJECTIVES: To prioritize the discovery of bioactive natural products and their biosynthetic pathways from these hundreds of high-confidence linkages, we developed a bioactivity-driven metabologenomics workflow combining quantitative chemical information, antiproliferative bioactivity data, and genome sequences. METHODS: The 110 fungi from our metabologenomics study were tested against multiple cancer cell lines to identify which strains produced antiproliferative natural products. Three strains were selected for further study, fractionated using flash chromatography, and subjected to an additional round of bioactivity testing and mass spectral analysis. Data were overlaid using biochemometrics analysis to predict active constituents early in the fractionation process following which their biosynthetic pathways were identified using metabologenomics. RESULTS: We isolated three new-to-nature stemphone analogs, 19-acetylstemphones G (1), B (2) and E (3), that demonstrated antiproliferative activity ranging from 3 to 5 µM against human melanoma (MDA-MB-435) and ovarian cancer (OVACR3) cells. We proposed a rational biosynthetic pathway for these compounds, highlighting the potential of using bioactivity as a filter for the analysis of integrated-Omics datasets. CONCLUSIONS: This work demonstrates how the incorporation of biochemometrics as a third dimension into the metabologenomics workflow can identify bioactive metabolites and link them to their biosynthetic machinery.
Asunto(s)
Vías Biosintéticas , Hongos , Metabolómica , Familia de Multigenes , Humanos , Metabolómica/métodos , Hongos/metabolismo , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , Productos Biológicos/farmacología , Productos Biológicos/metabolismo , Antineoplásicos/farmacología , Antineoplásicos/química , Antineoplásicos/metabolismoRESUMEN
The Human Proteoform Atlas (HPfA) is a web-based repository of experimentally verified human proteoforms on-line at http://human-proteoform-atlas.org and is a direct descendant of the Consortium of Top-Down Proteomics' (CTDP) Proteoform Atlas. Proteoforms are the specific forms of protein molecules expressed by our cells and include the unique combination of post-translational modifications (PTMs), alternative splicing and other sources of variation deriving from a specific gene. The HPfA uses a FAIR system to assign persistent identifiers to proteoforms which allows for redundancy calling and tracking from prior and future studies in the growing community of proteoform biology and measurement. The HPfA is organized around open ontologies and enables flexible classification of proteoforms. To achieve this, a public registry of experimentally verified proteoforms was also created. Submission of new proteoforms can be processed through email vianrtdphelp@northwestern.edu, and future iterations of these proteoform atlases will help to organize and assign function to proteoforms, their PTMs and their complexes in the years ahead.
Asunto(s)
Empalme Alternativo , Bases de Datos de Proteínas , Procesamiento Proteico-Postraduccional , Proteoma/química , Proteínas Proto-Oncogénicas p21(ras)/química , Interfaz Usuario-Computador , Secuencia de Aminoácidos , Atlas como Asunto , Ontología de Genes , Humanos , Modelos Moleculares , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Conformación Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteoma/clasificación , Proteoma/genética , Proteoma/metabolismo , Proteínas Proto-Oncogénicas p21(ras)/genética , Proteínas Proto-Oncogénicas p21(ras)/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismoRESUMEN
Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.
Asunto(s)
Ascomicetos/genética , Ascomicetos/metabolismo , Genoma Fúngico , Familia de Multigenes , Bacterias/genética , Bacterias/metabolismo , Productos Biológicos/metabolismo , Vías Biosintéticas/genética , Genes Fúngicos , Genómica , Filogenia , Metabolismo Secundario , Especificidad de la EspecieRESUMEN
Native mass spectrometry has recently moved alongside traditional structural biology techniques in its ability to provide clear insights into the composition of protein complexes. However, to date, limited software tools are available for the comprehensive analysis of native mass spectrometry data on protein complexes, particularly for experiments aimed at elucidating the composition of an intact protein complex. Here, we introduce ProSight Native as a start-to-finish informatics platform for analyzing native protein and protein complex data. Combining mass determination via spectral deconvolution with a top-down database search and stoichiometry calculations, ProSight Native can determine the complete composition of protein complexes. To demonstrate its features, we used ProSight Native to successfully determine the composition of the homotetrameric membrane complex Aquaporin Z. We also revisited previously published spectra and were able to decipher the composition of a heterodimer complex bound with two noncovalently associated ligands. In addition to determining complex composition, we developed new tools in the software for validating native mass spectrometry fragment ions and mapping top-down fragmentation data onto three-dimensional protein structures. Taken together, ProSight Native will reduce the informatics burden on the growing field of native mass spectrometry, enabling the technology to further its reach.
Asunto(s)
Proteínas , Programas Informáticos , Espectrometría de Masas/métodos , Proteínas/análisisRESUMEN
Analysis of intact proteins by mass spectrometry enables direct quantitation of the specific proteoforms present in a sample and is an increasingly important tool for biopharmaceutical and academic research. Interpreting and quantifying intact protein species from mass spectra typically involves many challenges including mass deconvolution and peak processing as well as determining optimal spectral averaging parameters and matching masses to theoretical proteoforms. Each of these steps can present informatic hurdles, as parameters often need to be tailored specifically to the data sets. To reduce intact mass deconvolution data analysis burdens, we built upon the widely used "sliding window" mass deconvolution technique with several additional concepts. First, we found that how spectra are averaged and the overlap in spectral windows can be tuned to favor either sensitivity or speed. A multiple window averaging approach was found to be the most effective way to increase mass detection and yielded a >2-fold increase in the number of masses detected. We also developed a targeted feature-finding routine that boosted sensitivity by >2-fold, decreased coefficient of variation across replicates by 50%, and increased the quality of mass elution profiles through 3-fold more detected time points. Lastly, we furthered existing approaches for annotating detected masses with potential proteoforms through spectral fitting for possible proteoform family modifications and network viewing. These proteoform annotation approaches ultimately produced a more accurate way of finding related, but previously unknown proteoforms from intact mass-only data. Together, these quantitation workflow improvements advance the information obtainable from intact protein mass spectrometry analyses.
Asunto(s)
Proteoma , Espectrometría de Masas en Tándem , Espectrometría de Masas en Tándem/métodos , Proteoma/análisisRESUMEN
The genomes of filamentous fungi contain up to 90 biosynthetic gene clusters (BGCs) encoding diverse secondary metabolites-an enormous reservoir of untapped chemical potential. However, the recalcitrant genetics, cryptic expression, and unculturability of these fungi prevent scientists from systematically exploiting these gene clusters and harvesting their products. As heterologous expression of fungal BGCs is largely limited to the expression of single or partial clusters, we established a scalable process for the expression of large numbers of full-length gene clusters, called FAC-MS. Using fungal artificial chromosomes (FACs) and metabolomic scoring (MS), we screened 56 secondary metabolite BGCs from diverse fungal species for expression in Aspergillus nidulans. We discovered 15 new metabolites and assigned them with confidence to their BGCs. Using the FAC-MS platform, we extensively characterized a new macrolactone, valactamide A, and its hybrid nonribosomal peptide synthetase-polyketide synthase (NRPS-PKS). The ability to regularize access to fungal secondary metabolites at an unprecedented scale stands to revitalize drug discovery platforms with renewable sources of natural products.
Asunto(s)
Aspergillus/genética , Aspergillus/metabolismo , Genes Fúngicos/genética , Familia de Multigenes , Metabolismo Secundario/genética , Sesterterpenos/análisis , Benzodiazepinas/análisis , Benzodiazepinas/metabolismo , Pirimidinonas/análisis , Pirimidinonas/metabolismo , Sesterterpenos/metabolismoRESUMEN
The benzodiazepine benzomalvin A/D is a fungally derived specialized metabolite and inhibitor of the substance P receptor NK1, biosynthesized by a three-gene nonribosomal peptide synthetase cluster. Here, we utilize fungal artificial chromosomes with metabolomic scoring (FAC-MS) to perform molecular genetic pathway dissection and targeted metabolomics analysis to assign the in vivo role of each domain in the benzomalvin biosynthetic pathway. The use of FAC-MS identified the terminal cyclizing condensation domain as BenY-CT and the internal C-domains as BenZ-C1 and BenZ-C2. Unexpectedly, we also uncovered evidence suggesting BenY-CT or a yet to be identified protein mediates benzodiazepine formation, representing the first reported benzodiazepine synthase enzymatic activity. This work informs understanding of what defines a fungal CT domain and shows how the FAC-MS platform can be used as a tool for in vivo analyses of specialized metabolite biosynthesis and for the discovery and dissection of new enzyme activities.
Asunto(s)
Aspergillus nidulans , Benzodiazepinas/metabolismo , Cromosomas Artificiales/genética , Cromosomas Fúngicos/genética , Proteínas Fúngicas , Péptido Sintasas , Pirimidinonas/metabolismo , Aspergillus nidulans/enzimología , Aspergillus nidulans/genética , Cromosomas Artificiales/metabolismo , Cromosomas Fúngicos/metabolismo , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Péptido Sintasas/química , Péptido Sintasas/genética , Péptido Sintasas/metabolismo , Dominios ProteicosRESUMEN
Covering: up to 2018 Thioester reductase domains catalyze two- and four-electron reductions to release natural products following assembly on nonribosomal peptide synthetases, polyketide synthases, and their hybrid biosynthetic complexes. This reductive off-loading of a natural product yields an aldehyde or alcohol, can initiate the formation of a macrocyclic imine, and contributes to important intermediates in a variety of biosyntheses, including those for polyketide alkaloids and pyrrolobenzodiazepines. Compounds that arise from reductase-terminated biosynthetic gene clusters are often reactive and exhibit biological activity. Biomedically important examples include the cancer therapeutic Yondelis (ecteinascidin 743), peptide aldehydes that inspired the first therapeutic proteasome inhibitor bortezomib, and numerous synthetic derivatives and antibody drug conjugates of the pyrrolobenzodiazepines. Recent advances in microbial genomics, metabolomics, bioinformatics, and reactivity-based labeling have facilitated the detection of these compounds for targeted isolation. Herein, we summarize known natural products arising from this important category, highlighting their occurrence in Nature, biosyntheses, biological activities, and the technologies used for their detection and identification. Additionally, we review publicly available genomic data to highlight the remaining potential for novel reductively tailored compounds and drug leads from microorganisms. This thorough retrospective highlights various molecular families with especially privileged bioactivity while illuminating challenges and prospects toward accelerating the discovery of new, high value natural products.
Asunto(s)
Productos Biológicos/metabolismo , Péptido Sintasas/metabolismo , Sintasas Poliquetidas/metabolismo , Alcaloides/biosíntesis , Alcaloides/química , Compuestos de Azabiciclo/química , Compuestos de Azabiciclo/metabolismo , Benzodiazepinonas/química , Benzodiazepinonas/metabolismo , Productos Biológicos/química , Productos Biológicos/farmacología , Vías Biosintéticas/genética , Ciclización , Depsipéptidos/química , Depsipéptidos/metabolismo , Dipéptidos/química , Dipéptidos/metabolismo , Indoles/química , Indoles/metabolismo , Lactamas/química , Lactamas/metabolismo , Leupeptinas/química , Leupeptinas/metabolismo , Lisina/análogos & derivados , Lisina/química , Lisina/metabolismo , Familia de Multigenes , Péptido Sintasas/genética , Sintasas Poliquetidas/genética , Dominios ProteicosRESUMEN
O-linked ß-N-acetylglucosamine (O-GlcNAc) glycosylation, the covalent attachment of N-acetylglucosamine to serine and threonine residues of proteins, is a post-translational modification that shares many features with protein phosphorylation. O-GlcNAc is essential for cell survival and plays important role in many biological processes (e.g. transcription, translation, cell division) and human diseases (e.g. diabetes, Alzheimer's disease, cancer). However, detection of O-GlcNAc is challenging. Here, a method for O-GlcNAc detection using in vitro sulfation with two N-acetylglucosamine (GlcNAc)-specific sulfotransferases, carbohydrate sulfotransferase 2 and carbohydrate sulfotransferase 4, and the radioisotope (35)S is described. Sulfation on free GlcNAc is first demonstrated, and then on O-GlcNAc residues of peptides as well as nuclear and cytoplasmic proteins. It is also demonstrated that the sulfation on O-GlcNAc is sensitive to OGT and O-ß-N-acetylglucosaminidase treatment. The labeled samples are separated on sodium dodecyl sulfate-polyacrylamide gel electrophoresis and visualized by autoradiography. Overall, the method is sensitive, specific and convenient.
Asunto(s)
Acetilglucosamina/análisis , Acetilglucosaminidasa/metabolismo , Sulfatos/metabolismo , Sulfotransferasas/metabolismo , Acetilglucosamina/metabolismo , Glicosilación , Células HEK293 , Humanos , Carbohidrato SulfotransferasasRESUMEN
Top-down mass spectrometry (TDMS) of intact proteins and antibodies enables direct determination of truncations, sequence variants, post-translational modifications, and disulfides without the need for any proteolytic cleavage. While mass deconvolution of top-down tandem mass spectra is typically used to identify fragment masses for matching to candidate proteoforms, larger molecules such as monoclonal antibodies can produce many fragment ions, making spectral interpretation challenging. Here, we explore an alternative approach for proteoform spectral matching that is better suited for larger protein analysis. This workflow uses direct matching of theoretical proteoform isotopic distributions to TDMS spectra, avoiding drawbacks of mass deconvolution such as poor sensitivity and problems differentiating overlapping distributions. Using a data set that analyzed an intact NIST monoclonal antibody across different fragmentation modes, we show that this isotope fitting strategy increased the sequence coverage of both light and heavy chain sequences >3-fold. We further found that isotope fitting is particularly amenable to identifying large fragments, including those near the hinge region that have been traditionally difficult to analyze by top-down methods. These advances in proteoform spectral matching can greatly increase the power of top-down analyses for intact biotherapeutics and other large molecules.
RESUMEN
Deconvolution from intact protein mass-to-charge spectra to mass spectra is essential to generate interpretable data for mass spectrometry (MS) platforms coupled to ionization sources that produce multiply charged species. Infrared matrix-assisted laser desorption electrospray ionization (IR-MALDESI) can be used to analyze intact proteins in multiwell microtiter plates with speed matching small molecule analyses (at least 1 Hz). However, the lack of compatible deconvolution software has limited its use in high-throughput screening applications. Most existing automated deconvolution software packages work best for data generated from LC-MS, and to the best of our knowledge, there is no software capable of performing fast plate-based mass spectral deconvolution. Herein we present the use of a new workflow in ProSight Native for the deconvolution of protein spectra from entire well plates that can be completed within 3 s. First, we successfully demonstrated the potential increased throughput benefits produced by the combined IR-MALDESI-MS - ProSight Native platform using protein standards. We then conducted a screen for Bruton's tyrosine kinase (BTK) covalent binders against a well-annotated compound collection consisting of 2232 compounds and applied ProSight Native to deconvolute the protein spectra. Seventeen hits including five known BTK covalent inhibitors in the compound set were identified. By alleviating the data processing bottleneck using ProSight Native, it may be feasible to analyze and report covalent screening results for >200,000 samples in a single day.
Asunto(s)
Espectrometría de Masas , Proteínas , Proteínas/química , Programas InformáticosRESUMEN
Human biology is tightly linked to proteins, yet most measurements do not precisely determine alternatively spliced sequences or posttranslational modifications. Here, we present the primary structures of ~30,000 unique proteoforms, nearly 10 times more than in previous studies, expressed from 1690 human genes across 21 cell types and plasma from human blood and bone marrow. The results, compiled in the Blood Proteoform Atlas (BPA), indicate that proteoforms better describe protein-level biology and are more specific indicators of differentiation than their corresponding proteins, which are more broadly expressed across cell types. We demonstrate the potential for clinical application, by interrogating the BPA in the context of liver transplantation and identifying cell and proteoform signatures that distinguish normal graft function from acute rejection and other causes of graft dysfunction.
Asunto(s)
Células Sanguíneas/química , Proteínas Sanguíneas/química , Células de la Médula Ósea/química , Bases de Datos de Proteínas , Isoformas de Proteínas/química , Proteoma/química , Empalme Alternativo , Linfocitos B/química , Proteínas Sanguíneas/genética , Linaje de la Célula , Humanos , Leucocitos Mononucleares/química , Trasplante de Hígado , Plasma/química , Isoformas de Proteínas/genética , Procesamiento Proteico-Postraduccional , Proteómica , Linfocitos T/químicaRESUMEN
We report the metabolomics-driven genome mining of a new cyclic-guanidino incorporating non-ribosomal peptide synthetase (NRPS) gene cluster and full structure elucidation of its associated hexapeptide product, faulknamycin. Structural studies unveiled that this natural product contained the previously unknown (R,S)-stereoisomer of capreomycidine, d-capreomycidine. Furthermore, heterologous expression of the identified gene cluster successfully reproduces faulknamycin production without an observed homologue of VioD, the pyridoxal phosphate (PLP)-dependent enzyme found in all previous l-capreomycidine biosynthesis. An alternative NRPS-dependent pathway for d-capreomycidine biosynthesis is proposed.
Asunto(s)
Arginina/análogos & derivados , Familia de Multigenes , Streptomyces/genética , Arginina/genética , Arginina/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Vías Biosintéticas , Genómica , Metabolómica , Péptido Sintasas/genética , Péptido Sintasas/metabolismo , Streptomyces/metabolismoRESUMEN
Advances in genome sequencing have revitalized natural product discovery efforts, revealing the untapped biosynthetic potential of fungi. While the volume of genomic data continues to expand, discovery efforts are slowed due to the time-consuming nature of experiments required to characterize new molecules. To direct efforts toward uncharacterized biosynthetic gene clusters most likely to encode novel chemical scaffolds, we took advantage of comparative metabolomics and heterologous gene expression using fungal artificial chromosomes (FACs). By linking mass spectral profiles with structural clues provided by FAC-encoded gene clusters, we targeted a compound originating from an unusual gene cluster containing an indoleamine 2,3-dioxygenase (IDO). With this approach, we isolate and characterize R and S forms of the new molecule terreazepine, which contains a novel chemical scaffold resulting from cyclization of the IDO-supplied kynurenine. The discovery of terreazepine illustrates that FAC-based approaches targeting unusual biosynthetic machinery provide a promising avenue forward for targeted discovery of novel scaffolds and their biosynthetic enzymes, and it also represents another example of a biosynthetic gene cluster "repurposing" a primary metabolic enzyme to diversify its secondary metabolite arsenal.IMPORTANCE Here, we provide evidence that Aspergillus terreus encodes a biosynthetic gene cluster containing a repurposed indoleamine 2,3-dioxygenase (IDO) dedicated to secondary metabolite synthesis. The discovery of this neofunctionalized IDO not only enabled discovery of a new compound with an unusual chemical scaffold but also provided insight into the numerous strategies fungi employ for diversifying and protecting themselves against secondary metabolites. The observations in this study set the stage for further in-depth studies into the function of duplicated IDOs present in fungal biosynthetic gene clusters and presents a strategy for accessing the biosynthetic potential of gene clusters containing duplicated primary metabolic genes.
Asunto(s)
Aspergillus/química , Productos Biológicos/química , Vías Biosintéticas/genética , Familia de Multigenes , Aspergillus/genética , Productos Biológicos/aislamiento & purificación , Cromosomas Artificiales/genética , Expresión Génica , Quinurenina/metabolismo , Metabolómica , Metabolismo Secundario/genéticaRESUMEN
Filamentous fungi are prolific producers of secondary metabolites with drug-like properties, and their genome sequences have revealed an untapped wealth of potential therapeutic leads. To better access these secondary metabolites and characterize their biosynthetic gene clusters, we applied a new platform for screening and heterologous expression of intact gene clusters that uses fungal artificial chromosomes and metabolomic scoring (FAC-MS). We leverage FAC-MS technology to identify the biosynthetic machinery responsible for production of acu-dioxomorpholine, a metabolite produced by the fungus, Aspergilllus aculeatus. The acu-dioxomorpholine nonribosomal peptide synthetase features a new type of condensation domain (designated CR) proposed to use a noncanonical arginine active site for ester bond formation. Using stable isotope labeling and MS, we determine that a phenyllactate monomer deriving from phenylalanine is incorporated into the diketomorpholine scaffold. Acu-dioxomorpholine is highly related to orphan inhibitors of P-glycoprotein targets in multidrug-resistant cancers, and identification of the biosynthetic pathway for this compound class enables genome mining for additional derivatives.