Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 22
2.
Nat Biotechnol ; 2024 Apr 23.
Article En | MEDLINE | ID: mdl-38653796

In recent years, generative protein sequence models have been developed to sample novel sequences. However, predicting whether generated proteins will fold and function remains challenging. We evaluate a set of 20 diverse computational metrics to assess the quality of enzyme sequences produced by three contrasting generative models: ancestral sequence reconstruction, a generative adversarial network and a protein language model. Focusing on two enzyme families, we expressed and purified over 500 natural and generated sequences with 70-90% identity to the most similar natural sequences to benchmark computational metrics for predicting in vitro enzyme activity. Over three rounds of experiments, we developed a computational filter that improved the rate of experimental success by 50-150%. The proposed metrics and models will drive protein engineering research by serving as a benchmark for generative protein sequence models and helping to select active variants for experimental testing.

3.
Elife ; 122024 Mar 15.
Article En | MEDLINE | ID: mdl-38488154

Accurately detecting distant evolutionary relationships between proteins remains an ongoing challenge in bioinformatics. Search methods based on primary sequence struggle to accurately detect homology between sequences with less than 20% amino acid identity. Profile- and structure-based strategies extend sensitive search capabilities into this twilight zone of sequence similarity but require slow pre-processing steps. Recently, whole-protein and positional embeddings from deep neural networks have shown promise for providing sensitive sequence comparison and annotation at long evolutionary distances. Embeddings are generally faster to compute than profiles and predicted structures but still suffer several drawbacks related to the ability of whole-protein embeddings to discriminate domain-level homology, and the database size and search speed of methods using positional embeddings. In this work, we show that low-dimensionality positional embeddings can be used directly in speed-optimized local search algorithms. As a proof of concept, we use the ESM2 3B model to convert primary sequences directly into the 3D interaction (3Di) alphabet or amino acid profiles and use these embeddings as input to the highly optimized Foldseek, HMMER3, and HH-suite search algorithms. Our results suggest that positional embeddings as small as a single byte can provide sufficient information for dramatically improved sensitivity over amino acid sequence searches without sacrificing search speed.


Algorithms , Proteins , Sequence Alignment , Proteins/genetics , Proteins/chemistry , Amino Acid Sequence , Computational Biology/methods , Amino Acids
4.
ACS Synth Biol ; 13(3): 745-751, 2024 03 15.
Article En | MEDLINE | ID: mdl-38377591

Commercially synthesized genes are typically made using variations of homology-based cloning techniques, including polymerase cycling assembly from chemically synthesized microarray-derived oligonucleotides. Here, we apply Data-optimized Assembly Design (DAD) to the synthesis of hundreds of codon-optimized genes in both constitutive and inducible vectors using Golden Gate Assembly. Starting from oligonucleotide pools, we synthesize genes in three simple steps: (1) amplification of parts belonging to individual assemblies in parallel from a single pool; (2) Golden Gate Assembly of parts for each construct; and (3) transformation. We construct genes from receiving DNA to sequence confirmed isolates in as little as 4 days. By leveraging the ligation fidelity afforded by T4 DNA ligase, we expect to be able to construct a larger breadth of sequences not currently supported by homology-based methods, which require stability of extensive single-stranded DNA overhangs.


Oligonucleotides , Synthetic Biology , Oligonucleotides/genetics , Synthetic Biology/methods , DNA/genetics , DNA, Single-Stranded/genetics , Cloning, Molecular , Genetic Vectors
5.
Mol Cell ; 84(5): 854-866.e7, 2024 Mar 07.
Article En | MEDLINE | ID: mdl-38402612

Deaminases have important uses in modification detection and genome editing. However, the range of applications is limited by the small number of characterized enzymes. To expand the toolkit of deaminases, we developed an in vitro approach that bypasses a major hurdle with their toxicity in cells. We assayed 175 putative cytosine deaminases on a variety of substrates and found a broad range of activity on double- and single-stranded DNA in various sequence contexts, including CpG-specific deaminases and enzymes without sequence preference. We also characterized enzyme selectivity across six DNA modifications and reported enzymes that do not deaminate modified cytosines. The detailed analysis of diverse deaminases opens new avenues for biotechnological and medical applications. As a demonstration, we developed SEM-seq, a non-destructive single-enzyme methylation sequencing method using a modification-sensitive double-stranded DNA deaminase. The streamlined protocol enables accurate, base-resolution methylome mapping of scarce biological material, including cell-free DNA and 10 pg input DNA.


Cytosine Deaminase , Epigenome , DNA/genetics , Cytosine , DNA, Single-Stranded/genetics , Cytidine Deaminase/genetics
6.
ACS Omega ; 8(20): 18339, 2023 May 23.
Article En | MEDLINE | ID: mdl-37251133

[This corrects the article DOI: 10.1021/acsomega.9b00488.].

7.
Microorganisms ; 10(1)2022 Jan 16.
Article En | MEDLINE | ID: mdl-35056642

Specialised metabolites produced during plant-fungal associations often define how symbiosis between the plant and the fungus proceeds. They also play a role in the establishment of additional interactions between the symbionts and other organisms present in the niche. However, specialised metabolism and its products are sometimes overlooked when studying plant-microbe interactions. This limits our understanding of the specific symbiotic associations and potentially future perspectives of their application in agriculture. In this study, we used the interaction between the root endophyte Serendipita indica and tomato (Solanum lycopersicum) plants to explore how specialised metabolism of the host plant is regulated upon a mutualistic symbiotic association. To do so, tomato seedlings were inoculated with S. indica chlamydospores and subjected to RNAseq analysis. Gene expression of the main tomato specialised metabolism pathways was compared between roots and leaves of endophyte-colonised plants and tissues of endophyte-free plants. S. indica colonisation resulted in a strong transcriptional response in the leaves of colonised plants. Furthermore, the presence of the fungus in plant roots appears to induce expression of genes involved in the biosynthesis of lignin-derived compounds, polyacetylenes, and specific terpenes in both roots and leaves, whereas pathways producing glycoalkaloids and flavonoids were expressed in lower or basal levels.

8.
Biomolecules ; 11(6)2021 06 16.
Article En | MEDLINE | ID: mdl-34208762

Interactions between plant-associated fungi and their hosts are characterized by a continuous crosstalk of chemical molecules. Specialized metabolites are often produced during these associations and play important roles in the symbiosis between the plant and the fungus, as well as in the establishment of additional interactions between the symbionts and other organisms present in the niche. Serendipita indica, a root endophytic fungus from the phylum Basidiomycota, is able to colonize a wide range of plant species, conferring many benefits to its hosts. The genome of S. indica possesses only few genes predicted to be involved in specialized metabolite biosynthesis, including a putative terpenoid synthase gene (SiTPS). In our experimental setup, SiTPS expression was upregulated when the fungus colonized tomato roots compared to its expression in fungal biomass growing on synthetic medium. Heterologous expression of SiTPS in Escherichia coli showed that the produced protein catalyzes the synthesis of a few sesquiterpenoids, with the alcohol viridiflorol being the main product. To investigate the role of SiTPS in the plant-endophyte interaction, an SiTPS-over-expressing mutant line was created and assessed for its ability to colonize tomato roots. Although overexpression of SiTPS did not lead to improved fungal colonization ability, an in vitro growth-inhibition assay showed that viridiflorol has antifungal properties. Addition of viridiflorol to the culture medium inhibited the germination of spores from a phytopathogenic fungus, indicating that SiTPS and its products could provide S. indica with a competitive advantage over other plant-associated fungi during root colonization.


Alkyl and Aryl Transferases/isolation & purification , Basidiomycota/enzymology , Sesquiterpenes/metabolism , Alkyl and Aryl Transferases/genetics , Alkyl and Aryl Transferases/metabolism , Basidiomycota/metabolism , Endophytes/metabolism , Gene Expression Regulation, Plant/genetics , Solanum lycopersicum/metabolism , Plant Roots/metabolism , Symbiosis/genetics , Terpenes/chemistry , Terpenes/metabolism
9.
Plant J ; 104(3): 693-705, 2020 11.
Article En | MEDLINE | ID: mdl-32777127

Serrulatane diterpenoids are natural products found in plants from a subset of genera within the figwort family (Scrophulariaceae). Many of these compounds have been characterized as having anti-microbial properties and share a common diterpene backbone. One example, leubethanol from Texas sage (Leucophyllum frutescens) has demonstrated activity against multi-drug-resistant tuberculosis. Leubethanol is the only serrulatane diterpenoid identified from this genus; however, a range of such compounds have been found throughout the closely related Eremophila genus. Despite their potential therapeutic relevance, the biosynthesis of serrulatane diterpenoids has not been previously reported. Here we leverage the simple product profile and high accumulation of leubethanol in the roots of L. frutescens and compare tissue-specific transcriptomes with existing data from Eremophila serrulata to decipher the biosynthesis of leubethanol. A short-chain cis-prenyl transferase (LfCPT1) first produces the rare diterpene precursor nerylneryl diphosphate, which is cyclized by an unusual plastidial terpene synthase (LfTPS1) into the characteristic serrulatane diterpene backbone. Final conversion to leubethanol is catalyzed by a cytochrome P450 (CYP71D616) of the CYP71 clan. This pathway documents the presence of a short-chain cis-prenyl diphosphate synthase, previously only found in Solanaceae, which is likely involved in the biosynthesis of other known diterpene backbones in Eremophila. LfTPS1 represents neofunctionalization of a compartment-switching terpene synthase accepting a novel substrate in the plastid. Biosynthetic access to leubethanol will enable pathway discovery to more complex serrulatane diterpenoids which share this common starting structure and provide a platform for the production and diversification of this class of promising anti-microbial therapeutics in heterologous systems.


Diterpenes/metabolism , Scrophulariaceae/metabolism , Alkyl and Aryl Transferases/metabolism , Cytochrome P-450 Enzyme System/genetics , Cytochrome P-450 Enzyme System/metabolism , Eremophila Plant/genetics , Escherichia coli/genetics , Neoprene/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Plant Roots/metabolism , Plants, Genetically Modified , Polyisoprenyl Phosphates/metabolism , Scrophulariaceae/genetics , Nicotiana/genetics , Nicotiana/metabolism , Transferases/genetics , Transferases/metabolism
10.
ACS Omega ; 4(4): 7323-7329, 2019 Apr 30.
Article En | MEDLINE | ID: mdl-31459832

Descriptions of molecular environments have many applications in chemoinformatics, including chemical shift prediction. Hierarchically ordered spherical environment (HOSE) codes are the most popular such descriptions. We developed a method to extend these with stereochemistry information. It enables distinguishing atoms which would be considered identical in traditional HOSE codes. The use of our method is demonstrated by chemical shift predictions for molecules in the nmrshiftdb2 database. We give a full specification and an implementation.

11.
New Phytol ; 223(1): 323-335, 2019 07.
Article En | MEDLINE | ID: mdl-30843212

The mint family (Lamiaceae) is well documented as a rich source of terpene natural products. More than 200 diterpene skeletons have been reported from mints, but biosynthetic pathways are known for just a few of these. We crossreferenced chemotaxonomic data with publicly available transcriptomes to select common selfheal (Prunella vulgaris) and its highly unusual vulgarisin diterpenoids as a case study for exploring the origins of diterpene skeletal diversity in Lamiaceae. Four terpene synthases (TPS) from the TPS-a subfamily, including two localised to the plastid, were cloned and functionally characterised. Previous examples of TPS-a enzymes from Lamiaceae were cytosolic and reported to act on the 15-carbon farnesyl diphosphate. Plastidial TPS-a enzymes using the 20-carbon geranylgeranyl diphosphate are known from other plant families, having apparently arisen independently in each family. All four new enzymes were found to be active on multiple prenyl-diphosphate substrates with different chain lengths and stereochemistries. One of the new enzymes catalysed the cyclisation of geranylgeranyl diphosphate into 11-hydroxy vulgarisane, the likely biosynthetic precursor of the vulgarisins. We uncovered the pathway to a rare diterpene skeleton. Our results support an emerging paradigm of substrate and compartment switching as important aspects of TPS evolution and diversification.


Alkyl and Aryl Transferases/metabolism , Evolution, Molecular , Prunella/enzymology , Alkyl and Aryl Transferases/genetics , Gene Expression Regulation, Plant , Peptides/metabolism , Phylogeny , Plant Leaves/genetics , Plant Roots/genetics , Polyisoprenyl Phosphates/metabolism , Prunella/genetics , Recombinant Fusion Proteins/metabolism , Substrate Specificity , Terpenes/chemistry , Terpenes/metabolism , Transcriptome/genetics
12.
Gigascience ; 8(3)2019 03 01.
Article En | MEDLINE | ID: mdl-30698701

BACKGROUND: Teak, a member of the Lamiaceae family, produces one of the most expensive hardwoods in the world. High demand coupled with deforestation have caused a decrease in natural teak forests, and future supplies will be reliant on teak plantations. Hence, selection of teak tree varieties for clonal propagation with superior growth performance is of great importance, and access to high-quality genetic and genomic resources can accelerate the selection process by identifying genes underlying desired traits. FINDINGS: To facilitate teak research and variety improvement, we generated a highly contiguous, chromosomal-scale genome assembly using high-coverage Pacific Biosciences long reads coupled with high-throughput chromatin conformation capture. Of the 18 teak chromosomes, we generated 17 near-complete pseudomolecules with one chromosome present as two chromosome arm scaffolds. Genome annotation yielded 31,168 genes encoding 46,826 gene models, of which, 39,930 and 41,155 had Pfam domain and expression evidence, respectively. We identified 14 clusters of tandem-duplicated terpene synthases (TPSs), genes central to the biosynthesis of terpenes, which are involved in plant defense and pollinator attraction. Transcriptome analysis revealed 10 TPSs highly expressed in woody tissues, of which, 8 were in tandem, revealing the importance of resolving tandemly duplicated genes and the quality of the assembly and annotation. We also validated the enzymatic activity of four TPSs to demonstrate the function of key TPSs. CONCLUSIONS: In summary, this high-quality chromosomal-scale assembly and functional annotation of the teak genome will facilitate the discovery of candidate genes related to traits critical for sustainable production of teak and for anti-insecticidal natural products.


Biological Products/metabolism , Biosynthetic Pathways/genetics , Chromosomes, Plant/genetics , Cytochrome P-450 Enzyme System/metabolism , Gene Duplication , Genome, Plant , Lamiaceae/genetics , Alkyl and Aryl Transferases/genetics , Gene Expression Regulation, Plant , Genes, Plant , Molecular Sequence Annotation , Phylogeny , Transcriptome/genetics
13.
J Biol Chem ; 294(4): 1349-1362, 2019 01 25.
Article En | MEDLINE | ID: mdl-30498089

Members of the mint family (Lamiaceae) accumulate a wide variety of industrially and medicinally relevant diterpenes. We recently sequenced leaf transcriptomes from 48 phylogenetically diverse Lamiaceae species. Here, we summarize the available chemotaxonomic and enzyme activity data for diterpene synthases (diTPSs) in the Lamiaceae and leverage the new transcriptomes to explore the diTPS sequence and functional space. Candidate genes were selected with an intent to evenly sample the sequence homology space and to focus on species in which diTPS transcripts were found, yet from which no diterpene structures have been previously reported. We functionally characterized nine class II diTPSs and 10 class I diTPSs from 11 distinct plant species and found five class II activities, including two novel activities, as well as a spectrum of class I activities. Among the class II diTPSs, we identified a neo-cleroda-4(18),13E-dienyl diphosphate synthase from Ajuga reptans, catalyzing the likely first step in the biosynthesis of a variety of insect-antifeedant compounds. Among the class I diTPSs was a palustradiene synthase from Origanum majorana, leading to the discovery of specialized diterpenes in that species. Our results provide insights into the diversification of diterpene biosynthesis in the mint family and establish a comprehensive foundation for continued investigation of diterpene biosynthesis in the Lamiaceae.


Alkyl and Aryl Transferases/metabolism , Databases, Pharmaceutical , Diterpenes/metabolism , Lamiaceae/enzymology , Plant Leaves/metabolism , Plant Proteins/metabolism , Gene Expression Regulation, Plant , Lamiaceae/genetics , Lamiaceae/growth & development , Phylogeny , Plant Leaves/genetics , Plant Leaves/growth & development , Plant Proteins/genetics
14.
Planta ; 249(1): 221-233, 2019 Jan.
Article En | MEDLINE | ID: mdl-30470899

MAIN CONCLUSION: Modular assembly and heterologous expression in the moss Physcomitrella patens of pairs of diterpene synthases results in accumulation of modern land plant diterpenoids. Physcomitrella patens is a representative of the ancient bryophyte plant lineage with a genome size of 511 Mb, dominant haploid life cycle and limited chemical and metabolic complexity. For these plants, exceptional capacity for genome editing through homologous recombination is met with recently demonstrated in vivo assembly of multiple heterologous DNA fragments. These traits earlier made P. patens an attractive choice as a biotechnological chassis for photosynthesis-driven production of recombinant peptides. The lack of diterpene gibberellic acid phytohormones in P. patens combined with the recent targeted disruption of the single bifunctional diterpene synthase yielded lines devoid of endogenous diterpenoid metabolites and well-suited for engineering of terpenoid production. Here, we mimicked the modular nature of diterpene biosynthetic pathways found in modern land plants by developing a flexible pipeline to install three combinations of class II and class I diterpene synthases in P. patens to access industrially relevant diterpene biomaterials. In addition to a well-established neutral locus for targeted integration, we also explored loci created by a class of Long Terminal Repeat Retrotransposon present at moderate number in the genome of P. patens. Assembly of the pathways and production of the enzymes from the neutral locus led to accumulation of diterpenes matching the reported activities in the angiosperm sources. In contrast, insights gained with the retrotransposon loci indicate their suitability for targeting, but reveal potentially inherent complications which may require adaptation of the experimental design.


Biotechnology/methods , Diterpenes/metabolism , Bryopsida/metabolism , Gibberellins/metabolism , Life Cycle Stages/physiology , Photosynthesis/physiology , Plant Growth Regulators/metabolism , Retroelements/genetics , Synthetic Biology/methods
15.
Plant Physiol ; 175(2): 681-695, 2017 Oct.
Article En | MEDLINE | ID: mdl-28838953

The commercially important essential oils of peppermint (Mentha × piperita) and its relatives in the mint family (Lamiaceae) are accumulated in specialized anatomical structures called glandular trichomes (GTs). A genome-scale stoichiometric model of secretory phase metabolism in peppermint GTs was constructed based on current biochemical and physiological knowledge. Fluxes through the network were predicted based on metabolomic and transcriptomic data. Using simulated reaction deletions, this model predicted that two processes, the regeneration of ATP and ferredoxin (in its reduced form), exert substantial control over flux toward monoterpenes. Follow-up biochemical assays with isolated GTs indicated that oxidative phosphorylation and ethanolic fermentation were active and that cooperation to provide ATP depended on the concentration of the carbon source. We also report that GTs with high flux toward monoterpenes express, at very high levels, genes coding for a unique pair of ferredoxin and ferredoxin-NADP+ reductase isoforms. This study provides, to our knowledge, the first evidence of how bioenergetic processes determine flux through monoterpene biosynthesis in GTs.


Biosynthetic Pathways , Energy Metabolism , Mentha piperita/metabolism , Monoterpenes/metabolism , Oils, Volatile/metabolism , Trichomes/metabolism , Adenosine Triphosphate/metabolism , Amino Acid Sequence , Carbon/metabolism , Computer Simulation , Ferredoxins/metabolism , Mentha piperita/chemistry , Models, Molecular , Oxidative Phosphorylation , Plant Leaves/chemistry , Plant Leaves/metabolism , Sequence Alignment , Trichomes/chemistry
16.
Mol Plant ; 10(2): 323-339, 2017 02 13.
Article En | MEDLINE | ID: mdl-27867107

The genus Mentha encompasses mint species cultivated for their essential oils, which are formulated into a vast array of consumer products. Desirable oil characteristics and resistance to the fungal disease Verticillium wilt are top priorities for the mint industry. However, cultivated mints have complex polyploid genomes and are sterile. Breeding efforts, therefore, require the development of genomic resources for fertile mint species. Here, we present draft de novo genome and plastome assemblies for a wilt-resistant South African accession of Mentha longifolia (L.) Huds., a diploid species ancestral to cultivated peppermint and spearmint. The 353 Mb genome contains 35 597 predicted protein-coding genes, including 292 disease resistance gene homologs, and nine genes determining essential oil characteristics. A genetic linkage map ordered 1397 genome scaffolds on 12 pseudochromosomes. More than two million simple sequence repeats were identified, which will facilitate molecular marker development. The M. longifolia genome is a valuable resource for both metabolic engineering and molecular breeding. This is exemplified by employing the genome sequence to clone and functionally characterize the promoters in a peppermint cultivar, and demonstrating the utility of a glandular trichome-specific promoter to increase expression of a biosynthetic gene, thereby modulating essential oil composition.


Genome, Plant , Mentha/genetics , Base Sequence , Plant Breeding , Plant Diseases/genetics , Promoter Regions, Genetic
17.
Article En | MEDLINE | ID: mdl-25789275

Various databases have been developed to aid in assigning structures to spectral peaks observed in metabolomics experiments. In this review article, we discuss the utility of currently available open-access spectral and chemical databases for natural products discovery. We also provide recommendations on how the research community can contribute to further improvements.

19.
Phytochemistry ; 113: 87-95, 2015 May.
Article En | MEDLINE | ID: mdl-25534952

Development and testing of Spektraris-NMR, an online spectral resource, is reported for the NMR-based structural identification of plant natural products (PNPs). Spektraris-NMR allows users to search with multiple spectra at once and returns a table with a list of hits arranged according to the goodness of fit between query data and database entries. For each hit, a link to a tabulated alignment of (1)H NMR and (13)C NMR spectroscopic peaks (query versus database entry) is provided. Furthermore, full spectroscopic records and experimental meta information about each database entry can be accessed online. To test the utility of Spektraris-NMR for PNP identification, the database was populated with NMR data (total of 466 spectra) for ∼ 250 taxanes, which are structurally complex diterpenoids (including the anticancer drug taxol) commonly found in the genus Taxus. NMR data generated with metabolites purified from Taxus cell suspension cultures were then used to search Spektraris-NMR, and enabled the identification of eight taxanes with high confidence. A ninth isolated metabolite could be assigned, based on spectral searches, to a taxane skeletal class, but no high confidence hit was produced. Using various spectroscopic methods, this metabolite was characterized as 2-deacetylbaccatin IV, a novel taxane. These results indicate that Spektraris-NMR is a valuable resource for rapid and reliable identification of known metabolites and has the potential to contribute to de-replication efforts in novel PNP discovery.


Biological Products/isolation & purification , Diterpenes/isolation & purification , Taxoids/isolation & purification , Taxus/chemistry , Biological Products/chemistry , Bridged-Ring Compounds , Diterpenes/chemistry , Molecular Structure , Nuclear Magnetic Resonance, Biomolecular , Taxoids/chemistry
...