RESUMO
To achieve a single fully harmonised research data set suitable for analysis from data collected at multiple sites requires not only semantic integration of collection concepts and convergence onto single collection units, but harmonisation of data collection processes. We describe our experience of identifying harmonisation challenges in the Precision ALS project, with particular focus on process alignment challenges in a multi-site multi-national research data collection project.
Assuntos
Coleta de Dados , Humanos , Esclerose Lateral Amiotrófica/terapia , Pesquisa BiomédicaRESUMO
Tea, one of the most widely consumed beverages globally, exhibits remarkable genomic diversity in its underlying flavour and health-related compounds. In this study, we present the construction and analysis of a tea pangenome comprising a total of 11 genomes, with a focus on three newly sequenced genomes comprising the purple-leaved assamica cultivar "Zijuan", the temperature-sensitive sinensis cultivar "Anjibaicha" and the wild accession "L618" whose assemblies exhibited excellent quality scores as they profited from latest sequencing technologies. Our analysis incorporates a detailed investigation of transposon complement across the tea pangenome, revealing shared patterns of transposon distribution among the studied genomes and improved transposon resolution with long read technologies, as shown by long terminal repeat (LTR) Assembly Index analysis. Furthermore, our study encompasses a gene-centric exploration of the pangenome, exploring the genomic landscape of the catechin pathway with our study, providing insights on copy number alterations and gene-centric variants, especially for Anthocyanidin synthases. We constructed a gene-centric pangenome by structurally and functionally annotating all available genomes using an identical pipeline, which both increased gene completeness and allowed for a high functional annotation rate. This improved and consistently annotated gene set will allow for a better comparison between tea genomes. We used this improved pangenome to capture the core and dispensable gene repertoire, elucidating the functional diversity present within the tea species. This pangenome resource might serve as a valuable resource for understanding the fundamental genetic basis of traits such as flavour, stress tolerance, and disease resistance, with implications for tea breeding programmes.
Assuntos
Camellia sinensis , Elementos de DNA Transponíveis , Genoma de Planta , Camellia sinensis/genética , Genoma de Planta/genética , Elementos de DNA Transponíveis/genética , Variação Genética , Chá/genética , Genômica , Catequina/genéticaRESUMO
BACKGROUND: Plant immunity relies on the perception of immunogenic signals by cell-surface and intracellular receptors and subsequent activation of defense responses like programmed cell death. Under certain circumstances, the fine-tuned innate immune system of plants results in the activation of autoimmune responses that cause constitutive defense responses and spontaneous cell death in the absence of pathogens. RESULTS: Here, we characterized the onset of leaf death 12 (old12) mutant that was identified in the Arabidopsis accession Landsberg erecta. The old12 mutant is characterized by a growth defect, spontaneous cell death, plant-defense gene activation, and early senescence. In addition, the old12 phenotype is temperature reversible, thereby exhibiting all characteristics of an autoimmune mutant. Mapping the mutated locus revealed that the old12 phenotype is caused by a mutation in the Lectin Receptor Kinase P2-TYPE PURINERGIC RECEPTOR 2 (P2K2) gene. Interestingly, the P2K2 allele from Landsberg erecta is conserved among Brassicaceae. P2K2 has been implicated in pathogen tolerance and sensing extracellular ATP. The constitutive activation of defense responses in old12 results in improved resistance against Pseudomonas syringae pv. tomato DC3000. CONCLUSION: We demonstrate that old12 is an auto-immune mutant and that allelic variation of P2K2 contributes to diversity in Arabidopsis immune responses.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Lectinas/genética , Lectinas/metabolismo , Resistência à Doença/fisiologia , Folhas de Planta/metabolismo , Mutação , Proteínas de Transporte/genética , Fenótipo , Receptores Mitogênicos/genética , Receptores Mitogênicos/metabolismo , Pseudomonas syringae/metabolismo , Doenças das Plantas/genética , Regulação da Expressão Gênica de PlantasRESUMO
Amyotrophic Lateral Sclerosis (ALS) is an incurable neurodegenerative condition. Despite significant advances in pre-clinical models that enhance understanding of disease pathobiology, translation of candidate drugs to effective human therapies has been disappointing. There is increasing recognition of the need for a precision medicine approach toward drug development, as many failures in translation can be attributed in part to disease heterogeneity in humans. PRECISION-ALS is an academic industry collaboration between clinicians, Computer Scientists, Information engineers, technologists, data scientists and industry partners that will address the key clinical, computational, data science and technology associated research questions to generate a sustainable precision medicine based approach toward new drug development. Using extant and prospectively collected population based clinical data across nine European sites, PRECISION-ALS provides a General Data Protection Regulation (GDPR) compliant framework that seamlessly collects, processes and analyses research-quality multimodal and multi-sourced clinical, patient and caregiver journey, digitally acquired data through remote monitoring, imaging, neuro-electric-signaling, genomic and biomarker datasets using machine learning and artificial intelligence. PRECISION-ALS represents a first-in-kind modular transferable pan-European ICT framework for ALS that can be easily adapted to other regions that face similar precision medicine related challenges in multimodal data collection and analysis.
Assuntos
Esclerose Lateral Amiotrófica , Humanos , Esclerose Lateral Amiotrófica/diagnóstico , Esclerose Lateral Amiotrófica/epidemiologia , Esclerose Lateral Amiotrófica/genética , Inteligência Artificial , Biomarcadores , Aprendizado de MáquinaRESUMO
The common foodstuff garlic produces the potent antibiotic defense substance allicin after tissue damage. Allicin is a redox toxin that oxidizes glutathione and cellular proteins and makes garlic a highly hostile environment for non-adapted microbes. Genomic clones from a highly allicin-resistant Pseudomonas fluorescens (PfAR-1), which was isolated from garlic, conferred allicin resistance to Pseudomonas syringae and even to Escherichia coli Resistance-conferring genes had redox-related functions and were on core fragments from three similar genomic islands identified by sequencing and in silico analysis. Transposon mutagenesis and overexpression analyses revealed the contribution of individual candidate genes to allicin resistance. Taken together, our data define a multicomponent resistance mechanism against allicin in PfAR-1, achieved through horizontal gene transfer.
Assuntos
Dissulfetos/farmacologia , Farmacorresistência Bacteriana/genética , Pseudomonas/genética , Ácidos Sulfínicos/farmacologia , Antibacterianos/metabolismo , Dissulfetos/metabolismo , Alho/metabolismo , Glutationa/metabolismo , Oxirredução , Ácidos Sulfínicos/metabolismoRESUMO
Upon local infection, plants activate a systemic immune response called systemic acquired resistance (SAR). During SAR, systemic leaves become primed for the superinduction of defense genes upon reinfection. We used formaldehyde-assisted isolation of regulatory DNA elements coupled to next-generation sequencing to identify SAR regulators. Our bioinformatic analysis produced 10,129 priming-associated open chromatin sites in the 5' region of 3,025 genes in the systemic leaves of Arabidopsis (Arabidopsis thaliana) plants locally infected with Pseudomonas syringae pv. maculicola Whole transcriptome shotgun sequencing analysis of the systemic leaves after challenge enabled the identification of genes with priming-linked open chromatin before (contained in the formaldehyde-assisted isolation of regulatory DNA elements sequencing dataset) and enhanced expression after (included in the whole transcriptome shotgun sequencing dataset) the systemic challenge. Among them, Arabidopsis MILDEW RESISTANCE LOCUS O3 (MLO3) was identified as a previously unidentified positive regulator of SAR. Further in silico analysis disclosed two yet unknown cis-regulatory DNA elements in the 5' region of genes. The P-box was mainly associated with priming-responsive genes, whereas the C-box was mostly linked to challenge. We found that the P- or W-box, the latter recruiting WRKY transcription factors, or combinations of these boxes, characterize the 5' region of most primed genes. Therefore, this study provides a genome-wide record of genes with open and accessible chromatin during SAR and identifies MLO3 and two previously unidentified DNA boxes as likely regulators of this immune response.
Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/imunologia , Proteínas de Ligação a Calmodulina/metabolismo , Imunidade Vegetal , Arabidopsis/metabolismo , Pseudomonas syringae , Elementos Reguladores de TranscriçãoRESUMO
Natural light environments are highly variable. Flexible adjustment between light energy utilization and photoprotection is therefore of vital importance for plant performance and fitness in the field. Short-term reactions to changing light intensity are triggered inside chloroplasts and leaves within seconds to minutes, whereas long-term adjustments proceed over hours and days, integrating multiple signals. While the mechanisms of long-term acclimation to light intensity have been studied by changing constant growth light intensity during the day, responses to fluctuating growth light intensity have rarely been inspected in detail. We performed transcriptome profiling in Arabidopsis (Arabidopsis thaliana) leaves to investigate long-term gene expression responses to fluctuating light (FL). In particular, we examined whether responses differ between young and mature leaves or between morning and the end of the day. Our results highlight global reprogramming of gene expression under FL, including that of genes related to photoprotection, photosynthesis, and photorespiration and to pigment, prenylquinone, and vitamin metabolism. The FL-induced changes in gene expression varied between young and mature leaves at the same time point and between the same leaves in the morning and at the end of the day, indicating interactions of FL acclimation with leaf development stage and time of day. Only 46 genes were up- or down-regulated in both young and mature leaves at both time points. Combined analyses of gene coexpression and cis-elements pointed to a role of the circadian clock and light in coordinating the acclimatory responses of functionally related genes. Our results also suggest a possible cross talk between FL acclimation and systemic acquired resistance-like gene expression in young leaves.
Assuntos
Arabidopsis/efeitos da radiação , Regulação da Expressão Gênica de Plantas/efeitos da radiação , Aclimatação/genética , Arabidopsis/genética , Arabidopsis/crescimento & desenvolvimento , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Proteínas de Arabidopsis/fisiologia , Perfilação da Expressão Gênica , Estresse Oxidativo/genética , Estresse Oxidativo/efeitos da radiação , Fotossíntese/genética , Folhas de Planta/genética , Folhas de Planta/crescimento & desenvolvimento , Folhas de Planta/efeitos da radiação , Luz Solar , Fatores de TempoRESUMO
Genome sequences from over 200 plant species have already been published, with this number expected to increase rapidly due to advances in sequencing technologies. Once a new genome has been assembled and the genes identified, the functional annotation of their putative translational products, proteins, using ontologies is of key importance as it places the sequencing data in a biological context. Furthermore, to keep pace with rapid production of genome sequences, this functional annotation process must be fully automated. Here we present a redesigned and significantly enhanced MapMan4 framework, together with a revised version of the associated online Mercator annotation tool. Compared with the original MapMan, the new ontology has been expanded almost threefold and enforces stricter assignment rules. This framework was then incorporated into Mercator4, which has been upgraded to reflect current knowledge across the land plant group, providing protein annotations for all embryophytes with a comparably high quality. The annotation process has been optimized to allow a plant genome to be annotated in a matter of minutes. The output results continue to be compatible with the established MapMan desktop application.
Assuntos
Bases de Dados Genéticas , Genoma de Planta/genética , Análise de Dados , Transcriptoma/genéticaRESUMO
Recent advances in genomics technologies have greatly accelerated the progress in both fundamental plant science and applied breeding research. Concurrently, high-throughput plant phenotyping is becoming widely adopted in the plant community, promising to alleviate the phenotypic bottleneck. While these technological breakthroughs are significantly accelerating quantitative trait locus (QTL) and causal gene identification, challenges to enable even more sophisticated analyses remain. In particular, care needs to be taken to standardize, describe and conduct experiments robustly while relying on plant physiology expertise. In this article, we review the state of the art regarding genome assembly and the future potential of pangenomics in plant research. We also describe the necessity of standardizing and describing phenotypic studies using the Minimum Information About a Plant Phenotyping Experiment (MIAPPE) standard to enable the reuse and integration of phenotypic data. In addition, we show how deep phenotypic data might yield novel trait-trait correlations and review how to link phenotypic data to genomic data. Finally, we provide perspectives on the golden future of machine learning and their potential in linking phenotypes to genomic features.
Assuntos
Estudos de Associação Genética , Genoma de Planta/genética , Genômica , Aprendizado de Máquina , Fenômica , Plantas/genética , Fenótipo , Locos de Características Quantitativas/genéticaRESUMO
Global warming is becoming a significant problem for food security, particularly in the Mediterranean basin. The use of molecular techniques to study gene-level responses to environmental changes in non-model organisms is increasing and may help to improve the mechanistic understanding of durum wheat response to elevated CO2 and high temperature. With this purpose, we performed transcriptome RNA sequencing (RNA-Seq) analyses combined with physiological and biochemical studies in the flag leaf of plants grown in field chambers at ear emergence. Enhanced photosynthesis by elevated CO2 was accompanied by an increase in biomass and starch and fructan content, and a decrease in N compounds, as chlorophyll, soluble proteins, and Rubisco content, in association with a decline of nitrate reductase and initial and total Rubisco activities. While high temperature led to a decline of chlorophyll, Rubisco activity, and protein content, the glucose content increased and starch decreased. Furthermore, elevated CO2 induced several genes involved in mitochondrial electron transport, a few genes for photosynthesis and fructan synthesis, and most of the genes involved in secondary metabolism and gibberellin and jasmonate metabolism, whereas those related to light harvesting, N assimilation, and other hormone pathways were repressed. High temperature repressed genes for C, energy, N, lipid, secondary, and hormone metabolisms. Under the combined increases in atmospheric CO2 and temperature, the transcript profile resembled that previously reported for high temperature, although elevated CO2 partly alleviated the downregulation of primary and secondary metabolism genes. The results suggest that there was a reprogramming of primary and secondary metabolism under the future climatic scenario, leading to coordinated regulation of C-N metabolism towards C-rich metabolites at elevated CO2 and a shift away from C-rich secondary metabolites at high temperature. Several candidate genes differentially expressed were identified, including protein kinases, receptor kinases, and transcription factors.
RESUMO
A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite's hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.
Assuntos
Cuscuta/genética , Duplicação Gênica , Regulação da Expressão Gênica de Plantas , Genoma de Planta , Interações Hospedeiro-Parasita , Proteínas de Plantas/genética , Proteínas de Transporte/genética , Proteínas de Transporte/metabolismo , Cuscuta/classificação , Deleção de Genes , Ontologia Genética , Cariótipo , Redes e Vias Metabólicas/genética , Anotação de Sequência Molecular , Pelargonium/parasitologia , Fotossíntese/genética , Filogenia , Proteínas de Plantas/metabolismoRESUMO
Dinitrogen fixation by Nostoc azollae residing in specialized leaf pockets supports prolific growth of the floating fern Azolla filiculoides. To evaluate contributions by further microorganisms, the A. filiculoides microbiome and nitrogen metabolism in bacteria persistently associated with Azolla ferns were characterized. A metagenomic approach was taken complemented by detection of N2 O released and nitrogen isotope determinations of fern biomass. Ribosomal RNA genes in sequenced DNA of natural ferns, their enriched leaf pockets and water filtrate from the surrounding ditch established that bacteria of A. filiculoides differed entirely from surrounding water and revealed species of the order Rhizobiales. Analyses of seven cultivated Azolla species confirmed persistent association with Rhizobiales. Two distinct nearly full-length Rhizobiales genomes were identified in leaf-pocket-enriched samples from ditch grown A. filiculoides. Their annotation revealed genes for denitrification but not N2 -fixation. 15 N2 incorporation was active in ferns with N. azollae but not in ferns without. N2 O was not detectably released from surface-sterilized ferns with the Rhizobiales. N2 -fixing N. azollae, we conclude, dominated the microbiome of Azolla ferns. The persistent but less abundant heterotrophic Rhizobiales bacteria possibly contributed to lowering O2 levels in leaf pockets but did not release detectable amounts of the strong greenhouse gas N2 O.
Assuntos
Alphaproteobacteria/fisiologia , Gleiquênias/microbiologia , Nitrogênio/metabolismo , Nostoc/fisiologia , Oxigênio/metabolismo , Alphaproteobacteria/genética , Alphaproteobacteria/isolamento & purificação , Biomassa , Desnitrificação , Endófitos , Gleiquênias/crescimento & desenvolvimento , Metagenoma , Microbiota , Fixação de Nitrogênio , Isótopos de Nitrogênio/análise , Nostoc/genética , Nostoc/isolamento & purificação , Folhas de Planta/crescimento & desenvolvimento , Folhas de Planta/microbiologia , Água , Microbiologia da ÁguaRESUMO
Updates in nanopore technology have made it possible to obtain gigabases of sequence data. Prior to this, nanopore sequencing technology was mainly used to analyze microbial samples. Here, we describe the generation of a comprehensive nanopore sequencing data set with a median read length of 11,979 bp for a self-compatible accession of the wild tomato species Solanum pennellii We describe the assembly of its genome to a contig N50 of 2.5 MB. The assembly pipeline comprised initial read correction with Canu and assembly with SMARTdenovo. The resulting raw nanopore-based de novo genome is structurally highly similar to that of the reference S. pennellii LA716 accession but has a high error rate and was rich in homopolymer deletions. After polishing the assembly with Illumina reads, we obtained an error rate of <0.02% when assessed versus the same Illumina data. We obtained a gene completeness of 96.53%, slightly surpassing that of the reference S. pennellii Taken together, our data indicate that such long read sequencing data can be used to affordably sequence and assemble gigabase-sized plant genomes.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Solanum/genética , Análise de Sequência de DNARESUMO
Sustainable agriculture demands reduced input of man-made nitrogen (N) fertilizer, yet N2 fixation limits the productivity of crops with heterotrophic diazotrophic bacterial symbionts. We investigated floating ferns from the genus Azolla that host phototrophic diazotrophic Nostoc azollae in leaf pockets and belong to the fastest growing plants. Experimental production reported here demonstrated N-fertilizer independent production of nitrogen-rich biomass with an annual yield potential per ha of 1200 kg-1 N fixed and 35 t dry biomass. 15N2 fixation peaked at noon, reaching 0.4 mg N g-1 dry weight h-1. Azolla ferns therefore merit consideration as protein crops in spite of the fact that little is known about the fern's physiology to enable domestication. To gain an understanding of their nitrogen physiology, analyses of fern diel transcript profiles under differing nitrogen fertilizer regimes were combined with microscopic observations. Results established that the ferns adapted to the phototrophic N2-fixing symbionts N. azollae by (1) adjusting metabolically to nightly absence of N supply using responses ancestral to ferns and seed plants; (2) developing a specialized xylem-rich vasculature surrounding the leaf-pocket organ; (3) responding to N-supply by controlling transcripts of genes mediating nutrient transport, allocation and vasculature development. Unlike other non-seed plants, the Azolla fern clock is shown to contain both the morning and evening loops; the evening loop is known to control rhythmic gene expression in the vasculature of seed plants and therefore may have evolved along with the vasculature in the ancestor of ferns and seed plants.
RESUMO
The extreme sensitivity of the microsporogenesis process to moderately high or low temperatures is a major hindrance for tomato (Solanum lycopersicum) sexual reproduction and hence year-round cropping. Consequently, breeding for parthenocarpy, namely, fertilization-independent fruit set, is considered a valuable goal especially for maintaining sustainable agriculture in the face of global warming. A mutant capable of setting high-quality seedless (parthenocarpic) fruit was found following a screen of EMS-mutagenized tomato population for yielding under heat stress. Next-generation sequencing followed by marker-assisted mapping and CRISPR/Cas9 gene knockout confirmed that a mutation in SlAGAMOUS-LIKE 6 (SlAGL6) was responsible for the parthenocarpic phenotype. The mutant is capable of fruit production under heat stress conditions that severely hamper fertilization-dependent fruit set. Different from other tomato recessive monogenic mutants for parthenocarpy, Slagl6 mutations impose no homeotic changes, the seedless fruits are of normal weight and shape, pollen viability is unaffected, and sexual reproduction capacity is maintained, thus making Slagl6 an attractive gene for facultative parthenocarpy. The characteristics of the analysed mutant combined with the gene's mode of expression imply SlAGL6 as a key regulator of the transition between the state of 'ovary arrest' imposed towards anthesis and the fertilization-triggered fruit set.
Assuntos
Frutas/genética , Proteínas de Plantas/genética , Solanum lycopersicum/genética , Sistemas CRISPR-Cas , Frutas/crescimento & desenvolvimento , Regulação da Expressão Gênica de Plantas , Resposta ao Choque Térmico/genética , Solanum lycopersicum/fisiologia , Mutação , Proteínas de Plantas/metabolismo , Plantas Geneticamente Modificadas , Sementes/genéticaRESUMO
Streptomyces thermoautotrophicus UBT1 has been described as a moderately thermophilic chemolithoautotroph with a novel nitrogenase enzyme that is oxygen-insensitive. We have cultured the UBT1 strain, and have isolated two new strains (H1 and P1-2) of very similar phenotypic and genetic characters. These strains show minimal growth on ammonium-free media, and fail to incorporate isotopically labeled N2 gas into biomass in multiple independent assays. The sdn genes previously published as the putative nitrogenase of S. thermoautotrophicus have little similarity to anything found in draft genome sequences, published here, for strains H1 and UBT1, but share >99% nucleotide identity with genes from Hydrogenibacillus schlegelii, a draft genome for which is also presented here. H. schlegelii similarly lacks nitrogenase genes and is a non-diazotroph. We propose reclassification of the species containing strains UBT1, H1, and P1-2 as a non-Streptomycete, non-diazotrophic, facultative chemolithoautotroph and conclude that the existence of the previously proposed oxygen-tolerant nitrogenase is extremely unlikely.
Assuntos
Genes Bacterianos , Fixação de Nitrogênio , Streptomyces/genética , Streptomyces/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Marcação por Isótopo , Nitrogênio/metabolismo , Nitrogenase/genética , Nitrogenase/metabolismo , Homologia de Sequência do Ácido NucleicoRESUMO
Solanum pennellii is a wild tomato species endemic to Andean regions in South America, where it has evolved to thrive in arid habitats. Because of its extreme stress tolerance and unusual morphology, it is an important donor of germplasm for the cultivated tomato Solanum lycopersicum. Introgression lines (ILs) in which large genomic regions of S. lycopersicum are replaced with the corresponding segments from S. pennellii can show remarkably superior agronomic performance. Here we describe a high-quality genome assembly of the parents of the IL population. By anchoring the S. pennellii genome to the genetic map, we define candidate genes for stress tolerance and provide evidence that transposable elements had a role in the evolution of these traits. Our work paves a path toward further tomato improvement and for deciphering the mechanisms underlying the myriad other agronomic traits that can be improved with S. pennellii germplasm.
Assuntos
Genoma de Planta , Solanum/genética , Estresse Fisiológico/genética , Mapeamento Cromossômico/métodos , Cromossomos de Plantas , Elementos de DNA Transponíveis , Locos de Características QuantitativasRESUMO
MOTIVATION: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. RESULTS: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. AVAILABILITY AND IMPLEMENTATION: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic CONTACT: usadel@bio1.rwth-aachen.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Biologia Computacional , Bases de Dados GenéticasRESUMO
Functional gene clusters, containing two or more genes encoding different enzymes for the same pathway, are sometimes observed in plant genomes, most often when the genes specify the synthesis of specialized defensive metabolites. Here, we show that a cluster of genes in tomato (Solanum lycopersicum; Solanaceae) contains genes for terpene synthases (TPSs) that specify the synthesis of monoterpenes and diterpenes from cis-prenyl diphosphates, substrates that are synthesized by enzymes encoded by cis-prenyl transferase (CPT) genes also located within the same cluster. The monoterpene synthase genes in the cluster likely evolved from a diterpene synthase gene in the cluster by duplication and divergence. In the orthologous cluster in Solanum habrochaites, a new sesquiterpene synthase gene was created by a duplication event of a monoterpene synthase followed by a localized gene conversion event directed by a diterpene synthase gene. The TPS genes in the Solanum cluster encoding cis-prenyl diphosphate-utilizing enzymes are closely related to a tobacco (Nicotiana tabacum; Solanaceae) diterpene synthase encoding Z-abienol synthase (Nt-ABS). Nt-ABS uses the substrate copal-8-ol diphosphate, which is made from the all-trans geranylgeranyl diphosphate by copal-8-ol diphosphate synthase (Nt-CPS2). The Solanum gene cluster also contains an ortholog of Nt-CPS2, but it appears to encode a nonfunctional protein. Thus, the Solanum functional gene cluster evolved by duplication and divergence of TPS genes, together with alterations in substrate specificity to utilize cis-prenyl diphosphates and through the acquisition of CPT genes.
Assuntos
Família Multigênica , Proteínas de Plantas/genética , Solanum/genética , Terpenos/metabolismo , Alquil e Aril Transferases/classificação , Alquil e Aril Transferases/genética , Alquil e Aril Transferases/metabolismo , Sequência de Bases , Vias Biossintéticas/genética , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Diterpenos/química , Diterpenos/metabolismo , Evolução Molecular , Conversão Gênica , Duplicação Gênica , Regulação da Expressão Gênica de Plantas , Variação Genética , Solanum lycopersicum/genética , Solanum lycopersicum/metabolismo , Dados de Sequência Molecular , Estrutura Molecular , Monoterpenos/química , Monoterpenos/metabolismo , Filogenia , Proteínas de Plantas/classificação , Proteínas de Plantas/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Solanum/classificação , Solanum/metabolismo , Especificidade da Espécie , Especificidade por Substrato , Terpenos/química , Transferases/classificação , Transferases/genética , Transferases/metabolismoRESUMO
Although applied over extremely short timescales, artificial selection has dramatically altered the form, physiology, and life history of cultivated plants. We have used RNAseq to define both gene sequence and expression divergence between cultivated tomato and five related wild species. Based on sequence differences, we detect footprints of positive selection in over 50 genes. We also document thousands of shifts in gene-expression level, many of which resulted from changes in selection pressure. These rapidly evolving genes are commonly associated with environmental response and stress tolerance. The importance of environmental inputs during evolution of gene expression is further highlighted by large-scale alteration of the light response coexpression network between wild and cultivated accessions. Human manipulation of the genome has heavily impacted the tomato transcriptome through directed admixture and by indirectly favoring nonsynonymous over synonymous substitutions. Taken together, our results shed light on the pervasive effects artificial and natural selection have had on the transcriptomes of tomato and its wild relatives.