RESUMO
The Dicer protein is an indispensable player in such fundamental cell pathways as miRNA biogenesis and regulation of protein expression in a cell. Most recently, both germline and somatic mutations in DICER1 have been identified in diverse types of cancers, which suggests Dicer mutations can lead to cancer progression. In addition to well-known hotspot mutations in RNAase III domains, DICER1 is characterized by a wide spectrum of variants in all the functional domains; most are of uncertain significance and unstated clinical effects. Moreover, various new somatic DICER1 mutations continuously appear in cancer genome sequencing. The latest contemporary methods of variant effect prediction utilize machine learning algorithms on bulk data, yielding suboptimal correlation with biological data. Consequently, such analysis should be conducted based on the functional and structural characteristics of each protein, using a well-grounded targeted dataset rather than relying on large amounts of unsupervised data. Domains are the functional and evolutionary units of a protein; the analysis of the whole protein should be based on separate and independent examinations of each domain by their evolutionary reconstruction. Dicer represents a hallmark example of a multidomain protein, and we confirmed the phylogenetic multidomain approach being beneficial for the clinical effect prediction of Dicer variants. Because Dicer was suggested to have a putative role in hematological malignancies, we examined variants of DICER1 occurring outside the well-known hotspots of the RNase III domain in this type of cancer using phylogenetic reconstruction of individual domain history. Examined substitutions might disrupt the Dicer function, which was demonstrated by molecular dynamic simulation, where distinct structural alterations were observed for each mutation. Our approach can be utilized to study other multidomain proteins and to improve clinical effect evaluation.
RESUMO
The impact of global warming is increasing and thus exacerbating environmental stresses that affect plant yield and distribution, including the Eriobotrya japonica Lindl (Loquat tree). Eriobotrya japonica, a member of the Rosaceae family, is valued not only for its nutritious fruit but also for its medicinal purposes, landscape uses, and other pharmacological benefits. Nonetheless, the productivity of Eriobotrya japonica has raised a lot of concern in the wake of adverse environmental conditions. Understanding the characteristics of the LRR-RLK gene family in loquat is crucial, as these genes play vital roles in plant stress responses. In this study, 283 LRR-RLK genes were identified in the genome of E. japonica that were randomly positioned on 17 chromosomes and 24 contigs. The 283 EjLRR-RLK proteins clustered into 21 classes and subclasses in the phylogenetic analysis based on domain and protein arrangements. Further explorations in the promoter regions of the EjLRR-RLK genes showed an abundance of cis-regulatory elements that functioned in growth and development, phytohormone, and biotic and abiotic responses. Most cis-elements were present in the biotic and abiotic responses suggesting that the EjLRR-RLK genes are invested in regulating both biotic and abiotic stresses. Additional investigations into the responses of EjLRR-RLK genes to abiotic stress using the RT-qPCR revealed that EjLRR-RLK genes respond to abiotic stress, especially heat and salt stresses. Particularly, EjapXI-1.6 and EjapI-2.5 exhibited constant upregulation in all stresses analyzed, indicating that these may take an active role in regulating abiotic stresses. Our findings suggest the pivotal functions of EjLRR-RLK genes although additional research is still required. This research aims to provide useful information relating to the characterization of EjLRR-RLK genes and their responses to environmental stresses, establishing a concrete base for the following research.
RESUMO
Here we investigate the evolutionary dynamics of five enzyme superfamilies (CYPs, GSTs, UGTs, CCEs and ABCs) involved in detoxification in Helicoverpa armigera. The reference assembly for an African isolate of the major lineages, H. a. armigera, has 373 genes in the five superfamilies. Most of its CYPs, GSTs, UGTs and CCEs and a few of its ABCs occur in blocks and most of the clustered genes are in subfamilies specifically implicated in detoxification. Most of the genes have orthologues in the reference genome for the Oceania lineage, H. a. conferta. However, clustered orthologues and subfamilies specifically implicated in detoxification show greater sequence divergence and less constraint on non-synonymous differences between the two assemblies than do other members of the five superfamilies. Two duplicated CYPs, which were found in the H. a. armigera but not H. a. conferta reference genome, were also missing in 16 Chinese populations spanning two different lineages of H. a. armigera. The enzyme produced by one of these duplicates has higher activity against esfenvalerate than a previously described chimeric CYP mutant conferring pyrethroid resistance. Various transposable elements were found in the introns of most detoxification genes, generating diverse gene structures. Extensive resequencing data for the Chinese H. a. armigera and H. a. conferta lineages also revealed complex copy number polymorphisms in 17 CCE001s in a cluster also implicated in pyrethroid metabolism, with substantial haplotype differences between all three lineages. Our results suggest that cotton bollworm has a versatile complement of detoxification genes which are evolving in diverse ways across its range.
Assuntos
Sistema Enzimático do Citocromo P-450 , Helicoverpa armigera , Animais , China , Sistema Enzimático do Citocromo P-450/genética , Evolução Molecular , Duplicação Gênica , Helicoverpa armigera/enzimologia , Helicoverpa armigera/genética , Inativação Metabólica/genética , FilogeniaRESUMO
ATP-BINDING CASSETTE SUBFAMILY E MEMBER (ABCE) proteins are one of the most conserved proteins across eukaryotes and archaea. Yeast and most animals possess a single ABCE gene encoding the critical translational factor ABCE1. In several plant species, including Arabidopsis thaliana and Oryza sativa, two or more ABCE gene copies have been identified, however information related to plant ABCE gene family is still missing. In this study we retrieved ABCE gene sequences of 76 plant species from public genome databases and comprehensively analyzed them with the reference to A. thaliana ABCE2 gene (AtABCE2). Using bioinformatic approach we assessed the conservation and phylogeny of plant ABCEs. In addition, we performed haplotype analysis of AtABCE2 and its paralogue AtABCE1 using genomic sequences of 1,135 A. thaliana ecotypes. Plant ABCE proteins showed overall high sequence conservation, sharing at least 78% of amino acid sequence identity with AtABCE2. We found that over half of the selected species have two to eight ABCE genes, suggesting that in plants ABCE genes can be classified as a low-copy gene family, rather than a single-copy gene family. The phylogenetic trees of ABCE protein sequences and the corresponding coding sequences demonstrated that Brassicaceae and Poaceae families have independently undergone lineage-specific split of the ancestral ABCE gene. Other plant species have gained ABCE gene copies through more recent duplication events. We also noticed that ploidy level but not ancient whole genome duplications experienced by a species impacts ABCE gene family size. Deeper analysis of AtABCE2 and AtABCE1 from 1,135 A. thaliana ecotypes revealed four and 35 non-synonymous SNPs, respectively. The lower natural variation in AtABCE2 compared to AtABCE1 is in consistence with its crucial role for plant viability. Overall, while the sequence of the ABCE protein family is highly conserved in the plant kingdom, many plants have evolved to have more than one copy of this essential translational factor.
RESUMO
BACKGROUND: Tubulins play crucial roles in numerous fundamental processes of plant development. In flowering plants, tubulins are grouped into α-, ß- and γ-subfamilies, while α- and ß-tubulins possess a large isotype diversity and gene number variations among different species. This circumstance leads to insufficient recognition of orthologous isotypes and significantly complicates extrapolation of obtained experimental results, and brings difficulties for the identification of particular tubulin isotype function. The aim of this research is to identify and characterize tubulins of an emerging biofuel crop Camelina sativa. RESULTS: We report comprehensive identification and characterization of tubulin gene family in C. sativa, including analyses of exon-intron organization, duplicated genes comparison, proper isotype designation, phylogenetic analysis, and expression patterns in different tissues. 17 α-, 34 ß- and 6 γ-tubulin genes were identified and assigned to a particular isotype. Recognition of orthologous tubulin isotypes was cross-referred, involving data of phylogeny, synteny analyses and genes allocation on reconstructed genomic blocks of Ancestral Crucifer Karyotype. An investigation of expression patterns of tubulin homeologs revealed the predominant role of N6 (A) and N7 (B) subgenomes in tubulin expression at various developmental stages, contrarily to general the dominance of transcripts of H7 (C) subgenome. CONCLUSIONS: For the first time a complete set of tubulin gene family members was identified and characterized for allohexaploid C. sativa species. The study demonstrates the comprehensive approach of precise inferring gene orthology. The applied technique allowed not only identifying C. sativa tubulin orthologs in model Arabidopsis species and tracking tubulin gene evolution, but also uncovered that A. thaliana is missing orthologs for several particular isotypes of α- and ß-tubulins.
Assuntos
Evolução Molecular , Genoma de Planta , Família Multigênica , Filogenia , Tubulina (Proteína) , Tubulina (Proteína)/genética , Brassicaceae/genética , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Sintenia , Regulação da Expressão Gênica de Plantas , Duplicação Gênica , Íntrons/genética , Éxons/genéticaRESUMO
For protein coding genes to emerge de novo from a non-genic DNA, the DNA sequence must gain an open reading frame (ORF) and the ability to be transcribed. The newborn de novo gene can further evolve to accumulate changes in its sequence. Consequently, it can also elongate or shrink with time. Existing literature shows that older de novo genes have longer ORF, but it is not clear if they elongated with time or remained of the same length since their inception. To address this question we developed a mathematical model of ORF elongation as a Markov-jump process, and show that ORFs tend to keep their length in short evolutionary timescales. We also show that if change occurs it is likely to be a truncation. Our genomics and transcriptomics data analyses of seven Drosophila melanogaster populations are also in agreement with the model's prediction. We conclude that selection could facilitate ORF length extension that may explain why longer ORFs were observed in old de novo genes in studies analysing longer evolutionary time scales. Alternatively, shorter ORFs may be purged because they may be less likely to yield functional proteins.
Assuntos
Drosophila melanogaster , Evolução Molecular , Modelos Genéticos , Fases de Leitura Aberta , Animais , Drosophila melanogaster/genética , Cadeias de MarkovRESUMO
Heaps' or Herdan-Heaps' law is a linguistic law describing the relationship between the vocabulary/dictionary size (type) and word counts (token) to be a power-law function. Its existence in genomes with certain definition of DNA words is unclear partly because the dictionary size in genome could be much smaller than that in a human language. We define a DNA word as a coding region in a genome that codes for a protein domain. Using human chromosomes and chromosome arms as individual samples, we establish the existence of Heaps' law in the human genome within limited range. Our definition of words in a genomic or proteomic context is different from other definitions such as over-represented k-mers which are much shorter in length. Although an approximate power-law distribution of protein domain sizes due to gene duplication and the related Zipf's law is well known, their translation to the Heaps' law in DNA words is not automatic. Several other animal genomes are shown herein also to exhibit range-limited Heaps' law with our definition of DNA words, though with various exponents. When tokens were randomly sampled and sample sizes reach to the maximum level, a deviation from the Heaps' law was observed, but a quadratic regression in log-log type-token plot fits the data perfectly. Investigation of type-token plot and its regression coefficients could provide an alternative narrative of reusage and redundancy of protein domains as well as creation of new protein domains from a linguistic perspective.
Assuntos
DNA , Genoma Humano , Humanos , DNA/genética , Animais , Linguística , Domínios ProteicosRESUMO
The Stat (signal transducer and activator of transcription) gene family plays a vital role in regulating immunity and the processes of cellular proliferation, differentiation, and apoptosis across diverse organisms. Although the functions of Stat genes in immunity have been extensively documented in many mammals, limited data are available for reptiles. We used phylogenetic analysis to identify eight putative members of the Stat family (Stat1-1, Stat1-2, Stat2, Stat3, Stat4, Stat5b, Stat6-1, and Stat6-2) within the genome of M. reevesii, a freshwater turtle found in East Asia. Sequence analysis showed that the Stat genes contain four conserved structural domains protein interaction domain, coiled-coil domain, DNA-binding domain, and Src homology domain 2. In addition, Stat1, Stat2, and Stat6 contain TAZ2bind, Apolipo_F, and TALPID3 structural domains. The mRNA levels of Stat genes were upregulated in spleen tissues at 4, 8, 12, and 16 h after administration of lipopolysaccharide, a potent activator of the immune system. Stat5b expression at 12-h LPS post-injection exhibited the most substantial difference from the control. The expression of Stat5b in spleen tissue cellular was verified by immunofluorescence. These results suggest that Stat5b plays a role in the immune response of M. reevesii and may prove to be as a positive marker of an immune response in future studies.
RESUMO
Studying gene family evolution strongly benefits from insightful visualizations. However, the ever-growing number of sequenced genomes is leading to increasingly larger gene families, which challenges existing gene tree visualizations. Indeed, most of them present users with a dilemma: display complete but intractable gene trees, or collapse subtrees, thereby hiding their children's information. Here, we introduce Matreex, a new dynamic tool to scale up the visualization of gene families. Matreex's key idea is to use "phylogenetic" profiles, which are dense representations of gene repertoires, to minimize the information loss when collapsing subtrees. We illustrate Matreex's usefulness with three biological applications. First, we demonstrate on the MutS family the power of combining gene trees and phylogenetic profiles to delve into precise evolutionary analyses of large multicopy gene families. Second, by displaying 22 intraflagellar transport gene families across 622 species cumulating 5,500 representatives, we show how Matreex can be used to automate large-scale analyses of gene presence-absence. Notably, we report for the first time the complete loss of intraflagellar transport in the myxozoan Thelohanellus kitauei. Finally, using the textbook example of visual opsins, we show Matreex's potential to create easily interpretable figures for teaching and outreach. Matreex is available from the Python Package Index (pip install Matreex) with the source code and documentation available at https://github.com/DessimozLab/matreex.
Assuntos
Família Multigênica , Filogenia , Software , Evolução Molecular , AnimaisRESUMO
Lyme borreliosis (LB) is the most common vector-borne disease in the Northern Hemisphere caused by spirochetes belonging to the Borrelia burgdorferi sensu lato (Bbsl) complex. Borrelia spirochetes circulate in obligatory transmission cycles between tick vectors and different vertebrate hosts. To successfully complete this complex transmission cycle, Bbsl encodes for an arsenal of proteins including the PFam54 protein family with known, or proposed, influences to reservoir host and/or vector adaptation. Even so, only fragmentary information is available regarding the naturally occurring level of variation in the PFam54 gene array especially in relation to Eurasian-distributed species. Utilizing whole genome data from isolates (n = 141) originated from three major LB-causing Borrelia species across Eurasia (B. afzelii, B. bavariensis, and B. garinii), we aimed to characterize the diversity of the PFam54 gene array in these isolates to facilitate understanding the evolution of PFam54 paralogs on an intra- and interspecies level. We found an extraordinarily high level of variation in the PFam54 gene array with 39 PFam54 paralogs belonging to 23 orthologous groups including five novel paralogs. Even so, the gene array appears to have remained fairly stable over the evolutionary history of the studied Borrelia species. Interestingly, genes outside Clade IV, which contains genes encoding for proteins associated with Borrelia pathogenesis, more frequently displayed signatures of diversifying selection between clades that differ in hypothesized vector or host species. This could suggest that non-Clade IV paralogs play a more important role in host and/or vector adaptation than previously expected, which would require future lab-based studies to validate.
RESUMO
In the human genome, two short open reading frames (ORFs) separated by a transcriptional silencer and a small intervening sequence stem from the gene SMIM45. The two ORFs show different translational characteristics, and they also show divergent patterns of evolutionary development. The studies presented here describe the evolution of the components of SMIM45. One ORF consists of an ultra-conserved 68 amino acid (aa) sequence, whose origins can be traced beyond the evolutionary age of divergence of the elephant shark, ~462 MYA. The silencer also has ancient origins, but it has a complex and divergent pattern of evolutionary formation, as it overlaps both at the 68 aa ORF and the intervening sequence. The other ORF consists of 107 aa. It develops during primate evolution but is found to originate de novo from an ancestral non-coding genomic region with root origins within the Afrothere clade of placental mammals, whose evolutionary age of divergence is ~99 MYA. The formation of the complete 107 aa ORF during primate evolution is outlined, whereby sequence development is found to occur through biased mutations, with disruptive random mutations that also occur but lead to a dead-end. The 107 aa ORF is of particular significance, as there is evidence to suggest it is a protein that may function in human brain development. Its evolutionary formation presents a view of a human-specific ORF and its linked silencer that were predetermined in non-primate ancestral species. The genomic position of the silencer offers interesting possibilities for the regulation of transcription of the 107 aa ORF. A hypothesis is presented with respect to possible spatiotemporal expression of the 107 aa ORF in embryonic tissues.
Assuntos
Genoma Humano , Placenta , Feminino , Gravidez , Animais , Humanos , Fases de Leitura Aberta/genética , Sequência de Aminoácidos , Primatas , MamíferosRESUMO
Genome organization is intricately tied to regulating genes and associated cell fate decisions. In this study, we examine the positioning and functional significance of human genes, grouped by their evolutionary age, within the 3D organization of the genome. We reveal that genes of different evolutionary origin have distinct positioning relationships with both domains and loop anchors, and remarkably consistent relationships with boundaries across cell types. While the functional associations of each group of genes are primarily cell type-specific, such associations of conserved genes maintain greater stability across 3D genomic features and disease than recently evolved genes. Furthermore, the expression of these genes across various tissues follows an evolutionary progression, such that RNA levels increase from young genes to ancient genes. Thus, the distinct relationships of gene evolutionary age, function, and positioning within 3D genomic features contribute to tissue-specific gene regulation in development and disease.
RESUMO
Cellular and organism survival depends upon the regulation of pH, which is regulated by highly specialized cell membrane transporters, the solute carriers (SLC) (For a comprehensive list of the solute carrier family members, see: https://www.bioparadigms.org/slc/ ). The SLC4 family of bicarbonate (HCO3-) transporters consists of ten members, sorted by their coupling to either sodium (NBCe1, NBCe2, NBCn1, NBCn2, NDCBE), chloride (AE1, AE2, AE3), or borate (BTR1). The ionic coupling of SLC4A9 (AE4) remains controversial. These SLC4 bicarbonate transporters may be controlled by cellular ionic gradients, cellular membrane voltage, and signaling molecules to maintain critical cellular and systemic pH (acid-base) balance. There are profound consequences when blood pH deviates even a small amount outside the normal range (7.35-7.45). Chiefly, Na+-coupled bicarbonate transporters (NCBT) control intracellular pH in nearly every living cell, maintaining the biological pH required for life. Additionally, NCBTs have important roles to regulate cell volume and maintain salt balance as well as absorption and secretion of acid-base equivalents. Due to their varied tissue expression, NCBTs have roles in pathophysiology, which become apparent in physiologic responses when their expression is reduced or genetically deleted. Variations in physiological pH are seen in a wide variety of conditions, from canonically acid-base related conditions to pathologies not necessarily associated with acid-base dysfunction such as cancer, glaucoma, or various neurological diseases. The membranous location of the SLC4 transporters as well as recent advances in discovering their structural biology makes them accessible and attractive as a druggable target in a disease context. The role of sodium-coupled bicarbonate transporters in such a large array of conditions illustrates the potential of treating a wide range of disease states by modifying function of these transporters, whether that be through inhibition or enhancement.
Assuntos
Bicarbonatos , Simportadores de Sódio-Bicarbonato , Simportadores de Sódio-Bicarbonato/genética , Simportadores de Sódio-Bicarbonato/metabolismo , Bicarbonatos/metabolismo , Bicarbonato de Sódio , Sódio/metabolismo , Proteínas de Membrana Transportadoras , Concentração de Íons de HidrogênioRESUMO
BACKGROUND: Caffeic acid O-methyltransferase (COMT) is a key enzyme that regulates melatonin synthesis and is involved in regulating the growth, development, and response to abiotic stress in plants. Tea plant is a popular beverage consumed worldwide, has been used for centuries for its medicinal properties, including its ability to reduce inflammation, improve digestion, and boost immune function. By analyzing genetic variation within the COMT family, while helping tea plants resist adversity, it is also possible to gain a deeper understanding of how different tea varieties produce and metabolize catechins, then be used to develop new tea cultivars with desired flavor profiles and health benefits. RESULTS: In this study, a total of 25 CsCOMT genes were identified based on the high-quality tea (Camellia sinensis) plant genome database. Phylogenetic tree analysis of CsCOMTs with COMTs from other species showed that COMTs divided into four subfamilies (Class I, II, III, IV), and CsCOMTs was distributed in Class I, Class II, Class III. CsCOMTs not only undergoes large-scale gene recombination in pairs internally in tea plant, but also shares 2 and 7 collinear genes with Arabidopsis thaliana and poplar (Populus trichocarpa), respectively. The promoter region of CsCOMTs was found to be rich in cis-acting elements associated with plant growth and stress response. By analyzing the previously transcriptome data, it was found that some members of CsCOMT family exhibited significant tissue-specific expression and differential expression under different stress treatments. Subsequently, we selected six CsCOMTs to further validated their expression levels in different tissues organ using qRT-PCR. In addition, we silenced the CsCOMT19 through virus-induced gene silencing (VIGS) method and found that CsCOMT19 positively regulates the synthesis of melatonin in tea plant. CONCLUSION: These results will contribute to the understanding the functions of CsCOMT gene family and provide valuable information for further research on the role of CsCOMT genes in regulating tea plant growth, development, and response to abiotic stress.
Assuntos
Camellia sinensis , Melatonina , Metiltransferases , Camellia sinensis/fisiologia , Melatonina/genética , Filogenia , Chá , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Regulação da Expressão Gênica de PlantasRESUMO
Pungent capsaicinoid is synthesized only in chili pepper (Capsicum spp.). The production of vanillylamine from vanillin is a unique reaction in the capsaicinoid biosynthesis pathway. Although putative aminotransferase (pAMT) has been isolated as the vanillylamine synthase gene, it is unclear how Capsicum acquired pAMT. Here, we present a phylogenetic overview of pAMT and its homologs. The Capsicum genome contained 5 homologs, including pAMT, CaGABA-T1, CaGABA-T3, and two pseudogenes. Phylogenetic analysis indicated that pAMT is a member of the Solanaceae cytoplasmic GABA-Ts. Comparative genome analysis found that multiple copies of GABA-T exist in a specific Solanaceae genomic region, and the cytoplasmic GABA-Ts other than pAMT are located in the region. The cytoplasmic GABA-T was phylogenetically close to pseudo-GABA-T harboring a plastid transit peptide (pseudo-GABA-T3). This suggested that Solanaceae cytoplasmic GABA-Ts occurred via duplication of a chloroplastic GABA-T ancestor and subsequent loss of the plastid transit signal. The cytoplasmic GABA-T may have been translocated from the specific Solanaceae genomic region during Capsicum divergence, resulting in the current pAMT locus. A recombinant protein assay demonstrated that pAMT had higher vanillylamine synthase activity than those of other plant GABA-Ts. pAMT was expressed exclusively in the placental septum of mature green fruit, whereas tomato orthologs SlGABA-T2/4 exhibit a ubiquitous expression pattern in plants. These findings suggested that both the increased catalytic efficiency and transcriptional changes in pAMT may have contributed to establish vanillylamine synthesis in the capsaicinoid biosynthesis pathway. This study provides insights into the establishment of pungency in the evolution of chili peppers.
Assuntos
Benzilaminas , Capsicum , Solanaceae , Gravidez , Feminino , Humanos , Capsicum/metabolismo , Capsaicina/metabolismo , Transaminases/metabolismo , Filogenia , Placenta/metabolismo , Solanaceae/genética , Solanaceae/metabolismo , Óxido Nítrico Sintase/genética , Ácido gama-Aminobutírico/metabolismo , Frutas/genética , Frutas/metabolismoRESUMO
Duplication of genes at different time period, through recurrent and frequent polyploidization events, have played a major role in plant evolution, adaptation and diversification. Interestingly, some of the ancestral duplicated genes (referred as paleologs), have been maintained for millions of years, and there is still a poor knowledge of the reasons of their retention, especially when testing the phenotypic effect of individual copies by using functional genetic approaches. To fill this gap, we performed functional genetic (CRISPR-Cas9), physiological, transcriptomic and evolutionary studies to finely investigate this open question, taking the example of the petC gene (involved in cytochrome b6/f and thus impacting photosynthesis) that is present in four paleologous copies in the oilseed crop Brassica napus. RNA-Seq and selective pressure analyses suggested that all paleologous copies conserved the same function and that they were all highly transcribed. Thereafter, the Knock Out (K.O.) of one, several or all petC copies highlighted that all paleologous copies have to be K.O. to suppress the gene function. In addition, we could determine that phenotypic effects in single and double mutants could only be deciphered in high light conditions. Interestingly, we did not detect any significant differences between single mutants K.O. for either the A03 or A09 copy (despite being differentially transcribed), or even between mutants for a single or two petC copies. Altogether, this work revealed that petC paleologs have retained their ancestral function and that the retention of these copies is explained by their compensatory role, especially in optimal environmental conditions.
Assuntos
Brassica napus , Brassica napus/genética , Genoma de Planta/genética , Genes de Plantas/genética , Genes Duplicados/genética , PoliploidiaRESUMO
The Orchidaceae is a mega-diverse plant family with ca. 29,000 species with a large variety of life forms that can colonize transitory habitats. Despite this diversity, little is known about their flowering integrators in response to specific environmental factors. During the reproductive transition in flowering plants a vegetative apical meristem (SAM) transforms into an inflorescence meristem (IM) that forms bracts and flowers. In model grasses, like rice, a flowering genetic regulatory network (FGRN) controlling reproductive transitions has been identified, but little is known in the Orchidaceae. In order to analyze the players of the FRGN in orchids, we performed comprehensive phylogenetic analyses of CONSTANS-like/CONSTANS-like 4 (COL/COL4), FLOWERING LOCUS D (FD), FLOWERING LOCUS C/FRUITFULL (FLC/FUL) and SUPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) gene lineages. In addition to PEBP and AGL24/SVP genes previously analyzed, here we identify an increase of orchid homologs belonging to COL4, and FUL gene lineages in comparison with other monocots, including grasses, due to orchid-specific gene lineage duplications. Contrariwise, local duplications in Orchidaceae are less frequent in the COL, FD and SOC1 gene lineages, which points to a retention of key functions under strong purifying selection in essential signaling factors. We also identified changes in the protein sequences after such duplications, variation in the evolutionary rates of resulting paralogous clades and targeted expression of isolated homologs in different orchids. Interestingly, vernalization-response genes like VERNALIZATION1 (VRN1) and FLOWERING LOCUS C (FLC) are completely lacking in orchids, or alternatively are reduced in number, as is the case of VERNALIZATION2/GHD7 (VRN2). Our findings point to non-canonical factors sensing temperature changes in orchids during reproductive transition. Expression data of key factors gathered from Elleanthus auratiacus, a terrestrial orchid in high Andean mountains allow us to characterize which copies are actually active during flowering. Altogether, our data lays down a comprehensive framework to assess gene function of a restricted number of homologs identified more likely playing key roles during the flowering transition, and the changes of the FGRN in neotropical orchids in comparison with temperate grasses.
RESUMO
BACKGROUND: Venoms, which have evolved numerous times in animals, are ideal models of convergent trait evolution. However, detailed genomic studies of toxin-encoding genes exist for only a few animal groups. The hyper-diverse hymenopteran insects are the most speciose venomous clade, but investigation of the origin of their venom genes has been largely neglected. RESULTS: Utilizing a combination of genomic and proteo-transcriptomic data, we investigated the origin of 11 toxin genes in 29 published and 3 new hymenopteran genomes and compiled an up-to-date list of prevalent bee venom proteins. Observed patterns indicate that bee venom genes predominantly originate through single gene co-option with gene duplication contributing to subsequent diversification. CONCLUSIONS: Most Hymenoptera venom genes are shared by all members of the clade and only melittin and the new venom protein family anthophilin1 appear unique to the bee lineage. Most venom proteins thus predate the mega-radiation of hymenopterans and the evolution of the aculeate stinger.
Assuntos
Venenos de Abelha , Abelhas/genética , Animais , Perfilação da Expressão Gênica , Transcriptoma , Genômica , Duplicação GênicaRESUMO
Family I84 serine protease inhibitors are believed to be mollusk specific proteins involved in host defense. The molecular evolution of the family, however, remains to be understood. In this study, the genes of Family I84 protease inhibitors in 3 bivalves, Crassostrea gigas, Crassostrea virginica and Tegillarca granosa, were analyzed at the genomic level. A total of 66 Family I84 genes (22 in C. gigas, 28 in C. virginica and 16 in T. granosa) were identified from the 3 species. They distributed unevenly in the genomes involving 4 chromosomes in C. gigas and 5 chromosomes in C. virginica and T. granosa and some genes were tandemly duplicated. Most genes had 3 exons with 12 genes having 4 exons and 1 gene having 2 exons. All genes but 1 from C. gigas and 1 from T. granosa encoded peptides with a signal sequence at the N-terminus, and the properties of the predicted mature molecules were similar. Four conserved motifs were identified in the 66 amino acid sequences. Collinear analysis revealed higher collinearity between the 2 oyster species in general genes and in Family I84 genes. Phylogenetic analysis of the 66 genes with those previously reported from 3 other bivalves and 1 gastropod showed that Family I84 protease inhibitor genes from the same species tended to be grouped together in terminal branches of the constructed Maximum likelihood tree, but most internal nodes were poorly supported by the bootstrap values. In addition, differences in expression patterns between the genes of a same species were observed in the developmental stages and tissues of C. gigas and T. granosa. Moreover, the co-expression of genes within Family I84 and Family I84 genes with non-Family I84 were also detected in C. gigas and T. granosa. These results suggested that Family I84 protease inhibitor genes evolved by active duplications and structural and functional diversifications after the speciation of related mollusks, and the diversified protease inhibitor family was likely multifunctional.
Assuntos
Bivalves , Crassostrea , Animais , Inibidores de Proteases , Filogenia , Genoma , Sequência de Aminoácidos , Antivirais , Bivalves/genética , Crassostrea/genéticaRESUMO
Maleae is one of the most widespread tribes of Rosaceae and includes several important fruit crops and ornamental plants. We used nuclear genes from 62 transcriptomes/genomes, including 26 newly generated transcriptomes, to reconstruct a well-supported phylogeny and study the evolution of fruit and leaf morphology and the possible effect of whole genome duplication (WGD). Our phylogeny recovered 11 well-supported clades and supported the monophyly of most genera (except Malus, Sorbus, and Pourthiaea) with at least two sampled species. A WGD was located to the most recent common ancestor (MRCA) of Maleae and dated to c. 54 million years ago (Ma) near the Early Eocene Climatic Optimum, supporting Gillenieae (x = 9) being a parental lineage of Maleae (x = 17) and including duplicate regulatory genes related to the origin of the fleshy pome fruit. Whole genome duplication-derived paralogs that are retained in specific lineages but lost in others are predicted to function in development, metabolism, and other processes. An upshift of diversification and innovations of fruit and leaf morphologies occurred at the MRCA of the Malinae subtribe, coinciding with the Eocene-Oligocene transition (c. 34 Ma), following a lag from the time of the WGD event. Our results provide new insights into the Maleae phylogeny, its rapid diversification, and morphological and molecular evolution.