RESUMO
Long noncoding RNAs (lncRNAs) are essential regulatory elements of sex chromosomes that act to equalize gene expression levels between males and females. XIST, RSX, and roX2 regulate X chromosomes in placental mammals, marsupials, and Drosophila, respectively. Because the green anole (Anolis carolinensis) shows complete dosage compensation of its X chromosome, we tested whether a lncRNA was involved. We found an ancient lncRNA, MAYEX, that gained male-specific expression more than 89 million years ago. MAYEX evolved a notable association with the acetylated histone 4 lysine 16 (H4K16ac) epigenetic mark and the ability to loop its locus to the totality of the X chromosome to increase expression levels. MAYEX is the first lncRNA in reptiles linked to a dosage compensation mechanism that balances the expression of sex chromosomes.
Assuntos
Mecanismo Genético de Compensação de Dose , Lagartos , RNA Longo não Codificante , Cromossomo X , Animais , Feminino , Masculino , Acetilação , Epigênese Genética , Evolução Molecular , Histonas/metabolismo , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Cromossomo X/genética , Lagartos/genéticaRESUMO
BACKGROUND: Plant long non-coding RNAs (lncRNAs) have important regulatory roles in responses to various biotic and abiotic stresses, including light quality. However, no lncRNAs have been specifically linked to the Shade Avoidance Response (SAS). RESULTS: To better understand the involvement of lncRNAs in shade avoidance, we examined RNA-seq libraries for lncRNAs with the potential to function in the neighbor proximity phenomenon in Arabidopsis thaliana (A. thaliana). Using transcriptomes generated from seedlings exposed to high and low red/far-red (R/FR) light conditions, we identified 13 lncRNA genes differentially expressed in cotyledons and 138 in hypocotyls. To infer possible functions for these lncRNAs, we used a 'guilt-by-association' approach to identify genes co-expressed with lncRNAs in a weighted gene co-expression network. Of 34 co-expression modules, 10 showed biological functions related to differential growth. We identified three potential lncRNAs co-regulated with genes related to SAS. T-DNA insertions in two of these lncRNAs were correlated with morphological differences in seedling responses to increased FR light, supporting our strategy for computational identification of lncRNAs involved in SAS. CONCLUSIONS: Using a computational approach, we identified multiple lncRNAs in Arabidopsis involved in SAS. T-DNA insertions caused altered phenotypes under low R/FR light, suggesting functional roles in shade avoidance. Further experiments are needed to determine the specific mechanisms of these lncRNAs in SAS.
Assuntos
Arabidopsis , Regulação da Expressão Gênica de Plantas , Luz , RNA Longo não Codificante , Arabidopsis/genética , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Redes Reguladoras de Genes , Perfilação da Expressão Gênica , Hipocótilo/genética , Hipocótilo/crescimento & desenvolvimento , Plântula/genética , Plântula/crescimento & desenvolvimento , Plântula/efeitos da radiação , Transcriptoma , Cotilédone/genéticaRESUMO
Latin America continues to be severely underrepresented in genomics research, and fine-scale genetic histories and complex trait architectures remain hidden owing to insufficient data1. To fill this gap, the Mexican Biobank project genotyped 6,057 individuals from 898 rural and urban localities across all 32 states in Mexico at a resolution of 1.8 million genome-wide markers with linked complex trait and disease information creating a valuable nationwide genotype-phenotype database. Here, using ancestry deconvolution and inference of identity-by-descent segments, we inferred ancestral population sizes across Mesoamerican regions over time, unravelling Indigenous, colonial and postcolonial demographic dynamics2-6. We observed variation in runs of homozygosity among genomic regions with different ancestries reflecting distinct demographic histories and, in turn, different distributions of rare deleterious variants. We conducted genome-wide association studies (GWAS) for 22 complex traits and found that several traits are better predicted using the Mexican Biobank GWAS compared to the UK Biobank GWAS7,8. We identified genetic and environmental factors associating with trait variation, such as the length of the genome in runs of homozygosity as a predictor for body mass index, triglycerides, glucose and height. This study provides insights into the genetic histories of individuals in Mexico and dissects their complex trait architectures, both crucial for making precision and preventive medicine initiatives accessible worldwide.
Assuntos
Bancos de Espécimes Biológicos , Genética Médica , Genoma Humano , Genômica , Hispânico ou Latino , Humanos , Glicemia/genética , Glicemia/metabolismo , Estatura/genética , Índice de Massa Corporal , Interação Gene-Ambiente , Marcadores Genéticos/genética , Estudo de Associação Genômica Ampla , Hispânico ou Latino/classificação , Hispânico ou Latino/genética , Homozigoto , México , Fenótipo , Triglicerídeos/sangue , Triglicerídeos/genética , Reino Unido , Genoma Humano/genéticaRESUMO
BACKGROUND: Long non-coding RNAs (lncRNAs) are defined as transcribed molecules longer than 200 nucleotides with little to no protein-coding potential. LncRNAs can regulate gene expression of nearby genes (cis-acting) or genes located on other chromosomes (trans-acting). Several methodologies have been developed to capture lncRNAs associated with chromatin at a genome-wide level. Analysis of RNA-DNA contacts can be combined with epigenetic and RNA-seq data to define potential lncRNAs involved in the regulation of gene expression. RESULTS: We performed Chromatin Associated RNA sequencing (ChAR-seq) in Anolis carolinensis to obtain the genome-wide map of the associations that RNA molecules have with chromatin. We analyzed the frequency of DNA contacts for different classes of RNAs and were able to define cis- and trans-acting lncRNAs. We integrated the ChAR-seq map of RNA-DNA contacts with epigenetic data for the acetylation of lysine 16 on histone H4 (H4K16ac), a mark connected to actively transcribed chromatin in lizards. We successfully identified three trans-acting lncRNAs significantly associated with the H4K16ac signal, which are likely involved in the regulation of gene expression in A. carolinensis. CONCLUSIONS: We show that the ChAR-seq method is a powerful tool to explore the RNA-DNA map of interactions. Moreover, in combination with epigenetic data, ChAR-seq can be applied in non-model species to establish potential roles for predicted lncRNAs that lack functional annotations.
Assuntos
Lagartos , RNA Longo não Codificante , Animais , Cromatina/genética , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Lagartos/genética , Lagartos/metabolismo , DNA/genética , GenomaRESUMO
Long non-coding RNAs (lncRNAs) are a prominent class of eukaryotic regulatory genes. Despite the numerous available transcriptomic datasets, the annotation of plant lncRNAs remains based on dated annotations that have been historically carried over. We present a substantially improved annotation of Arabidopsis thaliana lncRNAs, generated by integrating 224 transcriptomes in multiple tissues, conditions, and developmental stages. We annotate 6764 lncRNA genes, including 3772 that are novel. We characterize their tissue expression patterns and find 1425 lncRNAs are co-expressed with coding genes, with enriched functional categories such as chloroplast organization, photosynthesis, RNA regulation, transcription, and root development. This improved transcription-guided annotation constitutes a valuable resource for studying lncRNAs and the biological processes they may regulate.
Assuntos
Arabidopsis , RNA Longo não Codificante , Arabidopsis/metabolismo , Anotação de Sequência Molecular , RNA Longo não Codificante/metabolismo , Transcriptoma/genéticaRESUMO
Hi-C enables the characterization of the 0conformation of the genome in the three-dimensional nuclear space. This technique has revolutionized our ability to detect interactions between linearly distant genomic sites on a genome-wide scale. Here, we detail a protocol to carry out in situ Hi-C in plants and describe a straightforward bioinformatics pipeline for the analysis of such data, in particular for comparing samples from different organs or conditions.
Assuntos
Cromatina , Biologia Computacional , Núcleo Celular/genética , Biologia Computacional/métodos , Genoma , Genômica/métodos , Plantas/genéticaRESUMO
[This corrects the article DOI: 10.1371/journal.pcbi.1009218.].
RESUMO
Although specialized mechanosensory cells are found across animal phylogeny, early evolutionary histories of mechanoreceptor development remain enigmatic. Cnidaria (e.g. sea anemones and jellyfishes) is the sister group to well-studied Bilateria (e.g. flies and vertebrates), and has two mechanosensory cell types - a lineage-specific sensory effector known as the cnidocyte, and a classical mechanosensory neuron referred to as the hair cell. While developmental genetics of cnidocytes is increasingly understood, genes essential for cnidarian hair cell development are unknown. Here, we show that the class IV POU homeodomain transcription factor (POU-IV) - an indispensable regulator of mechanosensory cell differentiation in Bilateria and cnidocyte differentiation in Cnidaria - controls hair cell development in the sea anemone cnidarian Nematostella vectensis. N. vectensis POU-IV is postmitotically expressed in tentacular hair cells, and is necessary for development of the apical mechanosensory apparatus, but not of neurites, in hair cells. Moreover, it binds to deeply conserved DNA recognition elements, and turns on a unique set of effector genes - including the transmembrane receptor-encoding gene polycystin 1 - specifically in hair cells. Our results suggest that POU-IV directs differentiation of cnidarian hair cells and cnidocytes via distinct gene regulatory mechanisms, and support an evolutionarily ancient role for POU-IV in defining the mature state of mechanosensory neurons.
Assuntos
Diferenciação Celular/genética , Mecanorreceptores/metabolismo , Fatores do Domínio POU/genética , Anêmonas-do-Mar/crescimento & desenvolvimento , Animais , Evolução Biológica , Fatores do Domínio POU/metabolismo , Anêmonas-do-Mar/genéticaRESUMO
Current Genome-Wide Association Studies (GWAS) rely on genotype imputation to increase statistical power, improve fine-mapping of association signals, and facilitate meta-analyses. Due to the complex demographic history of Latin America and the lack of balanced representation of Native American genomes in current imputation panels, the discovery of locally relevant disease variants is likely to be missed, limiting the scope and impact of biomedical research in these populations. Therefore, the necessity of better diversity representation in genomic databases is a scientific imperative. Here, we expand the 1,000 Genomes reference panel (1KGP) with 134 Native American genomes (1KGP + NAT) to assess imputation performance in Latin American individuals of mixed ancestry. Our panel increased the number of SNPs above the GWAS quality threshold, thus improving statistical power for association studies in the region. It also increased imputation accuracy, particularly in low-frequency variants segregating in Native American ancestry tracts. The improvement is subtle but consistent across countries and proportional to the number of genomes added from local source populations. To project the potential improvement with a higher number of reference genomes, we performed simulations and found that at least 3,000 Native American genomes are needed to equal the imputation performance of variants in European ancestry tracts. This reflects the concerning imbalance of diversity in current references and highlights the contribution of our work to reducing it while complementing efforts to improve global equity in genomic research.
RESUMO
Long non-coding RNAs (lncRNAs) have important regulatory functions across eukarya. It is now clear that many of these functions are related to gene expression regulation through their capacity to recruit epigenetic modifiers and establish chromatin interactions. Several lncRNAs have been recently shown to participate in modulating chromatin within the spatial organization of the genome in the three-dimensional space of the nucleus. The identification of lncRNA candidates is challenging, as it is their functional characterization. Conservation signatures of lncRNAs are different from those of protein-coding genes, making identifying lncRNAs under selection a difficult task, and the homology between lncRNAs may not be readily apparent. Here, we review the evidence for these higher-order genome organization functions of lncRNAs in animals and the evolutionary signatures they display.
RESUMO
While colocalization within a bacterial operon enables coexpression of the constituent genes, the mechanistic logic of clustering of nonhomologous monocistronic genes in eukaryotes is not immediately obvious. Biosynthetic gene clusters that encode pathways for specialized metabolites are an exception to the classical eukaryote rule of random gene location and provide paradigmatic exemplars with which to understand eukaryotic cluster dynamics and regulation. Here, using 3C, Hi-C, and Capture Hi-C (CHi-C) organ-specific chromosome conformation capture techniques along with high-resolution microscopy, we investigate how chromosome topology relates to transcriptional activity of clustered biosynthetic pathway genes in Arabidopsis thaliana Our analyses reveal that biosynthetic gene clusters are embedded in local hot spots of 3D contacts that segregate cluster regions from the surrounding chromosome environment. The spatial conformation of these cluster-associated domains differs between transcriptionally active and silenced clusters. We further show that silenced clusters associate with heterochromatic chromosomal domains toward the periphery of the nucleus, while transcriptionally active clusters relocate away from the nuclear periphery. Examination of chromosome structure at unrelated clusters in maize, rice, and tomato indicates that integration of clustered pathway genes into distinct topological domains is a common feature in plant genomes. Our results shed light on the potential mechanisms that constrain coexpression within clusters of nonhomologous eukaryotic genes and suggest that gene clustering in the one-dimensional chromosome is accompanied by compartmentalization of the 3D chromosome.
Assuntos
Arabidopsis/genética , Cromossomos de Plantas/genética , Família Multigênica , Proteínas de Plantas/genética , Solanum lycopersicum/genética , Zea mays/genética , Arabidopsis/metabolismo , Cromossomos de Plantas/metabolismo , Genoma de Planta , Solanum lycopersicum/metabolismo , Oryza/genética , Oryza/metabolismo , Proteínas de Plantas/metabolismo , Zea mays/metabolismoRESUMO
Long noncoding RNAs (lncRNAs) have recently emerged as prominent regulators of gene expression in eukaryotes. LncRNAs often drive the modification and maintenance of gene activation or gene silencing states via chromatin conformation rearrangements. In plants, lncRNAs have been shown to participate in gene regulation, and are essential to processes such as vernalization and photomorphogenesis. Despite their prominent functions, only over a dozen lncRNAs have been experimentally and functionally characterized. Similar to its animal counterparts, the rates of sequence divergence are much higher in plant lncRNAs than in protein coding mRNAs, making it difficult to identify lncRNA conservation using traditional sequence comparison methods. Beyond this, little is known about the evolutionary patterns of lncRNAs in plants. Here, we characterized the splicing conservation of lncRNAs in Brassicaceae. We generated a whole-genome alignment of 16 Brassica species and used it to identify synthenic lncRNA orthologs. Using a scoring system trained on transcriptomes from A. thaliana and B. oleracea, we identified splice sites across the whole alignment and measured their conservation. Our analysis revealed that 17.9% (112/627) of all intergenic lncRNAs display splicing conservation in at least one exon, an estimate that is substantially higher than previous estimates of lncRNA conservation in this group. Our findings agree with similar studies in vertebrates, demonstrating that splicing conservation can be evidence of stabilizing selection. We provide conclusive evidence for the existence of evolutionary deeply conserved lncRNAs in plants and describe a generally applicable computational workflow to identify functional lncRNAs in plants.
Assuntos
Sequência Conservada/genética , Splicing de RNA/genética , RNA Longo não Codificante/genética , RNA de Plantas/genética , Arabidopsis/genética , Brassica/genética , Evolução Molecular , Genoma de Planta/genética , RNA Mensageiro/genéticaRESUMO
An Amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
A widely held-but rarely tested-hypothesis for the origin of animals is that they evolved from a unicellular ancestor, with an apical cilium surrounded by a microvillar collar, that structurally resembled modern sponge choanocytes and choanoflagellates1-4. Here we test this view of animal origins by comparing the transcriptomes, fates and behaviours of the three primary sponge cell types-choanocytes, pluripotent mesenchymal archaeocytes and epithelial pinacocytes-with choanoflagellates and other unicellular holozoans. Unexpectedly, we find that the transcriptome of sponge choanocytes is the least similar to the transcriptomes of choanoflagellates and is significantly enriched in genes unique to either animals or sponges alone. By contrast, pluripotent archaeocytes upregulate genes that control cell proliferation and gene expression, as in other metazoan stem cells and in the proliferating stages of two unicellular holozoans, including a colonial choanoflagellate. Choanocytes in the sponge Amphimedon queenslandica exist in a transient metastable state and readily transdifferentiate into archaeocytes, which can differentiate into a range of other cell types. These sponge cell-type conversions are similar to the temporal cell-state changes that occur in unicellular holozoans5. Together, these analyses argue against homology of sponge choanocytes and choanoflagellates, and the view that the first multicellular animals were simple balls of cells with limited capacity to differentiate. Instead, our results are consistent with the first animal cell being able to transition between multiple states in a manner similar to modern transdifferentiating and stem cells.
Assuntos
Transdiferenciação Celular , Modelos Biológicos , Filogenia , Células-Tronco Pluripotentes/citologia , Poríferos/citologia , Animais , Proliferação de Células , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Evolução Molecular , Células-Tronco Pluripotentes/metabolismo , Poríferos/metabolismo , Reprodutibilidade dos Testes , TranscriptomaRESUMO
BACKGROUND: Micro RNAs (miRNAs) and piwi interacting RNAs (piRNAs), along with the more ancient eukaryotic endogenous small interfering RNAs (endo-siRNAs) constitute the principal components of the RNA interference (RNAi) repertoire of most animals. RNAi in non-bilaterians - sponges, ctenophores, placozoans and cnidarians - appears to be more diverse than that of bilaterians, and includes structurally variable miRNAs in sponges, an enormous number of piRNAs in cnidarians and the absence of miRNAs in ctenophores and placozoans. RESULTS: Here we identify thousands of endo-siRNAs and piRNAs from the sponge Amphimedon queenslandica, the ctenophore Mnemiopsis leidyi and the cnidarian Nematostella vectensis using a computational approach that clusters mapped small RNA sequences and annotates each cluster based on the read length and relative abundance of the constituent reads. This approach was validated on 11 small RNA libraries in Drosophila melanogaster, demonstrating the successful annotation of RNAi-associated loci with properties consistent with previous reports. In the non-bilaterians we uncover seven new miRNAs from Amphimedon and four from Nematostella as well as sub-populations of candidate cis-natural antisense transcript (cis-NAT) endo-siRNAs. We confirmed the absence of miRNAs in Mnemiopsis but detected an abundance of endo-siRNAs in this ctenophore. Analysis of putative piRNA structure suggests that conserved localised secondary structures in primary transcripts may be important for the production of mature piRNAs in Amphimedon and Nematostella, as is also the case for endo-siRNAs. CONCLUSION: Together, these findings suggest that the last common ancestor of extant animals did not have the entrained RNAi system that typifies bilaterians. Instead it appears that bilaterians, cnidarians, ctenophores and sponges express unique repertoires and combinations of miRNAs, piRNAs and endo-siRNAs.
Assuntos
Evolução Biológica , Interferência de RNA , Animais , Ctenóforos/genética , Drosophila/genética , Biblioteca Gênica , Genoma , MicroRNAs/genética , Anotação de Sequência Molecular , RNA Interferente Pequeno/metabolismo , Anêmonas-do-Mar/genéticaRESUMO
The advent of high-throughput sequencing (HTS) technologies has revolutionized the way we understand the transformation of genetic information into morphological traits. Elucidating the network of interactions between genes that govern cell differentiation through development is one of the core challenges in genome research. These networks are known as developmental gene regulatory networks (dGRNs) and consist largely of the functional linkage between developmental control genes, cis-regulatory modules, and differentiation genes, which generate spatially and temporally refined patterns of gene expression. Over the last 20 years, great advances have been made in determining these gene interactions mainly in classical model systems, including human, mouse, sea urchin, fruit fly, and worm. This has brought about a radical transformation in the fields of developmental biology and evolutionary biology, allowing the generation of high-resolution gene regulatory maps to analyze cell differentiation during animal development. Such maps have enabled the identification of gene regulatory circuits and have led to the development of network inference methods that can recapitulate the differentiation of specific cell-types or developmental stages. In contrast, dGRN research in non-classical model systems has been limited to the identification of developmental control genes via the candidate gene approach and the characterization of their spatiotemporal expression patterns, as well as to the discovery of cis-regulatory modules via patterns of sequence conservation and/or predicted transcription-factor binding sites. However, thanks to the continuous advances in HTS technologies, this scenario is rapidly changing. Here, we give a historical overview on the architecture and elucidation of the dGRNs. Subsequently, we summarize the approaches available to unravel these regulatory networks, highlighting the vast range of possibilities of integrating multiple technical advances and theoretical approaches to expand our understanding on the global gene regulation during animal development in non-classical model systems. Such new knowledge will not only lead to greater insights into the evolution of molecular mechanisms underlying cell identity and animal body plans, but also into the evolution of morphological key innovations in animals.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Invertebrados/genética , Vertebrados/genética , Animais , Invertebrados/crescimento & desenvolvimento , Modelos Animais , Vertebrados/crescimento & desenvolvimentoRESUMO
Combinatorial patterns of histone modifications regulate developmental and cell type-specific gene expression and underpin animal complexity, but it is unclear when this regulatory system evolved. By analysing histone modifications in a morphologically-simple, early branching animal, the sponge Amphimedonqueenslandica, we show that the regulatory landscape used by complex bilaterians was already in place at the dawn of animal multicellularity. This includes distal enhancers, repressive chromatin and transcriptional units marked by H3K4me3 that vary with levels of developmental regulation. Strikingly, Amphimedon enhancers are enriched in metazoan-specific microsyntenic units, suggesting that their genomic location is extremely ancient and likely to place constraints on the evolution of surrounding genes. These results suggest that the regulatory foundation for spatiotemporal gene expression evolved prior to the divergence of sponges and eumetazoans, and was necessary for the evolution of animal multicellularity.