Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 4.760
Filter
1.
Genome Biol ; 25(1): 83, 2024 Apr 02.
Article in English | MEDLINE | ID: mdl-38566111

ABSTRACT

BACKGROUND: The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS: Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS: Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.


Subject(s)
DNA , Regulatory Sequences, Nucleic Acid , Binding Sites , Sequence Alignment , Algorithms , Conserved Sequence/genetics , Evolution, Molecular
2.
PLoS Comput Biol ; 20(4): e1012028, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38662765

ABSTRACT

Intrinsically disordered regions (IDRs) are segments of proteins without stable three-dimensional structures. As this flexibility allows them to interact with diverse binding partners, IDRs play key roles in cell signaling and gene expression. Despite the prevalence and importance of IDRs in eukaryotic proteomes and various biological processes, associating them with specific molecular functions remains a significant challenge due to their high rates of sequence evolution. However, by comparing the observed values of various IDR-associated properties against those generated under a simulated model of evolution, a recent study found most IDRs across the entire yeast proteome contain conserved features. Furthermore, it showed clusters of IDRs with common "evolutionary signatures," i.e. patterns of conserved features, were associated with specific biological functions. To determine if similar patterns of conservation are found in the IDRs of other systems, in this work we applied a series of phylogenetic models to over 7,500 orthologous IDRs identified in the Drosophila genome to dissect the forces driving their evolution. By comparing models of constrained and unconstrained continuous trait evolution using the Brownian motion and Ornstein-Uhlenbeck models, respectively, we identified signals of widespread constraint, indicating conservation of distributed features is mechanism of IDR evolution common to multiple biological systems. In contrast to the previous study in yeast, however, we observed limited evidence of IDR clusters with specific biological functions, which suggests a more complex relationship between evolutionary constraints and function in the IDRs of multicellular organisms.


Subject(s)
Evolution, Molecular , Intrinsically Disordered Proteins , Phylogeny , Animals , Intrinsically Disordered Proteins/chemistry , Intrinsically Disordered Proteins/genetics , Intrinsically Disordered Proteins/metabolism , Conserved Sequence/genetics , Computational Biology/methods , Drosophila/genetics , Proteome/chemistry , Proteome/metabolism , Proteome/genetics , Drosophila Proteins/genetics , Drosophila Proteins/chemistry , Drosophila Proteins/metabolism
3.
Genome Biol Evol ; 16(4)2024 Apr 02.
Article in English | MEDLINE | ID: mdl-38502060

ABSTRACT

Conserved noncoding elements (CNEs) are DNA sequences located outside of protein-coding genes that can remain under purifying selection for up to hundreds of millions of years. Studies in vertebrate genomes have revealed that most CNEs carry out regulatory functions. Notably, many of them are enhancers that control the expression of homeodomain transcription factors and other genes that play crucial roles in embryonic development. To further our knowledge of CNEs in other parts of the animal tree, we conducted a large-scale characterization of CNEs in more than 50 genomes from three of the main branches of the metazoan tree: Cnidaria, Mollusca, and Arthropoda. We identified hundreds of thousands of CNEs and reconstructed the temporal dynamics of their appearance in each lineage, as well as determining their spatial distribution across genomes. We show that CNEs evolve repeatedly around the same genes across the Metazoa, including around homeodomain genes and other transcription factors; they also evolve repeatedly around genes involved in neural development. We also show that transposons are a major source of CNEs, confirming previous observations from vertebrates and suggesting that they have played a major role in wiring developmental gene regulatory mechanisms since the dawn of animal evolution.


Subject(s)
Regulatory Sequences, Nucleic Acid , Vertebrates , Animals , Conserved Sequence/genetics , Vertebrates/genetics , Base Sequence , Transcription Factors/genetics , Evolution, Molecular
5.
Nucleic Acids Res ; 52(6): 3121-3136, 2024 Apr 12.
Article in English | MEDLINE | ID: mdl-38375870

ABSTRACT

MicroRNAs (miRNAs) are important and ubiquitous regulators of gene expression in both plants and animals. They are thought to have evolved convergently in these lineages and hypothesized to have played a role in the evolution of multicellularity. In line with this hypothesis, miRNAs have so far only been described in few unicellular eukaryotes. Here, we investigate the presence and evolution of miRNAs in Amoebozoa, focusing on species belonging to Acanthamoeba, Physarum and dictyostelid taxonomic groups, representing a range of unicellular and multicellular lifestyles. miRNAs that adhere to both the stringent plant and animal miRNA criteria were identified in all examined amoebae, expanding the total number of protists harbouring miRNAs from 7 to 15. We found conserved miRNAs between closely related species, but the majority of species feature only unique miRNAs. This shows rapid gain and/or loss of miRNAs in Amoebozoa, further illustrated by a detailed comparison between two evolutionary closely related dictyostelids. Additionally, loss of miRNAs in the Dictyostelium discoideum drnB mutant did not seem to affect multicellular development and, hence, demonstrates that the presence of miRNAs does not appear to be a strict requirement for the transition from uni- to multicellular life.


Subject(s)
Amoebozoa , Evolution, Molecular , MicroRNAs , RNA, Protozoan , Amoebozoa/classification , Amoebozoa/genetics , Dictyostelium/genetics , MicroRNAs/genetics , Phylogeny , RNA, Protozoan/genetics , Conserved Sequence/genetics , RNA Interference
6.
Dev Growth Differ ; 66(1): 75-88, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37925606

ABSTRACT

Abnormal expression of the transcriptional regulator and hedgehog (Hh) signaling pathway effector Gli3 is known to trigger congenital disease, most frequently affecting the central nervous system (CNS) and the limbs. Accurate delineation of the genomic cis-regulatory landscape controlling Gli3 transcription during embryonic development is critical for the interpretation of noncoding variants associated with congenital defects. Here, we employed a comparative genomic analysis on fish species with a slow rate of molecular evolution to identify seven previously unknown conserved noncoding elements (CNEs) in Gli3 intronic intervals (CNE15-21). Transgenic assays in zebrafish revealed that most of these elements drive activities in Gli3 expressing tissues, predominantly the fins, CNS, and the heart. Intersection of these CNEs with human disease associated SNPs identified CNE15 as a putative mammalian craniofacial enhancer, with conserved activity in vertebrates and potentially affected by mutation associated with human craniofacial morphology. Finally, comparative functional dissection of an appendage-specific CNE conserved in slowly evolving fish (elephant shark), but not in teleost (CNE14/hs1586) indicates co-option of limb specificity from other tissues prior to the divergence of amniotes and lobe-finned fish. These results uncover a novel subset of intronic Gli3 enhancers that arose in the common ancestor of gnathostomes and whose sequence components were likely gradually modified in other species during the process of evolutionary diversification.


Subject(s)
Enhancer Elements, Genetic , Zebrafish , Animals , Humans , Zebrafish/genetics , Zebrafish/metabolism , Enhancer Elements, Genetic/genetics , Hedgehog Proteins/genetics , Hedgehog Proteins/metabolism , Animals, Genetically Modified , Mammals , Evolution, Molecular , Conserved Sequence/genetics
7.
Nature ; 625(7996): 735-742, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38030727

ABSTRACT

Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.


Subject(s)
Conserved Sequence , Evolution, Molecular , Genome , Primates , Animals , Female , Humans , Pregnancy , Conserved Sequence/genetics , Deoxyribonuclease I/metabolism , DNA/genetics , DNA/metabolism , Genome/genetics , Mammals/classification , Mammals/genetics , Placenta , Primates/classification , Primates/genetics , Regulatory Sequences, Nucleic Acid/genetics , Reproducibility of Results , Transcription Factors/metabolism , Proteins/genetics , Gene Expression Regulation/genetics
8.
J Biol Chem ; 300(2): 105611, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38159848

ABSTRACT

During growth, bacteria remodel and recycle their peptidoglycan (PG). A key family of PG-degrading enzymes is the lytic transglycosylases, which produce anhydromuropeptides, a modification that caps the PG chains and contributes to bacterial virulence. Previously, it was reported that the polar-growing Gram-negative plant pathogen Agrobacterium tumefaciens lacks anhydromuropeptides. Here, we report the identification of an enzyme, MdaA (MurNAc deacetylase A), which specifically removes the acetyl group from anhydromuropeptide chain termini in A. tumefaciens, resolving this apparent anomaly. A. tumefaciens lacking MdaA accumulates canonical anhydromuropeptides, whereas MdaA was able to deacetylate anhydro-N-acetyl muramic acid in purified sacculi that lack this modification. As for other PG deacetylases, MdaA belongs to the CE4 family of carbohydrate esterases but harbors an unusual Cys residue in its active site. MdaA is conserved in other polar-growing bacteria, suggesting a possible link between PG chain terminus deacetylation and polar growth.


Subject(s)
Agrobacterium tumefaciens , Bacterial Proteins , Agrobacterium tumefaciens/classification , Agrobacterium tumefaciens/enzymology , Agrobacterium tumefaciens/genetics , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Cell Wall , Peptidoglycan , Amidohydrolases/genetics , Amidohydrolases/metabolism , Bacteria/classification , Bacteria/genetics , Bacteria/metabolism , Conserved Sequence/genetics , Gene Deletion
9.
Nature ; 624(7991): 390-402, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38092918

ABSTRACT

Divergence of cis-regulatory elements drives species-specific traits1, but how this manifests in the evolution of the neocortex at the molecular and cellular level remains unclear. Here we investigated the gene regulatory programs in the primary motor cortex of human, macaque, marmoset and mouse using single-cell multiomics assays, generating gene expression, chromatin accessibility, DNA methylome and chromosomal conformation profiles from a total of over 200,000 cells. From these data, we show evidence that divergence of transcription factor expression corresponds to species-specific epigenome landscapes. We find that conserved and divergent gene regulatory features are reflected in the evolution of the three-dimensional genome. Transposable elements contribute to nearly 80% of the human-specific candidate cis-regulatory elements in cortical cells. Through machine learning, we develop sequence-based predictors of candidate cis-regulatory elements in different species and demonstrate that the genomic regulatory syntax is highly preserved from rodents to primates. Finally, we show that epigenetic conservation combined with sequence similarity helps to uncover functional cis-regulatory elements and enhances our ability to interpret genetic variants contributing to neurological disease and traits.


Subject(s)
Conserved Sequence , Evolution, Molecular , Gene Expression Regulation , Gene Regulatory Networks , Mammals , Neocortex , Animals , Humans , Mice , Callithrix/genetics , Chromatin/genetics , Chromatin/metabolism , Conserved Sequence/genetics , DNA Methylation , DNA Transposable Elements/genetics , Epigenome , Gene Expression Regulation/genetics , Macaca/genetics , Mammals/genetics , Motor Cortex/cytology , Motor Cortex/metabolism , Multiomics , Neocortex/cytology , Neocortex/metabolism , Regulatory Sequences, Nucleic Acid/genetics , Single-Cell Analysis , Transcription Factors/metabolism , Genetic Variation/genetics
10.
Mol Biol Evol ; 40(12)2023 Dec 01.
Article in English | MEDLINE | ID: mdl-38085182

ABSTRACT

DNA that controls gene expression (e.g. enhancers, promoters) has seemed almost never to be conserved between distantly related animals, like vertebrates and arthropods. This is mysterious, because development of such animals is partly organized by homologous genes with similar complex expression patterns, termed "deep homology." Here, we report 25 regulatory DNA segments conserved across bilaterian animals, of which 7 are also conserved in cnidaria (coral and sea anemone). They control developmental genes (e.g. Nr2f, Ptch, Rfx1/3, Sall, Smad6, Sp5, Tbx2/3), including six homeobox genes: Gsx, Hmx, Meis, Msx, Six1/2, and Zfhx3/4. The segments contain perfectly or near-perfectly conserved CCAAT boxes, E-boxes, and other sequences recognized by regulatory proteins. More such DNA conservation will surely be found soon, as more genomes are published and sequence comparison is optimized. This reveals a control system for animal development conserved since the Precambrian.


Subject(s)
Anthozoa , Genes, Homeobox , Animals , DNA , Transcription Factors/genetics , Anthozoa/genetics , Embryonic Development/genetics , Conserved Sequence/genetics
11.
PeerJ ; 11: e15632, 2023.
Article in English | MEDLINE | ID: mdl-37456878

ABSTRACT

MicroRNAs (miRNAs) are endogenous non-coding small RNA with 19-24 nucleotides (nts) in length, which play an essential role in regulating gene expression at the post-transcriptional level. As one of the first miRNAs found in plants, miR171 is a typical class of conserved miRNAs. The miR171 sequences among different species are highly similar, and the vast majority of them have both "GAGCCG" and "CAAUAU" fragments. In addition to being involved in plant growth and development, hormone signaling and stress response, miR171 also plays multiple and important roles in plants through interactions with microbe and other small-RNAs. The miRNA functions by regulating the expression of target genes. Most of miR171's target genes are in the GRAS gene family, but also include some NSP, miRNAs, lncRNAs, and other genes. This review is intended to summarize recent updates on miR171 regarding its function in plant life and hopefully provide new ideas for understanding miR171 function and regulatory mechanisms.


Subject(s)
MicroRNAs , Plant Development , Plants , Gene Expression Regulation, Plant/genetics , MicroRNAs/genetics , MicroRNAs/metabolism , Plant Development/genetics , Signal Transduction/genetics , Plants/classification , Plants/genetics , Phylogeny , Conserved Sequence/genetics , Stress, Physiological/genetics
12.
Int J Mol Sci ; 24(13)2023 Jul 04.
Article in English | MEDLINE | ID: mdl-37446254

ABSTRACT

Glutathione peroxidase-like enzyme is an important enzymatic antioxidant in plants. It is involved in scavenging reactive oxygen species, which can effectively prevent oxidative damage and improve resistance. GPXL has been studied in many plants but has not been reported in potatoes, the world's fourth-largest food crop. This study identified eight StGPXL genes in potatoes for the first time through genome-wide bioinformatics analysis and further studied the expression patterns of these genes using qRT-PCR. The results showed that the expression of StGPXL1 was significantly upregulated under high-temperature stress, indicating its involvement in potato defense against high-temperature stress, while the expression levels of StGPXL4 and StGPXL5 were significantly downregulated. The expression of StGPXL1, StGPXL2, StGPXL3, and StGPXL6 was significantly upregulated under drought stress, indicating their involvement in potato defense against drought stress. After MeJA hormone treatment, the expression level of StGPXL6 was significantly upregulated, indicating its involvement in the chemical defense mechanism of potatoes. The expression of all StGPXL genes is inhibited under biotic stress, which indicates that GPXL is a multifunctional gene family, which may endow plants with resistance to various stresses. This study will help deepen the understanding of the function of the potato GPXL gene family, provide comprehensive information for the further analysis of the molecular function of the potato GPXL gene family as well as a theoretical basis for potato molecular breeding.


Subject(s)
Gene Expression Regulation, Plant , Genome-Wide Association Study , Glutathione Peroxidase , Plant Proteins , Solanum tuberosum , Gene Expression Profiling , Glutathione Peroxidase/genetics , Glutathione Peroxidase/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Solanum tuberosum/classification , Solanum tuberosum/enzymology , Solanum tuberosum/genetics , Stress, Physiological/genetics , Gene Duplication/genetics , Conserved Sequence/genetics , Amino Acid Motifs/genetics , Arabidopsis Proteins/genetics , Gene Ontology
13.
Sci China Life Sci ; 66(10): 2399-2414, 2023 10.
Article in English | MEDLINE | ID: mdl-37256419

ABSTRACT

Limb loss shows recurrent phenotypic evolution across squamate lineages. Here, based on three de novo-assembled genomes of limbless lizards from different lineages, we showed that divergence of conserved non-coding elements (CNEs) played an important role in limb development. These CNEs were associated with genes required for limb initiation and outgrowth, and with regulatory signals in the early stage of limb development. Importantly, we identified the extensive existence of insertions and deletions (InDels) in the CNEs, with the numbers ranging from 111 to 756. Most of these CNEs with InDels were lineage-specific in the limbless squamates. Nearby genes of these InDel CNEs were important to early limb formation, such as Tbx4, Fgf10, and Gli3. Based on functional experiments, we found that nucleotide mutations and InDels both affected the regulatory function of the CNEs. Our study provides molecular evidence underlying limb loss in squamate reptiles from a developmental perspective and sheds light on the importance of regulatory element InDels in phenotypic evolution.


Subject(s)
Genome , Reptiles , Animals , Reptiles/genetics , Transcription Factors/genetics , Evolution, Molecular , Conserved Sequence/genetics , Biological Evolution
14.
Life Sci Alliance ; 6(6)2023 06.
Article in English | MEDLINE | ID: mdl-37024123

ABSTRACT

Although long noncoding RNAs (lncRNAs) experience weaker evolutionary constraints and exhibit lower sequence conservation than coding genes, they can still conserve their features in various aspects. Here, we used multiple approaches to systemically evaluate the conservation between human and mouse lncRNAs from various dimensions including sequences, promoter, global synteny, and local synteny, which led to the identification of 1,731 conserved lncRNAs with 427 high-confidence ones meeting multiple criteria. Conserved lncRNAs, compared with non-conserved ones, generally have longer gene bodies, more exons and transcripts, stronger connections with human diseases, and are more abundant and widespread across different tissues. Transcription factor (TF) profile analysis revealed a significant enrichment of TF types and numbers in the promoters of conserved lncRNAs. We further identified a set of TFs that preferentially bind to conserved lncRNAs and exert stronger regulation on conserved than non-conserved lncRNAs. Our study has reconciled some discrepant interpretations of lncRNA conservation and revealed a new set of transcriptional factors ruling the expression of conserved lncRNAs.


Subject(s)
RNA, Long Noncoding , Mice , Humans , Animals , Conserved Sequence/genetics , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Gene Expression Regulation/genetics , Transcription Factors/genetics , Biological Evolution
15.
Science ; 380(6643): eabn2253, 2023 04 28.
Article in English | MEDLINE | ID: mdl-37104592

ABSTRACT

Conserved genomic sequences disrupted in humans may underlie uniquely human phenotypic traits. We identified and characterized 10,032 human-specific conserved deletions (hCONDELs). These short (average 2.56 base pairs) deletions are enriched for human brain functions across genetic, epigenomic, and transcriptomic datasets. Using massively parallel reporter assays in six cell types, we discovered 800 hCONDELs conferring significant differences in regulatory activity, half of which enhance rather than disrupt regulatory function. We highlight several hCONDELs with putative human-specific effects on brain development, including HDAC5, CPEB4, and PPP2CA. Reverting an hCONDEL to the ancestral sequence alters the expression of LOXL2 and developmental genes involved in myelination and synaptic function. Our data provide a rich resource to investigate the evolutionary mechanisms driving new traits in humans and other species.


Subject(s)
Brain , Evolution, Molecular , Gene Expression Regulation, Developmental , Sequence Deletion , Humans , Conserved Sequence/genetics , Genome , Genomics , RNA-Binding Proteins/genetics , Brain/growth & development
16.
Science ; 380(6643): eabn3943, 2023 04 28.
Article in English | MEDLINE | ID: mdl-37104599

ABSTRACT

Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.


Subject(s)
Eutheria , Evolution, Molecular , Animals , Female , Humans , Conserved Sequence/genetics , Eutheria/genetics , Genome, Human
17.
Plant Cell Physiol ; 64(6): 604-621, 2023 Jun 15.
Article in English | MEDLINE | ID: mdl-36943747

ABSTRACT

In plants, microRNA (miRNA)-target interactions (MTIs) require high complementarity, a feature from which bioinformatic programs have predicted numerous and diverse targets for any given miRNA, promoting the idea of complex miRNA networks. Opposing this is a hypothesis of constrained miRNA specificity, in which functional MTIs are restricted to the few targets whose required expression output is compatible with the expression of the miRNA. To explore these opposing views, the bioinformatic pipeline Targets Ranked Using Experimental Evidence was applied to strongly conserved miRNAs to identity their high-evidence (HE) targets across species. For each miRNA family, HE targets predominantly consisted of homologs from one conserved target gene family (primary family). These primary families corresponded to the known canonical miRNA-target families, validating the approach. Very few additional HE target families were identified (secondary family), and if so, they were likely functionally related to the primary family. Many primary target families contained highly conserved nucleotide sequences flanking their miRNA-binding sites that were enriched in HE homologs across species. A number of these flanking sequences are predicted to form conserved RNA secondary structures that preferentially base pair with the miRNA-binding site, implying that these sites are highly structured. Our findings support a target landscape view that is dominated by the conserved primary target families, with a minority of either secondary target families or non-conserved targets. This is consistent with the constrained hypothesis of functional miRNA specificity, which potentially in part is being facilitated by features beyond complementarity.


Subject(s)
MicroRNAs , MicroRNAs/genetics , MicroRNAs/metabolism , Plants/genetics , Plants/metabolism , Conserved Sequence/genetics , Binding Sites , RNA, Plant/genetics , RNA, Plant/metabolism , Gene Expression Regulation, Plant
18.
J Integr Plant Biol ; 65(6): 1467-1478, 2023 Jun.
Article in English | MEDLINE | ID: mdl-36762577

ABSTRACT

Physical contact between genes distant on chromosomes is a potentially important way for genes to coordinate their expressions. To investigate the potential importance of distant contacts, we performed high-throughput chromatin conformation capture (Hi-C) experiments on leaf nuclei isolated from Brassica rapa and Brassica oleracea. We then combined our results with published Hi-C data from Arabidopsis thaliana. We found that distant genes come into physical contact and do so preferentially between the proximal promoter of one gene and the downstream region of another gene. Genes with higher numbers of conserved noncoding sequences (CNSs) nearby were more likely to have contact with distant genes. With more CNSs came higher numbers of transcription factor binding sites and more histone modifications associated with the activity. In addition, for the genes we studied, distant contacting genes with CNSs were more likely to be transcriptionally coordinated. These observations suggest that CNSs may enrich active histone modifications and recruit transcription factors, correlating with distant contacts to ensure coordinated expression. This study advances our knowledge of gene contacts and provides insights into the relationship between CNSs and distant gene contacts in plants.


Subject(s)
Arabidopsis , Brassica , Arabidopsis/genetics , Arabidopsis/metabolism , Brassica/genetics , Brassica/metabolism , Conserved Sequence/genetics , Transcription Factors/metabolism , Promoter Regions, Genetic/genetics , Genome, Plant
19.
Biomolecules ; 13(2)2023 01 31.
Article in English | MEDLINE | ID: mdl-36830634

ABSTRACT

Lnc-uc.147, a long non-coding RNA derived from a transcribed ultraconserved region (T-UCR), was previously evidenced in breast cancer. However, the role of this region in other tumor types was not previously investigated. The present study aimed to investigate lnc-uc.147 in different types of cancer, as well as to suggest lnc-uc.147 functional and regulation aspects. From solid tumor datasets analysis of The Cancer Genome Atlas (TCGA), deregulated lnc-uc.147 expression was associated with the histologic grade of hepatocellular carcinoma, and with the tumor stage of clear cell renal and gastric adenocarcinoma. Considering the epidemiologic relevance of liver cancer, silencing lnc-uc.147 reduced the viability and clonogenic capacity of HepG2 cell lines. Additionally, we suggest a relation between the transcription factor TEAD4 and lnc-uc.147 in liver and breast cancer cells.


Subject(s)
Breast Neoplasms , Carcinoma, Hepatocellular , Carcinoma, Renal Cell , Kidney Neoplasms , RNA, Long Noncoding , Humans , Female , Conserved Sequence/genetics , Carcinoma, Hepatocellular/genetics , RNA, Long Noncoding/genetics , Carcinoma, Renal Cell/genetics , Kidney Neoplasms/genetics , Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic , TEA Domain Transcription Factors
20.
New Phytol ; 238(4): 1722-1732, 2023 05.
Article in English | MEDLINE | ID: mdl-36751910

ABSTRACT

Understanding the evolutionary conservation of complex eukaryotic transcriptomes significantly illuminates the physiological relevance of alternative splicing (AS). Examining the evolutionary depth of a given AS event with ordinary homology searches is generally challenging and time-consuming. Here, we present Catsnap, an algorithmic pipeline for assessing the conservation of putative protein isoforms generated by AS. It employs a machine learning approach following a database search with the provided pair of protein sequences. We used the Catsnap algorithm for analyzing the conservation of emerging experimentally characterized alternative proteins from plants and animals. Indeed, most of them are conserved among other species. Catsnap can detect the conserved functional protein isoforms regardless of the AS type by which they are generated. Notably, we found that while the primary amino acid sequence is maintained, the type of AS determining the inclusion or exclusion of protein regions varies throughout plant phylogenetic lineages in these proteins. We also document that this phenomenon is less seen among animals. In sum, our algorithm highlights the presence of unexpectedly frequent hotspots where protein isoforms recurrently arise to carry physiologically relevant functions. The user web interface is available at https://catsnap.cesnet.cz/.


Subject(s)
Algorithms , Alternative Splicing , Animals , Alternative Splicing/genetics , Phylogeny , Protein Isoforms/genetics , Amino Acid Sequence , Mutant Proteins , Plants , Evolution, Molecular , Conserved Sequence/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...