Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
1.
Nucleic Acids Res ; 52(D1): D72-D80, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37904589

ABSTRACT

G-quadruplexes (G4s) are non-canonical four-stranded structures and are emerging as novel genetic regulatory elements. However, a comprehensive genomic annotation of endogenous G4s (eG4s) and systematic characterization of their regulatory network are still lacking, posing major challenges for eG4 research. Here, we present EndoQuad (https://EndoQuad.chenzxlab.cn/) to address these pressing issues by integrating high-throughput experimental data. First, based on high-quality genome-wide eG4s mapping datasets (human: 1181; mouse: 24; chicken: 2) generated by G4 ChIP-seq/CUT&Tag, we generate a reference set of genome-wide eG4s. Our multi-omics analyses show that most eG4s are identified in one or a few cell types. The eG4s with higher occurrences across samples are more structurally stable, evolutionarily conserved, enriched in promoter regions, mark highly expressed genes and associate with complex regulatory programs, demonstrating higher confidence level for further experiments. Finally, we integrate millions of functional genomic variants and prioritize eG4s with regulatory functions in disease and cancer contexts. These efforts have culminated in the comprehensive and interactive database of experimentally validated DNA eG4s. As such, EndoQuad enables users to easily access, download and repurpose these data for their own research. EndoQuad will become a one-stop resource for eG4 research and lay the foundation for future functional studies.


Subject(s)
Databases, Genetic , G-Quadruplexes , Regulatory Sequences, Nucleic Acid , Animals , Humans , Mice , Genome , Genomics
2.
Cell Biosci ; 13(1): 117, 2023 Jun 28.
Article in English | MEDLINE | ID: mdl-37381029

ABSTRACT

G-quadruplex (G4) is a four-stranded helical DNA secondary structure formed by guanine-rich sequence folding, and G4 has been computationally predicted to exist in a wide range of species. Substantial evidence has supported the formation of endogenous G4 (eG4) in living cells and revealed its regulatory dynamics and critical roles in several important biological processes, making eG4 a regulator of gene expression perturbation and a promising therapeutic target in disease biology. Here, we reviewed the methods for prediction of potential G4 sequences (PQS) and detection of eG4s. We also highlighted the factors affecting the dynamics of eG4s and the effects of eG4 dynamics. Finally, we discussed the future applications of eG4 dynamics in disease therapy.

3.
Genome Biol ; 23(1): 235, 2022 11 08.
Article in English | MEDLINE | ID: mdl-36348461

ABSTRACT

BACKGROUND: Pseudogenes are excellent markers for genome evolution, which are emerging as crucial regulators of development and disease, especially cancer. However, systematic functional characterization and evolution of pseudogenes remain largely unexplored. RESULTS: To systematically characterize pseudogenes, we date the origin of human and mouse pseudogenes across vertebrates and observe a burst of pseudogene gain in these two lineages. Based on a hybrid sequencing dataset combining full-length PacBio sequencing, sample-matched Illumina sequencing, and public time-course transcriptome data, we observe that abundant mammalian pseudogenes could be transcribed, which contribute to the establishment of organ identity. Our analyses reveal that developmentally dynamic pseudogenes are evolutionarily conserved and show an increasing weight during development. Besides, they are involved in complex transcriptional and post-transcriptional modulation, exhibiting the signatures of functional enrichment. Coding potential evaluation suggests that 19% of human pseudogenes could be translated, thus serving as a new way for protein innovation. Moreover, pseudogenes carry disease-associated SNPs and conduce to cancer transcriptome perturbation. CONCLUSIONS: Our discovery reveals an unexpectedly high abundance of mammalian pseudogenes that can be transcribed and translated, and these pseudogenes represent a novel regulatory layer. Our study also prioritizes developmentally dynamic pseudogenes with signatures of functional enrichment and provides a hybrid sequencing dataset for further unraveling their biological mechanisms in organ development and carcinogenesis in the future.


Subject(s)
Neoplasms , Pseudogenes , Humans , Mice , Animals , Genome , Mammals/genetics , Sequence Analysis, DNA , Neoplasms/genetics
4.
J Genet Genomics ; 49(1): 20-29, 2022 01.
Article in English | MEDLINE | ID: mdl-34601118

ABSTRACT

G-quadruplexes in viral genomes can be applied as the targets of antiviral therapies, which has attracted wide interest. However, it is still not clear whether the pervasive number of such elements in the viral world is the result of natural selection for functionality. In this study, we identified putative quadruplex-forming sequences (PQSs) across the known viral genomes and analyzed the abundance, structural stability, and conservation of viral PQSs. A Viral Putative G-quadruplex Database (http://jsjds.hzau.edu.cn/MBPC/ViPGD/index.php/home/index) was constructed to collect the details of each viral PQS, which provides guidance for selecting the desirable PQS. The PQS with two putative G-tetrads (G2-PQS) was significantly enriched in both eukaryotic viruses and prokaryotic viruses, whereas the PQSs with three putative G-tetrads (G3-PQS) were only enriched in eukaryotic viruses and depleted in prokaryotic viruses. The structural stability of PQSs in prokaryotic viruses was significantly lower than that in eukaryotic viruses. Conservation analysis showed that the G2-PQS, instead of G3-PQS, was highly conserved within the genus. This suggested that the G2-quadruplex might play an important role in viral biology, and the difference in the occurrence of G-quadruplex between eukaryotic viruses and prokaryotic viruses may result from the different selection pressures from hosts.


Subject(s)
G-Quadruplexes , Viruses , Eukaryota , Genome, Viral/genetics , Viruses/genetics
5.
Article in English | MEDLINE | ID: mdl-36031057

ABSTRACT

In the evolutionary model of dosage compensation, per-allele expression level of the X chromosome has been proposed to have twofold up-regulation to compensate its dose reduction in males (XY) compared to females (XX). However, the expression regulation of X-linked genes is still controversial, and comprehensive evaluations are still lacking. By integrating multi-omics datasets in mammals, we investigated the expression ratios including X to autosomes (X:AA ratio) and X to orthologs (X:XX ratio) at the transcriptome, translatome, and proteome levels. We revealed a dynamic spatial-temporal X:AA ratio during development in humans and mice. Meanwhile, by tracing the evolution of orthologous gene expression in chickens, platypuses, and opossums, we found a stable expression ratio of X-linked genes in humans to their autosomal orthologs in other species (X:XX ≈ 1) across tissues and developmental stages, demonstrating stable dosage compensation in mammals. We also found that different epigenetic regulations contributed to the high tissue specificity and stage specificity of X-linked gene expression, thus affecting X:AA ratios. It could be concluded that the dynamics of X:AA ratios were attributed to the different gene contents and expression preferences of the X chromosome, rather than the stable dosage compensation.

6.
J Genet Genomics ; 48(12): 1122-1129, 2021 12.
Article in English | MEDLINE | ID: mdl-34538772

ABSTRACT

The origination of new genes contributes to the biological diversity of life. New genes may quickly build their network, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific genes, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically date the age of 9,102,113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using a protein-family-based pipeline with Wagner parsimony algorithm. We also collect gene age estimate data from other studies and uniformly distribute the gene age estimates to time ranges in a million years for comparison across studies. All the data are cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age, and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog, and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, is provided to researchers for exploring gene functions.


Subject(s)
Evolution, Molecular , Vertebrates , Algorithms , Animals , Phylogeny , Software , Vertebrates/genetics
SELECTION OF CITATIONS
SEARCH DETAIL