Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 212
Filtrar
1.
J Virol ; 98(10): e0116024, 2024 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-39315813

RESUMO

HIV-1 must generate infectious virions to spread to new hosts and HIV-1 unspliced RNA (HIV-1 RNA) plays two central roles in this process. HIV-1 RNA serves as an mRNA that is translated to generate proteins essential for particle production and replication, and it is packaged into particles as the viral genome. HIV-1 uses several transcription start sites to generate multiple RNAs that differ by a few nucleotides at the 5' end, including those with one (1G) or three (3G) 5' guanosines. The virus relies on host machinery to translate its RNAs in a cap-dependent manner. Here, we demonstrate that the 5' context of HIV-1 RNA affects the efficiency of translation both in vitro and in cells. Although both RNAs are competent for translation, 3G RNA is translated more efficiently than 1G RNA. The 5' untranslated region (UTR) of 1G and 3G RNAs has previously been shown to fold into distinct structural ensembles. We show that HIV-1 mutants in which the 5' UTR of 1G and 3G RNAs fold into similar structures were translated at similar efficiencies. Thus, the host machinery translates two 99.9% identical HIV-1 RNAs with different efficiencies, and the translation efficiency is regulated by the 5' UTR structure.IMPORTANCEHIV-1 unspliced RNA contains all the viral genetic information and encodes virion structural proteins and enzymes. Thus, the unspliced RNA serves distinct roles as viral genome and translation template, both critical for viral replication. HIV-1 generates two major unspliced RNAs with a 2-nt difference at the 5' end (3G RNA and 1G RNA). The 1G transcript is known to be preferentially packaged over the 3G transcript. Here, we showed that 3G RNA is favorably translated over 1G RNA based on its 5' untranslated region (UTR) RNA structure. In HIV-1 mutants in which the two major transcripts have similar 5' UTR structures, 1G and 3G RNAs are translated similarly. Therefore, HIV-1 generates two 9-kb RNAs with a 2-nt difference, each serving a distinct role dictated by differential 5' UTR structures.


Assuntos
Regiões 5' não Traduzidas , HIV-1 , Biossíntese de Proteínas , RNA Viral , HIV-1/genética , Regiões 5' não Traduzidas/genética , RNA Viral/genética , RNA Viral/metabolismo , Humanos , Replicação Viral , Conformação de Ácido Nucleico , Regulação Viral da Expressão Gênica , Células HEK293 , Genoma Viral , Mutação
2.
J Comput Biol ; 31(5): 445-457, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38752891

RESUMO

ABSTRACT An alternative transcription start site (ATSS) is a major driving force for increasing the complexity of transcripts in human tissues. As a transcriptional regulatory mechanism, ATSS has biological significance. Many studies have confirmed that ATSS plays an important role in diseases and cell development and differentiation. However, exploration of its dynamic mechanisms remains insufficient. Identifying ATSS change points during cell differentiation is critical for elucidating potential dynamic mechanisms. For relative ATSS usage as percentage data, the existing methods lack sensitivity to detect the change point for ATSS longitudinal data. In addition, some methods have strict requirements for data distribution and cannot be applied to deal with this problem. In this study, the Bayesian change point detection model was first constructed using reparameterization techniques for two parameters of a beta distribution for the percentage data type, and the posterior distributions of parameters and change points were obtained using Markov Chain Monte Carlo (MCMC) sampling. With comprehensive simulation studies, the performance of the Bayesian change point detection model is found to be consistently powerful and robust across most scenarios with different sample sizes and beta distributions. Second, differential ATSS events in the real data, whose change points were identified using our method, were clustered according to their change points. Last, for each change point, pathway and transcription factor motif analyses were performed on its differential ATSS events. The results of our analyses demonstrated the effectiveness of the Bayesian change point detection model and provided biological insights into cell differentiation.


Assuntos
Teorema de Bayes , Diferenciação Celular , Sítio de Iniciação de Transcrição , Diferenciação Celular/genética , Humanos , Cadeias de Markov , Método de Monte Carlo , Modelos Genéticos , Algoritmos , Simulação por Computador
3.
Aging Cell ; 23(8): e14200, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38757354

RESUMO

The sperm epigenome is thought to affect the developmental programming of the resulting embryo, influencing health and disease in later life. Age-related methylation changes in the sperm of old fathers may mediate the increased risks for reproductive and offspring medical problems. The impact of paternal age on sperm methylation has been extensively studied in humans and, to a lesser extent, in rodents and cattle. Here, we performed a comparative analysis of paternal age effects on protein-coding genes in the human and marmoset sperm methylomes. The marmoset has gained growing importance as a non-human primate model of aging and age-related diseases. Using reduced representation bisulfite sequencing, we identified age-related differentially methylated transcription start site (ageTSS) regions in 204 marmoset and 27 human genes. The direction of methylation changes was the opposite, increasing with age in marmosets and decreasing in humans. None of the identified ageTSS was differentially methylated in both species. Although the average methylation levels of all TSS regions were highly correlated between marmosets and humans, with the majority of TSS being hypomethylated in sperm, more than 300 protein-coding genes were endowed with species-specifically (hypo)methylated TSS. Several genes of the glycosphingolipid (GSL) biosynthesis pathway, which plays a role in embryonic stem cell differentiation and regulation of development, were hypomethylated (<5%) in human and fully methylated (>95%) in marmoset sperm. The expression levels and patterns of defined sets of GSL genes differed considerably between human and marmoset pre-implantation embryo stages and blastocyst tissues, respectively.


Assuntos
Envelhecimento , Callithrix , Metilação de DNA , Epigenoma , Especificidade da Espécie , Espermatozoides , Animais , Callithrix/genética , Masculino , Metilação de DNA/genética , Humanos , Espermatozoides/metabolismo , Envelhecimento/genética , Sítio de Iniciação de Transcrição , Epigênese Genética
4.
BMC Genomics ; 25(1): 368, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622509

RESUMO

BACKGROUND: We recently developed two high-resolution methods for genome-wide mapping of two prominent types of DNA damage, single-strand DNA breaks (SSBs) and abasic (AP) sites and found highly complex and non-random patterns of these lesions in mammalian genomes. One salient feature of SSB and AP sites was the existence of single-nucleotide hotspots for both lesions. RESULTS: In this work, we show that SSB hotspots are enriched in the immediate vicinity of transcriptional start sites (TSSs) in multiple normal mammalian tissues, however the magnitude of enrichment varies significantly with tissue type and appears to be limited to a subset of genes. SSB hotspots around TSSs are enriched on the template strand and associate with higher expression of the corresponding genes. Interestingly, SSB hotspots appear to be at least in part generated by the base-excision repair (BER) pathway from the AP sites. CONCLUSIONS: Our results highlight complex relationship between DNA damage and regulation of gene expression and suggest an exciting possibility that SSBs at TSSs might function as sensors of DNA damage to activate genes important for DNA damage response.


Assuntos
Quebras de DNA de Cadeia Simples , Reparo do DNA , Animais , Reparo do DNA/genética , Dano ao DNA , DNA de Cadeia Simples , Mamíferos
5.
Genes Genet Syst ; 992024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38447993

RESUMO

The budding yeast Saccharomyces cerevisiae is an excellent model organism for studying chromatin regulation with high-resolution genome-wide analyses. Since newly generated genome-wide data are often compared with publicly available datasets, expanding our dataset repertoire will be beneficial for the field. Information on transcription start sites (TSSs) determined at base pair resolution is essential for elucidating mechanisms of transcription and related chromatin regulation, yet no datasets that cover two different cell types are available. Here, we present a CAGE (cap analysis of gene expression) dataset for a-cells and α-cells grown in defined and rich media. Cell type-specific genes were differentially expressed as expected, ensuring the reliability of the data. Some of the differentially expressed TSSs were medium-specific or detected due to unrecognized chromosome rearrangement. By comparing the CAGE data with a high-resolution nucleosome map, major TSSs were primarily found in +1 nucleosomes, with a peak approximately 30 bp from the promoter-proximal end of the nucleosome. The dataset is available at DDBJ/GEA.


Assuntos
Estudo de Associação Genômica Ampla , Nucleossomos , Reprodutibilidade dos Testes , Cromatina/metabolismo , Saccharomyces cerevisiae/genética
6.
Biochim Biophys Acta Gene Regul Mech ; 1867(2): 195021, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38417480

RESUMO

The lysine 4 of histone H3 (H3K4) can be methylated or acetylated into four states: H3K4me1, H3K4me2, H3K4me3, or H3K4ac. Unlike H3K4 methylation, the genome-wide distribution and functional roles of H3K4ac remain unclear. To understand the relationship of acetylation with methylation at H3K4 and to explore the roles of H3K4ac in the context of chromatin, we analyzed H3K4ac across the human genome and compared it with H3K4 methylation in K562 cells. H3K4ac was positively correlated with H3K4me1/2/3 in reciprocal analysis. A decrease in H3K4ac through the mutation of the histone acetyltransferase p300 reduced H3K4me1 and H3K4me3 at the H3K4ac peaks. H3K4ac was also impaired by H3K4me depletion in the histone methyltransferase MLL3/4-mutated cells. H3K4ac peaks were enriched at enhancers in addition to the transcription start sites (TSSs) of genes. H3K4ac of TSSs and enhancers was positively correlated with mRNA and eRNA transcription. A decrease in H3K4ac reduced H3K4me3 and H3K4me1 in TSSs and enhancers, respectively, and inhibited the eviction of histone H3 from them. The mRNA transcription of highly transcribed genes was affected by the reduced H3K4ac. Interestingly, H3K4ac played a redundant role with regard to H3K27ac in eRNA transcription. These results indicate that H3K4ac serves as a marker of both active TSSs and enhancers and plays a role in histone eviction and RNA transcription by leading to H3K4me1/3.


Assuntos
Elementos Facilitadores Genéticos , Histonas , Sítio de Iniciação de Transcrição , Transcrição Gênica , Histonas/metabolismo , Humanos , Células K562 , Acetilação , Metilação , Cromatina/metabolismo , RNA/metabolismo , RNA/genética
7.
mBio ; 15(4): e0086123, 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38411060

RESUMO

A member of the Retroviridae, human immunodeficiency virus type 1 (HIV-1), uses the RNA genome packaged into nascent virions to transfer genetic information to its progeny. The genome packaging step is a highly regulated and extremely efficient process as a vast majority of virus particles contain two copies of full-length unspliced HIV-1 RNA that form a dimer. Thus, during virus assembly HIV-1 can identify and selectively encapsidate HIV-1 unspliced RNA from an abundant pool of cellular RNAs and various spliced HIV-1 RNAs. Several "G" features facilitate the packaging of a dimeric RNA genome. The viral polyprotein Gag orchestrates virus assembly and mediates RNA genome packaging. During this process, Gag preferentially binds unpaired guanosines within the highly structured 5' untranslated region (UTR) of HIV-1 RNA. In addition, the HIV-1 unspliced RNA provides a scaffold that promotes Gag:Gag interactions and virus assembly, thereby ensuring its packaging. Intriguingly, recent studies have shown that the use of different guanosines at the junction of U3 and R as transcription start sites results in HIV-1 unspliced RNA species with 99.9% identical sequences but dramatically distinct 5' UTR conformations. Consequently, one species of unspliced RNA is preferentially packaged over other nearly identical RNAs. These studies reveal how conformations affect the functions of HIV-1 RNA elements and the complex regulation of HIV-1 replication. In this review, we summarize cis- and trans-acting elements critical for HIV-1 RNA packaging, locations of Gag:RNA interactions that mediate genome encapsidation, and the effects of transcription start sites on the structure and packaging of HIV-1 RNA.


Assuntos
HIV-1 , Humanos , HIV-1/fisiologia , RNA Viral/metabolismo , Montagem de Vírus , Genoma Viral
8.
Trends Biochem Sci ; 49(2): 145-155, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38218671

RESUMO

Eukaryotic transcription starts with the assembly of a preinitiation complex (PIC) on core promoters. Flanking this region is the +1 nucleosome, the first nucleosome downstream of the core promoter. While this nucleosome is rich in epigenetic marks and plays a key role in transcription regulation, how the +1 nucleosome interacts with the transcription machinery has been a long-standing question. Here, we summarize recent structural and functional studies of the +1 nucleosome in complex with the PIC. We specifically focus on how differently organized promoter-nucleosome templates affect the assembly of the PIC and PIC-Mediator on chromatin and result in distinct transcription initiation.


Assuntos
Cromatina , Nucleossomos , Nucleossomos/genética , Cromatina/genética , Regiões Promotoras Genéticas , Transcrição Gênica , RNA Polimerase II/metabolismo
9.
Geroscience ; 46(2): 2063-2081, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37817005

RESUMO

While some old adults stay healthy and non-frail up to late in life, others experience multimorbidity and frailty often accompanied by a pro-inflammatory state. The underlying molecular mechanisms for those differences are still obscure. Here, we used gene expression analysis to understand the molecular underpinning between non-frail and frail individuals in old age. Twenty-four adults (50% non-frail and 50% frail) from InCHIANTI study were included. Total RNA extracted from whole blood was analyzed by Cap Analysis of Gene Expression (CAGE). CAGE identified transcription start site (TSS) and active enhancer regions. We identified a set of differentially expressed (DE) TSS and enhancer between non-frail and frail and male and female participants. Several DE TSSs were annotated as lncRNA (XIST and TTTY14) and antisense RNAs (ZFX-AS1 and OVCH1 Antisense RNA 1). The promoter region chr6:366,786,54-366,787,97;+ was DE and overlapping the longevity CDKN1A gene. GWAS-LD enrichment analysis identifies overlapping LD-blocks with the DE regions with reported traits in GWAS catalog (isovolumetric relaxation time and urinary tract infection frequency). Furthermore, we used weighted gene co-expression network analysis (WGCNA) to identify changes of gene expression associated with clinical traits and identify key gene modules. We performed functional enrichment analysis of the gene modules with significant trait/module correlation. One gene module is showing a very distinct pattern in hub genes. Glycogen Phosphorylase L (PYGL) was the top ranked hub gene between non-frail and frail. We predicted transcription factor binding sites (TFBS) and motif activity. TF involved in age-related pathways (e.g., FOXO3 and MYC) shows different expression patterns between non-frail and frail participants. Expanding the study of OVCH1 Antisense RNA 1 and PYGL may help understand the mechanisms leading to loss of homeostasis that ultimately causes frailty.


Assuntos
Fragilidade , RNA Longo não Codificante , Humanos , Masculino , Feminino , Idoso , Idoso Fragilizado , Fragilidade/genética , Perfilação da Expressão Gênica , RNA Longo não Codificante/genética , RNA Antissenso/genética
10.
Am J Physiol Renal Physiol ; 326(3): F394-F410, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38153851

RESUMO

Nuclear factor of activated T cells 5 (NFAT5; also called TonEBP/OREBP) is a transcription factor that is activated by hypertonicity and induces osmoprotective genes to protect cells against hypertonic conditions. In the kidney, renal tubular NFAT5 is known to be involved in the urine concentration mechanism. Previous studies have suggested that NFAT5 modulates the immune system and exerts various effects on organ damage, depending on organ and disease states. Pathophysiological roles of NFAT5 in renal tubular cells, however, still remain obscure. We conducted comprehensive analysis by performing transcription start site (TSS) sequencing on the kidney of inducible and renal tubular cell-specific NFAT5 knockout (KO) mice. Mice were subjected to unilateral ureteral obstruction to examine the relevance of renal tubular NFAT5 in renal fibrosis. TSS sequencing analysis identified 722 downregulated TSSs and 1,360 upregulated TSSs, which were differentially regulated ≤-1.0 and ≥1.0 in log2 fold, respectively. Those TSSs were annotated to 532 downregulated genes and 944 upregulated genes, respectively. Motif analysis showed that sequences that possibly bind to NFAT5 were enriched in TSSs of downregulated genes. Gene Ontology analysis with the upregulated genes suggested disorder of innate and adaptive immune systems in the kidney. Unilateral ureteral obstruction significantly exacerbated renal fibrosis in the renal medulla in KO mice compared with wild-type mice, accompanied by enhanced activation of immune responses. In conclusion, NFAT5 in renal tubules could have pathophysiological roles in renal fibrosis through modulating innate and adaptive immune systems in the kidney.NEW & NOTEWORTHY TSS-Seq analysis of the kidney from renal tubular cell-specific NFAT5 KO mice uncovered novel genes that are possibly regulated by NFAT5 in the kidney under physiological conditions. The study further implied disorders of innate and adaptive immune systems in NFAT5 KO mice, thereby exacerbating renal fibrosis at pathological states. Our results may implicate the involvement of renal tubular NFAT5 in the progression of renal fibrosis. Further studies would be worthwhile for the development of novel therapy to treat chronic kidney disease.


Assuntos
Obstrução Ureteral , Animais , Camundongos , Fibrose , Expressão Gênica , Rim , Camundongos Knockout
12.
Comput Struct Biotechnol J ; 21: 4887-4894, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37860228

RESUMO

Mutations and gene expression are the two most studied genomic features in cancer research. In the last decade, the combined advances in genomic technology and computational algorithms have broadened mutation research with the concept of mutation density and expanded the traditional scope of protein-coding RNA to noncoding RNAs. However, mutation density analysis had yet to be integrated with non-coding RNAs. In this study, we examined long non-coding RNA (lncRNA) mutation density patterns of 57 unique cancer types using 80 cancer cohorts. Our analysis revealed that lncRNAs exhibit mutation density patterns reminiscent to those of protein-coding mRNAs. These patterns include mutation peak and dip around transcription start sites of lncRNA. In many cohorts, these patterns justified statistically significant transcription strand bias, and the transcription strand bias was shared between lncRNAs and mRNAs. We further quantified transcription strand biases with a Log Odds Ratio metric and showed that some of these biases are associated with patient prognosis. The prognostic effect may be exerted due to strong Transcription-coupled repair mechanisms associated with the individual patient. For the first time, our study combined mutational density patterns with lncRNA mutations, and the results demonstrated remarkably comparable patterns between protein-coding mRNA and lncRNA, further illustrating lncRNA's potential roles in cancer research.

13.
Curr Biol ; 33(20): 4381-4391.e3, 2023 10 23.
Artigo em Inglês | MEDLINE | ID: mdl-37729909

RESUMO

Noncoding polymorphism frequently associates with phenotypic variation, but causation and mechanism are rarely established. Noncoding single-nucleotide polymorphisms (SNPs) characterize the major haplotypes of the Arabidopsis thaliana floral repressor gene FLOWERING LOCUS C (FLC). This noncoding polymorphism generates a range of FLC expression levels, determining the requirement for and the response to winter cold. The major adaptive determinant of these FLC haplotypes was shown to be the autumnal levels of FLC expression. Here, we investigate how noncoding SNPs influence FLC transcriptional output. We identify an upstream transcription start site (uTSS) cluster at FLC, whose usage is increased by an A variant at the promoter SNP-230. This variant is present in relatively few Arabidopsis accessions, with the majority containing G at this site. We demonstrate a causal role for the A variant at -230 in reduced FLC transcriptional output. The G variant upregulates FLC expression redundantly with the major transcriptional activator FRIGIDA (FRI). We demonstrate an additive interaction of SNP-230 with an intronic SNP+259, which also differentially influences uTSS usage. Combinatorial interactions between noncoding SNPs and transcriptional activators thus generate quantitative variation in FLC transcription that has facilitated the adaptation of Arabidopsis accessions to distinct climates.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Proteínas de Domínio MADS/genética , Proteínas de Domínio MADS/metabolismo , Flores/fisiologia , Fatores de Transcrição/metabolismo , Polimorfismo de Nucleotídeo Único , Regulação da Expressão Gênica de Plantas
14.
Genome Biol ; 24(1): 213, 2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37730643

RESUMO

In birds, sex is genetically determined; however, the molecular mechanism is not well-understood. The avian Z sex chromosome (chrZ) lacks whole chromosome inactivation, in contrast to the mammalian chrX. To investigate chrZ dosage compensation and its role in sex specification, we use a highly quantitative method and analyze transcriptional activities of male and female fibroblast cells from seven bird species. Our data indicate that three fourths of chrZ genes are strictly compensated across Aves, similar to mammalian chrX. We also present a complete list of non-compensated chrZ genes and identify Ribosomal Protein S6 (RPS6) as a conserved sex-dimorphic gene in birds.


Assuntos
Epigênese Genética , Cromossomos Sexuais , Animais , Feminino , Masculino , Cromossomos Sexuais/genética , Aves/genética , Fibroblastos , Mamíferos
15.
Trends Biochem Sci ; 48(10): 839-848, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37574371

RESUMO

Core promoters are sites where transcriptional regulatory inputs of a gene are integrated to direct the assembly of the preinitiation complex (PIC) and RNA polymerase II (Pol II) transcription output. Until now, core promoter functions have been investigated by distinct methods, including Pol II transcription initiation site mappings and structural characterization of PICs on distinct promoters. Here, we bring together these previously unconnected observations and hypothesize how, on metazoan TATA promoters, the precisely structured building up of transcription factor (TF) IID-based PICs results in sharp transcription start site (TSS) selection; or, in contrast, how the less strictly controlled positioning of the TATA-less promoter DNA relative to TFIID-core PIC components results in alternative broad TSS selections by Pol II.


Assuntos
Fator de Transcrição TFIID , Transcrição Gênica , Animais , Fator de Transcrição TFIID/genética , Fator de Transcrição TFIID/metabolismo , TATA Box , Regiões Promotoras Genéticas , RNA Polimerase II/metabolismo
16.
Comput Biol Chem ; 105: 107904, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37327560

RESUMO

MOTIVATION: Computational promoter prediction (CPP) tools designed to classify prokaryotic promoter regions usually assume that a transcription start site (TSS) is located at a predefined position within each promoter region. Such CPP tools are sensitive to any positional shifting of the TSS in a windowed region, and they are unsuitable for determining the boundaries of prokaryotic promoters. RESULTS: TSSUNet-MB is a deep learning model developed to identify the TSSs of σ70 promoters. Mononucleotide and bendability were used to encode input sequences. TSSUNet-MB outperforms other CPP tools when assessed using the sequences obtained from the neighborhood of real promoters. TSSUNet-MB achieved a sensitivity of 0.839 and specificity of 0.768 on sliding sequences, while other CPP tool cannot maintain both sensitivities and specificities in a compatible range. Furthermore, TSSUNet-MB can precisely predict the TSS position of σ70 promoter-containing regions with a 10-base accuracy of 77.6%. By leveraging the sliding window scanning approach, we further computed the confidence score of each predicted TSS, which allows for more accurately identifying TSS locations. Our results suggest that TSSUNet-MB is a robust tool for finding σ70 promoters and identifying TSSs.


Assuntos
Escherichia coli , Sítio de Iniciação de Transcrição , Regiões Promotoras Genéticas/genética , Escherichia coli/genética
17.
PeerJ Comput Sci ; 9: e1340, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37346545

RESUMO

Recognizing transcription start sites is key to gene identification. Several approaches have been employed in related problems such as detecting translation initiation sites or promoters, many of the most recent ones based on machine learning. Deep learning methods have been proven to be exceptionally effective for this task, but their use in transcription start site identification has not yet been explored in depth. Also, the very few existing works do not compare their methods to support vector machines (SVMs), the most established technique in this area of study, nor provide the curated dataset used in the study. The reduced amount of published papers in this specific problem could be explained by this lack of datasets. Given that both support vector machines and deep neural networks have been applied in related problems with remarkable results, we compared their performance in transcription start site predictions, concluding that SVMs are computationally much slower, and deep learning methods, specially long short-term memory neural networks (LSTMs), are best suited to work with sequences than SVMs. For such a purpose, we used the reference human genome GRCh38. Additionally, we studied two different aspects related to data processing: the proper way to generate training examples and the imbalanced nature of the data. Furthermore, the generalization performance of the models studied was also tested using the mouse genome, where the LSTM neural network stood out from the rest of the algorithms. To sum up, this article provides an analysis of the best architecture choices in transcription start site identification, as well as a method to generate transcription start site datasets including negative instances on any species available in Ensembl. We found that deep learning methods are better suited than SVMs to solve this problem, being more efficient and better adapted to long sequences and large amounts of data. We also create a transcription start site (TSS) dataset large enough to be used in deep learning experiments.

18.
Exp Eye Res ; 233: 109520, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37236522

RESUMO

More than half of mammalian protein-coding genes have multiple transcription start sites. Alternative transcription start site (TSS) modulate mRNA stability, localization, and translation efficiency on post-transcription level, and even generate novel protein isoforms. However, differential TSS usage among cell types in healthy and diabetic retina remains poorly characterized. In this study, by using 5'-tag-based single-cell RNA sequencing, we identified cell type-specific alternative TSS events and key transcription factors for each of retinal cell types. We observed that lengthening of 5'- UTRs in retinal cell types are enriched for multiple RNA binding protein binding sites, including splicing regulators Rbfox1/2/3 and Nova1. Furthermore, by comparing TSS expression between healthy and diabetic retina, we identified elevated apoptosis signal in Müller glia and microglia, which can be served as a putative early sign of diabetic retinopathy. By measuring 5'UTR isoforms in retinal single-cell dataset, our work provides a comprehensive panorama of alternative TSS and its potential consequence related to post-transcriptional regulation. We anticipate our assay can not only provide insights into cellular heterogeneity driven by transcriptional initiation, but also open up the perspectives for identification of novel diagnostic indexes for diabetic retinopathy.


Assuntos
Diabetes Mellitus , Retinopatia Diabética , Animais , Sítio de Iniciação de Transcrição , Retinopatia Diabética/genética , Retina , Fatores de Transcrição/genética , Isoformas de Proteínas/genética , Mamíferos
19.
G3 (Bethesda) ; 13(8)2023 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-37216666

RESUMO

Understanding the genomic control of tissue-specific gene expression and regulation can help to inform the application of genomic technologies in farm animal breeding programs. The fine mapping of promoters [transcription start sites (TSS)] and enhancers (divergent amplifying segments of the genome local to TSS) in different populations of cattle across a wide diversity of tissues provides information to locate and understand the genomic drivers of breed- and tissue-specific characteristics. To this aim, we used Cap Analysis Gene Expression (CAGE) sequencing, of 24 different tissues from 3 populations of cattle, to define TSS and their coexpressed short-range enhancers (<1 kb) in the ARS-UCD1.2_Btau5.0.1Y reference genome (1000bulls run9) and analyzed tissue and population specificity of expressed promoters. We identified 51,295 TSS and 2,328 TSS-Enhancer regions shared across the 3 populations (dairy, beef-dairy cross, and Canadian Kinsella composite cattle from 2 individuals, 1 of each sex, per population). Cross-species comparative analysis of CAGE data from 7 other species, including sheep, revealed a set of TSS and TSS-Enhancers that were specific to cattle. The CAGE data set will be combined with other transcriptomic information for the same tissues to create a new high-resolution map of transcript diversity across tissues and populations in cattle for the BovReg project. Here we provide the CAGE data set and annotation tracks for TSS and TSS-Enhancers in the cattle genome. This new annotation information will improve our understanding of the drivers of gene expression and regulation in cattle and help to inform the application of genomic technologies in breeding programs.


Assuntos
Animais Domésticos , Genômica , Animais , Bovinos/genética , Ovinos , Sítio de Iniciação de Transcrição , Canadá , Transcriptoma
20.
Proc Natl Acad Sci U S A ; 120(23): e2305103120, 2023 06 06.
Artigo em Inglês | MEDLINE | ID: mdl-37252967

RESUMO

HIV-1 relies on host RNA polymeraseII (Pol II) to transcribe its genome and uses multiple transcription start sites (TSS), including three consecutive guanosines located near the U3-R junction, to generate transcripts containing three, two, and one guanosine at the 5' end, referred to as 3G, 2G, and 1G RNA, respectively. The 1G RNA is preferentially selected for packaging, indicating that these 99.9% identical RNAs exhibit functional differences and highlighting the importance of TSS selection. Here, we demonstrate that TSS selection is regulated by sequences between the CATA/TATA box and the beginning of R. Furthermore, we have generated two HIV-1 mutants with distinct 2-nucleotide modifications that predominantly express 3G RNA or 1G RNA. Both mutants can generate infectious viruses and undergo multiple rounds of replication in T cells. However, both mutants exhibit replication defects compared to the wild-type virus. The 3G-RNA-expressing mutant displays an RNA genome-packaging defect and delayed replication kinetics, whereas the 1G-RNA-expressing mutant exhibits reduced Gag expression and a replication fitness defect. Additionally, reversion of the latter mutant is frequently observed, consistent with sequence correction by plus-strand DNA transfer during reverse transcription. These findings demonstrate that HIV-1 maximizes its replication fitness by usurping the TSS heterogeneity of host RNA Pol II to generate unspliced RNAs with different specialized roles in viral replication. The three consecutive guanosines at the junction of U3 and R may also maintain HIV-1 genome integrity during reverse transcription. These studies reveal the intricate regulation of HIV-1 RNA and complex replication strategy.


Assuntos
HIV-1 , RNA Polimerase II , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , HIV-1/fisiologia , Sítio de Iniciação de Transcrição , RNA Viral/genética , RNA Viral/metabolismo , Replicação Viral/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA