Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 72
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 185(16): 3025-3040.e6, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35882231

RESUMO

Non-allelic recombination between homologous repetitive elements contributes to evolution and human genetic disorders. Here, we combine short- and long-DNA read sequencing of repeat elements with a new bioinformatics pipeline to show that somatic recombination of Alu and L1 elements is widespread in the human genome. Our analysis uncovers tissue-specific non-allelic homologous recombination hallmarks; moreover, we find that centromeres and cancer-associated genes are enriched for retroelements that may act as recombination hotspots. We compare recombination profiles in human-induced pluripotent stem cells and differentiated neurons and find that the neuron-specific recombination of repeat elements accompanies chromatin changes during cell-fate determination. Finally, we report that somatic recombination profiles are altered in Parkinson's and Alzheimer's disease, suggesting a link between retroelement recombination and genomic instability in neurodegeneration. This work highlights a significant contribution of the somatic recombination of repeat elements to genomic diversity in health and disease.


Assuntos
Genoma Humano , Retroelementos , Elementos Alu/genética , Recombinação Homóloga , Humanos , Elementos Nucleotídeos Longos e Dispersos , Sequências Repetitivas de Ácido Nucleico
2.
Genome Res ; 2022 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-35961773

RESUMO

In eukaryotes, capped RNAs include long transcripts such as messenger RNAs and long noncoding RNAs, as well as shorter transcripts such as spliceosomal RNAs, small nucleolar RNAs, and enhancer RNAs. Long capped transcripts can be profiled using cap analysis gene expression (CAGE) sequencing and other methods. Here, we describe a sequencing library preparation protocol for short capped RNAs, apply it to a differentiation time course of the human cell line THP-1, and systematically compare the landscape of short capped RNAs to that of long capped RNAs. Transcription initiation peaks associated with genes in the sense direction have a strong preference to produce either long or short capped RNAs, with one out of six peaks detected in the short capped RNA libraries only. Gene-associated short capped RNAs have highly specific 3' ends, typically overlapping splice sites. Enhancers also preferentially generate either short or long capped RNAs, with 10% of enhancers observed in the short capped RNA libraries only. Enhancers producing either short or long capped RNAs show enrichment for GWAS-associated disease SNPs. We conclude that deep sequencing of short capped RNAs reveals new families of noncoding RNAs and elucidates the diversity of transcripts generated at known and novel promoters and enhancers.

3.
BMC Genomics ; 24(1): 574, 2023 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-37759202

RESUMO

BACKGROUND: Super-enhancers (SEs), which activate genes involved in cell-type specificity, have mainly been defined as genomic regions with top-ranked enrichment(s) of histone H3 with acetylated K27 (H3K27ac) and/or transcription coactivator(s) including a bromodomain and extra-terminal domain (BET) family protein, BRD4. However, BRD4 preferentially binds to multi-acetylated histone H4, typically with acetylated K5 and K8 (H4K5acK8ac), leading us to hypothesize that SEs should be defined by high H4K5acK8ac enrichment at least as well as by that of H3K27ac. RESULTS: Here, we conducted genome-wide profiling of H4K5acK8ac and H3K27ac, BRD4 binding, and the transcriptome by using a BET inhibitor, JQ1, in three human glial cell lines. When SEs were defined as having the top ranks for H4K5acK8ac or H3K27ac signal, 43% of H4K5acK8ac-ranked SEs were distinct from H3K27ac-ranked SEs in a glioblastoma stem-like cell (GSC) line. CRISPR-Cas9-mediated deletion of the H4K5acK8ac-preferred SEs associated with MYCN and NFIC decreased the stem-like properties in GSCs. CONCLUSIONS: Collectively, our data highlights H4K5acK8ac's utility for identifying genes regulating cell-type specificity.


Assuntos
Glioblastoma , Fatores de Transcrição , Humanos , Fatores de Transcrição/metabolismo , Histonas/metabolismo , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Glioblastoma/genética , Acetilação , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo
4.
Genome Res ; 30(7): 1073-1081, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32079618

RESUMO

Long noncoding RNAs (lncRNAs) have emerged as key coordinators of biological and cellular processes. Characterizing lncRNA expression across cells and tissues is key to understanding their role in determining phenotypes, including human diseases. We present here FC-R2, a comprehensive expression atlas across a broadly defined human transcriptome, inclusive of over 109,000 coding and noncoding genes, as described in the FANTOM CAGE-Associated Transcriptome (FANTOM-CAT) study. This atlas greatly extends the gene annotation used in the original recount2 resource. We demonstrate the utility of the FC-R2 atlas by reproducing key findings from published large studies and by generating new results across normal and diseased human samples. In particular, we (a) identify tissue-specific transcription profiles for distinct classes of coding and noncoding genes, (b) perform differential expression analysis across thirteen cancer types, identifying novel noncoding genes potentially involved in tumor pathogenesis and progression, and (c) confirm the prognostic value for several enhancer lncRNAs expression in cancer. Our resource is instrumental for the systematic molecular characterization of lncRNA by the FANTOM6 Consortium. In conclusion, comprised of over 70,000 samples, the FC-R2 atlas will empower other researchers to investigate functions and biological roles of both known coding genes and novel lncRNAs.


Assuntos
Transcriptoma , Bases de Dados Genéticas , Elementos Facilitadores Genéticos , Perfilação da Expressão Gênica , Genoma Humano , Humanos , Neoplasias/genética , Especificidade de Órgãos , Prognóstico , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/metabolismo
5.
Bioinformatics ; 38(22): 5126-5128, 2022 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-36173306

RESUMO

MOTIVATION: Cell type-specific activities of cis-regulatory elements (CRE) are central to understanding gene regulation and disease predisposition. Single-cell RNA 5'end sequencing (sc-end5-seq) captures the transcription start sites (TSS) which can be used as a proxy to measure the activity of transcribed CREs (tCREs). However, a substantial fraction of TSS identified from sc-end5-seq data may not be genuine due to various artifacts, hindering the use of sc-end5-seq for de novo discovery of tCREs. RESULTS: We developed SCAFE-Single-Cell Analysis of Five-prime Ends-a software suite that processes sc-end5-seq data to de novo identify TSS clusters based on multiple logistic regression. It annotates tCREs based on the identified TSS clusters and generates a tCRE-by-cell count matrix for downstream analyses. The software suite consists of a set of flexible tools that could either be run independently or as pre-configured workflows. AVAILABILITY AND IMPLEMENTATION: SCAFE is implemented in Perl and R. The source code and documentation are freely available for download under the MIT License from https://github.com/chung-lab/SCAFE. Docker images are available from https://hub.docker.com/r/cchon/scafe. The submitted software version and test data are archived at https://doi.org/10.5281/zenodo.7023163 and https://doi.org/10.5281/zenodo.7024060, respectively. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequências Reguladoras de Ácido Nucleico , Software , Fluxo de Trabalho , Sítio de Iniciação de Transcrição
6.
Biochem Soc Trans ; 51(5): 1975-1988, 2023 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-37830459

RESUMO

Enhancers are genomic regions that regulate gene transcription and are located far away from the transcription start sites of their target genes. Enhancers are highly enriched in disease-associated variants and thus deciphering the interactions between enhancers and genes is crucial to understanding the molecular basis of genetic predispositions to diseases. Experimental validations of enhancer targets can be laborious. Computational methods have thus emerged as a valuable alternative for studying enhancer-gene interactions. A variety of computational methods have been developed to predict enhancer targets by incorporating genomic features (e.g. conservation, distance, and sequence), epigenomic features (e.g. histone marks and chromatin contacts) and activity measurements (e.g. covariations of enhancer activity and gene expression). With the recent advances in genome perturbation and chromatin conformation capture technologies, data on experimentally validated enhancer targets are becoming available for supervised training of these methods and evaluation of their performance. In this review, we categorize enhancer target prediction methods based on their rationales and approaches. Then we discuss their merits and limitations and highlight the future directions for enhancer targets prediction.


Assuntos
Elementos Facilitadores Genéticos , Histonas , Histonas/metabolismo , Cromatina , Genômica/métodos , Epigenômica
7.
Nature ; 543(7644): 199-204, 2017 03 09.
Artigo em Inglês | MEDLINE | ID: mdl-28241135

RESUMO

Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.


Assuntos
Bases de Dados Genéticas , RNA Longo não Codificante/química , RNA Longo não Codificante/genética , Transcriptoma/genética , Células Cultivadas , Sequência Conservada/genética , Conjuntos de Dados como Assunto , Elementos Facilitadores Genéticos/genética , Epigênese Genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas/genética , Locos de Características Quantitativas/genética , Estabilidade de RNA , RNA Mensageiro/genética
8.
Nucleic Acids Res ; 49(D1): D892-D898, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33211864

RESUMO

The Functional ANnoTation Of the Mammalian genome (FANTOM) Consortium has continued to provide extensive resources in the pursuit of understanding the transcriptome, and transcriptional regulation, of mammalian genomes for the last 20 years. To share these resources with the research community, the FANTOM web-interfaces and databases are being regularly updated, enhanced and expanded with new data types. In recent years, the FANTOM Consortium's efforts have been mainly focused on creating new non-coding RNA datasets and resources. The existing FANTOM5 human and mouse miRNA atlas was supplemented with rat, dog, and chicken datasets. The sixth (latest) edition of the FANTOM project was launched to assess the function of human long non-coding RNAs (lncRNAs). From its creation until 2020, FANTOM6 has contributed to the research community a large dataset generated from the knock-down of 285 lncRNAs in human dermal fibroblasts; this is followed with extensive expression profiling and cellular phenotyping. Other updates to the FANTOM resource includes the reprocessing of the miRNA and promoter atlases of human, mouse and chicken with the latest reference genome assemblies. To facilitate the use and accessibility of all above resources we further enhanced FANTOM data viewers and web interfaces. The updated FANTOM web resource is publicly available at https://fantom.gsc.riken.jp/.


Assuntos
Anotação de Sequência Molecular , RNA Longo não Codificante/genética , Transcriptoma/genética , Animais , Sítios de Ligação , Cromatina/metabolismo , Drosophila/genética , Fibroblastos/citologia , Fibroblastos/metabolismo , Genoma , Humanos , Metadados , Camundongos , MicroRNAs/genética , MicroRNAs/metabolismo , Regiões Promotoras Genéticas , RNA Longo não Codificante/metabolismo , Fatores de Transcrição/metabolismo , Interface Usuário-Computador
10.
Nucleic Acids Res ; 47(D1): D752-D758, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30407557

RESUMO

The FANTOM web resource (http://fantom.gsc.riken.jp/) was developed to provide easy access to the data produced by the FANTOM project. It contains the most complete and comprehensive sets of actively transcribed enhancers and promoters in the human and mouse genomes. We determined the transcription activities of these regulatory elements by CAGE (Cap Analysis of Gene Expression) for both steady and dynamic cellular states in all major and some rare cell types, consecutive stages of differentiation and responses to stimuli. We have expanded the resource by employing different assays, such as RNA-seq, short RNA-seq and a paired-end protocol for CAGE (CAGEscan), to provide new angles to study the transcriptome. That yielded additional atlases of long noncoding RNAs, miRNAs and their promoters. We have also expanded the CAGE analysis to cover rat, dog, chicken, and macaque species for a limited number of cell types. The CAGE data obtained from human and mouse were reprocessed to make them available on the latest genome assemblies. Here, we report the recent updates of both data and interfaces in the FANTOM web resource.


Assuntos
Bases de Dados Genéticas , Genoma/genética , Internet , Transcriptoma/genética , Animais , Diferenciação Celular/genética , Galinhas/genética , Cães , Regulação da Expressão Gênica/genética , Genômica/tendências , Humanos , Camundongos , MicroRNAs/genética , Regiões Promotoras Genéticas/genética , RNA Longo não Codificante/genética , Ratos , Interface Usuário-Computador
11.
BMC Genomics ; 21(1): 766, 2020 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-33148170

RESUMO

BACKGROUND: Protein Disulfide Isomerases are thiol oxidoreductase chaperones from thioredoxin superfamily with crucial roles in endoplasmic reticulum proteostasis, implicated in many diseases. The family prototype PDIA1 is also involved in vascular redox cell signaling. PDIA1 is coded by the P4HB gene. While forced changes in P4HB gene expression promote physiological effects, little is known about endogenous P4HB gene regulation and, in particular, gene modulation by alternative splicing. This study addressed the P4HB splice variant landscape. RESULTS: Ten protein coding sequences (Ensembl) of the P4HB gene originating from alternative splicing were characterized. Structural features suggest that except for P4HB-021, other splice variants are unlikely to exert thiol isomerase activity at the endoplasmic reticulum. Extensive analyses using FANTOM5, ENCODE Consortium and GTEx project databases as RNA-seq data sources were performed. These indicated widespread expression but significant variability in the degree of isoform expression among distinct tissues and even among distinct locations of the same cell, e.g., vascular smooth muscle cells from different origins. P4HB-02, P4HB-027 and P4HB-021 were relatively more expressed across each database, the latter particularly in vascular smooth muscle. Expression of such variants was validated by qRT-PCR in some cell types. The most consistently expressed splice variant was P4HB-021 in human mammary artery vascular smooth muscle which, together with canonical P4HB gene, had its expression enhanced by serum starvation. CONCLUSIONS: Our study details the splice variant landscape of the P4HB gene, indicating their potential role to diversify the functional reach of this crucial gene. P4HB-021 splice variant deserves further investigation in vascular smooth muscle cells.


Assuntos
Pró-Colágeno-Prolina Dioxigenase , Isomerases de Dissulfetos de Proteínas , Retículo Endoplasmático/genética , Retículo Endoplasmático/metabolismo , Humanos , Mutação , Pró-Colágeno-Prolina Dioxigenase/genética , Pró-Colágeno-Prolina Dioxigenase/metabolismo , Isomerases de Dissulfetos de Proteínas/genética , Transdução de Sinais
12.
Nature ; 513(7518): 431-5, 2014 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-25043062

RESUMO

Antigenic variation of the Plasmodium falciparum multicopy var gene family enables parasite evasion of immune destruction by host antibodies. Expression of a particular var subgroup, termed upsA, is linked to the obstruction of blood vessels in the brain and to the pathogenesis of human cerebral malaria. The mechanism determining upsA activation remains unknown. Here we show that an entirely new type of gene silencing mechanism involving an exonuclease-mediated degradation of nascent RNA controls the silencing of genes linked to severe malaria. We identify a novel chromatin-associated exoribonuclease, termed PfRNase II, that controls the silencing of upsA var genes by marking their transcription start site and intron-promoter regions leading to short-lived cryptic RNA. Parasites carrying a deficient PfRNase II gene produce full-length upsA var transcripts and intron-derived antisense long non-coding RNA. The presence of stable upsA var transcripts overcomes monoallelic expression, resulting in the simultaneous expression of both upsA and upsC type PfEMP1 proteins on the surface of individual infected red blood cells. In addition, we observe an inverse relationship between transcript levels of PfRNase II and upsA-type var genes in parasites from severe malaria patients, implying a crucial role of PfRNase II in severe malaria. Our results uncover a previously unknown type of post-transcriptional gene silencing mechanism in malaria parasites with repercussions for other organisms. Additionally, the identification of RNase II as a parasite protein controlling the expression of virulence genes involved in pathogenesis in patients with severe malaria may provide new strategies for reducing malaria mortality.


Assuntos
Exorribonucleases/metabolismo , Inativação Gênica , Genes de Protozoários/genética , Malária Cerebral/parasitologia , Plasmodium falciparum/enzimologia , Plasmodium falciparum/genética , RNA de Protozoário/metabolismo , Alelos , Variação Antigênica/genética , Cromatina/enzimologia , Regulação para Baixo/genética , Eritrócitos/parasitologia , Exorribonucleases/deficiência , Exorribonucleases/genética , Humanos , Íntrons/genética , Malária Falciparum/parasitologia , Plasmodium falciparum/patogenicidade , Regiões Promotoras Genéticas/genética , Proteínas de Protozoários/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA de Protozoário/genética , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Sítio de Iniciação de Transcrição , Virulência/genética , Fatores de Virulência/genética
13.
PLoS Genet ; 10(4): e1004261, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24743168

RESUMO

Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.


Assuntos
Cryptococcus neoformans/genética , Genoma Fúngico/genética , RNA Fúngico/genética , Transcriptoma/genética , Virulência/genética , Cromossomos Fúngicos/genética , DNA Fúngico/genética , Íntrons/genética
14.
Nucleic Acids Res ; 42(6): 3623-37, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24442674

RESUMO

While gene expression is a fundamental and tightly controlled cellular process that is regulated at multiple steps, the exact contribution of each step remains unknown in any organism. The absence of transcription initiation regulation for RNA polymerase II in the protozoan parasite Trypanosoma brucei greatly simplifies the task of elucidating the contribution of translation to global gene expression. Therefore, we have sequenced ribosome-protected mRNA fragments in T. brucei, permitting the genome-wide analysis of RNA translation and translational efficiency. We find that the latter varies greatly between life cycle stages of the parasite and ∼100-fold between genes, thus contributing to gene expression to a similar extent as RNA stability. The ability to map ribosome positions at sub-codon resolution revealed extensive translation from upstream open reading frames located within 5' UTRs and enabled the identification of hundreds of previously un-annotated putative coding sequences (CDSs). Evaluation of existing proteomics and genome-wide RNAi data confirmed the translation of previously un-annotated CDSs and suggested an important role for >200 of those CDSs in parasite survival, especially in the form that is infective to mammals. Overall our data show that translational control plays a prevalent and important role in different parasite life cycle stages of T. brucei.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Biossíntese de Proteínas , Ribossomos/metabolismo , Trypanosoma brucei brucei/genética , Códon , Estágios do Ciclo de Vida/genética , Fases de Leitura Aberta , Iniciação Traducional da Cadeia Peptídica , Trypanosoma brucei brucei/crescimento & desenvolvimento , Trypanosoma brucei brucei/metabolismo
15.
Nucleic Acids Res ; 42(15): 9717-29, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25104019

RESUMO

Base J, ß-d-glucosyl-hydroxymethyluracil, is an epigenetic modification of thymine in the nuclear DNA of flagellated protozoa of the order Kinetoplastida. J is enriched at sites involved in RNA polymerase (RNAP) II initiation and termination. Reduction of J in Leishmania tarentolae via growth in BrdU resulted in cell death and indicated a role of J in the regulation of RNAP II termination. To further explore J function in RNAP II termination among kinetoplastids and avoid indirect effects associated with BrdU toxicity and genetic deletions, we inhibited J synthesis in Leishmania major and Trypanosoma brucei using DMOG. Reduction of J in L. major resulted in genome-wide defects in transcription termination at the end of polycistronic gene clusters and the generation of antisense RNAs, without cell death. In contrast, loss of J in T. brucei did not lead to genome-wide termination defects; however, the loss of J at specific sites within polycistronic gene clusters led to altered transcription termination and increased expression of downstream genes. Thus, J regulation of RNAP II transcription termination genome-wide is restricted to Leishmania spp., while in T. brucei it regulates termination and gene expression at specific sites within polycistronic gene clusters.


Assuntos
Regulação da Expressão Gênica , Leishmania major/genética , Terminação da Transcrição Genética , Trypanosoma brucei brucei/genética , Uracila/análogos & derivados , Linhagem Celular , Glucosídeos , Leishmania major/enzimologia , RNA Polimerase II/metabolismo , RNA de Protozoário/análise , Trypanosoma brucei brucei/enzimologia , Uracila/fisiologia
16.
PLoS Pathog ; 9(12): e1003824, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24385905

RESUMO

Entamoeba histolytica is the pathogenic amoeba responsible for amoebiasis, an infectious disease targeting human tissues. Amoebiasis arises when virulent trophozoites start to destroy the muco-epithelial barrier by first crossing the mucus, then killing host cells, triggering inflammation and subsequently causing dysentery. The main goal of this study was to analyse pathophysiology and gene expression changes related to virulent (i.e. HM1:IMSS) and non-virulent (i.e. Rahman) strains when they are in contact with the human colon. Transcriptome comparisons between the two strains, both in culture conditions and upon contact with human colon explants, provide a global view of gene expression changes that might contribute to the observed phenotypic differences. The most remarkable feature of the virulent phenotype resides in the up-regulation of genes implicated in carbohydrate metabolism and processing of glycosylated residues. Consequently, inhibition of gene expression by RNA interference of a glycoside hydrolase (ß-amylase absent from humans) abolishes mucus depletion and tissue invasion by HM1:IMSS. In summary, our data suggest a potential role of carbohydrate metabolism in colon invasion by virulent E. histolytica.


Assuntos
Colo/parasitologia , Disenteria Amebiana/parasitologia , Entamoeba histolytica/crescimento & desenvolvimento , Entamoeba histolytica/patogenicidade , Fatores de Virulência/genética , Adulto , Sequência de Aminoácidos , Animais , Clonagem Molecular , Colo/patologia , Cricetinae , Disenteria Amebiana/genética , Entamoeba histolytica/genética , Interações Hospedeiro-Parasita/genética , Humanos , Masculino , Mesocricetus , Modelos Moleculares , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos , Fatores de Virulência/metabolismo , beta-Amilase/genética , beta-Amilase/metabolismo
17.
Nucleic Acids Res ; 41(3): 1936-52, 2013 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-23258700

RESUMO

Alternative splicing and polyadenylation were observed pervasively in eukaryotic messenger RNAs. These alternative isoforms could either be consequences of physiological regulation or stochastic noise of RNA processing. To quantify the extent of stochastic noise in splicing and polyadenylation, we analyzed the alternative usage of splicing and polyadenylation sites in Entamoeba histolytica using RNA-Seq. First, we identified a large number of rarely spliced alternative junctions and then showed that the occurrence of these alternative splicing events is correlated with splicing site sequence, occurrence of constitutive splicing events and messenger RNA abundance. Our results implied the majority of these alternative splicing events are likely to be stochastic error of splicing machineries, and we estimated the corresponding error rates. Second, we observed extensive microheterogeneity of polyadenylation cleavage sites, and the extent of such microheterogeneity is correlated with the occurrence of constitutive cleavage events, suggesting most of such microheterogeneity is likely to be stochastic. Overall, we only observed a small fraction of alternative splicing and polyadenylation isoforms that are unlikely to be solely stochastic, implying the functional relevance of alternative splicing and polyadenylation in E. histolytica is limited. Lastly, we revised the gene models and annotated their 3'UTR in AmoebaDB, providing valuable resources to the community.


Assuntos
Processamento Alternativo , Entamoeba histolytica/genética , Poliadenilação , Entamoeba histolytica/metabolismo , Éxons , Íntrons , Modelos Genéticos , Motivos de Nucleotídeos , Poli A/análise , Isoformas de RNA/análise , RNA Mensageiro/química , Processos Estocásticos
18.
BMC Genomics ; 15: 150, 2014 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-24559473

RESUMO

BACKGROUND: Advances in high-throughput sequencing have led to the discovery of widespread transcription of natural antisense transcripts (NATs) in a large number of organisms, where these transcripts have been shown to play important roles in the regulation of gene expression. Likewise, the existence of NATs has been observed in Plasmodium but our understanding towards their genome-wide distribution remains incomplete due to the limited depth and uncertainties in the level of strand specificity of previous datasets. RESULTS: To gain insights into the genome-wide distribution of NATs in P. falciparum, we performed RNA-ligation based strand-specific RNA sequencing at unprecedented depth. Our data indicate that 78.3% of the genome is transcribed during blood-stage development. Moreover, our analysis reveals significant levels of antisense transcription from at least 24% of protein-coding genes and that while expression levels of NATs change during the intraerythrocytic developmental cycle (IDC), they do not correlate with the corresponding mRNA levels. Interestingly, antisense transcription is not evenly distributed across coding regions (CDSs) but strongly clustered towards the 3'-end of CDSs. Furthermore, for a significant subset of NATs, transcript levels correlate with mRNA levels of neighboring genes.Finally, we were able to identify the polyadenylation sites (PASs) for a subset of NATs, demonstrating that at least some NATs are polyadenylated. We also mapped the PASs of 3443 coding genes, yielding an average 3' untranslated region length of 523 bp. CONCLUSIONS: Our strand-specific analysis of the P. falciparum transcriptome expands and strengthens the existing body of evidence that antisense transcription is a substantial phenomenon in P. falciparum. For a subset of neighboring genes we find that sense and antisense transcript levels are intricately linked while other NATs appear to be regulated independently of mRNA transcription. Our deep strand-specific dataset will provide a valuable resource for the precise determination of expression levels as it separates sense from antisense transcript levels, which we find to often significantly differ. In addition, the extensive novel data on 3' UTR length will allow others to perform searches for regulatory motifs in the UTRs and help understand post-translational regulation in P. falciparum.


Assuntos
Plasmodium falciparum/genética , RNA Antissenso , RNA de Protozoário , Transcrição Gênica , Regiões 3' não Traduzidas , Núcleo Celular/metabolismo , Análise por Conglomerados , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Poliadenilação , Splicing de RNA , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
19.
PLoS One ; 19(5): e0295971, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38709794

RESUMO

The human genome is pervasively transcribed and produces a wide variety of long non-coding RNAs (lncRNAs), constituting the majority of transcripts across human cell types. Some specific nuclear lncRNAs have been shown to be important regulatory components acting locally. As RNA-chromatin interaction and Hi-C chromatin conformation data showed that chromatin interactions of nuclear lncRNAs are determined by the local chromatin 3D conformation, we used Hi-C data to identify potential target genes of lncRNAs. RNA-protein interaction data suggested that nuclear lncRNAs act as scaffolds to recruit regulatory proteins to target promoters and enhancers. Nuclear lncRNAs may therefore play a role in directing regulatory factors to locations spatially close to the lncRNA gene. We provide the analysis results through an interactive visualization web portal at https://fantom.gsc.riken.jp/zenbu/reports/#F6_3D_lncRNA.


Assuntos
Cromatina , RNA Longo não Codificante , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Cromatina/metabolismo , Cromatina/genética , Humanos , Anotação de Sequência Molecular , Núcleo Celular/metabolismo , Núcleo Celular/genética , Genoma Humano , Regiões Promotoras Genéticas
20.
Nat Commun ; 14(1): 7240, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945584

RESUMO

Five-prime single-cell RNA-seq (scRNA-seq) has been widely employed to profile cellular transcriptomes, however, its power of analysing transcription start sites (TSS) has not been fully utilised. Here, we present a computational method suite, CamoTSS, to precisely identify TSS and quantify its expression by leveraging the cDNA on read 1, which enables effective detection of alternative TSS usage. With various experimental data sets, we have demonstrated that CamoTSS can accurately identify TSS and the detected alternative TSS usages showed strong specificity in different biological processes, including cell types across human organs, the development of human thymus, and cancer conditions. As evidenced in nasopharyngeal cancer, alternative TSS usage can also reveal regulatory patterns including systematic TSS dysregulations.


Assuntos
Neoplasias Nasofaríngeas , Humanos , Sítio de Iniciação de Transcrição , Análise da Expressão Gênica de Célula Única , Transcriptoma/genética , Fenótipo , Análise de Célula Única/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA