Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nat Methods ; 20(8): 1159-1169, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37443337

RESUMEN

The detection of circular RNA molecules (circRNAs) is typically based on short-read RNA sequencing data processed using computational tools. Numerous such tools have been developed, but a systematic comparison with orthogonal validation is missing. Here, we set up a circRNA detection tool benchmarking study, in which 16 tools detected more than 315,000 unique circRNAs in three deeply sequenced human cell types. Next, 1,516 predicted circRNAs were validated using three orthogonal methods. Generally, tool-specific precision is high and similar (median of 98.8%, 96.3% and 95.5% for qPCR, RNase R and amplicon sequencing, respectively) whereas the sensitivity and number of predicted circRNAs (ranging from 1,372 to 58,032) are the most significant differentiators. Of note, precision values are lower when evaluating low-abundance circRNAs. We also show that the tools can be used complementarily to increase detection sensitivity. Finally, we offer recommendations for future circRNA detection and validation.


Asunto(s)
Benchmarking , ARN Circular , Humanos , ARN Circular/genética , ARN/genética , ARN/metabolismo , Análisis de Secuencia de ARN/métodos
2.
Nucleic Acids Res ; 52(D1): D115-D123, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37823705

RESUMEN

Circular RNAs (circRNAs) are RNA molecules with a continuous loop structure characterized by back-splice junctions (BSJs). While analyses of short-read RNA sequencing have identified millions of BSJ events, it is inherently challenging to determine exact full-length sequences and alternatively spliced (AS) isoforms of circRNAs. Recent advances in nanopore long-read sequencing with circRNA enrichment bring an unprecedented opportunity for investigating the issues. Here, we developed FL-circAS (https://cosbi.ee.ncku.edu.tw/FL-circAS/), which collected such long-read sequencing data of 20 cell lines/tissues and thereby identified 884 636 BSJs with 1 853 692 full-length circRNA isoforms in human and 115 173 BSJs with 135 617 full-length circRNA isoforms in mouse. FL-circAS also provides multiple circRNA features. For circRNA expression, FL-circAS calculates expression levels for each circRNA isoform, cell line/tissue specificity at both the BSJ and isoform levels, and AS entropy for each BSJ across samples. For circRNA biogenesis, FL-circAS identifies reverse complementary sequences and RNA binding protein (RBP) binding sites residing in flanking sequences of BSJs. For functional patterns, FL-circAS identifies potential microRNA/RBP binding sites and several types of evidence for circRNA translation on each full-length circRNA isoform. FL-circAS provides user-friendly interfaces for browsing, searching, analyzing, and downloading data, serving as the first resource for discovering full-length circRNAs at the isoform level.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN Circular , Animales , Humanos , Ratones , Empalme Alternativo/genética , MicroARNs/genética , MicroARNs/metabolismo , Secuenciación de Nanoporos , ARN Circular/genética , Isoformas de ARN/genética
3.
Nucleic Acids Res ; 51(15): 7777-7797, 2023 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-37497782

RESUMEN

Trans-spliced RNAs (ts-RNAs) are a type of non-co-linear (NCL) transcripts that consist of exons in an order topologically inconsistent with the corresponding DNA template. Detecting ts-RNAs is often interfered by experimental artifacts, circular RNAs (circRNAs) and genetic rearrangements. Particularly, intragenic ts-RNAs, which are derived from separate precursor mRNA molecules of the same gene, are often mistaken for circRNAs through analyses of RNA-seq data. Here we developed a bioinformatics pipeline (NCLscan-hybrid), which integrated short and long RNA-seq reads to minimize false positives and proposed out-of-circle and rolling-circle long reads to distinguish between intragenic ts-RNAs and circRNAs. Combining NCLscan-hybrid screening and multiple experimental validation steps successfully confirmed that four NCL events, which were previously regarded as circRNAs in databases, originated from trans-splicing. CRISPR-based endogenous genome modification experiments further showed that flanking intronic complementary sequences can significantly contribute to ts-RNA formation, providing an efficient/specific method to deplete ts-RNAs. We also experimentally validated that one ts-RNA (ts-ARFGEF1) played an important role for p53-mediated apoptosis through affecting the PERK/eIF2a/ATF4/CHOP signaling pathway in breast cancer cells. This study thus described both bioinformatics procedures and experimental validation steps for rigorous characterization of ts-RNAs, expanding future studies for identification, biogenesis, and function of these important but understudied transcripts.


Asunto(s)
Análisis de Secuencia de ARN , Trans-Empalme , Genoma , Empalme del ARN , ARN Circular , Análisis de Secuencia de ARN/métodos
4.
Genome Res ; 30(3): 375-391, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32127416

RESUMEN

Circular RNAs (circRNAs), a class of long noncoding RNAs, are known to be enriched in mammalian neural tissues. Although a wide range of dysregulation of gene expression in autism spectrum disorder (ASD) have been reported, the role of circRNAs in ASD remains largely unknown. Here, we performed genome-wide circRNA expression profiling in postmortem brains from individuals with ASD and controls and identified 60 circRNAs and three coregulated modules that were perturbed in ASD. By integrating circRNA, microRNA, and mRNA dysregulation data derived from the same cortex samples, we identified 8170 ASD-associated circRNA-microRNA-mRNA interactions. Putative targets of the axes were enriched for ASD risk genes and genes encoding inhibitory postsynaptic density (PSD) proteins, but not for genes implicated in monogenetic forms of other brain disorders or genes encoding excitatory PSD proteins. This reflects the previous observation that ASD-derived organoids show overproduction of inhibitory neurons. We further confirmed that some ASD risk genes (NLGN1, STAG1, HSD11B1, VIP, and UBA6) were regulated by an up-regulated circRNA (circARID1A) via sponging a down-regulated microRNA (miR-204-3p) in human neuronal cells. Particularly, alteration of NLGN1 expression is known to affect the dynamic processes of memory consolidation and strengthening. To the best of our knowledge, this is the first systems-level view of circRNA regulatory networks in ASD cortex samples. We provided a rich set of ASD-associated circRNA candidates and the corresponding circRNA-microRNA-mRNA axes, particularly those involving ASD risk genes. Our findings thus support a role for circRNA dysregulation and the corresponding circRNA-microRNA-mRNA axes in ASD pathophysiology.


Asunto(s)
Trastorno del Espectro Autista/genética , Regulación de la Expresión Génica , MicroARNs/metabolismo , ARN Circular/metabolismo , ARN Mensajero/metabolismo , Astrocitos/metabolismo , Trastorno del Espectro Autista/metabolismo , Encéfalo/metabolismo , Línea Celular , Genoma Humano , Humanos , Células-Madre Neurales/metabolismo , Neuronas/metabolismo
5.
Mol Psychiatry ; 27(11): 4695-4706, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35962193

RESUMEN

Genetic risk variants and transcriptional expression changes in autism spectrum disorder (ASD) were widely investigated, but their causal relationship remains largely unknown. Circular RNAs (circRNAs) are abundant in brain and often serve as upstream regulators of mRNAs. By integrating RNA-sequencing with genotype data from autistic brains, we assessed expression quantitative trait loci of circRNAs (circQTLs) that cis-regulated expression of nearby circRNAs and trans-regulated expression of distant genes (trans-eGenes) simultaneously. We thus identified 3619 circQTLs that were also trans-eQTLs and constructed 19,804 circQTL-circRNA-trans-eGene regulatory axes. We conducted two different types of approaches, mediation and partial correlation tests (MPT), to determine the axes with mediation effects of circQTLs on trans-eGene expression through circRNA expression. We showed that the mediation effects of the circQTLs (trans-eQTLs) on circRNA expression were positively correlated with the magnitude of circRNA-trans-eGene correlation of expression profile. The positive correlation became more significant after adjustment for the circQTLs. Of the 19,804 axes, 8103 passed MPT. Meanwhile, we performed causal inference test (CIT) and identified 2070 circQTL-trans-eGene-ASD diagnosis propagation paths. We showed that the CIT-passing genes were significantly enriched for ASD risk genes, genes encoding postsynaptic density proteins, and other ASD-relevant genes, supporting the relevance of the CIT-passing genes to ASD pathophysiology. Integration of MPT- and CIT-passing axes further constructed 352 circQTL-circRNA-trans-eGene-ASD diagnosis propagation paths, wherein the circRNA-trans-eGene axes may act as causal mediators for the circQTL-ASD diagnosis associations. These analyses were also successfully applied to an independent dataset from schizophrenia brains. Collectively, this study provided the first framework for systematically investigating trans-genetic effects of circQTLs and inferring the corresponding causal relations in diseases. The identified circQTL-circRNA-trans-eGene regulatory interactions, particularly the internal modules that were previously implicated in the examined disorders, also provided a helpful dataset for further investigating causative biology and cryptic regulatory mechanisms underlying the neuropsychiatric diseases.


Asunto(s)
Trastorno del Espectro Autista , MicroARNs , Humanos , ARN Circular/genética , Trastorno del Espectro Autista/genética , Sitios de Carácter Cuantitativo/genética , ARN Mensajero/genética , Análisis de Secuencia de ARN , MicroARNs/genética , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , ARN/genética
6.
BMC Bioinformatics ; 23(1): 164, 2022 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-35524165

RESUMEN

BACKGROUND: Circular RNAs (circRNAs) are a class of non-coding RNAs formed by pre-mRNA back-splicing, which are widely expressed in animal/plant cells and often play an important role in regulating microRNA (miRNA) activities. While numerous databases have collected a large amount of predicted circRNA candidates and provided the corresponding circRNA-regulated interactions, a stand-alone package for constructing circRNA-miRNA-mRNA interactions based on user-identified circRNAs across species is lacking. RESULTS: We present CircMiMi (circRNA-miRNA-mRNA interactions), a modular, Python-based software to identify circRNA-miRNA-mRNA interactions across 18 species (including 16 animals and 2 plants) with the given coordinates of circRNA junctions. The CircMiMi-constructed circRNA-miRNA-mRNA interactions are derived from circRNA-miRNA and miRNA-mRNA axes with the support of computational predictions and/or experimental data. CircMiMi also allows users to examine alignment ambiguity of back-splice junctions for checking circRNA reliability and examine reverse complementary sequences residing in the sequences flanking the circularized exons for investigating circRNA formation. We further employ CircMiMi to identify circRNA-miRNA-mRNA interactions based on the circRNAs collected in NeuroCirc, a large-scale database of circRNAs in the human brain. We construct circRNA-miRNA-mRNA interactions comprising differentially expressed circRNAs, and miRNAs in autism spectrum disorder (ASD) and cross-species analyze the relevance of the targets to ASD. We thus provide a rich set of ASD-associated circRNA-miRNA-mRNA axes and a useful starting point for investigation of regulatory mechanisms in ASD pathophysiology. CONCLUSIONS: CircMiMi allows users to identify circRNA-mediated interactions in multiple species, shedding light on regulatory roles of circRNAs. The software package and web interface are freely available at https://github.com/TreesLab/CircMiMi and http://circmimi.genomics.sinica.edu.tw/ , respectively.


Asunto(s)
Trastorno del Espectro Autista , MicroARNs , Animales , Redes Reguladoras de Genes , MicroARNs/genética , ARN Circular , ARN Mensajero/genética , Reproducibilidad de los Resultados , Programas Informáticos
7.
Genome Res ; 29(11): 1766-1776, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31515285

RESUMEN

Adenosine-to-inosine (A-to-I) RNA editing is a very common co-/posttranscriptional modification that can lead to A-to-G changes at the RNA level and compensate for G-to-A genomic changes to a certain extent. It has been shown that each healthy individual can carry dozens of missense variants predicted to be severely deleterious. Why strongly detrimental variants are preserved in a population and not eliminated by negative natural selection remains mostly unclear. Here, we ask if RNA editing correlates with the burden of deleterious A/G polymorphisms in a population. Integrating genome and transcriptome sequencing data from 447 human lymphoblastoid cell lines, we show that nonsynonymous editing activities (prevalence/level) are negatively correlated with the deleteriousness of A-to-G genomic changes and positively correlated with that of G-to-A genomic changes within the population. We find a significantly negative correlation between nonsynonymous editing activities and allele frequency of A within the population. This negative editing-allele frequency correlation is particularly strong when editing sites are located in highly important genes/loci. Examinations of deleterious missense variants from the 1000 Genomes Project further show a significantly higher proportion of rare missense mutations for G-to-A changes than for other types of changes. The proportion for G-to-A changes increases with increasing deleterious effects of the changes. Moreover, the deleteriousness of G-to-A changes is significantly positively correlated with the percentage of editing enzyme binding motifs at the variants. Overall, we show that nonsynonymous editing is associated with the increased burden of G-to-A missense mutations in healthy individuals, expanding RNA editing in pathogenomics studies.


Asunto(s)
Adenosina/genética , Inosina/genética , Mutación Missense , Edición de ARN , ARN/genética , Frecuencia de los Genes , Humanos
8.
Nucleic Acids Res ; 46(7): 3671-3691, 2018 04 20.
Artículo en Inglés | MEDLINE | ID: mdl-29385530

RESUMEN

Transcriptionally non-co-linear (NCL) transcripts can originate from trans-splicing (trans-spliced RNA; 'tsRNA') or cis-backsplicing (circular RNA; 'circRNA'). While numerous circRNAs have been detected in various species, tsRNAs remain largely uninvestigated. Here, we utilize integrative transcriptome sequencing of poly(A)- and non-poly(A)-selected RNA-seq data from diverse human cell lines to distinguish between tsRNAs and circRNAs. We identified 24,498 NCL events and found that a considerable proportion (20-35%) of them arise from both tsRNAs and circRNAs, representing extensive alternative trans-splicing and cis-backsplicing in human cells. We show that sequence generalities of exon circularization are also observed in tsRNAs. Recapitulation of NCL RNAs further shows that inverted Alu repeats can simultaneously promote the formation of tsRNAs and circRNAs. However, tsRNAs and circRNAs exhibit quite different, or even opposite, expression patterns, in terms of correlation with the expression of their co-linear counterparts, expression breadth/abundance, transcript stability, and subcellular localization preference. These results indicate that tsRNAs and circRNAs may play different regulatory roles and analysis of NCL events should take the joint effects of different NCL-splicing types and joint effects of multiple NCL events into consideration. This study describes the first transcriptome-wide analysis of trans-splicing and cis-backsplicing, expanding our understanding of the complexity of the human transcriptome.


Asunto(s)
Empalme Alternativo/genética , ARN/genética , Trans-Empalme/genética , Transcriptoma/genética , Exones/genética , Perfilación de la Expresión Génica , Humanos , Empalme del ARN/genética , ARN Circular
9.
BMC Bioinformatics ; 20(1): 3, 2019 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-30606103

RESUMEN

BACKGROUND: Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. RESULTS: We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCLscore) based on the variation in the number of supporting NCL junction reads identified by the tools examined. Of the input NCL events, we show that the ambiguous alignment-derived events have relatively lower NCLscore values than the other events, indicating that an NCL event with a higher NCLscore has a higher level of reliability. To help selecting highly expressed NCL events, NCLcomparator also provides a series of useful measurements such as the expression levels of the detected NCL events and their corresponding host genes and the junction usage of the co-linear splice junctions at both NCL donor and acceptor sites. CONCLUSION: NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator .


Asunto(s)
Fusión Génica/genética , Genoma/genética , Empalme del ARN/genética , ARN/genética , Análisis de Secuencia de ARN/métodos
10.
Nucleic Acids Res ; 44(3): e29, 2016 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-26442529

RESUMEN

Analysis of RNA-seq data often detects numerous 'non-co-linear' (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of false positives arising from alignment artifacts and discrimination between different types of NCL transcripts (trans-spliced, circular or fusion transcripts). Here, we developed a new NCL-transcript-detecting method ('NCLscan'), which utilized a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives, enabling NCLscan outperform 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript. With the high accuracy, NCLscan was applied to distinguishing between trans-spliced, circular and fusion transcripts on the basis of poly(A)- and nonpoly(A)-selected RNA-seq data. We showed that circular RNAs were expressed more ubiquitously, more abundantly and less cell type-specifically than trans-spliced and fusion transcripts. Our study thus describes a robust pipeline for the discovery of NCL transcripts, and sheds light on the fundamental biology of these non-canonical RNA events in human transcriptome.


Asunto(s)
Empalme del ARN , ARN Mensajero/genética , ARN/genética , Límite de Detección , ARN Circular , Reproducibilidad de los Resultados
11.
Genome Res ; 24(1): 25-36, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24131564

RESUMEN

Trans-splicing is a post-transcriptional event that joins exons from separate pre-mRNAs. Detection of trans-splicing is usually severely hampered by experimental artifacts and genetic rearrangements. Here, we develop a new computational pipeline, TSscan, which integrates different types of high-throughput long-/short-read transcriptome sequencing of different human embryonic stem cell (hESC) lines to effectively minimize false positives while detecting trans-splicing. Combining TSscan screening with multiple experimental validation steps revealed that most chimeric RNA products were platform-dependent experimental artifacts of RNA sequencing. We successfully identified and confirmed four trans-spliced RNAs, including the first reported trans-spliced large intergenic noncoding RNA ("tsRMST"). We showed that these trans-spliced RNAs were all highly expressed in human pluripotent stem cells and differentially expressed during hESC differentiation. Our results further indicated that tsRMST can contribute to pluripotency maintenance of hESCs by suppressing lineage-specific gene expression through the recruitment of NANOG and the PRC2 complex factor, SUZ12. Taken together, our findings provide important insights into the role of trans-splicing in pluripotency maintenance of hESCs and help to facilitate future studies into trans-splicing, opening up this important but understudied class of post-transcriptional events for comprehensive characterization.


Asunto(s)
Células Madre Embrionarias/fisiología , Secuenciación de Nucleótidos de Alto Rendimiento , Células Madre Pluripotentes/fisiología , ARN Largo no Codificante/metabolismo , Análisis de Secuencia de ARN , Trans-Empalme , Transcriptoma , Animales , Línea Celular , Células Madre Embrionarias/citología , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Genoma , Histonas/metabolismo , Proteínas de Homeodominio/genética , Proteínas de Homeodominio/metabolismo , Humanos , Ratones , Proteína Homeótica Nanog , Proteínas de Neoplasias , Análisis de Secuencia por Matrices de Oligonucleótidos , Especificidad de Órganos , Células Madre Pluripotentes/citología , Complejo Represivo Polycomb 2/genética , Complejo Represivo Polycomb 2/metabolismo , ARN Largo no Codificante/genética , Reproducibilidad de los Resultados , Programas Informáticos , Factores de Transcripción
12.
Nucleic Acids Res ; 42(14): 9410-23, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25053845

RESUMEN

Global transcriptome investigations often result in the detection of an enormous number of transcripts composed of non-co-linear sequence fragments. Such 'aberrant' transcript products may arise from post-transcriptional events or genetic rearrangements, or may otherwise be false positives (sequencing/alignment errors or in vitro artifacts). Moreover, post-transcriptionally non-co-linear ('PtNcl') transcripts can arise from trans-splicing or back-splicing in cis (to generate so-called 'circular RNA'). Here, we collected previously-predicted human non-co-linear RNA candidates, and designed a validation procedure integrating in silico filters with multiple experimental validation steps to examine their authenticity. We showed that >50% of the tested candidates were in vitro artifacts, even though some had been previously validated by RT-PCR. After excluding the possibility of genetic rearrangements, we distinguished between trans-spliced and circular RNAs, and confirmed that these two splicing forms can share the same non-co-linear junction. Importantly, the experimentally-confirmed PtNcl RNA events and their corresponding PtNcl splicing types (i.e. trans-splicing, circular RNA, or both sharing the same junction) were all expressed in rhesus macaque, and some were even expressed in mouse. Our study thus describes an essential procedure for confirming PtNcl transcripts, and provides further insight into the evolutionary role of PtNcl RNA events, opening up this important, but understudied, class of post-transcriptional events for comprehensive characterization.


Asunto(s)
Artefactos , Empalme del ARN , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Trans-Empalme , Animales , Células Cultivadas , Evolución Molecular , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Macaca mulatta , Ratones , ARN/química , ARN/aislamiento & purificación , Sitios de Empalme de ARN , Análisis de Secuencia de ARN
13.
Mol Biol Evol ; 31(2): 387-96, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24157417

RESUMEN

DNA methylation at CpG dinucleotides can significantly increase the rate of cytosine-to-thymine mutations and the level of sequence divergence. Although the correlations between DNA methylation and genomic sequence evolution have been widely studied, an unaddressed yet fundamental question is how DNA methylation is associated with the conservation of individual nucleotides in different sequence contexts. Here, we demonstrate that in mammalian exons, the correlations between DNA methylation and the conservation of individual nucleotides are dependent on the type of exonic sequence (coding or untranslated), the degeneracy of coding nucleotides, background selection pressure, and the relative position (first or nonfirst exon in the transcript) where the nucleotides are located. For untranslated and nonzero-fold degenerate nucleotides, methylated sites are less conserved than unmethylated sites regardless of background selection pressure and the relative position of the exon. For zero-fold degenerate (or nondegenerate) nucleotides, however, the reverse trend is observed in nonfirst coding exons and first coding exons that are under stringent background selection pressure. Furthermore, cytosine-to-thymine mutations at methylated zero-fold degenerate nucleotides are predicted to be more detrimental than those that occur at unmethylated nucleotides. As zero-fold and nonzero-fold degenerate nucleotides are very close to each other, our results suggest that the "functional resolution" of DNA methylation may be finer than previously recognized. In addition, the positive correlation between CpG methylation and the level of conservation at zero-fold degenerate nucleotides implies that CpG methylation may serve as an "indicator" of functional importance of these nucleotides.


Asunto(s)
Islas de CpG/genética , Metilación de ADN , ADN/genética , Mamíferos/genética , Nucleótidos/genética , Animales , Células Cultivadas , Epigénesis Genética , Evolución Molecular , Exones , Genómica , Humanos , Tasa de Mutación , Selección Genética
14.
BMC Plant Biol ; 15: 39, 2015 Feb 05.
Artículo en Inglés | MEDLINE | ID: mdl-25652661

RESUMEN

BACKGROUND: Crop plants such as rice, maize and sorghum play economically-important roles as main sources of food, fuel, and animal feed. However, current genome annotations of crop plants still suffer false-positive predictions; a more comprehensive registry of alternative splicing (AS) events is also in demand. Comparative genomics of crop plants is largely unexplored. RESULTS: We performed a large-scale comparative analysis (ExonFinder) of the expressed sequence tag (EST) library from nine grass plants against three crop genomes (rice, maize, and sorghum) and identified 2,879 previously-unannotated exons (i.e., novel exons) in the three crops. We validated 81% of the tested exons by RT-PCR-sequencing, supporting the effectiveness of our in silico strategy. Evolutionary analysis reveals that the novel exons, comparing with their flanking annotated ones, are generally under weaker selection pressure at the protein level, but under stronger pressure at the RNA level, suggesting that most of the novel exons also represent novel alternatively spliced variants (ASVs). However, we also observed the consistency of evolutionary rates between certain novel exons and their flanking exons, which provided further evidence of their co-occurrence in the transcripts, suggesting that previously-annotated isoforms might be subject to erroneous predictions. Our validation showed that 54% of the tested genes expressed the newly-identified isoforms that contained the novel exons, rather than the previously-annotated isoforms that excluded them. The consistent results were steadily observed across cultivated (Oryza sativa and O. glaberrima) and wild (O. rufipogon and O. nivara) rice species, asserting the necessity of our curation of the crop genome annotations. Our comparative analyses also inferred the common ancestral transcriptome of grass plants and gain- and loss-of-ASV events. CONCLUSIONS: We have reannotated the rice, maize, and sorghum genomes, and showed that evolutionary rates might serve as an indicator for determining whether the identified exons were alternatively spliced. This study not only presents an effective in silico strategy for the improvement of plant annotations, but also provides further insights into the role of AS events in the evolution and domestication of crop plants. ExonFinder and the novel exons/ASVs identified are publicly accessible at http://exonfinder.sourceforge.net/ .


Asunto(s)
Productos Agrícolas/genética , Etiquetas de Secuencia Expresada/química , Genoma de Planta , Proteínas de Plantas/genética , Poaceae/genética , Empalme Alternativo , Exones , Oryza/genética , Isoformas de Proteínas/genética , Reacción en Cadena en Tiempo Real de la Polimerasa , Sorghum/genética , Zea mays/genética
15.
16.
Nucleic Acids Res ; 41(13): 6371-80, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23658220

RESUMEN

Transcription factor (TF) and microRNA (miRNA) are two crucial trans-regulatory factors that coordinately control gene expression. Understanding the impacts of these two factors on the rate of protein sequence evolution is of great importance in evolutionary biology. While many biological factors associated with evolutionary rate variations have been studied, evolutionary analysis of simultaneously accounting for TF and miRNA regulations across metazoans is still uninvestigated. Here, we provide a series of statistical analyses to assess the influences of TF and miRNA regulations on evolutionary rates across metazoans (human, mouse and fruit fly). Our results reveal that the negative correlations between trans-regulation and evolutionary rates hold well across metazoans, but the strength of TF regulation as a rate indicator becomes weak when the other confounding factors that may affect evolutionary rates are controlled. We show that miRNA regulation tends to be a more essential indicator of evolutionary rates than TF regulation, and the combination of TF and miRNA regulations has a significant dependent effect on protein evolutionary rates. We also show that trans-regulation (especially miRNA regulation) is much more important in human/mouse than in fruit fly in determining protein evolutionary rates, suggesting a considerable variation in rate determinants between vertebrates and invertebrates.


Asunto(s)
Evolución Molecular , Regulación de la Expresión Génica , MicroARNs/metabolismo , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Drosophila melanogaster/genética , Humanos , Ratones , Proteínas/genética
17.
Proc Natl Acad Sci U S A ; 109(39): 15841-6, 2012 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-23019368

RESUMEN

DNA cytosine methylation is a central epigenetic marker that is usually mutagenic and may increase the level of sequence divergence. However, methylated genes have been reported to evolve more slowly than unmethylated genes. Hence, there is a controversy on whether DNA methylation is correlated with increased or decreased protein evolutionary rates. We hypothesize that this controversy has resulted from the differential correlations between DNA methylation and the evolutionary rates of coding exons in different genic positions. To test this hypothesis, we compare human-mouse and human-macaque exonic evolutionary rates against experimentally determined single-base resolution DNA methylation data derived from multiple human cell types. We show that DNA methylation is significantly related to within-gene variations in evolutionary rates. First, DNA methylation level is more strongly correlated with C-to-T mutations at CpG dinucleotides in the first coding exons than in the internal and last exons, although it is positively correlated with the synonymous substitution rate in all exon positions. Second, for the first exons, DNA methylation level is negatively correlated with exonic expression level, but positively correlated with both nonsynonymous substitution rate and the sample specificity of DNA methylation level. For the internal and last exons, however, we observe the opposite correlations. Our results imply that DNA methylation level is differentially correlated with the biological (and evolutionary) features of coding exons in different genic positions. The first exons appear more prone to the mutagenic effects, whereas the other exons are more influenced by the regulatory effects of DNA methylation.


Asunto(s)
Metilación de ADN/fisiología , Evolución Molecular , Exones/fisiología , Regulación de la Expresión Génica/fisiología , Sistemas de Lectura Abierta/fisiología , Animales , Línea Celular , Humanos , Macaca , Ratones , Mutación Missense , Especificidad de la Especie
18.
BMC Evol Biol ; 14: 145, 2014 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-24965500

RESUMEN

BACKGROUND: The evolution of the coding exome is a major driving force of functional divergence both between species and between protein isoforms. Exons at different positions in the transcript or in different transcript isoforms may (1) mutate at different rates due to variations in DNA methylation level; and (2) serve distinct biological roles, and thus be differentially targeted by natural selection. Furthermore, intrinsic exonic features, such as exon length, may also affect the evolution of individual exons. Importantly, the evolutionary effects of these intrinsic/extrinsic features may differ significantly between animals and plants. Such inter-lineage differences, however, have not been systematically examined. RESULTS: Here we examine how DNA methylation at CpG dinucleotides (CpG methylation), in the context of intrinsic exonic features (exon length and relative exon position in the transcript), influences the evolution of coding exons of Arabidopsis thaliana. We observed fairly different evolutionary patterns in A. thaliana as compared with those reported for animals. Firstly, the mutagenic effect of CpG methylation is the strongest for internal exons and the weakest for first exons despite the stringent selective constraints on the former group. Secondly, the mutagenic effect of CpG methylation increases significantly with length in first exons but not in the other two exon groups. Thirdly, CpG methylation level is correlated with evolutionary rates (dS, dN, and the dN/dS ratio) with markedly different patterns among the three exon groups. The correlations are generally positive, negative, and mixed for first, last, and internal exons, respectively. Fourthly, exon length is a CpG methylation-independent indicator of evolutionary rates, particularly for dN and the dN/dS ratio in last and internal exons. Finally, the evolutionary patterns of coding exons with regard to CpG methylation differ significantly between Arabidopsis species and mammals. CONCLUSIONS: Our results suggest that intrinsic features, including relative exonic position in the transcript and exon length, play an important role in the evolution of A. thaliana coding exons. Furthermore, CpG methylation is correlated with exonic evolutionary rates differentially between A. thaliana and animals, and may have served different biological roles in the two lineages.


Asunto(s)
Arabidopsis/genética , Metilación de ADN , Evolución Molecular , Exoma , Animales , Islas de CpG , Exones , Selección Genética
19.
Acta Neuropathol Commun ; 12(1): 77, 2024 05 18.
Artículo en Inglés | MEDLINE | ID: mdl-38762464

RESUMEN

Glioblastoma (GBM) is the most common malignant brain tumor in adults, which remains incurable and often recurs rapidly after initial therapy. While large efforts have been dedicated to uncover genomic/transcriptomic alternations associated with the recurrence of GBMs, the evolutionary trajectories of matched pairs of primary and recurrent (P-R) GBMs remain largely elusive. It remains challenging to identify genes associated with time to relapse (TTR) and construct a stable and effective prognostic model for predicting TTR of primary GBM patients. By integrating RNA-sequencing and genomic data from multiple datasets of patient-matched longitudinal GBMs of isocitrate dehydrogenase wild-type (IDH-wt), here we examined the associations of TTR with heterogeneities between paired P-R GBMs in gene expression profiles, tumor mutation burden (TMB), and microenvironment. Our results revealed a positive correlation between TTR and transcriptomic/genomic differences between paired P-R GBMs, higher percentages of non-mesenchymal-to-mesenchymal transition and mesenchymal subtype for patients with a short TTR than for those with a long TTR, a high correlation between paired P-R GBMs in gene expression profiles and TMB, and a negative correlation between the fitting level of such a paired P-R GBM correlation and TTR. According to these observations, we identified 55 TTR-associated genes and thereby constructed a seven-gene (ZSCAN10, SIGLEC14, GHRHR, TBX15, TAS2R1, CDKL1, and CD101) prognostic model for predicting TTR of primary IDH-wt GBM patients using univariate/multivariate Cox regression analyses. The risk scores estimated by the model were significantly negatively correlated with TTR in the training set and two independent testing sets. The model also segregated IDH-wt GBM patients into two groups with significantly divergent progression-free survival outcomes and showed promising performance for predicting 1-, 2-, and 3-year progression-free survival rates in all training and testing sets. Our findings provide new insights into the molecular understanding of GBM progression at recurrence and potential targets for therapeutic treatments.


Asunto(s)
Neoplasias Encefálicas , Glioblastoma , Isocitrato Deshidrogenasa , Recurrencia Local de Neoplasia , Transcriptoma , Humanos , Glioblastoma/genética , Glioblastoma/patología , Isocitrato Deshidrogenasa/genética , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patología , Recurrencia Local de Neoplasia/genética , Masculino , Femenino , Genómica/métodos , Mutación , Persona de Mediana Edad , Factores de Tiempo
20.
Elife ; 122024 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-38752723

RESUMEN

A causal relationship exists among the aging process, organ decay and disfunction, and the occurrence of various diseases including cancer. A genetically engineered mouse model, termed Klf1K74R/K74R or Klf1(K74R), carrying mutation on the well-conserved sumoylation site of the hematopoietic transcription factor KLF1/EKLF has been generated that possesses extended lifespan and healthy characteristics, including cancer resistance. We show that the healthy longevity characteristics of the Klf1(K74R) mice, as exemplified by their higher anti-cancer capability, are likely gender-, age-, and genetic background-independent. Significantly, the anti-cancer capability, in particular that against melanoma as well as hepatocellular carcinoma, and lifespan-extending property of Klf1(K74R) mice, could be transferred to wild-type mice via transplantation of their bone marrow mononuclear cells at a young age of the latter. Furthermore, NK(K74R) cells carry higher in vitro cancer cell-killing ability than wild-type NK cells. Targeted/global gene expression profiling analysis has identified changes in the expression of specific proteins, including the immune checkpoint factors PDCD and CD274, and cellular pathways in the leukocytes of the Klf1(K74R) that are in the directions of anti-cancer and/or anti-aging. This study demonstrates the feasibility of developing a transferable hematopoietic/blood system for long-term anti-cancer and, potentially, for anti-aging.


Asunto(s)
Factores de Transcripción de Tipo Kruppel , Longevidad , Animales , Factores de Transcripción de Tipo Kruppel/genética , Factores de Transcripción de Tipo Kruppel/metabolismo , Ratones , Longevidad/genética , Células Asesinas Naturales/inmunología , Neoplasias/genética , Ingeniería Genética , Trasplante de Médula Ósea , Femenino , Perfilación de la Expresión Génica , Masculino , Ratones Transgénicos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA