Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Nucleic Acids Res ; 52(W1): W341-W347, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38709877

RESUMEN

Genes commonly express multiple RNA products (RNA isoforms), which differ in exonic content and can have different functions. Making sense of the plethora of known and novel RNA isoforms being identified by transcriptomic approaches requires a user-friendly way to visualize gene isoforms and how they differ in exonic content, expression levels and potential functions. Here we introduce IsoVis, a freely available webserver that accepts user-supplied transcriptomic data and visualizes the expressed isoforms in a clear, intuitive manner. IsoVis contains numerous features, including the ability to visualize all RNA isoforms of a gene and their expression levels; the annotation of known isoforms from external databases; mapping of protein domains and features to exons, allowing changes to protein sequence and function between isoforms to be established; and extensive species compatibility. Datasets visualised on IsoVis remain private to the user, allowing analysis of sensitive data. IsoVis visualisations can be downloaded to create publication-ready figures. The IsoVis webserver enables researchers to perform isoform analyses without requiring programming skills, is free to use, and available at https://isomix.org/isovis/.


Asunto(s)
Internet , Anotación de Secuencia Molecular , Isoformas de ARN , Programas Informáticos , Isoformas de ARN/genética , Isoformas de ARN/metabolismo , Isoformas de ARN/química , Humanos , Animales , Exones/genética , Transcriptoma/genética , Empalme Alternativo
2.
Nat Protoc ; 19(6): 1835-1865, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38347203

RESUMEN

RNA structure determination is essential to understand how RNA carries out its diverse biological functions. In cells, RNA isoforms are readily expressed with partial variations within their sequences due, for example, to alternative splicing, heterogeneity in the transcription start site, RNA processing or differential termination/polyadenylation. Nanopore dimethyl sulfate mutational profiling (Nano-DMS-MaP) is a method for in situ isoform-specific RNA structure determination. Unlike similar methods that rely on short sequencing reads, Nano-DMS-MaP employs nanopore sequencing to resolve the structures of long and highly similar RNA molecules to reveal their previously hidden structural differences. This Protocol describes the development and applications of Nano-DMS-MaP and outlines the main considerations for designing and implementing a successful experiment: from bench to data analysis. In cell probing experiments can be carried out by an experienced molecular biologist in 3-4 d. Data analysis requires good knowledge of command line tools and Python scripts and requires a further 3-5 d.


Asunto(s)
Conformación de Ácido Nucleico , ARN , Ésteres del Ácido Sulfúrico , Ésteres del Ácido Sulfúrico/química , ARN/química , ARN/genética , Isoformas de ARN/genética , Isoformas de ARN/química , Análisis de Secuencia de ARN/métodos , Humanos , Nanoporos , Secuenciación de Nanoporos/métodos
4.
RNA Biol ; 19(1): 279-289, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35188062

RESUMEN

The Drosha cleavage of a pri-miRNA defines mature microRNA sequence. Drosha cleavage at alternative positions generates 5' isoforms (isomiRs) which have distinctive functions. To understand how pri-miRNA structures influence Drosha cleavage, we performed a systematic analysis of the maturation of endogenous pri-miRNAs and their variants both in vitro and in vivo. We show that in addition to previously known features, the overall structural flexibility of pri-miRNA impact Drosha cleavage fidelity. Internal loops and nearby G · U wobble pairs on the pri-miRNA stem induce the use of non-canonical cleavage sites by Drosha, resulting in 5' isomiR production. By analysing patient data deposited in the Cancer Genome Atlas, we provide evidence that alternative Drosha cleavage of pri-miRNAs is a tunable process that responds to the level of pri-miRNA-associated RNA-binding proteins. Together, our findings reveal that Drosha cleavage fidelity can be modulated by altering pri-miRNA structure, a potential mechanism underlying 5' isomiR biogenesis in tumours.[Figure: see text].


Asunto(s)
MicroARNs/química , Conformación de Ácido Nucleico , Isoformas de ARN/química , Humanos , MicroARNs/genética , MicroARNs/metabolismo , División del ARN , Isoformas de ARN/genética , Isoformas de ARN/metabolismo , Ribonucleasa III/metabolismo , Relación Estructura-Actividad
5.
RNA ; 28(2): 162-176, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34728536

RESUMEN

Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional direct RNA nanopore sequencing, the 5' and 3' ends of poly(A) RNA cannot be identified unambiguously. This is due in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoforms among ∼4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5' m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nt oligomer. This oligomer adaptation method improved 5' end sequencing and ensured correct identification of the 5' m7G capped ends. Second, among these 5'-capped nanopore reads, we screened for features consistent with a 3' polyadenylation site. Combining these two steps, we identified 294,107 individual high-confidence full-length RNA scaffolds from human GM12878 cells, most of which (257,721) aligned to protein-coding genes. Of these, 4876 scaffolds indicated unannotated isoforms that were often internal to longer, previously identified RNA isoforms. Orthogonal data for m7G caps and open chromatin, such as CAGE and DNase-HS seq, confirmed the validity of these high-confidence RNA scaffolds.


Asunto(s)
Isoformas de ARN/química , ARN Mensajero/química , Línea Celular Tumoral , Humanos , Secuenciación de Nanoporos/métodos , Señales de Poliadenilación de ARN 3' , Isoformas de ARN/genética , ARN Mensajero/genética , Transcriptoma
6.
Genome Res ; 30(9): 1332-1344, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32887688

RESUMEN

Eukaryotic genes often generate a variety of RNA isoforms that can lead to functionally distinct protein variants. The synthesis and stability of RNA isoforms is poorly characterized because current methods to quantify RNA metabolism use short-read sequencing and cannot detect RNA isoforms. Here we present nanopore sequencing-based isoform dynamics (nano-ID), a method that detects newly synthesized RNA isoforms and monitors isoform metabolism. Nano-ID combines metabolic RNA labeling, long-read nanopore sequencing of native RNA molecules, and machine learning. Nano-ID derives RNA stability estimates and evaluates stability determining factors such as RNA sequence, poly(A)-tail length, secondary structure, translation efficiency, and RNA-binding proteins. Application of nano-ID to the heat shock response in human cells reveals that many RNA isoforms change their stability. Nano-ID also shows that the metabolism of individual RNA isoforms differs strongly from that estimated for the combined RNA signal at a specific gene locus. Nano-ID enables studies of RNA metabolism at the level of single RNA molecules and isoforms in different cell states and conditions.


Asunto(s)
Secuenciación de Nanoporos/métodos , Isoformas de ARN/química , Estabilidad del ARN , Línea Celular Tumoral , Humanos , Aprendizaje Automático , Redes Neurales de la Computación , Isoformas de ARN/síntesis química , Uridina/química
7.
Nucleic Acids Res ; 48(14): 7700-7711, 2020 08 20.
Artículo en Inglés | MEDLINE | ID: mdl-32652016

RESUMEN

Arabidopsis thaliana transcriptomes have been extensively studied and characterized under different conditions. However, most of the current 'RNA-sequencing' technologies produce a relatively short read length and demand a reverse-transcription step, preventing effective characterization of transcriptome complexity. Here, we performed Direct RNA Sequencing (DRS) using the latest Oxford Nanopore Technology (ONT) with exceptional read length. We demonstrate that the complexity of the A. thaliana transcriptomes has been substantially under-estimated. The ONT direct RNA sequencing identified novel transcript isoforms at both the vegetative (14-day old seedlings, stage 1.04) and reproductive stages (stage 6.00-6.10) of development. Using in-house software called TrackCluster, we determined alternative transcription initiation (ATI), alternative polyadenylation (APA), alternative splicing (AS), and fusion transcripts. More than 38 500 novel transcript isoforms were identified, including six categories of fusion-transcripts that may result from differential RNA processing mechanisms. Aided by the Tombo algorithm, we found an enrichment of m5C modifications in the mobile mRNAs, consistent with a recent finding that m5C modification in mRNAs is crucial for their long-distance movement. In summary, ONT DRS offers an advantage in the identification and functional characterization of novel RNA isoforms and RNA base modifications, significantly improving annotation of the A. thaliana genome.


Asunto(s)
Arabidopsis/genética , Secuenciación de Nanoporos/métodos , ARN de Planta/química , ARN de Planta/metabolismo , Análisis de Secuencia de ARN/métodos , Transcriptoma , Citosina/metabolismo , Metilación , Isoformas de ARN/química , Isoformas de ARN/metabolismo , ARN Mensajero/química , ARN Mensajero/metabolismo , RNA-Seq
8.
Biochim Biophys Acta Gene Regul Mech ; 1863(4): 194373, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-30953728

RESUMEN

MicroRNAs (miRNAs) are a class of small non-coding RNAs that play increasingly appreciated roles in gene regulation. In animals, miRNAs silence gene expression by binding to partially complementary sequences within target mRNAs. It is well-established that miRNAs recognize canonical target sites by base-pairing in the 5'region. However, the development of biochemical methods has identified many novel, non-canonical target sites, suggesting additional modes of miRNA-target association. Here, we review the current knowledge of miRNA-target recognition and how new evidence supports or challenges existing models. We also review the process by which microRNA isoforms achieve functional diversification via modulation of target recognition.


Asunto(s)
MicroARNs/química , MicroARNs/metabolismo , Regulación de la Expresión Génica , Isoformas de ARN/química , Isoformas de ARN/metabolismo , ARN Mensajero/química , ARN Mensajero/metabolismo
9.
Curr Protoc Mol Biol ; 128(1): e99, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31503415

RESUMEN

The DMS region extraction and deep sequencing (DREADS) procedure was designed to probe RNA structure in vivo and to link this structural information to specific 3' isoforms. Growing cells are treated with the alkylating agent dimethyl sulfate (DMS), which enters easily into cells and modifies RNA molecules at solvent-exposed A and C residues. RNA is isolated, and sequencing libraries are constructed in a manner that preserves the identities of individual mRNA isoforms arising from alternative cleavage/polyadenylation sites. During the cDNA synthesis step of library construction, the progress of reverse transcriptase (RT) is blocked when it encounters a DMS modification on the RNA, leading to disproportionate cDNA termination adjacent to DMS-modified positions. After paired-end deep sequencing, the downstream end of each sequenced fragment is mapped to a specific cleavage/poly(A) site representing an individual mRNA 3' isoform. The upstream mapped end of the sequenced fragment defines where the RT reaction stopped. Over the population of all sequenced fragments derived from a particular isoform, A and C positions that are overrepresented next to the upstream endpoints in the DMS sample (relative to a parallel untreated control) are inferred to have been DMS modified, and hence solvent exposed. This method thus allows in vivo structural information obtained using DMS to be linked to individual mRNA 3' isoforms. © 2019 by John Wiley & Sons, Inc.


Asunto(s)
Técnicas Genéticas , Conformación de Ácido Nucleico , Isoformas de ARN/química , Ésteres del Ácido Sulfúrico/química , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , ARN de Hongos/química , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ARN
10.
Genome Biol ; 20(1): 196, 2019 09 26.
Artículo en Inglés | MEDLINE | ID: mdl-31554518

RESUMEN

BACKGROUND: DNA methylation (DNAm) is a critical regulator of both development and cellular identity and shows unique patterns in neurons. To better characterize maturational changes in DNAm patterns in these cells, we profile the DNAm landscape at single-base resolution across the first two decades of human neocortical development in NeuN+ neurons using whole-genome bisulfite sequencing and compare them to non-neurons (primarily glia) and prenatal homogenate cortex. RESULTS: We show that DNAm changes more dramatically during the first 5 years of postnatal life than during the entire remaining period. We further refine global patterns of increasingly divergent neuronal CpG and CpH methylation (mCpG and mCpH) into six developmental trajectories and find that in contrast to genome-wide patterns, neighboring mCpG and mCpH levels within these regions are highly correlated. We integrate paired RNA-seq data and identify putative regulation of hundreds of transcripts and their splicing events exclusively by mCpH levels, independently from mCpG levels, across this period. We finally explore the relationship between DNAm patterns and development of brain-related phenotypes and find enriched heritability for many phenotypes within identified DNAm features. CONCLUSIONS: By profiling DNAm changes in NeuN-sorted neurons over the span of human cortical development, we identify novel, dynamic regions of DNAm that would be masked in homogenate DNAm data; expand on the relationship between CpG methylation, CpH methylation, and gene expression; and find enrichment particularly for neuropsychiatric diseases in genomic regions with cell type-specific, developmentally dynamic DNAm patterns.


Asunto(s)
Encéfalo/crecimiento & desarrollo , Metilación de ADN , Neuronas/metabolismo , Adolescente , Encéfalo/embriología , Encéfalo/metabolismo , Encéfalo/fisiología , Niño , Preescolar , Islas de CpG , Expresión Génica , Genómica , Humanos , Lactante , Recién Nacido , Plasticidad Neuronal , Isoformas de ARN/química , Isoformas de ARN/metabolismo , Empalme del ARN , Adulto Joven
11.
Nucleic Acids Res ; 47(14): 7262-7275, 2019 08 22.
Artículo en Inglés | MEDLINE | ID: mdl-31305886

RESUMEN

RNA-Seq is a powerful transcriptome profiling technology enabling transcript discovery and quantification. Whilst most commonly used for gene-level quantification, the data can be used for the analysis of transcript isoforms. However, when the underlying transcript assemblies are complex, current visualization approaches can be limiting, with splicing events a challenge to interpret. Here, we report on the development of a graph-based visualization method as a complementary approach to understanding transcript diversity from short-read RNA-Seq data. Following the mapping of reads to a reference genome, a read-to-read comparison is performed on all reads mapping to a given gene, producing a weighted similarity matrix between reads. This is used to produce an RNA assembly graph, where nodes represent reads and edges similarity scores between them. The resulting graphs are visualized in 3D space to better appreciate their sometimes large and complex topology, with other information being overlaid on to nodes, e.g. transcript models. Here we demonstrate the utility of this approach, including the unusual structure of these graphs and how they can be used to identify issues in assembly, repetitive sequences within transcripts and splice variants. We believe this approach has the potential to significantly improve our understanding of transcript complexity.


Asunto(s)
Empalme Alternativo , Gráficos por Computador , Perfilación de la Expresión Génica/métodos , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos , Genoma Humano/genética , Humanos , Modelos Genéticos , Modelos Moleculares , Conformación de Ácido Nucleico , Isoformas de ARN/química , Isoformas de ARN/genética , Isoformas de ARN/metabolismo , ARN Mensajero/química , ARN Mensajero/metabolismo
12.
Mol Cell ; 72(5): 849-861.e6, 2018 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-30318446

RESUMEN

Alternative polyadenylation generates numerous 3' mRNA isoforms that can vary in biological properties, such as stability and localization. We developed methods to obtain transcriptome-scale structural information and protein binding on individual 3' mRNA isoforms in vivo. Strikingly, near-identical mRNA isoforms can possess dramatically different structures throughout the 3' UTR. Analyses of identical mRNAs in different species or refolded in vitro indicate that structural differences in vivo are often due to trans-acting factors. The level of Pab1 binding to poly(A)-containing isoforms is surprisingly variable, and differences in Pab1 binding correlate with the extent of structural variation for closely spaced isoforms. A pattern encompassing single-strandedness near the 3' terminus, double-strandedness of the poly(A) tail, and low Pab1 binding is associated with mRNA stability. Thus, individual 3' mRNA isoforms can be remarkably different physical entities in vivo. Sequences responsible for isoform-specific structures, differential Pab1 binding, and mRNA stability are evolutionarily conserved, indicating biological function.


Asunto(s)
Regulación Fúngica de la Expresión Génica , Proteínas de Unión a Poli(A)/genética , Isoformas de ARN/química , ARN de Hongos/química , ARN Mensajero/química , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Secuencia de Bases , Conformación de Ácido Nucleico , Proteínas de Unión a Poli(A)/metabolismo , Poliadenilación , Unión Proteica , Isoformas de ARN/genética , Isoformas de ARN/metabolismo , Estabilidad del ARN , ARN de Hongos/genética , ARN de Hongos/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Transcriptoma
13.
Gene ; 669: 1-7, 2018 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-29800732

RESUMEN

Pituitary homeobox 2 (PITX2) plays crucial roles in embryogenesis, ontogenesis, growth, and development via the Wnt/beta-catenin and POU1F1 pathways. To better understand the characteristics and genetic effects of the cattle PITX2 gene, we identified alternative PITX2 splicings, examined the effects of the spliced variants on mRNA expression levels in tissues, and then used association analyses to explore the relationships between a PITX2 deletion genetic variant and growth traits in 750 native Chinese cattle. An unreported spliced variant of PITX2, designated here as PITX2-V1, was identified in cattle using in silico cloning and RT-PCR. The entire coding sequence of PITX2 is 978 bp, encoding 325 amino acids, whereas that of PITX2-V1 is 357 bp encoding 118 amino acids. Cattle PITX2 exhibited both a perfect homeodomain and an OAR domain, but PITX2-V1 lacked the homeodomain. Analyses with qRT-PCR showed that the expression level of PITX2 in cattle testis was very low, and PITX2-V1 was only very slightly expressed in the brain and testis. Furthermore, a 24 bp deletion was detected within PITX2 intron, and the different genotypes were significantly associated with growth traits (e.g., body height, body length, heart girth) in four cattle breeds (P < 0.05). These results are of direct benefit to future cattle breeding, and provide new insights into the characteristics and functions of cattle PITX2 gene.


Asunto(s)
Empalme Alternativo , Bovinos/genética , Proteínas de Homeodominio/genética , Factores de Transcripción/genética , Secuencia de Aminoácidos , Animales , Bovinos/crecimiento & desarrollo , Bovinos/metabolismo , Femenino , Frecuencia de los Genes , Genotipo , Proteínas de Homeodominio/química , Proteínas de Homeodominio/metabolismo , Masculino , Isoformas de ARN/química , Isoformas de ARN/metabolismo , ARN Mensajero/química , ARN Mensajero/metabolismo , Alineación de Secuencia , Eliminación de Secuencia , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Transcriptoma
14.
Proc Natl Acad Sci U S A ; 114(47): E10244-E10253, 2017 11 21.
Artículo en Inglés | MEDLINE | ID: mdl-29109288

RESUMEN

Chronic obstructive pulmonary disease (COPD) affects over 65 million individuals worldwide, where α-1-antitrypsin deficiency is a major genetic cause of the disease. The α-1-antitrypsin gene, SERPINA1, expresses an exceptional number of mRNA isoforms generated entirely by alternative splicing in the 5'-untranslated region (5'-UTR). Although all SERPINA1 mRNAs encode exactly the same protein, expression levels of the individual mRNAs vary substantially in different human tissues. We hypothesize that these transcripts behave unequally due to a posttranscriptional regulatory program governed by their distinct 5'-UTRs and that this regulation ultimately determines α-1-antitrypsin expression. Using whole-transcript selective 2'-hydroxyl acylation by primer extension (SHAPE) chemical probing, we show that splicing yields distinct local 5'-UTR secondary structures in SERPINA1 transcripts. Splicing in the 5'-UTR also changes the inclusion of long upstream ORFs (uORFs). We demonstrate that disrupting the uORFs results in markedly increased translation efficiencies in luciferase reporter assays. These uORF-dependent changes suggest that α-1-antitrypsin protein expression levels are controlled at the posttranscriptional level. A leaky-scanning model of translation based on Kozak translation initiation sequences alone does not adequately explain our quantitative expression data. However, when we incorporate the experimentally derived RNA structure data, the model accurately predicts translation efficiencies in reporter assays and improves α-1-antitrypsin expression prediction in primary human tissues. Our results reveal that RNA structure governs a complex posttranscriptional regulatory program of α-1-antitrypsin expression. Crucially, these findings describe a mechanism by which genetic alterations in noncoding gene regions may result in α-1-antitrypsin deficiency.


Asunto(s)
Empalme Alternativo/genética , Modelos Biológicos , Biosíntesis de Proteínas/genética , ARN Mensajero/química , alfa 1-Antitripsina/genética , Regiones no Traducidas 5'/genética , Células A549 , Secuencia de Bases , Células Hep G2 , Humanos , Mutagénesis , Sistemas de Lectura Abierta/genética , Enfermedad Pulmonar Obstructiva Crónica/genética , Relación Estructura-Actividad Cuantitativa , Isoformas de ARN/química , Isoformas de ARN/genética , ARN Mensajero/genética , Deficiencia de alfa 1-Antitripsina/genética
15.
Sci Rep ; 6: 28977, 2016 06 29.
Artículo en Inglés | MEDLINE | ID: mdl-27353836

RESUMEN

Transcriptional heterogeneity is extensive in the genome, and most genes express variable transcript isoforms. However, whether variable transcript isoforms of one gene are regulated by common promoter elements remain to be elucidated. Here, we investigated whether isoform promoters of one gene have separated DNA signals for transcription and translation initiation. We found that TATA box and nucleosome-disfavored DNA sequences are prevalent in distinct transcript isoform promoters of one gene. These DNA signals are conserved among species. Transcript isoform has a RNA-determined unstructured region around its start site. We found that these DNA/RNA features facilitate isoform transcription and translation. These results suggest a DNA-encoded mechanism by which transcript isoform is generated.


Asunto(s)
Regiones Promotoras Genéticas , Isoformas de ARN/genética , Levaduras/genética , Secuencia de Bases , Genoma Fúngico , Iniciación de la Cadena Peptídica Traduccional , Isoformas de ARN/química , TATA Box , Transcripción Genética
16.
Nucleic Acids Res ; 43(15): e96, 2015 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-25953852

RESUMEN

Most mammalian genes have mRNA variants due to alternative promoter usage, alternative splicing, and alternative cleavage and polyadenylation. Expression of alternative RNA isoforms has been found to be associated with tumorigenesis, proliferation and differentiation. Detection of condition-associated transcription variation requires association methods. Traditional association methods such as Pearson chi-square test and Fisher Exact test are single test methods and do not work on count data with replicates. Although the Cochran Mantel Haenszel (CMH) approach can handle replicated count data, our simulations showed that multiple CMH tests still had very low power. To identify condition-associated variation of transcription, we here proposed a ranking analysis of chi-squares (RAX2) for large-scale association analysis. RAX2 is a nonparametric method and has accurate and conservative estimation of FDR profile. Simulations demonstrated that RAX2 performs well in finding condition-associated transcription variants. We applied RAX2 to primary T-cell transcriptomic data and identified 1610 (16.3%) tags associated in transcription with immune stimulation at FDR < 0.05. Most of these tags also had differential expression. Analysis of two and three tags within genes revealed that under immune stimulation short RNA isoforms were preferably used.


Asunto(s)
Empalme Alternativo , Perfilación de la Expresión Génica/métodos , Poliadenilación , Linfocitos T CD4-Positivos/metabolismo , Línea Celular , Distribución de Chi-Cuadrado , Variación Genética , Genómica/métodos , Humanos , Isoformas de ARN/química , Isoformas de ARN/metabolismo , Estadísticas no Paramétricas , Transcripción Genética
17.
Bioinformatics ; 31(14): 2400-2, 2015 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-25617416

RESUMEN

MOTIVATION: Analysis of RNA sequencing (RNA-Seq) data revealed that the vast majority of human genes express multiple mRNA isoforms, produced by alternative pre-mRNA splicing and other mechanisms, and that most alternative isoforms vary in expression between human tissues. As RNA-Seq datasets grow in size, it remains challenging to visualize isoform expression across multiple samples. RESULTS: To help address this problem, we present Sashimi plots, a quantitative visualization of aligned RNA-Seq reads that enables quantitative comparison of exon usage across samples or experimental conditions. Sashimi plots can be made using the Broad Integrated Genome Viewer or with a stand-alone command line program. AVAILABILITY AND IMPLEMENTATION: Software code and documentation freely available here: http://miso.readthedocs.org/en/fastmiso/sashimi.html


Asunto(s)
Empalme Alternativo , Exones , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Gráficos por Computador , Humanos , Isoformas de ARN/química , Isoformas de ARN/metabolismo , Alineación de Secuencia
18.
Nucleic Acids Res ; 43(1): e1, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25056322

RESUMEN

The preparation and high-throughput sequencing of cDNA libraries from samples of small RNA is a powerful tool to quantify known small RNAs (such as microRNAs) and to discover novel RNA species. Interest in identifying the small RNA repertoire present in tissues and in biofluids has grown substantially with the findings that small RNAs can serve as indicators of biological conditions and disease states. Here we describe a novel and straightforward method to clone cDNA libraries from small quantities of input RNA. This method permits the generation of cDNA libraries from sub-picogram quantities of RNA robustly, efficiently and reproducibly. We demonstrate that the method provides a significant improvement in sensitivity compared to previous cloning methods while maintaining reproducible identification of diverse small RNA species. This method should have widespread applications in a variety of contexts, including biomarker discovery from scarce samples of human tissue or body fluids.


Asunto(s)
Clonación Molecular/métodos , Biblioteca de Genes , MicroARNs/sangre , Biotinilación , ADN Complementario/química , ADN Complementario/aislamiento & purificación , Nucleótidos de Desoxiuracil , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , MicroARNs/química , MicroARNs/aislamiento & purificación , Isoformas de ARN/química , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
19.
BMC Bioinformatics ; 15: 135, 2014 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-24885830

RESUMEN

BACKGROUND: The main goal of the whole transcriptome analysis is to correctly identify all expressed transcripts within a specific cell/tissue--at a particular stage and condition--to determine their structures and to measure their abundances. RNA-seq data promise to allow identification and quantification of transcriptome at unprecedented level of resolution, accuracy and low cost. Several computational methods have been proposed to achieve such purposes. However, it is still not clear which promises are already met and which challenges are still open and require further methodological developments. RESULTS: We carried out a simulation study to assess the performance of 5 widely used tools, such as: CEM, Cufflinks, iReckon, RSEM, and SLIDE. All of them have been used with default parameters. In particular, we considered the effect of the following three different scenarios: the availability of complete annotation, incomplete annotation, and no annotation at all. Moreover, comparisons were carried out using the methods in three different modes of action. In the first mode, the methods were forced to only deal with those isoforms that are present in the annotation; in the second mode, they were allowed to detect novel isoforms using the annotation as guide; in the third mode, they were operating in fully data driven way (although with the support of the alignment on the reference genome). In the latter modality, precision and recall are quite poor. On the contrary, results are better with the support of the annotation, even though it is not complete. Finally, abundance estimation error often shows a very skewed distribution. The performance strongly depends on the true real abundance of the isoforms. Lowly (and sometimes also moderately) expressed isoforms are poorly detected and estimated. In particular, lowly expressed isoforms are identified mainly if they are provided in the original annotation as potential isoforms. CONCLUSIONS: Both detection and quantification of all isoforms from RNA-seq data are still hard problems and they are affected by many factors. Overall, the performance significantly changes since it depends on the modes of action and on the type of available annotation. Results obtained using complete or partial annotation are able to detect most of the expressed isoforms, even though the number of false positives is often high. Fully data driven approaches require more attention, at least for complex eucaryotic genomes. Improvements are desirable especially for isoform quantification and for isoform detection with low abundance.


Asunto(s)
Isoformas de ARN/análisis , Programas Informáticos , Algoritmos , Perfilación de la Expresión Génica , Humanos , Isoformas de ARN/química , Isoformas de ARN/metabolismo , Análisis de Secuencia de ARN/métodos
20.
Bioinformatics ; 30(17): 2447-55, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-24813214

RESUMEN

MOTIVATION: Several state-of-the-art methods for isoform identification and quantification are based on [Formula: see text]-regularized regression, such as the Lasso. However, explicitly listing the-possibly exponentially-large set of candidate transcripts is intractable for genes with many exons. For this reason, existing approaches using the [Formula: see text]-penalty are either restricted to genes with few exons or only run the regression algorithm on a small set of preselected isoforms. RESULTS: We introduce a new technique called FlipFlop, which can efficiently tackle the sparse estimation problem on the full set of candidate isoforms by using network flow optimization. Our technique removes the need of a preselection step, leading to better isoform identification while keeping a low computational cost. Experiments with synthetic and real RNA-Seq data confirm that our approach is more accurate than alternative methods and one of the fastest available. AVAILABILITY AND IMPLEMENTATION: Source code is freely available as an R package from the Bioconductor Web site (http://www.bioconductor.org/), and more information is available at http://cbio.ensmp.fr/flipflop. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Isoformas de ARN/química , Análisis de Secuencia de ARN/métodos , Algoritmos , Exones , Humanos , Modelos Estadísticos , Isoformas de ARN/metabolismo , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...