Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Nat Methods ; 2024 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-38849569

RESUMEN

The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.

3.
Nat Methods ; 16(12): 1297-1305, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31740818

RESUMEN

High-throughput complementary DNA sequencing technologies have advanced our understanding of transcriptome complexity and regulation. However, these methods lose information contained in biological RNA because the copied reads are often short and modifications are not retained. We address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies. Our study generated 9.9 million aligned sequence reads for the human cell line GM12878, using thirty MinION flow cells at six institutions. These native RNA reads had a median length of 771 bases, and a maximum aligned length of over 21,000 bases. Mitochondrial poly(A) reads provided an internal measure of read-length quality. We combined these long nanopore reads with higher accuracy short-reads and annotated GM12878 promoter regions to identify 33,984 plausible RNA isoforms. We describe strategies for assessing 3' poly(A) tail length, base modifications and transcript haplotypes.


Asunto(s)
Secuenciación de Nanoporos/métodos , Poli A/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma , Células Cultivadas , Humanos
4.
Mol Biol Evol ; 33(12): 3308-3313, 2016 12.
Artículo en Inglés | MEDLINE | ID: mdl-27687565

RESUMEN

The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations.


Asunto(s)
Drosophila melanogaster/genética , Genoma de los Insectos , Animales , Bases de Datos de Ácidos Nucleicos , Frecuencia de los Genes , Variación Genética , Estándares de Referencia , Selección Genética
5.
Genome Biol ; 25(1): 173, 2024 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-38956576

RESUMEN

BACKGROUND: RNA-seq has brought forth significant discoveries regarding aberrations in RNA processing, implicating these RNA variants in a variety of diseases. Aberrant splicing and single nucleotide variants (SNVs) in RNA have been demonstrated to alter transcript stability, localization, and function. In particular, the upregulation of ADAR, an enzyme that mediates adenosine-to-inosine editing, has been previously linked to an increase in the invasiveness of lung adenocarcinoma cells and associated with splicing regulation. Despite the functional importance of studying splicing and SNVs, the use of short-read RNA-seq has limited the community's ability to interrogate both forms of RNA variation simultaneously. RESULTS: We employ long-read sequencing technology to obtain full-length transcript sequences, elucidating cis-effects of variants on splicing changes at a single molecule level. We develop a computational workflow that augments FLAIR, a tool that calls isoform models expressed in long-read data, to integrate RNA variant calls with the associated isoforms that bear them. We generate nanopore data with high sequence accuracy from H1975 lung adenocarcinoma cells with and without knockdown of ADAR. We apply our workflow to identify key inosine isoform associations to help clarify the prominence of ADAR in tumorigenesis. CONCLUSIONS: Ultimately, we find that a long-read approach provides valuable insight toward characterizing the relationship between RNA variants and splicing patterns.


Asunto(s)
Haplotipos , Humanos , Línea Celular Tumoral , Polimorfismo de Nucleótido Simple , Adenosina Desaminasa/genética , Adenosina Desaminasa/metabolismo , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo , Neoplasias Pulmonares/genética , Empalme del ARN , Inosina/metabolismo , Inosina/genética , Análisis de Secuencia de ARN , Adenocarcinoma del Pulmón/genética , Edición de ARN , Programas Informáticos
6.
bioRxiv ; 2023 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-37398362

RESUMEN

Background: RNA-Seq has brought forth significant discoveries regarding aberrations in RNA processing, implicating these RNA variants in a variety of diseases. Aberrant splicing and single nucleotide variants in RNA have been demonstrated to alter transcript stability, localization, and function. In particular, the upregulation of ADAR, an enzyme which mediates adenosine-to-inosine editing, has been previously linked to an increase in the invasiveness of lung ADC cells and associated with splicing regulation. Despite the functional importance of studying splicing and SNVs, short read RNA-Seq has limited the community's ability to interrogate both forms of RNA variation simultaneously. Results: We employed long-read technology to obtain full-length transcript sequences, elucidating cis-effects of variants on splicing changes at a single molecule level. We have developed a computational workflow that augments FLAIR, a tool that calls isoform models expressed in long-read data, to integrate RNA variant calls with the associated isoforms that bear them. We generated nanopore data with high sequence accuracy of H1975 lung adenocarcinoma cells with and without knockdown of ADAR. We applied our workflow to identify key inosine-isoform associations to help clarify the prominence of ADAR in tumorigenesis. Conclusions: Ultimately, we find that a long-read approach provides valuable insight toward characterizing the relationship between RNA variants and splicing patterns.

7.
Life Sci Alliance ; 6(10)2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37487637

RESUMEN

U2AF1 is one of the most recurrently mutated splicing factors in lung adenocarcinoma and has been shown to cause transcriptome-wide pre-mRNA splicing alterations; however, the full-length altered mRNA isoforms associated with the mutation are largely unknown. To better understand the impact U2AF1 has on full-length isoform fate and function, we conducted high-throughput long-read cDNA sequencing from isogenic human bronchial epithelial cells with and without a U2AF1 S34F mutation. We identified 49,366 multi-exon transcript isoforms, more than half of which did not match GENCODE or short-read-assembled isoforms. We found 198 transcript isoforms with significant expression and usage changes relative to WT, only 68% of which were assembled by short reads. Expression of isoforms from immune-related genes is largely down-regulated in mutant cells and without observed splicing changes. Finally, we reveal that isoforms likely targeted by nonsense-mediated decay are down-regulated in U2AF1 S34F cells, suggesting that isoform changes may alter the translational output of those affected genes. Altogether, our work provides a resource of full-length isoforms associated with U2AF1 S34F in lung cells.


Asunto(s)
Células Epiteliales , Empalme del ARN , Humanos , Factor de Empalme U2AF/genética , Factor de Empalme U2AF/metabolismo , Empalme del ARN/genética , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Células Epiteliales/metabolismo , Mutación/genética
8.
bioRxiv ; 2023 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-37546854

RESUMEN

The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.

9.
Dev Cell ; 57(5): 624-637.e4, 2022 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-35202586

RESUMEN

Alternative splicing generates distinct mRNA variants and is essential for development, homeostasis, and renewal. Proteins of the serine/arginine (SR)-rich splicing factor family are major splicing regulators that are broadly required for organ development as well as cell and organism viability. However, how these proteins support adult organ function remains largely unknown. Here, we used the continuously growing mouse incisor as a model to dissect the functions of the prototypical SR family protein SRSF1 during tissue homeostasis and renewal. We identified an SRSF1-governed alternative splicing network that is specifically required for dental proliferation and survival of progenitors but dispensable for the viability of differentiated cells. We also observed a similar progenitor-specific role of SRSF1 in the small intestinal epithelium, indicating a conserved function of SRSF1 across adult epithelial tissues. Thus, our findings define a regulatory mechanism by which SRSF1 specifically controls progenitor-specific alternative splicing events to support adult tissue homeostasis and renewal.


Asunto(s)
Empalme Alternativo , Empalme del ARN , Empalme Alternativo/genética , Animales , Epitelio/metabolismo , Homeostasis , Ratones , Factores de Empalme Serina-Arginina/genética , Factores de Empalme Serina-Arginina/metabolismo
10.
Nat Commun ; 11(1): 1438, 2020 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-32188845

RESUMEN

While splicing changes caused by somatic mutations in SF3B1 are known, identifying full-length isoform changes may better elucidate the functional consequences of these mutations. We report nanopore sequencing of full-length cDNA from CLL samples with and without SF3B1 mutation, as well as normal B cell samples, giving a total of 149 million pass reads. We present FLAIR (Full-Length Alternative Isoform analysis of RNA), a computational workflow to identify high-confidence transcripts, perform differential splicing event analysis, and differential isoform analysis. Using nanopore reads, we demonstrate differential 3' splice site changes associated with SF3B1 mutation, agreeing with previous studies. We also observe a strong downregulation of intron retention events associated with SF3B1 mutation. Full-length transcript analysis links multiple alternative splicing events together and allows for better estimates of the abundance of productive versus unproductive isoforms. Our work demonstrates the potential utility of nanopore sequencing for cancer and splicing research.


Asunto(s)
Regulación hacia Abajo/genética , Intrones/genética , Leucemia Linfocítica Crónica de Células B/genética , Mutación/genética , Fosfoproteínas/genética , Factores de Empalme de ARN/genética , Adulto , Empalme Alternativo/genética , Secuencia de Bases , Humanos , Secuenciación de Nanoporos , Isoformas de Proteínas/genética , Sitios de Empalme de ARN/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA