Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Biol ; 21(6): e3002133, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37390046

RESUMEN

Characterizing cellular diversity at different levels of biological organization and across data modalities is a prerequisite to understanding the function of cell types in the brain. Classification of neurons is also essential to manipulate cell types in controlled ways and to understand their variation and vulnerability in brain disorders. The BRAIN Initiative Cell Census Network (BICCN) is an integrated network of data-generating centers, data archives, and data standards developers, with the goal of systematic multimodal brain cell type profiling and characterization. Emphasis of the BICCN is on the whole mouse brain with demonstration of prototype feasibility for human and nonhuman primate (NHP) brains. Here, we provide a guide to the cellular and spatial approaches employed by the BICCN, and to accessing and using these data and extensive resources, including the BRAIN Cell Data Center (BCDC), which serves to manage and integrate data across the ecosystem. We illustrate the power of the BICCN data ecosystem through vignettes highlighting several BICCN analysis and visualization tools. Finally, we present emerging standards that have been developed or adopted toward Findable, Accessible, Interoperable, and Reusable (FAIR) neuroscience. The combined BICCN ecosystem provides a comprehensive resource for the exploration and analysis of cell types in the brain.


Asunto(s)
Encéfalo , Neurociencias , Animales , Humanos , Ratones , Ecosistema , Neuronas
2.
Genome Res ; 32(4): 726-737, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35301264

RESUMEN

Long-read transcriptomics require understanding error sources inherent to technologies. Current approaches cannot compare methods for an individual RNA molecule. Here, we present a novel platform-comparison method that combines barcoding strategies and long-read sequencing to sequence cDNA copies representing an individual RNA molecule on both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). We compare these long-read pairs in terms of sequence content and isoform patterns. Although individual read pairs show high similarity, we find differences in (1) aligned length, (2) transcription start site (TSS), (3) polyadenylation site (poly(A)-site) assignment, and (4) exon-intron structures. Overall, 25% of read pairs disagree on either TSS, poly(A)-site, or splice site. Intron-chain disagreement typically arises from alignment errors of microexons and complicated splice sites. Our single-molecule technology comparison reveals that inconsistencies are often caused by sequencing error-induced inaccurate ONT alignments, especially to downstream GUNNGU donor motifs. However, annotation-disagreeing upstream shifts in NAGNAG acceptors in ONT are often confirmed by PacBio and are thus likely real. In both barcoded and nonbarcoded ONT reads, we find that intron number and proximity of GU/AGs better predict inconsistencies with the annotation than read quality alone. We summarize these findings in an annotation-based algorithm for spliced alignment correction that improves subsequent transcript construction with ONT reads.


Asunto(s)
Nanoporos , ADN Complementario , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN , Análisis de Secuencia de ADN/métodos , Tecnología
3.
Bioinformatics ; 40(2)2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38262343

RESUMEN

MOTIVATION: Recent advancements in long-read RNA sequencing have enabled the examination of full-length isoforms, previously uncaptured by short-read sequencing methods. An alternative powerful method for studying isoforms is through the use of barcoded short-read RNA reads, for which a barcode indicates whether two short-reads arise from the same molecule or not. Such techniques included the 10x Genomics linked-read based SParse Isoform Sequencing (SPIso-seq), as well as Loop-Seq, or Tell-Seq. Some applications, such as novel-isoform discovery, require very high coverage. Obtaining high coverage using long reads can be difficult, making barcoded RNA-seq data a valuable alternative for this task. However, most annotation pipelines are not able to work with a set of short reads instead of a single transcript, also not able to work with coverage gaps within a molecule if any. In order to overcome this challenge, we present an RNA-seq assembler that allows the determination of the expressed isoform per barcode. RESULTS: In this article, we present cloudrnaSPAdes, a tool for assembling full-length isoforms from barcoded RNA-seq linked-read data in a reference-free fashion. Evaluating it on simulated and real human data, we found that cloudrnaSPAdes accurately assembles isoforms, even for genes with high isoform diversity. AVAILABILITY AND IMPLEMENTATION: cloudrnaSPAdes is a feature release of a SPAdes assembler and version used for this article is available at https://github.com/1dayac/cloudrnaSPAdes-release.


Asunto(s)
Genómica , ARN , Humanos , ARN/genética , Análisis de Secuencia de ARN/métodos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA-Seq , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Transcriptoma
4.
Bioinformatics ; 38(13): 3474-3476, 2022 06 27.
Artículo en Inglés | MEDLINE | ID: mdl-35604081

RESUMEN

SUMMARY: RNA isoforms contribute to the diverse functionality of the proteins they encode within the cell. Visualizing how isoform expression differs across cell types and brain regions can inform our understanding of disease and gain or loss of functionality caused by alternative splicing with potential negative impacts. However, the extent to which this occurs in specific cell types and brain regions is largely unknown. This is the kind of information that ScisorWiz plots can provide in an informative and easily communicable manner. ScisorWiz affords its user the opportunity to visualize specific genes across any number of cell types, and provides various sorting options for the user to gain different ways to understand their data. ScisorWiz provides a clear picture of differential isoform expression through various clustering methods and highlights features such as alternative exons and single-nucleotide variants. Tools like ScisorWiz are key for interpreting single-cell isoform sequencing data. This tool applies to any single-cell long-read RNA sequencing data in any cell type, tissue or species. AVAILABILITY AND IMPLEMENTATION: Source code is available at http://github.com/ans4013/ScisorWiz. No new data were generated for this publication. Data used to generate figures was sourced from GEO accession token GSE158450 and available on GitHub as example data.


Asunto(s)
Empalme Alternativo , Programas Informáticos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Isoformas de ARN/metabolismo , Exones , Análisis de Secuencia de ARN
5.
Mol Psychiatry ; 27(3): 1416-1434, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34789849

RESUMEN

Due to an inability to ethically access developing human brain tissue as well as identify prospective cases, early-arising neurodevelopmental and cell-specific signatures of Schizophrenia (Scz) have remained unknown and thus undefined. To overcome these challenges, we utilized patient-derived induced pluripotent stem cells (iPSCs) to generate 3D cerebral organoids to model neuropathology of Scz during this critical period. We discovered that Scz organoids exhibited ventricular neuropathology resulting in altered progenitor survival and disrupted neurogenesis. This ultimately yielded fewer neurons within developing cortical fields of Scz organoids. Single-cell sequencing revealed that Scz progenitors were specifically depleted of neuronal programming factors leading to a remodeling of cell-lineages, altered differentiation trajectories, and distorted cortical cell-type diversity. While Scz organoids were similar in their macromolecular diversity to organoids generated from healthy controls (Ctrls), four GWAS factors (PTN, COMT, PLCL1, and PODXL) and peptide fragments belonging to the POU-domain transcription factor family (e.g., POU3F2/BRN2) were altered. This revealed that Scz organoids principally differed not in their proteomic diversity, but specifically in their total quantity of disease and neurodevelopmental factors at the molecular level. Single-cell sequencing subsequently identified cell-type specific alterations in neuronal programming factors as well as a developmental switch in neurotrophic growth factor expression, indicating that Scz neuropathology can be encoded on a cell-type-by-cell-type basis. Furthermore, single-cell sequencing also specifically replicated the depletion of BRN2 (POU3F2) and PTN in both Scz progenitors and neurons. Subsequently, in two mechanistic rescue experiments we identified that the transcription factor BRN2 and growth factor PTN operate as mechanistic substrates of neurogenesis and cellular survival, respectively, in Scz organoids. Collectively, our work suggests that multiple mechanisms of Scz exist in patient-derived organoids, and that these disparate mechanisms converge upon primordial brain developmental pathways such as neuronal differentiation, survival, and growth factor support, which may amalgamate to elevate intrinsic risk of Scz.


Asunto(s)
Células Madre Pluripotentes Inducidas , Esquizofrenia , Humanos , Organoides/metabolismo , Proteómica , Esquizofrenia/metabolismo , Factores de Transcripción/metabolismo
6.
Genome Res ; 28(2): 231-242, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29196558

RESUMEN

Understanding transcriptome complexity is crucial for understanding human biology and disease. Technologies such as Synthetic long-read RNA sequencing (SLR-RNA-seq) delivered 5 million isoforms and allowed assessing splicing coordination. Pacific Biosciences and Oxford Nanopore increase throughput also but require high input amounts or amplification. Our new droplet-based method, sparse isoform sequencing (spISO-seq), sequences 100k-200k partitions of 10-200 molecules at a time, enabling analysis of 10-100 million RNA molecules. SpISO-seq requires less than 1 ng of input cDNA, limiting or removing the need for prior amplification with its associated biases. Adjusting the number of reads devoted to each molecule reduces sequencing lanes and cost, with little loss in detection power. The increased number of molecules expands our understanding of isoform complexity. In addition to confirming our previously published cases of splicing coordination (e.g., BIN1), the greater depth reveals many new cases, such as MAPT Coordination of internal exons is found to be extensive among protein coding genes: 23.5%-59.3% (95% confidence interval) of highly expressed genes with distant alternative exons exhibit coordination, showcasing the need for long-read transcriptomics. However, coordination is less frequent for noncoding sequences, suggesting a larger role of splicing coordination in shaping proteins. Groups of genes with coordination are involved in protein-protein interactions with each other, raising the possibility that coordination facilitates complex formation and/or function. We also find new splicing coordination types, involving initial and terminal exons. Our results provide a more comprehensive understanding of the human transcriptome and a general, cost-effective method to analyze it.


Asunto(s)
Empalme Alternativo/genética , Microfluídica/métodos , Empalme del ARN/genética , Transcriptoma/genética , Biología Computacional , Regulación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Anotación de Secuencia Molecular , Isoformas de Proteínas/genética , Análisis de Secuencia de ARN
8.
Genome Res ; 25(11): 1610-21, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26297486

RESUMEN

Elucidating the consequences of genetic differences between humans is essential for understanding phenotypic diversity and personalized medicine. Although variation in RNA levels, transcription factor binding, and chromatin have been explored, little is known about global variation in translation and its genetic determinants. We used ribosome profiling, RNA sequencing, and mass spectrometry to perform an integrated analysis in lymphoblastoid cell lines from a diverse group of individuals. We find significant differences in RNA, translation, and protein levels suggesting diverse mechanisms of personalized gene expression control. Combined analysis of RNA expression and ribosome occupancy improves the identification of individual protein level differences. Finally, we identify genetic differences that specifically modulate ribosome occupancy--many of these differences lie close to start codons and upstream ORFs. Our results reveal a new level of gene expression variation among humans and indicate that genetic variants can cause changes in protein levels through effects on translation.


Asunto(s)
Polimorfismo de Nucleótido Simple , Biosíntesis de Proteínas , ARN/metabolismo , Cromatina/genética , Cromatina/metabolismo , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Proteómica , Sitios de Carácter Cuantitativo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/genética , Ribosomas/metabolismo , Alineación de Secuencia , Análisis de Secuencia de ARN
9.
RNA ; 21(6): 1187-202, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25904137

RESUMEN

The OLR1 gene encodes the oxidized low-density lipoprotein receptor (LOX-1), which is responsible for the cellular uptake of oxidized LDL (Ox-LDL), foam cell formation in atheroma plaques and atherosclerotic plaque rupture. Alternative splicing (AS) of OLR1 exon 5 generates two protein isoforms with antagonistic functions in Ox-LDL uptake. Previous work identified six single nucleotide polymorphisms (SNPs) in linkage disequilibrium that influence the inclusion levels of OLR1 exon 5 and correlate with the risk of cardiovascular disease. Here we use minigenes to recapitulate the effects of two allelic series (Low- and High-Risk) on OLR1 AS and identify one SNP in intron 4 (rs3736234) as the main contributor to the differences in exon 5 inclusion, while the other SNPs in the allelic series attenuate the drastic effects of this key SNP. Bioinformatic, proteomic, mutational and functional high-throughput analyses allowed us to define regulatory sequence motifs and identify SR protein family members (SRSF1, SRSF2) and HMGA1 as factors involved in the regulation of OLR1 AS. Our results suggest that antagonism between SRSF1 and SRSF2/HMGA1, and differential recognition of their regulatory motifs depending on the identity of the rs3736234 polymorphism, influence OLR1 exon 5 inclusion and the efficiency of Ox-LDL uptake, with potential implications for atherosclerosis and coronary disease.


Asunto(s)
Empalme Alternativo , Proteína HMGA1a/metabolismo , Lipoproteínas LDL/metabolismo , Proteínas Nucleares/metabolismo , Proteínas de Unión al ARN/metabolismo , Ribonucleoproteínas/metabolismo , Receptores Depuradores de Clase E/genética , Biología Computacional/métodos , Enfermedad Coronaria/genética , Enfermedad Coronaria/metabolismo , Predisposición Genética a la Enfermedad , Proteína HMGA1a/genética , Humanos , Intrones , Desequilibrio de Ligamiento , Proteínas Nucleares/genética , Polimorfismo de Nucleótido Simple , Proteínas de Unión al ARN/genética , Secuencias Reguladoras de Ácido Ribonucleico , Ribonucleoproteínas/genética , Receptores Depuradores de Clase E/metabolismo , Factores de Empalme Serina-Arginina
10.
Proc Natl Acad Sci U S A ; 111(27): 9869-74, 2014 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-24961374

RESUMEN

Personal transcriptomes in which all of an individual's genetic variants (e.g., single nucleotide variants) and transcript isoforms (transcription start sites, splice sites, and polyA sites) are defined and quantified for full-length transcripts are expected to be important for understanding individual biology and disease, but have not been described previously. To obtain such transcriptomes, we sequenced the lymphoblastoid transcriptomes of three family members (GM12878 and the parents GM12891 and GM12892) by using a Pacific Biosciences long-read approach complemented with Illumina 101-bp sequencing and made the following observations. First, we found that reads representing all splice sites of a transcript are evident for most sufficiently expressed genes ≤3 kb and often for genes longer than that. Second, we added and quantified previously unidentified splicing isoforms to an existing annotation, thus creating the first personalized annotation to our knowledge. Third, we determined SNVs in a de novo manner and connected them to RNA haplotypes, including HLA haplotypes, thereby assigning single full-length RNA molecules to their transcribed allele, and demonstrated Mendelian inheritance of RNA molecules. Fourth, we show how RNA molecules can be linked to personal variants on a one-by-one basis, which allows us to assess differential allelic expression (DAE) and differential allelic isoforms (DAI) from the phased full-length isoform reads. The DAI method is largely independent of the distance between exon and SNV--in contrast to fragmentation-based methods. Overall, in addition to improving eukaryotic transcriptome annotation, these results describe, to our knowledge, the first large-scale and full-length personal transcriptome.


Asunto(s)
Alelos , Transcriptoma , Expresión Génica , Haplotipos , Humanos , Polimorfismo de Nucleótido Simple , ARN/genética
11.
Genome Res ; 22(9): 1616-25, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22955974

RESUMEN

Splicing remains an incompletely understood process. Recent findings suggest that chromatin structure participates in its regulation. Here, we analyze the RNA from subcellular fractions obtained through RNA-seq in the cell line K562. We show that in the human genome, splicing occurs predominantly during transcription. We introduce the coSI measure, based on RNA-seq reads mapping to exon junctions and borders, to assess the degree of splicing completion around internal exons. We show that, as expected, splicing is almost fully completed in cytosolic polyA+ RNA. In chromatin-associated RNA (which includes the RNA that is being transcribed), for 5.6% of exons, the removal of the surrounding introns is fully completed, compared with 0.3% of exons for which no intron-removal has occurred. The remaining exons exist as a mixture of spliced and fewer unspliced molecules, with a median coSI of 0.75. Thus, most RNAs undergo splicing while being transcribed: "co-transcriptional splicing." Consistent with co-transcriptional spliceosome assembly and splicing, we have found significant enrichment of spliceosomal snRNAs in chromatin-associated RNA compared with other cellular RNA fractions and other nonspliceosomal snRNAs. CoSI scores decrease along the gene, pointing to a "first transcribed, first spliced" rule, yet more downstream exons carry other characteristics, favoring rapid, co-transcriptional intron removal. Exons with low coSI values, that is, in the process of being spliced, are enriched with chromatin marks, consistent with a role for chromatin in splicing during transcription. For alternative exons and long noncoding RNAs, splicing tends to occur later, and the latter might remain unspliced in some cases.


Asunto(s)
Genoma Humano , Empalme del ARN , ARN Largo no Codificante/metabolismo , Transcripción Genética , Cromatina/metabolismo , Análisis por Conglomerados , Biología Computacional/métodos , Exones , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , ARN/genética , ARN/metabolismo , Análisis de Secuencia de ARN , Empalmosomas/genética , Empalmosomas/metabolismo , Fracciones Subcelulares/química
12.
Genome Res ; 22(9): 1775-89, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22955988

RESUMEN

The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences-particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.


Asunto(s)
Bases de Datos Genéticas , ARN Largo no Codificante/genética , Empalme Alternativo , Animales , Núcleo Celular/genética , Núcleo Celular/metabolismo , Análisis por Conglomerados , Evolución Molecular , Exones , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Histonas/metabolismo , Humanos , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Especificidad de Órganos/genética , Primates/genética , Procesamiento Postranscripcional del ARN , Sitios de Empalme de ARN , ARN Mensajero/genética , Selección Genética , Transcripción Genética
13.
Nat Neurosci ; 27(6): 1051-1063, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38594596

RESUMEN

RNA isoforms influence cell identity and function. However, a comprehensive brain isoform map was lacking. We analyze single-cell RNA isoforms across brain regions, cell subtypes, developmental time points and species. For 72% of genes, full-length isoform expression varies along one or more axes. Splicing, transcription start and polyadenylation sites vary strongly between cell types, influence protein architecture and associate with disease-linked variation. Additionally, neurotransmitter transport and synapse turnover genes harbor cell-type variability across anatomical regions. Regulation of cell-type-specific splicing is pronounced in the postnatal day 21-to-postnatal day 28 adolescent transition. Developmental isoform regulation is stronger than regional regulation for the same cell type. Cell-type-specific isoform regulation in mice is mostly maintained in the human hippocampus, allowing extrapolation to the human brain. Conversely, the human brain harbors additional cell-type specificity, suggesting gain-of-function isoforms. Together, this detailed single-cell atlas of full-length isoform regulation across development, anatomical regions and species reveals an unappreciated degree of isoform variability across multiple axes.


Asunto(s)
Encéfalo , Análisis de la Célula Individual , Animales , Humanos , Ratones , Encéfalo/metabolismo , Encéfalo/crecimiento & desarrollo , Análisis de la Célula Individual/métodos , Empalme del ARN/genética , Isoformas de ARN/genética , Empalme Alternativo/genética , Masculino , Ratones Endogámicos C57BL
14.
bioRxiv ; 2024 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-38464236

RESUMEN

Multimodal measurements have become widespread in genomics, however measuring open chromatin accessibility and splicing simultaneously in frozen brain tissues remains unconquered. Hence, we devised Single-Cell-ISOform-RNA sequencing coupled with the Assay-for-Transposase-Accessible-Chromatin (ScISOr-ATAC). We utilized ScISOr-ATAC to assess whether chromatin and splicing alterations in the brain convergently affect the same cell types or divergently different ones. We applied ScISOr-ATAC to three major conditions: comparing (i) the Rhesus macaque (Macaca mulatta) prefrontal cortex (PFC) and visual cortex (VIS), (ii) cross species divergence of Rhesus macaque versus human PFC, as well as (iii) dysregulation in Alzheimer's disease in human PFC. We found that among cortical-layer biased excitatory neuron subtypes, splicing is highly brain-region specific for L3-5/L6 IT_RORB neurons, moderately specific in L2-3 IT_CUX2.RORB neurons and unspecific in L2-3 IT_CUX2 neurons. In contrast, at the chromatin level, L2-3 IT_CUX2.RORB neurons show the highest brain-region specificity compared to other subtypes. Likewise, when comparing human and macaque PFC, strong evolutionary divergence on one molecular modality does not necessarily imply strong such divergence on another molecular level in the same cell type. Finally, in Alzheimer's disease, oligodendrocytes show convergently high dysregulation in both chromatin and splicing. However, chromatin and splicing dysregulation most strongly affect distinct oligodendrocyte subtypes. Overall, these results indicate that chromatin and splicing can show convergent or divergent results depending on the performed comparison, justifying the need for their concurrent measurement to investigate complex systems. Taken together, ScISOr-ATAC allows for the characterization of single-cell splicing and chromatin patterns and the comparison of sample groups in frozen brain samples.

15.
Transcription ; 14(3-5): 92-104, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37314295

RESUMEN

The profiling of gene expression patterns to glean biological insights from single cells has become commonplace over the last few years. However, this approach overlooks the transcript contents that can differ between individual cells and cell populations. In this review, we describe early work in the field of single-cell short-read sequencing as well as full-length isoforms from single cells. We then describe recent work in single-cell long-read sequencing wherein some transcript elements have been observed to work in tandem. Based on earlier work in bulk tissue, we motivate the study of combination patterns of other RNA variables. Given that we are still blind to some aspects of isoform biology, we suggest possible future avenues such as CRISPR screens which can further illuminate the function of RNA variables in distinct cell populations.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Empalme Alternativo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , ARN/genética , Análisis de Secuencia de ARN , Secuenciación de Nucleótidos de Alto Rendimiento
16.
bioRxiv ; 2023 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-37546844

RESUMEN

Motivation: Recent advancements in long-read RNA sequencing have enabled the examination of full-length isoforms, previously uncaptured by short-read sequencing methods. An alternative powerful method for studying isoforms is through the use of barcoded short-read RNA reads, for which a barcode indicates whether two short-reads arise from the same molecule or not. Such techniques included the 10x Genomics linked-read based SParse Isoform Sequencing (SPIso-seq), as well as Loop-Seq, or Tell-Seq. Some applications, such as novel-isoform discovery, require very high coverage. Obtaining high coverage using long reads can be difficult, making barcoded RNA-seq data a valuable alternative for this task. However, most annotation pipelines are not able to work with a set of short reads instead of a single transcript, also not able to work with coverage gaps within a molecule if any. In order to overcome this challenge, we present an RNA-seq assembler allowing the determination of the expressed isoform per barcode. Results: In this paper, we present cloudrnaSPAdes, a tool for assembling full-length isoforms from barcoded RNA-seq linked-read data in a reference-free fashion. Evaluating it on simulated and real human data, we found that cloudrnaSPAdes accurately assembles isoforms, even for genes with high isoform diversity. Availability: cloudrnaSPAdes is a feature release of a SPAdes assembler and available at https://cab.spbu.ru/software/cloudrnaspades/.

17.
Nat Biotechnol ; 41(7): 915-918, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-36593406

RESUMEN

Annotating newly sequenced genomes and determining alternative isoforms from long-read RNA data are complex and incompletely solved problems. Here we present IsoQuant-a computational tool using intron graphs that accurately reconstructs transcripts both with and without reference genome annotation. For novel transcript discovery, IsoQuant reduces the false-positive rate fivefold and 2.5-fold for Oxford Nanopore reference-based or reference-free mode, respectively. IsoQuant also improves performance for Pacific Biosciences data.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , ARN , Isoformas de Proteínas/genética , Análisis de Secuencia de ARN , Genoma , Análisis de Secuencia de ADN
18.
bioRxiv ; 2023 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-37066387

RESUMEN

RNA isoforms influence cell identity and function. Until recently, technological limitations prevented a genome-wide appraisal of isoform influence on cell identity in various parts of the brain. Using enhanced long-read single-cell isoform sequencing, we comprehensively analyze RNA isoforms in multiple mouse brain regions, cell subtypes, and developmental timepoints from postnatal day 14 (P14) to adult (P56). For 75% of genes, full-length isoform expression varies along one or more axes of phenotypic origin, underscoring the pervasiveness of isoform regulation across multiple scales. As expected, splicing varies strongly between cell types. However, certain gene classes including neurotransmitter release and reuptake as well as synapse turnover, harbor significant variability in the same cell type across anatomical regions, suggesting differences in network activity may influence cell-type identity. Glial brain-region specificity in isoform expression includes strong poly(A)-site regulation, whereas neurons have stronger TSS regulation. Furthermore, developmental patterns of cell-type specific splicing are especially pronounced in the murine adolescent transition from P21 to P28. The same cell type traced across development shows more isoform variability than across adult anatomical regions, indicating a coordinated modulation of functional programs dictating neural development. As most cell-type specific exons in P56 mouse hippocampus behave similarly in newly generated data from human hippocampi, these principles may be extrapolated to human brain. However, human brains have evolved additional cell-type specificity in splicing, suggesting gain-of-function isoforms. Taken together, we present a detailed single-cell atlas of full-length brain isoform regulation across development and anatomical regions, providing a previously unappreciated degree of isoform variability across multiple scales of the brain.

19.
Sci Rep ; 12(1): 4369, 2022 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-35288582

RESUMEN

The zebra finch is one of the most commonly studied songbirds in biology, particularly in genomics, neuroscience and vocal communication. However, this species lacks a robust cell line for molecular biology research and reagent optimization. We generated a cell line, designated CFS414, from zebra finch embryonic fibroblasts using the SV40 large and small T antigens. This cell line demonstrates an improvement over previous songbird cell lines through continuous and density-independent growth, allowing for indefinite culture and monoclonal line derivation. Cytogenetic, genomic, and transcriptomic profiling established the provenance of this cell line and identified the expression of genes relevant to ongoing songbird research. Using this cell line, we disrupted endogenous gene sequences using S.aureus Cas9 and confirmed a stress-dependent localization response of a song system specialized gene, SAP30L. The utility of CFS414 cells enhances the comprehensive molecular potential of the zebra finch and validates cell immortalization strategies in a songbird species.


Asunto(s)
Pinzones , Animales , Sistemas CRISPR-Cas , Línea Celular , Pinzones/genética , Genoma , Genómica
20.
Nat Biotechnol ; 40(7): 1082-1092, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35256815

RESUMEN

Single-nuclei RNA sequencing characterizes cell types at the gene level. However, compared to single-cell approaches, many single-nuclei cDNAs are purely intronic, lack barcodes and hinder the study of isoforms. Here we present single-nuclei isoform RNA sequencing (SnISOr-Seq). Using microfluidics, PCR-based artifact removal, target enrichment and long-read sequencing, SnISOr-Seq increased barcoded, exon-spanning long reads 7.5-fold compared to naive long-read single-nuclei sequencing. We applied SnISOr-Seq to adult human frontal cortex and found that exons associated with autism exhibit coordinated and highly cell-type-specific inclusion. We found two distinct combination patterns: those distinguishing neural cell types, enriched in TSS-exon, exon-polyadenylation-site and non-adjacent exon pairs, and those with multiple configurations within one cell type, enriched in adjacent exon pairs. Finally, we observed that human-specific exons are almost as tightly coordinated as conserved exons, implying that coordination can be rapidly established during evolution. SnISOr-Seq enables cell-type-specific long-read isoform analysis in human brain and in any frozen or hard-to-dissociate sample.


Asunto(s)
Encéfalo , ARN , Empalme Alternativo/genética , Encéfalo/metabolismo , Exones/genética , Humanos , Isoformas de Proteínas/genética , ARN/genética , Análisis de Secuencia de ARN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA