RESUMEN
Since the discovery of RNA splicing and its role in gene expression, researchers have sought a set of rules, an algorithm or a computational model that could predict the splice isoforms, and their frequencies, produced from any transcribed gene in a specific cellular context. Over the past 30 years, these models have evolved from simple position weight matrices to deep-learning models capable of integrating sequence data across vast genomic distances. Most recently, new model architectures are moving the field closer to context-specific alternative splicing predictions, and advances in sequencing technologies are expanding the type of data that can be used to inform and interpret such models. Together, these developments are driving improved understanding of splicing regulatory mechanisms and emerging applications of the splicing code to the rational design of RNA- and splicing-based therapeutics.
RESUMEN
Loss of function of the RNA-binding protein TDP-43 (TDP-LOF) is a hallmark of amyotrophic lateral sclerosis (ALS) and other neurodegenerative disorders. Here we describe TDP-REG, which exploits the specificity of cryptic splicing induced by TDP-LOF to drive protein expression when and where the disease process occurs. The SpliceNouveau algorithm combines deep learning with rational design to generate customizable cryptic splicing events within protein-coding sequences. We demonstrate that expression of TDP-REG reporters is tightly coupled to TDP-LOF in vitro and in vivo. TDP-REG enables genomic prime editing to ablate the UNC13A cryptic donor splice site specifically upon TDP-LOF. Finally, we design TDP-REG vectors encoding a TDP-43/Raver1 fusion protein that rescues key pathological cryptic splicing events, paving the way for the development of precision therapies for TDP43-related disorders.
Asunto(s)
Esclerosis Amiotrófica Lateral , Proteínas de Unión al ADN , Demencia Frontotemporal , Medicina de Precisión , Empalme del ARN , Animales , Humanos , Ratones , Esclerosis Amiotrófica Lateral/genética , Esclerosis Amiotrófica Lateral/terapia , Aprendizaje Profundo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Demencia Frontotemporal/genética , Demencia Frontotemporal/terapia , Edición Génica , Células HEK293 , Sitios de Empalme de ARN , Proteínas de Unión al ARN/metabolismo , Proteínas de Unión al ARN/genéticaRESUMEN
TDP-43 loss of function induces multiple splicing changes, including a cryptic exon in the amyotrophic lateral sclerosis and fronto-temporal lobar degeneration risk gene UNC13A, leading to nonsense-mediated decay of UNC13A transcripts and loss of protein. UNC13A is an active zone protein with an integral role in coordinating pre-synaptic function. Here, we show TDP-43 depletion induces a severe reduction in synaptic transmission, leading to an asynchronous pattern of network activity. We demonstrate that these deficits are largely driven by a single cryptic exon in UNC13A. Antisense oligonucleotides targeting the UNC13A cryptic exon robustly rescue UNC13A protein levels and restore normal synaptic function, providing a potential new therapeutic approach for ALS and other TDP-43-related disorders.
RESUMEN
Ribosome profiling is a powerful technique to study translation at a transcriptome-wide level. However, ensuring good data quality is paramount for accurate interpretation, as is ensuring that the analyses are reproducible. We introduce a new Nextflow DSL2 pipeline, riboseq-flow, designed for processing and comprehensive quality control of ribosome profiling experiments. Riboseq-flow is user-friendly, versatile and upholds high standards in reproducibility, scalability, portability, version control and continuous integration. It enables users to efficiently analyse multiple samples in parallel and helps them evaluate the quality and utility of their data based on the detailed metrics and visualisations that are automatically generated. Riboseq-flow is available at https://github.com/iraiosub/riboseq-flow.
Ribosome profiling is a cutting-edge method that provides a detailed view of protein synthesis across the entire set of RNA molecules within cells. To ensure the reliability of such studies, high-quality data and the ability to replicate analyses are crucial. To address this, we present riboseq-flow, a new tool built with Nextflow DSL2, tailored for analysing data from ribosome profiling experiments. This pipeline stands out for its ease of use, flexibility, and commitment to high reproducibility standards. It's designed to handle multiple samples simultaneously, ensuring efficient analysis for large-scale studies. Moreover, riboseq-flow automatically generates detailed reports and visual representations to assess the data quality, enhancing researchers' understanding of their experiments and guiding future decisions. This valuable resource is freely accessible at https://github.com/iraiosub/riboseq-flow.
RESUMEN
Nuclear depletion and cytoplasmic aggregation of the RNA-binding protein TDP-43 is the hallmark of ALS, occurring in over 97% of cases. A key consequence of TDP-43 nuclear loss is the de-repression of cryptic exons. Whilst TDP-43 regulated cryptic splicing is increasingly well catalogued, cryptic alternative polyadenylation (APA) events, which define the 3' end of last exons, have been largely overlooked, especially when not associated with novel upstream splice junctions. We developed a novel bioinformatic approach to reliably identify distinct APA event types: alternative last exons (ALE), 3'UTR extensions (3'Ext) and intronic polyadenylation (IPA) events. We identified novel neuronal cryptic APA sites induced by TDP-43 loss of function by systematically applying our pipeline to a compendium of publicly available and in house datasets. We find that TDP-43 binding sites and target motifs are enriched at these cryptic events and that TDP-43 can have both repressive and enhancing action on APA. Importantly, all categories of cryptic APA can also be identified in ALS and FTD post mortem brain regions with TDP-43 proteinopathy underlining their potential disease relevance. RNA-seq and Ribo-seq analyses indicate that distinct cryptic APA categories have different downstream effects on transcript and translation. Intriguingly, cryptic 3'Exts occur in multiple transcription factors, such as ELK1, SIX3, and TLX1, and lead to an increase in wild-type protein levels and function. Finally, we show that an increase in RNA stability leading to a higher cytoplasmic localisation underlies these observations. In summary, we demonstrate that TDP-43 nuclear depletion induces a novel category of cryptic RNA processing events and we expand the palette of TDP-43 loss consequences by showing this can also lead to an increase in normal protein translation.
RESUMEN
Functional loss of TDP-43, an RNA binding protein genetically and pathologically linked to amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), leads to the inclusion of cryptic exons in hundreds of transcripts during disease. Cryptic exons can promote the degradation of affected transcripts, deleteriously altering cellular function through loss-of-function mechanisms. Here, we show that mRNA transcripts harboring cryptic exons generated de novo proteins in TDP-43-depleted human iPSC-derived neurons in vitro, and de novo peptides were found in cerebrospinal fluid (CSF) samples from patients with ALS or FTD. Using coordinated transcriptomic and proteomic studies of TDP-43-depleted human iPSC-derived neurons, we identified 65 peptides that mapped to 12 cryptic exons. Cryptic exons identified in TDP-43-depleted human iPSC-derived neurons were predictive of cryptic exons expressed in postmortem brain tissue from patients with TDP-43 proteinopathy. These cryptic exons produced transcript variants that generated de novo proteins. We found that the inclusion of cryptic peptide sequences in proteins altered their interactions with other proteins, thereby likely altering their function. Last, we showed that 18 de novo peptides across 13 genes were present in CSF samples from patients with ALS/FTD spectrum disorders. The demonstration of cryptic exon translation suggests new mechanisms for ALS/FTD pathophysiology downstream of TDP-43 dysfunction and may provide a potential strategy to assay TDP-43 function in patient CSF.
Asunto(s)
Esclerosis Amiotrófica Lateral , Demencia Frontotemporal , Humanos , Esclerosis Amiotrófica Lateral/genética , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Demencia Frontotemporal/genética , Péptidos , ProteómicaRESUMEN
A system enabling the expression of therapeutic proteins specifically in diseased cells would be transformative, providing greatly increased safety and the possibility of pre-emptive treatment. Here we describe "TDP-REG", a precision medicine approach primarily for amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), which exploits the cryptic splicing events that occur in cells with TDP-43 loss-of-function (TDP-LOF) in order to drive expression specifically in diseased cells. In addition to modifying existing cryptic exons for this purpose, we develop a deep-learning-powered algorithm for generating customisable cryptic splicing events, which can be embedded within virtually any coding sequence. By placing part of a coding sequence within a novel cryptic exon, we tightly couple protein expression to TDP-LOF. Protein expression is activated by TDP-LOF in vitro and in vivo, including TDP-LOF induced by cytoplasmic TDP-43 aggregation. In addition to generating a variety of fluorescent and luminescent reporters, we use this system to perform TDP-LOF-dependent genomic prime editing to ablate the UNC13A cryptic donor splice site. Furthermore, we design a panel of tightly gated, autoregulating vectors encoding a TDP-43/Raver1 fusion protein, which rescue key pathological cryptic splicing events. In summary, we combine deep-learning and rational design to create sophisticated splicing sensors, resulting in a platform that provides far safer therapeutics for neurodegeneration, potentially even enabling preemptive treatment of at-risk individuals.
RESUMEN
Functional loss of TDP-43, an RNA-binding protein genetically and pathologically linked to ALS and FTD, leads to inclusion of cryptic exons in hundreds of transcripts during disease. Cryptic exons can promote degradation of affected transcripts, deleteriously altering cellular function through loss-of-function mechanisms. However, the possibility of de novo protein synthesis from cryptic exon transcripts has not been explored. Here, we show that mRNA transcripts harboring cryptic exons generate de novo proteins both in TDP-43 deficient cellular models and in disease. Using coordinated transcriptomic and proteomic studies of TDP-43 depleted iPSC-derived neurons, we identified numerous peptides that mapped to cryptic exons. Cryptic exons identified in iPSC models were highly predictive of cryptic exons expressed in brains of patients with TDP-43 proteinopathy, including cryptic transcripts that generated de novo proteins. We discovered that inclusion of cryptic peptide sequences in proteins altered their interactions with other proteins, thereby likely altering their function. Finally, we showed that these de novo peptides were present in CSF from patients with ALS. The demonstration of cryptic exon translation suggests new mechanisms for ALS pathophysiology downstream of TDP-43 dysfunction and may provide a strategy for novel biomarker development.
RESUMEN
Recent studies have revealed multiple mechanisms that can lead to heterogeneity in ribosomal composition. This heterogeneity can lead to preferential translation of specific panels of mRNAs, and is defined in large part by the ribosomal protein (RP) content, amongst other things. However, it is currently unknown to what extent ribosomal composition is heterogeneous across tissues, which is compounded by a lack of tools available to study it. Here we present dripARF, a method for detecting differential RP incorporation into the ribosome using Ribosome Profiling (Ribo-seq) data. We combine the 'waste' rRNA fragment data generated in Ribo-seq with the known 3D structure of the human ribosome to predict differences in the composition of ribosomes in the material being studied. We have validated this approach using publicly available data, and have revealed a potential role for eS25/RPS25 in development. Our results indicate that ribosome heterogeneity can be detected in Ribo-seq data, providing a new method to study this phenomenon. Furthermore, with dripARF, previously published Ribo-seq data provides a wealth of new information, allowing the identification of RPs of interest in many disease and normal contexts. dripARF is available as part of the ARF R package and can be accessed through https://github.com/fallerlab/ARF.
Asunto(s)
Ribosomas/química , Humanos , ARN Mensajero , ARN Ribosómico/análisis , Proteínas Ribosómicas/análisis , Ribosomas/genéticaRESUMEN
Variants of UNC13A, a critical gene for synapse function, increase the risk of amyotrophic lateral sclerosis and frontotemporal dementia1-3, two related neurodegenerative diseases defined by mislocalization of the RNA-binding protein TDP-434,5. Here we show that TDP-43 depletion induces robust inclusion of a cryptic exon in UNC13A, resulting in nonsense-mediated decay and loss of UNC13A protein. Two common intronic UNC13A polymorphisms strongly associated with amyotrophic lateral sclerosis and frontotemporal dementia risk overlap with TDP-43 binding sites. These polymorphisms potentiate cryptic exon inclusion, both in cultured cells and in brains and spinal cords from patients with these conditions. Our findings, which demonstrate a genetic link between loss of nuclear TDP-43 function and disease, reveal the mechanism by which UNC13A variants exacerbate the effects of decreased TDP-43 function. They further provide a promising therapeutic target for TDP-43 proteinopathies.
Asunto(s)
Esclerosis Amiotrófica Lateral , Demencia Frontotemporal , Proteinopatías TDP-43 , Empalme Alternativo , Esclerosis Amiotrófica Lateral/genética , Esclerosis Amiotrófica Lateral/metabolismo , Codón sin Sentido , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Demencia Frontotemporal/genética , Demencia Frontotemporal/metabolismo , Humanos , Proteínas del Tejido Nervioso , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Background: The first step of virtually all next generation sequencing analysis involves the splitting of the raw sequencing data into separate files using sample-specific barcodes, a process known as "demultiplexing". However, we found that existing software for this purpose was either too inflexible or too computationally intensive for fast, streamlined processing of raw, single end fastq files containing combinatorial barcodes. Results: Here, we introduce a fast and uniquely flexible demultiplexer, named Ultraplex, which splits a raw FASTQ file containing barcodes either at a single end or at both 5' and 3' ends of reads, trims the sequencing adaptors and low-quality bases, and moves unique molecular identifiers (UMIs) into the read header, allowing subsequent removal of PCR duplicates. Ultraplex is able to perform such single or combinatorial demultiplexing on both single- and paired-end sequencing data, and can process an entire Illumina HiSeq lane, consisting of nearly 500 million reads, in less than 20 minutes. Conclusions: Ultraplex greatly reduces computational burden and pipeline complexity for the demultiplexing of complex sequencing libraries, such as those produced by various CLIP and ribosome profiling protocols, and is also very user friendly, enabling streamlined, robust data processing. Ultraplex is available on PyPi and Conda and via Github.
RESUMEN
FUsed in Sarcoma (FUS) is a multifunctional RNA binding protein (RBP). FUS mutations lead to its cytoplasmic mislocalization and cause the neurodegenerative disease amyotrophic lateral sclerosis (ALS). Here, we use mouse and human models with endogenous ALS-associated mutations to study the early consequences of increased cytoplasmic FUS. We show that in axons, mutant FUS condensates sequester and promote the phase separation of fragile X mental retardation protein (FMRP), another RBP associated with neurodegeneration. This leads to repression of translation in mouse and human FUS-ALS motor neurons and is corroborated in vitro, where FUS and FMRP copartition and repress translation. Last, we show that translation of FMRP-bound RNAs is reduced in vivo in FUS-ALS motor neurons. Our results unravel new pathomechanisms of FUS-ALS and identify a novel paradigm by which mutations in one RBP favor the formation of condensates sequestering other RBPs, affecting crucial biological functions, such as protein translation.
Asunto(s)
Esclerosis Amiotrófica Lateral , Enfermedades Neurodegenerativas , Esclerosis Amiotrófica Lateral/genética , Animales , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/genética , Ratones , Mutación , Biosíntesis de Proteínas , Proteína FUS de Unión a ARN/genéticaRESUMEN
The human genome expresses thousands of natural antisense transcripts (NAT) that can regulate epigenetic state, transcription, RNA stability or translation of their overlapping genes1,2. Here we describe MAPT-AS1, a brain-enriched NAT that is conserved in primates and contains an embedded mammalian-wide interspersed repeat (MIR), which represses tau translation by competing for ribosomal RNA pairing with the MAPT mRNA internal ribosome entry site3. MAPT encodes tau, a neuronal intrinsically disordered protein (IDP) that stabilizes axonal microtubules. Hyperphosphorylated, aggregation-prone tau forms the hallmark inclusions of tauopathies4. Mutations in MAPT cause familial frontotemporal dementia, and common variations forming the MAPT H1 haplotype are a significant risk factor in many tauopathies5 and Parkinson's disease. Notably, expression of MAPT-AS1 or minimal essential sequences from MAPT-AS1 (including MIR) reduces-whereas silencing MAPT-AS1 expression increases-neuronal tau levels, and correlate with tau pathology in human brain. Moreover, we identified many additional NATs with embedded MIRs (MIR-NATs), which are overrepresented at coding genes linked to neurodegeneration and/or encoding IDPs, and confirmed MIR-NAT-mediated translational control of one such gene, PLCG1. These results demonstrate a key role for MAPT-AS1 in tauopathies and reveal a potentially broad contribution of MIR-NATs to the tightly controlled translation of IDPs6, with particular relevance for proteostasis in neurodegeneration.
Asunto(s)
Biosíntesis de Proteínas/genética , Proteostasis/genética , ARN sin Sentido/genética , Tauopatías/genética , Tauopatías/metabolismo , Proteínas tau/genética , Proteínas tau/metabolismo , Anciano , Animales , Sitios de Unión , Encéfalo/metabolismo , Encéfalo/patología , Estudios de Casos y Controles , Diferenciación Celular , Progresión de la Enfermedad , Femenino , Humanos , Sitios Internos de Entrada al Ribosoma/genética , Masculino , Ratones , Ratones Transgénicos , Persona de Mediana Edad , Neuronas/metabolismo , Neuronas/patología , Ribosomas/metabolismo , Proteínas tau/biosíntesisRESUMEN
Disordered proteins play an essential role in a wide variety of biological processes, and are often posttranslationally modified. One such protein is histone H1; its highly disordered C-terminal tail (CH1) condenses internucleosomal linker DNA in chromatin in a way that is still poorly understood. Moreover, CH1 is phosphorylated in a cell cycle-dependent manner that correlates with changes in the chromatin condensation level. Here we present a model system that recapitulates key aspects of the in vivo process, and also allows a detailed structural and biophysical analysis of the stages before and after condensation. CH1 remains disordered in the DNA-bound state, despite its nanomolar affinity. Phase-separated droplets (coacervates) form, containing higher-order assemblies of CH1/DNA complexes. Phosphorylation at three serine residues, spaced along the length of the tail, has little effect on the local properties of the condensate. However, it dramatically alters higher-order structure in the coacervate and reduces partitioning to the coacervate phase. These observations show that disordered proteins can bind tightly to DNA without a disorder-to-order transition. Importantly, they also provide mechanistic insights into how higher-order structures can be exquisitely sensitive to perturbation by posttranslational modifications, thus broadening the repertoire of mechanisms that might regulate chromatin and other macromolecular assemblies.