RESUMO
The importance of genomic sequence context in generating transcriptome diversity through RNA splicing is independently unmasked by two studies in this issue (Jaganathan et al., 2019; Baeza-Centurion et al., 2019).
Assuntos
Aprendizado Profundo , Splicing de RNA , Genoma , Genômica , TranscriptomaRESUMO
In this issue of Molecular Cell, Gonatopoulos-Pournatzis et al. (2020) report a neuron-specific microexon in eIF4G translation initiation factors that dampens synaptic protein translation. Autism-associated disruption of this exon results in increased protein production, likely through reduced coalescence with cytoplasmic ribonucleoprotein granule components, including FMRP.
Assuntos
Transtorno Autístico , Fator de Iniciação Eucariótico 4G , Animais , Encéfalo , Cognição , PlumasRESUMO
Alternative splicing plays a crucial role in protein diversity and gene expression regulation in higher eukaryotes, and mutations causing dysregulated splicing underlie a range of genetic diseases. Computational prediction of alternative splicing from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing mutations and drug targets. However, the current methods for the quantitative prediction of splice site usage still have limited accuracy. Here, we present DeltaSplice, a deep neural network model optimized to learn the impact of mutations on quantitative changes in alternative splicing from the comparative analysis of homologous genes. The model architecture enables DeltaSplice to perform "reference-informed prediction" by incorporating the known splice site usage of a reference gene sequence to improve its prediction on splicing-altering mutations. We benchmarked DeltaSplice and several other state-of-the-art methods on various prediction tasks, including evolutionary sequence divergence on lineage-specific splicing and splicing-altering mutations in human populations and neurodevelopmental disorders, and demonstrated that DeltaSplice outperformed consistently. DeltaSplice predicted â¼15% of splicing quantitative trait loci (sQTLs) in the human brain as causal splicing-altering variants. It also predicted splicing-altering de novo mutations outside the splice sites in a subset of patients affected by autism and other neurodevelopmental disorders (NDDs), including 19 genes with recurrent splicing-altering mutations. Integration of splicing-altering mutations with other types of de novo mutation burdens allowed the prediction of eight novel NDD-risk genes. Our work expanded the capacity of in silico splicing models with potential applications in genetic diagnosis and the development of splicing-based precision medicine.
Assuntos
Processamento Alternativo , Mutação , Locos de Características Quantitativas , Sítios de Splice de RNA , Humanos , Biologia Computacional/métodos , Transtornos do Neurodesenvolvimento/genéticaRESUMO
RNA-binding proteins (RBPs) regulate post-transcriptional gene expression by recognizing short and degenerate sequence motifs in their target transcripts, but precisely defining their binding specificity remains challenging. Crosslinking and immunoprecipitation (CLIP) allows for mapping of the exact protein-RNA crosslink sites, which frequently reside at specific positions in RBP motifs at single-nucleotide resolution. Here, we have developed a computational method, named mCross, to jointly model RBP binding specificity while precisely registering the crosslinking position in motif sites. We applied mCross to 112 RBPs using ENCODE eCLIP data and validated the reliability of the discovered motifs by genome-wide analysis of allelic binding sites. Our analyses revealed that the prototypical SR protein SRSF1 recognizes clusters of GGA half-sites in addition to its canonical GGAGGA motif. Therefore, SRSF1 regulates splicing of a much larger repertoire of transcripts than previously appreciated, including HNRNPD and HNRNPDL, which are involved in multivalent protein assemblies and phase separation.
Assuntos
Ribonucleoproteínas Nucleares Heterogêneas Grupo D/química , Modelos Moleculares , RNA/química , Fatores de Processamento de Serina-Arginina/química , Sequência de Bases , Sítios de Ligação , Reagentes de Ligações Cruzadas/química , Expressão Gênica , Células HeLa , Células Hep G2 , Ribonucleoproteína Nuclear Heterogênea D0 , Ribonucleoproteínas Nucleares Heterogêneas Grupo D/genética , Ribonucleoproteínas Nucleares Heterogêneas Grupo D/metabolismo , Humanos , Células K562 , Conformação de Ácido Nucleico , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , RNA/genética , RNA/metabolismo , Alinhamento de Sequência , Homologia de Sequência do Ácido Nucleico , Fatores de Processamento de Serina-Arginina/genética , Fatores de Processamento de Serina-Arginina/metabolismoRESUMO
FMRP loss of function causes Fragile X syndrome (FXS) and autistic features. FMRP is a polyribosome-associated neuronal RNA-binding protein, suggesting that it plays a key role in regulating neuronal translation, but there has been little consensus regarding either its RNA targets or mechanism of action. Here, we use high-throughput sequencing of RNAs isolated by crosslinking immunoprecipitation (HITS-CLIP) to identify FMRP interactions with mouse brain polyribosomal mRNAs. FMRP interacts with the coding region of transcripts encoding pre- and postsynaptic proteins and transcripts implicated in autism spectrum disorders (ASD). We developed a brain polyribosome-programmed translation system, revealing that FMRP reversibly stalls ribosomes specifically on its target mRNAs. Our results suggest that loss of a translational brake on the synthesis of a subset of synaptic proteins contributes to FXS. In addition, they provide insight into the molecular basis of the cognitive and allied defects in FXS and ASD and suggest multiple targets for clinical intervention.
Assuntos
Transtorno Autístico/metabolismo , Encéfalo/metabolismo , Proteína do X Frágil da Deficiência Intelectual/metabolismo , Síndrome do Cromossomo X Frágil/metabolismo , Ribossomos/metabolismo , Sinapses/metabolismo , Animais , Transtorno Autístico/fisiopatologia , Proteína do X Frágil da Deficiência Intelectual/genética , Síndrome do Cromossomo X Frágil/fisiopatologia , Humanos , Camundongos , Camundongos Knockout , Polirribossomos/metabolismo , Biossíntese de Proteínas , Proteínas de Ligação a RNA , Análise de Sequência de RNARESUMO
LIN28 is a bipartite RNA-binding protein that post-transcriptionally inhibits the biogenesis of let-7 microRNAs to regulate development and influence disease states. However, the mechanisms of let-7 suppression remain poorly understood because LIN28 recognition depends on coordinated targeting by both the zinc knuckle domain (ZKD), which binds a GGAG-like element in the precursor, and the cold shock domain (CSD), whose binding sites have not been systematically characterized. By leveraging single-nucleotide-resolution mapping of LIN28 binding sites in vivo, we determined that the CSD recognizes a (U)GAU motif. This motif partitions the let-7 microRNAs into two subclasses, precursors with both CSD and ZKD binding sites (CSD+) and precursors with ZKD but no CSD binding sites (CSD-). LIN28 in vivo recognition-and subsequent 3' uridylation and degradation-of CSD+ precursors is more efficient, leading to their stronger suppression in LIN28-activated cells and cancers. Thus, CSD binding sites amplify the regulatory effects of LIN28.
Assuntos
MicroRNAs/metabolismo , Proteínas de Ligação a RNA/metabolismo , Animais , Sequência de Bases , Células-Tronco Embrionárias , Células Hep G2 , Humanos , Células K562 , Camundongos , MicroRNAs/genética , Modelos Moleculares , Conformação de Ácido Nucleico , Domínios Proteicos , Estrutura Terciária de Proteína , Precursores de RNA/metabolismo , Proteínas de Ligação a RNA/genéticaRESUMO
Control over gene expression is exerted, in multiple stages of spermatogenesis, at the post-transcriptional level by RNA binding proteins (RBPs). We identify here an essential role in mammalian spermatogenesis and male fertility for 'RNA binding protein 46' (RBM46). A highly evolutionarily conserved gene, Rbm46 is also essential for fertility in both flies and fish. We found Rbm46 expression was restricted to the mouse germline, detectable in males in the cytoplasm of premeiotic spermatogonia and meiotic spermatocytes. To define its requirement for spermatogenesis, we generated Rbm46 knockout (KO, Rbm46-/-) mice; although male Rbm46-/- mice were viable and appeared grossly normal, they were infertile. Testes from adult Rbm46-/- mice were small, with seminiferous tubules containing only Sertoli cells and few undifferentiated spermatogonia. Using genome-wide unbiased high throughput assays RNA-seq and 'enhanced crosslinking immunoprecipitation' coupled with RNA-seq (eCLIP-seq), we discovered RBM46 could bind, via a U-rich conserved consensus sequence, to a cohort of mRNAs encoding proteins required for completion of differentiation and subsequent meiotic initiation. In summary, our studies support an essential role for RBM46 in regulating target mRNAs during spermatogonia differentiation prior to the commitment to meiosis in mice.
Assuntos
Proteínas de Ligação a RNA/metabolismo , Espermatogênese , Espermatogônias , Animais , Diferenciação Celular/genética , Masculino , Mamíferos/genética , Meiose/genética , Camundongos , Camundongos Knockout , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genética , Espermatócitos/metabolismo , Espermatogênese/genética , Espermatogônias/metabolismo , TestículoRESUMO
The enormous cellular diversity in the mammalian brain, which is highly prototypical and organized in a hierarchical manner, is dictated by cell-type-specific gene-regulatory programs at the molecular level. Although prevalent in the brain, the contribution of alternative splicing (AS) to the molecular diversity across neuronal cell types is just starting to emerge. Here, we systematically investigated AS regulation across over 100 transcriptomically defined neuronal types of the adult mouse cortex using deep single-cell RNA-sequencing data. We found distinct splicing programs between glutamatergic and GABAergic neurons and between subclasses within each neuronal class. These programs consist of overlapping sets of alternative exons showing differential splicing at multiple hierarchical levels. Using an integrative approach, our analysis suggests that RNA-binding proteins (RBPs) Celf1/2, Mbnl2, and Khdrbs3 are preferentially expressed and more active in glutamatergic neurons, while Elavl2 and Qk are preferentially expressed and more active in GABAergic neurons. Importantly, these and additional RBPs also contribute to differential splicing between neuronal subclasses at multiple hierarchical levels, and some RBPs contribute to splicing dynamics that do not conform to the hierarchical structure defined by the transcriptional profiles. Thus, our results suggest graded regulation of AS across neuronal cell types, which may provide a molecular mechanism to specify neuronal identity and function that are orthogonal to established classifications based on transcriptional regulation.
Assuntos
Córtex Cerebral/metabolismo , Neurônios GABAérgicos/metabolismo , Proteínas do Tecido Nervoso/biossíntese , Splicing de RNA , RNA-Seq , Análise de Célula Única , Animais , Córtex Cerebral/citologia , Neurônios GABAérgicos/citologia , Camundongos , Proteínas do Tecido Nervoso/genéticaRESUMO
PURPOSE: To compare machine learning (ML) models with logistic regression model in order to identify the optimal factors associated with mammography-occult (i.e. false-negative mammographic findings) magnetic resonance imaging (MRI)-detected newly diagnosed breast cancer (BC). MATERIAL AND METHODS: The present single-centre retrospective study included consecutive women with BC who underwent mammography and MRI (no more than 45 days apart) for breast cancer between January 2018 and May 2023. Various ML algorithms and binary logistic regression analysis were utilized to extract features linked to mammography-occult BC. These features were subsequently employed to create different models. The predictive value of these models was assessed using receiver operating characteristic curve analysis. RESULTS: This study included 1957 malignant lesions from 1914 patients, with an average age of 51.64 ± 9.92 years and a range of 20-86 years. Among these lesions, there were 485 mammography-occult BCs. The optimal features of mammography-occult BC included calcification status, tumour size, mammographic density, age, lesion enhancement type on MRI, and histological type. Among the different ML models (ANN, L1-LR, RF, and SVM) and the LR-based combined model, the ANN model with RF features was found to be the optimal model. It demonstrated the best discriminative performance in predicting mammography false- negative findings, with an AUC of 0.912, an accuracy of 86.90%, a sensitivity of 85.85%, and a specificity of 84.18%. CONCLUSION: Mammography-occult MRI-detected breast cancers have features that should be considered when performing breast MRI to improve the detection rate for breast cancer and aid in clinician management.
Assuntos
Neoplasias da Mama , Aprendizado de Máquina , Imageamento por Ressonância Magnética , Mamografia , Humanos , Neoplasias da Mama/diagnóstico por imagem , Feminino , Pessoa de Meia-Idade , Imageamento por Ressonância Magnética/métodos , Mamografia/métodos , Estudos Retrospectivos , Adulto , Idoso , Modelos Logísticos , Idoso de 80 Anos ou mais , Adulto Jovem , Reações Falso-Negativas , Curva ROCRESUMO
Phase separation is an important mechanism that mediates the spatial distribution of proteins in different cellular compartments. While phase-separated proteins share certain sequence characteristics, including intrinsically disordered regions (IDRs) and prion-like domains, such characteristics are insufficient for making accurate predictions; thus, a proteome-wide understanding of phase separation is currently lacking. Here, we define phase-separated proteomes based on the systematic analysis of immunofluorescence images of 12 073 proteins in the Human Protein Atlas. The analysis of these proteins reveals that phase-separated candidate proteins exhibit higher IDR contents, higher mean net charge and lower hydropathy and prefer to bind to RNA. Kinases and transcription factors are also enriched among these candidate proteins. Strikingly, both phase-separated kinases and phase-separated transcription factors display significantly reduced substrate specificity. Our work provides the first global view of the phase-separated proteome and suggests that the spatial proximity resulting from phase separation reduces the requirement for motif specificity and expands the repertoire of substrates. The source code and data are available at https://github.com/cheneyyu/deepphase.
Assuntos
Proteínas Intrinsicamente Desordenadas/química , Proteoma , Aprendizado Profundo , Imunofluorescência , Humanos , Proteínas Intrinsicamente Desordenadas/isolamento & purificação , Proteínas Intrinsicamente Desordenadas/metabolismo , Extração Líquido-Líquido , Organelas/metabolismo , Conformação Proteica , Processamento de Proteína Pós-TraducionalRESUMO
Inhibition of muscleblind-like (MBNL) activity due to sequestration by microsatellite expansion RNAs is a major pathogenic event in the RNA-mediated disease myotonic dystrophy (DM). Although MBNL1 and MBNL2 bind to nascent transcripts to regulate alternative splicing during muscle and brain development, another major binding site for the MBNL protein family is the 3' untranslated region of target RNAs. Here, we report that depletion of Mbnl proteins in mouse embryo fibroblasts leads to misregulation of thousands of alternative polyadenylation events. HITS-CLIP and minigene reporter analyses indicate that these polyadenylation switches are a direct consequence of MBNL binding to target RNAs. Misregulated alternative polyadenylation also occurs in skeletal muscle in a mouse polyCUG model and human DM, resulting in the persistence of neonatal polyadenylation patterns. These findings reveal an additional developmental function for MBNL proteins and demonstrate that DM is characterized by misregulation of pre-mRNA processing at multiple levels.
Assuntos
Processamento Alternativo/genética , Proteínas de Transporte/genética , Proteínas de Ligação a DNA/genética , Poliadenilação/genética , Proteínas de Ligação a RNA/genética , Regiões 3' não Traduzidas/genética , Animais , Sítios de Ligação/genética , Proteínas de Transporte/metabolismo , Células Cultivadas , Proteínas de Ligação a DNA/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Camundongos , Repetições de Microssatélites/genética , Músculo Esquelético/citologia , Músculo Esquelético/metabolismo , Distrofia Miotônica/genética , Ligação Proteica , Interferência de RNA , Precursores de RNA/genética , Processamento Pós-Transcricional do RNA/genética , RNA Mensageiro/metabolismo , RNA Interferente Pequeno , Proteínas de Ligação a RNA/metabolismoRESUMO
We combine the labeling of newly transcribed RNAs with 5-ethynyluridine with the characterization of bound proteins. This approach, named capture of the newly transcribed RNA interactome using click chemistry (RICK), systematically captures proteins bound to a wide range of RNAs, including nascent RNAs and traditionally neglected nonpolyadenylated RNAs. RICK has identified mitotic regulators amongst other novel RNA-binding proteins with preferential affinity for nonpolyadenylated RNAs, revealed a link between metabolic enzymes/factors and nascent RNAs, and expanded the known RNA-bound proteome of mouse embryonic stem cells. RICK will facilitate an in-depth interrogation of the total RNA-bound proteome in different cells and systems.
Assuntos
Química Click/métodos , Proteoma/metabolismo , Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Animais , Células-Tronco Embrionárias/citologia , Células-Tronco Embrionárias/metabolismo , Células HeLa , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Espectrometria de Massas/métodos , Camundongos , Mapas de Interação de Proteínas , RNA/genética , Proteínas de Ligação a RNA/genética , Uridina/análogos & derivados , Uridina/químicaRESUMO
Chemotherapy is one of conventional treatment methods for breast cancer, but drug toxicity and side effects have severely limited its clinical applications. Photothermal therapy has emerged as a promising method that, upon combination with chemotherapy, can better treat breast cancer. In this context, a biodegradable mesoporous silica nanoparticle (bMSN NPs) system was developed for loading doxorubicin (DOX) and IR780, to be potentially applied in the treatment of breast cancer. IR780 is encapsulated in the pores of bMSN NPs by hydrophobic adsorption, while DOX is adsorbed on the surface of the bMSN NPs by hyaluronic acid electrostatically, to form the bMID NPs. Transmission electron microscopy, fluorescence spectrum and UV absorption spectrum are used to prove the successful encapsulation of IR780 and the loading of DOX. In vitro experiments have shown bMID NPs present an excellent therapeutic effect on breast cancer cells. In vivo fluorescence imaging results have indicated that bMID NPs can accumulate in tumor sites gradually and achieve in vivo long-term circulation and continuous drug release. Furthermore, bMID NPs have provided obvious antitumor effects in breast cancer mouse models, thus evolving as an efficient platform for breast cancer therapy.
Assuntos
Antineoplásicos/uso terapêutico , Materiais Biocompatíveis/química , Neoplasias da Mama/terapia , Ácido Hialurônico/química , Hipertermia Induzida , Nanocompostos/química , Fototerapia , Dióxido de Silício/química , Animais , Morte Celular/efeitos dos fármacos , Endocitose , Feminino , Humanos , Células MCF-7 , Camundongos Nus , Nanopartículas/química , Nanopartículas/ultraestrutura , Porosidade , Eletricidade Estática , Distribuição Tecidual , Testes de Toxicidade Aguda , Ensaio Tumoral de Célula-TroncoRESUMO
Proprioceptive feedback from Group Ia/II muscle spindle afferents and Group Ib Golgi tendon afferents is critical for the normal execution of most motor tasks, yet how these distinct proprioceptor subtypes emerge during development remains poorly understood. Using molecular genetic approaches in mice of either sex, we identified 24 transcripts that have not previously been associated with a proprioceptor identity. Combinatorial expression analyses of these markers reveal at least three molecularly distinct proprioceptor subtypes. In addition, we find that 12 of these transcripts are expressed well after proprioceptors innervate their respective sensory receptors, and expression of three of these markers, including the heart development molecule Heg1, is significantly reduced in mice that lack muscle spindles. These data reveal Heg1 as a putative marker for proprioceptive muscle spindle afferents. Moreover, they suggest that the phenotypic specialization of functionally distinct proprioceptor subtypes depends, in part, on extrinsic sensory receptor organ-derived signals.SIGNIFICANCE STATEMENT Sensory feedback from muscle spindle (MS) and Golgi tendon organ (GTO) sensory end organs is critical for normal motor control, but how distinct MS and GTO afferent sensory neurons emerge during development remains poorly understood. Using (bulk) transcriptome analysis of genetically identified proprioceptors, this work reveals molecular markers for distinct proprioceptor subsets, including some that appear selectively expressed in MS afferents. Detailed analysis of the expression of these transcripts provides evidence that MS/GTO afferent subtype phenotypes may, at least in part, emerge through extrinsic, sensory end organ-derived signals.
Assuntos
Retroalimentação Sensorial/fisiologia , Mecanorreceptores/fisiologia , Fusos Musculares/fisiologia , Propriocepção/fisiologia , Animais , Feminino , Masculino , Proteínas de Membrana/metabolismo , Camundongos , Fusos Musculares/inervação , FenótipoRESUMO
Coal and gas outbursts are among the most severe disasters threatening the safety of coal mines around the world. They are dynamic phenomena characterized by large quantities of coal and gas ejected from working faces within a short time. Numerous researchers have conducted studies on outburst prediction, and a variety of indices have been developed to this end. However, these indices are usually empirical or based on local experience, and the accurate prediction of outbursts is not feasible due to the complicated mechanisms of outbursts. This study conducts outburst experiments using large-scale multifunctional equipment developed in the laboratory to develop a more robust outburst prediction method. In this study, the coal temperature during the outburst process was monitored using temperature sensors. The results show that the coal temperature decreased rapidly as the outburst progressed. Meanwhile, the coal temperature in locations far from the outburst mouth increased. The coal broken in the stress concentration state is the main factor causing the abnormal temperature rise. The discovery of these phenomena lays a theoretical foundation and provides an experimental basis for an effective outburst prediction method. An outburst prediction method based on monitoring temperature was proposed, and has a simpler and faster operation process and is not easily disturbed by coal mining activities. What is more, the critical values of coal temperature rises or temperature gradients can be flexibly adjusted according to the actual situations of different coal mines to predict outbursts more effectively and accurately.
RESUMO
Two polypyrimidine tract RNA-binding proteins (PTBs), one near-ubiquitously expressed (Ptbp1) and another highly tissue-restricted (Ptbp2), regulate RNA in interrelated but incompletely understood ways. Ptbp1, a splicing regulator, is replaced in the brain and differentiated neuronal cell lines by Ptbp2. To define the roles of Ptbp2 in the nervous system, we generated two independent Ptbp2-null strains, unexpectedly revealing that Ptbp2 is expressed in neuronal progenitors and is essential for postnatal survival. A HITS-CLIP (high-throughput sequencing cross-linking immunoprecipitation)-generated map of reproducible Ptbp2-RNA interactions in the developing mouse neocortex, combined with results from splicing-sensitive microarrays, demonstrated that the major action of Ptbp2 is to inhibit adult-specific alternative exons by binding pyrimidine-rich sequences upstream of and/or within them. These regulated exons are present in mRNAs encoding proteins associated with control of cell fate, proliferation, and the actin cytoskeleton, suggesting a role for Ptbp2 in neurogenesis. Indeed, neuronal progenitors in the Ptbp2-null brain exhibited an aberrant polarity and were associated with regions of premature neurogenesis and reduced progenitor pools. Thus, Ptbp2 inhibition of a discrete set of adult neuronal exons underlies early brain development prior to neuronal differentiation and is essential for postnatal survival.
Assuntos
Processamento Alternativo/fisiologia , Encéfalo/embriologia , Diferenciação Celular/fisiologia , Proteínas do Tecido Nervoso/metabolismo , Células-Tronco Neurais/metabolismo , Proteína de Ligação a Regiões Ricas em Polipirimidinas/metabolismo , RNA Mensageiro/metabolismo , Animais , Encéfalo/metabolismo , Éxons/fisiologia , Ribonucleoproteínas Nucleares Heterogêneas/genética , Ribonucleoproteínas Nucleares Heterogêneas/metabolismo , Camundongos , Camundongos Mutantes , Proteínas do Tecido Nervoso/genética , Células-Tronco Neurais/citologia , Proteína de Ligação a Regiões Ricas em Polipirimidinas/genética , RNA Mensageiro/genéticaRESUMO
Summary: UV cross-linking and immunoprecipitation (CLIP), followed by high-throughput sequencing, is a powerful biochemical assay that maps in vivo protein-RNA interactions on a genome-wide scale. The CLIP Tool Kit (CTK) aims at providing a set of tools for flexible, streamlined and comprehensive CLIP data analysis. This software package extends the scope of our original CIMS package. Availability and Implementation: The software is implemented in Perl. The source code and detailed documentation are available at http://zhanglab.c2b2.columbia.edu/index.php/CTK . Contact: cz2294@columbia.edu.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Imunoprecipitação/métodos , Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Software , Humanos , Ligação Proteica , Análise de Sequência de RNA/métodosRESUMO
Alternative splicing (AS) dramatically expands the complexity of the mammalian brain transcriptome, but its atlas remains incomplete. Here we performed deep mRNA sequencing of mouse cortex to discover and characterize alternative exons with potential functional significance. Our analysis expands the list of AS events over 10-fold compared with previous annotations, demonstrating that 72% of multiexon genes express multiple splice variants in this single tissue. To evaluate functionality of the newly discovered AS events, we conducted comprehensive analyses on central nervous system (CNS) cell type-specific splicing, targets of tissue- or cell type-specific RNA binding proteins (RBPs), evolutionary selection pressure, and coupling of AS with nonsense-mediated decay (AS-NMD). We show that newly discovered events account for 23-42% of all cassette exons under tissue- or cell type-specific regulation. Furthermore, over 7,000 cassette exons are under evolutionary selection for regulated AS in mammals, 70% of which are new. Among these are 3,058 highly conserved cassette exons, including 1,014 NMD exons that may function directly to control gene expression levels. These NMD exons are particularly enriched in RBPs including splicing factors and interestingly also regulators for other steps of RNA metabolism. Unexpectedly, a second group of NMD exons reside in genes encoding chromatin regulators. Although the conservation of NMD exons in RBPs frequently extends into lower vertebrates, NMD exons in chromatin regulators are introduced later into the mammalian lineage, implying the emergence of a novel mechanism coupling AS and epigenetics. Our results highlight previously uncharacterized complexity and evolution in the mammalian brain transcriptome.
Assuntos
Processamento Alternativo/genética , Encéfalo/metabolismo , Cromatina/metabolismo , Sequência Conservada/genética , Éxons/genética , Mamíferos/genética , Degradação do RNAm Mediada por Códon sem Sentido/genética , Animais , Sequência de Bases , Córtex Cerebral/metabolismo , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Camundongos Endogâmicos C57BL , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Especificidade de Órgãos/genética , Seleção Genética , Transcriptoma/genéticaRESUMO
Recent studies have identified many genes with rare de novo mutations in autism, but a limited number of these have been conclusively established as disease-susceptibility genes due to the lack of recurrence and confounding background mutations. Such extreme genetic heterogeneity severely limits recurrence-based statistical power even in studies with a large sample size. Here, we use cell-type specific expression profiles to differentiate mutations in autism patients from those in unaffected siblings. We report a gene expression signature in different neuronal cell types shared by genes with likely gene-disrupting (LGD) mutations in autism cases. This signature reflects haploinsufficiency of risk genes enriched in transcriptional and post-transcriptional regulators, with the strongest positive associations with specific types of neurons in different brain regions, including cortical neurons, cerebellar granule cells, and striatal medium spiny neurons. When used to prioritize genes with a single LGD mutation in cases, a D-score derived from the signature achieved a precision of 40% as compared with the 15% baseline with a minimal loss in sensitivity. An ensemble model combining D-score with mutation intolerance metrics from Exome Aggregation Consortium further improved the precision to 60%, resulting in 117 high-priority candidates. These prioritized lists can facilitate identification of additional autism-susceptibility genes.
Assuntos
Transtorno Autístico/genética , Estudos de Associação Genética , Predisposição Genética para Doença , Haploinsuficiência , Transcriptoma , Transtorno Autístico/diagnóstico , Cromossomos Humanos Par 16 , Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Análise Mutacional de DNA , Feminino , Perfilação da Expressão Gênica , Humanos , Masculino , Mutação , Especificidade de Órgãos/genética , Sensibilidade e Especificidade , Fatores Sexuais , Sequenciamento do ExomaRESUMO
The major cell classes of the brain differ in their developmental processes, metabolism, signaling, and function. To better understand the functions and interactions of the cell types that comprise these classes, we acutely purified representative populations of neurons, astrocytes, oligodendrocyte precursor cells, newly formed oligodendrocytes, myelinating oligodendrocytes, microglia, endothelial cells, and pericytes from mouse cerebral cortex. We generated a transcriptome database for these eight cell types by RNA sequencing and used a sensitive algorithm to detect alternative splicing events in each cell type. Bioinformatic analyses identified thousands of new cell type-enriched genes and splicing isoforms that will provide novel markers for cell identification, tools for genetic manipulation, and insights into the biology of the brain. For example, our data provide clues as to how neurons and astrocytes differ in their ability to dynamically regulate glycolytic flux and lactate generation attributable to unique splicing of PKM2, the gene encoding the glycolytic enzyme pyruvate kinase. This dataset will provide a powerful new resource for understanding the development and function of the brain. To ensure the widespread distribution of these datasets, we have created a user-friendly website (http://web.stanford.edu/group/barres_lab/brain_rnaseq.html) that provides a platform for analyzing and comparing transciption and alternative splicing profiles for various cell classes in the brain.