RESUMO
Antibodies that gain specificity by a large insert encoding for an extra domain were described for the first time in 2016. In malaria-exposed individuals, an exon deriving from the leukocyte-associated immunoglobulin-like 1 (LAIR1) gene integrated via a copy-and-paste insertion into the immunoglobulin heavy chain encoding region. A few years later, a second example was identified, namely a dual exon integration from the leukocyte immunoglobulin-like receptor B1 (LILRB1) gene that is located in close proximity to LAIR1. A dedicated high-throughput characterization of chimeric immunoglobulin heavy chain transcripts unraveled, that insertions from distant genomic regions (including mitochondrial DNA) can contribute to human antibody diversity. This review describes the modalities of insert-containing antibodies. The role of known DNA mobility aspects, such as genomic translocation, gene conversion, and DNA fragility, is discussed in the context of insert-antibody generation. Finally, the review covers why insert antibodies were omitted from the past repertoire analyses and how insert antibodies can contribute to protective immunity or an autoreactive response.
Assuntos
Éxons , Recombinação V(D)J , Humanos , Recombinação V(D)J/genética , Éxons/genética , Animais , Anticorpos/imunologia , Anticorpos/genética , Receptores Imunológicos/genética , Receptores Imunológicos/metabolismo , Receptores Imunológicos/imunologia , Diversidade de Anticorpos/genéticaRESUMO
The activities of RNA polymerase and the spliceosome are responsible for the heterogeneity in the abundance and isoform composition of mRNA in human cells. However, the dynamics of these megadalton enzymatic complexes working in concert on endogenous genes have not been described. Here, we establish a quasi-genome-scale platform for observing synthesis and processing kinetics of single nascent RNA molecules in real time. We find that all observed genes show transcriptional bursting. We also observe large kinetic variation in intron removal for single introns in single cells, which is inconsistent with deterministic splice site selection. Transcriptome-wide footprinting of the U2AF complex, nascent RNA profiling, long-read sequencing, and lariat sequencing further reveal widespread stochastic recursive splicing within introns. We propose and validate a unified theoretical model to explain the general features of transcription and pervasive stochastic splice site selection.
Assuntos
Precursores de RNA/genética , Sítios de Splice de RNA/fisiologia , Transcrição Gênica , Éxons/genética , Humanos , Íntrons/genética , Precursores de RNA/metabolismo , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Splicing de RNA/fisiologia , RNA Mensageiro/metabolismo , Spliceossomos/metabolismo , TranscriptomaRESUMO
The splicing of pre-mRNAs into mature transcripts is remarkable for its precision, but the mechanisms by which the cellular machinery achieves such specificity are incompletely understood. Here, we describe a deep neural network that accurately predicts splice junctions from an arbitrary pre-mRNA transcript sequence, enabling precise prediction of noncoding genetic variants that cause cryptic splicing. Synonymous and intronic mutations with predicted splice-altering consequence validate at a high rate on RNA-seq and are strongly deleterious in the human population. De novo mutations with predicted splice-altering consequence are significantly enriched in patients with autism and intellectual disability compared to healthy controls and validate against RNA-seq in 21 out of 28 of these patients. We estimate that 9%-11% of pathogenic mutations in patients with rare genetic disorders are caused by this previously underappreciated class of disease variation.
Assuntos
Previsões/métodos , Precursores de RNA/genética , Splicing de RNA/genética , Algoritmos , Processamento Alternativo/genética , Transtorno Autístico/genética , Aprendizado Profundo , Éxons/genética , Humanos , Deficiência Intelectual/genética , Íntrons/genética , Redes Neurais de Computação , Precursores de RNA/metabolismo , Sítios de Splice de RNA/genética , Sítios de Splice de RNA/fisiologiaRESUMO
Despite a wealth of molecular knowledge, quantitative laws for accurate prediction of biological phenomena remain rare. Alternative pre-mRNA splicing is an important regulated step in gene expression frequently perturbed in human disease. To understand the combined effects of mutations during evolution, we quantified the effects of all possible combinations of exonic mutations accumulated during the emergence of an alternatively spliced human exon. This revealed that mutation effects scale non-monotonically with the inclusion level of an exon, with each mutation having maximum effect at a predictable intermediate inclusion level. This scaling is observed genome-wide for cis and trans perturbations of splicing, including for natural and disease-associated variants. Mathematical modeling suggests that competition between alternative splice sites is sufficient to cause this non-linearity in the genotype-phenotype map. Combining the global scaling law with specific pairwise interactions between neighboring mutations allows accurate prediction of the effects of complex genotype changes involving >10 mutations.
Assuntos
Processamento Alternativo/genética , Splicing de RNA/genética , Receptor fas/genética , Animais , Éxons/genética , Técnicas Genéticas , Genética , Genótipo , Humanos , Íntrons/genética , Camundongos , Modelos Teóricos , Mutação/genética , Fenótipo , Precursores de RNA/metabolismo , Sítios de Splice de RNA/genética , RNA Mensageiro/metabolismoRESUMO
Many protein-coding genes in higher eukaryotes can produce circular RNAs (circRNAs) through back-splicing of exons. CircRNAs differ from mRNAs in their production, structure and turnover and thereby have unique cellular functions and potential biomedical applications. In this Review, I discuss recent progress in our understanding of the biogenesis of circRNAs and the regulation of their abundance and of their biological functions, including in transcription and splicing, sequestering or scaffolding of macromolecules to interfere with microRNA activities or signalling pathways, and serving as templates for translation. I further discuss the emerging roles of circRNAs in regulating immune responses and cell proliferation, and the possibilities of applying circRNA technologies in biomedical research.
Assuntos
RNA Circular/genética , RNA Circular/metabolismo , RNA Circular/fisiologia , Processamento Alternativo/genética , Animais , Éxons/genética , Expressão Gênica/genética , Regulação da Expressão Gênica/genética , Humanos , MicroRNAs/metabolismo , RNA/genética , Splicing de RNA/genética , RNA Mensageiro/metabolismoRESUMO
CRISPR-Cas technology has transformed functional genomics, yet understanding of how individual exons differentially shape cellular phenotypes remains limited. Here, we optimized and conducted massively parallel exon deletion and splice-site mutation screens in human cell lines to identify exons that regulate cellular fitness. Fitness-promoting exons are prevalent in essential and highly expressed genes and commonly overlap with protein domains and interaction interfaces. Conversely, fitness-suppressing exons are enriched in nonessential genes, exhibiting lower inclusion levels, and overlap with intrinsically disordered regions and disease-associated mutations. In-depth mechanistic investigation of the screen-hit TAF5 alternative exon-8 revealed that its inclusion is required for assembly of the TFIID general transcription initiation complex, thereby regulating global gene expression output. Collectively, our orthogonal exon perturbation screens established a comprehensive repository of phenotypically important exons and uncovered regulatory mechanisms governing cellular fitness and gene expression.
Assuntos
Éxons , Humanos , Éxons/genética , Sistemas CRISPR-Cas , Fator de Transcrição TFIID/genética , Fator de Transcrição TFIID/metabolismo , Aptidão Genética , Células HEK293 , Fatores Associados à Proteína de Ligação a TATA/genética , Fatores Associados à Proteína de Ligação a TATA/metabolismo , Sítios de Splice de RNA , Mutação , Regulação da Expressão Gênica , Processamento AlternativoRESUMO
Alternative splicing significantly expands biological complexity, particularly in the vertebrate nervous system. Increasing evidence indicates that developmental and tissue-dependent alternative exons often control protein-protein interactions; yet, only a minor fraction of these events have been characterized. Using affinity purification-mass spectrometry (AP-MS), we show that approximately 60% of analyzed neural-differential exons in proteins previously implicated in transcriptional regulation result in the gain or loss of interaction partners, which in some cases form unexpected links with coupled processes. Notably, a neural exon in Chtop regulates its interaction with the Prmt1 methyltransferase and DExD-Box helicases Ddx39b/a, affecting its methylation and activity in promoting RNA export. Additionally, a neural exon in Sap30bp affects interactions with RNA processing factors, modulating a critical function of Sap30bp in promoting the splicing of <100 nt "mini-introns" that control nuclear RNA levels. AP-MS is thus a powerful approach for elucidating the multifaceted functions of proteins imparted by context-dependent alternative exons.
Assuntos
Processamento Alternativo , Splicing de RNA , Éxons/genética , Íntrons , RNARESUMO
N6-methyladenosine (m6A), a widespread destabilizing mark on mRNA, is non-uniformly distributed across the transcriptome, yet the basis for its selective deposition is unknown. Here, we propose that m6A deposition is not selective. Instead, it is exclusion based: m6A consensus motifs are methylated by default, unless they are within a window of â¼100 nt from a splice junction. A simple model which we extensively validate, relying exclusively on presence of m6A motifs and exon-intron architecture, allows in silico recapitulation of experimentally measured m6A profiles. We provide evidence that exclusion from splice junctions is mediated by the exon junction complex (EJC), potentially via physical occlusion, and that previously observed associations between exon-intron architecture and mRNA decay are mechanistically mediated via m6A. Our findings establish a mechanism coupling nuclear mRNA splicing and packaging with the covalent installation of m6A, in turn controlling cytoplasmic decay.
Assuntos
Splicing de RNA , Transcriptoma , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Estabilidade de RNA , Éxons/genéticaRESUMO
4.5SH RNA is a highly abundant, small rodent-specific noncoding RNA that localizes to nuclear speckles enriched in pre-mRNA-splicing regulators. To investigate the physiological functions of 4.5SH RNA, we have created mutant mice that lack the expression of 4.5SH RNA. The mutant mice exhibited embryonic lethality, suggesting that 4.5SH RNA is an essential species-specific noncoding RNA in mice. RNA-sequencing analyses revealed that 4.5SH RNA protects the transcriptome from abnormal exonizations of the antisense insertions of the retrotransposon SINE B1 (asB1), which would otherwise introduce deleterious premature stop codons or frameshift mutations. Mechanistically, 4.5SH RNA base pairs with complementary asB1-containing exons via the target recognition region and recruits effector proteins including Hnrnpm via its 5' stem loop region. The modular organization of 4.5SH RNA allows us to engineer a programmable splicing regulator to induce the skipping of target exons of interest. Our results also suggest the general existence of splicing regulatory noncoding RNAs.
Assuntos
Splicing de RNA , Pequeno RNA não Traduzido , Camundongos , Animais , Splicing de RNA/genética , Éxons/genética , Retroelementos/genética , Códon sem Sentido , Processamento AlternativoRESUMO
Early spliceosome assembly can occur through an intron-defined pathway, whereby U1 and U2 small nuclear ribonucleoprotein particles (snRNPs) assemble across the intron1. Alternatively, it can occur through an exon-defined pathway2-5, whereby U2 binds the branch site located upstream of the defined exon and U1 snRNP interacts with the 5' splice site located directly downstream of it. The U4/U6.U5 tri-snRNP subsequently binds to produce a cross-intron (CI) or cross-exon (CE) pre-B complex, which is then converted to the spliceosomal B complex6,7. Exon definition promotes the splicing of upstream introns2,8,9 and plays a key part in alternative splicing regulation10-16. However, the three-dimensional structure of exon-defined spliceosomal complexes and the molecular mechanism of the conversion from a CE-organized to a CI-organized spliceosome, a pre-requisite for splicing catalysis, remain poorly understood. Here cryo-electron microscopy analyses of human CE pre-B complex and B-like complexes reveal extensive structural similarities with their CI counterparts. The results indicate that the CE and CI spliceosome assembly pathways converge already at the pre-B stage. Add-back experiments using purified CE pre-B complexes, coupled with cryo-electron microscopy, elucidate the order of the extensive remodelling events that accompany the formation of B complexes and B-like complexes. The molecular triggers and roles of B-specific proteins in these rearrangements are also identified. We show that CE pre-B complexes can productively bind in trans to a U1 snRNP-bound 5' splice site. Together, our studies provide new mechanistic insights into the CE to CI switch during spliceosome assembly and its effect on pre-mRNA splice site pairing at this stage.
Assuntos
Éxons , Íntrons , Splicing de RNA , Spliceossomos , Humanos , Processamento Alternativo , Microscopia Crioeletrônica , Éxons/genética , Íntrons/genética , Modelos Moleculares , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Spliceossomos/metabolismo , Spliceossomos/química , Spliceossomos/ultraestrutura , Ribonucleoproteínas Nucleares Pequenas/química , Ribonucleoproteínas Nucleares Pequenas/metabolismo , Ribonucleoproteínas Nucleares Pequenas/ultraestruturaRESUMO
The loss of the tail is among the most notable anatomical changes to have occurred along the evolutionary lineage leading to humans and to the 'anthropomorphous apes'1-3, with a proposed role in contributing to human bipedalism4-6. Yet, the genetic mechanism that facilitated tail-loss evolution in hominoids remains unknown. Here we present evidence that an individual insertion of an Alu element in the genome of the hominoid ancestor may have contributed to tail-loss evolution. We demonstrate that this Alu element-inserted into an intron of the TBXT gene7-9-pairs with a neighbouring ancestral Alu element encoded in the reverse genomic orientation and leads to a hominoid-specific alternative splicing event. To study the effect of this splicing event, we generated multiple mouse models that express both full-length and exon-skipped isoforms of Tbxt, mimicking the expression pattern of its hominoid orthologue TBXT. Mice expressing both Tbxt isoforms exhibit a complete absence of the tail or a shortened tail depending on the relative abundance of Tbxt isoforms expressed at the embryonic tail bud. These results support the notion that the exon-skipped transcript is sufficient to induce a tail-loss phenotype. Moreover, mice expressing the exon-skipped Tbxt isoform develop neural tube defects, a condition that affects approximately 1 in 1,000 neonates in humans10. Thus, tail-loss evolution may have been associated with an adaptive cost of the potential for neural tube defects, which continue to affect human health today.
Assuntos
Processamento Alternativo , Evolução Molecular , Hominidae , Proteínas com Domínio T , Cauda , Animais , Humanos , Camundongos , Processamento Alternativo/genética , Elementos Alu/genética , Modelos Animais de Doenças , Genoma/genética , Hominidae/anatomia & histologia , Hominidae/genética , Íntrons/genética , Defeitos do Tubo Neural/genética , Defeitos do Tubo Neural/metabolismo , Fenótipo , Isoformas de Proteínas/deficiência , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas com Domínio T/deficiência , Proteínas com Domínio T/genética , Proteínas com Domínio T/metabolismo , Cauda/anatomia & histologia , Cauda/embriologia , Éxons/genéticaRESUMO
Transposable elements (TEs) are a major constituent of human genes, occupying approximately half of the intronic space. During pre-messenger RNA synthesis, intronic TEs are transcribed along with their host genes but rarely contribute to the final mRNA product because they are spliced out together with the intron and rapidly degraded. Paradoxically, TEs are an abundant source of RNA-processing signals through which they can create new introns1, and also functional2 or non-functional chimeric transcripts3. The rarity of these events implies the existence of a resilient splicing code that is able to suppress TE exonization without compromising host pre-mRNA processing. Here we show that SAFB proteins protect genome integrity by preventing retrotransposition of L1 elements while maintaining splicing integrity, via prevention of the exonization of previously integrated TEs. This unique dual role is possible because of L1's conserved adenosine-rich coding sequences that are bound by SAFB proteins. The suppressive activity of SAFB extends to tissue-specific, giant protein-coding cassette exons, nested genes and Tigger DNA transposons. Moreover, SAFB also suppresses LTR/ERV elements in species in which they are still active, such as mice and flies. A significant subset of splicing events suppressed by SAFB in somatic cells are activated in the testis, coinciding with low SAFB expression in postmeiotic spermatids. Reminiscent of the division of labour between innate and adaptive immune systems that fight external pathogens, our results uncover SAFB proteins as an RNA-based, pattern-guided, non-adaptive defence system against TEs in the soma, complementing the RNA-based, adaptive Piwi-interacting RNA pathway of the germline.
Assuntos
Elementos de DNA Transponíveis , Íntrons , Precursores de RNA , Splicing de RNA , RNA Mensageiro , Animais , Humanos , Masculino , Camundongos , Elementos de DNA Transponíveis/genética , Drosophila melanogaster/genética , Éxons/genética , Genoma/genética , Íntrons/genética , Especificidade de Órgãos/genética , RNA de Interação com Piwi/genética , RNA de Interação com Piwi/metabolismo , Precursores de RNA/genética , Precursores de RNA/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Espermátides/citologia , Espermátides/metabolismo , Splicing de RNA/genética , Testículo , MeioseRESUMO
Timothy syndrome (TS) is a severe, multisystem disorder characterized by autism, epilepsy, long-QT syndrome and other neuropsychiatric conditions1. TS type 1 (TS1) is caused by a gain-of-function variant in the alternatively spliced and developmentally enriched CACNA1C exon 8A, as opposed to its counterpart exon 8. We previously uncovered several phenotypes in neurons derived from patients with TS1, including delayed channel inactivation, prolonged depolarization-induced calcium rise, impaired interneuron migration, activity-dependent dendrite retraction and an unanticipated persistent expression of exon 8A2-6. We reasoned that switching CACNA1C exon utilization from 8A to 8 would represent a potential therapeutic strategy. Here we developed antisense oligonucleotides (ASOs) to effectively decrease the inclusion of exon 8A in human cells both in vitro and, following transplantation, in vivo. We discovered that the ASO-mediated switch from exon 8A to 8 robustly rescued defects in patient-derived cortical organoids and migration in forebrain assembloids. Leveraging a transplantation platform previously developed7, we found that a single intrathecal ASO administration rescued calcium changes and in vivo dendrite retraction of patient neurons, suggesting that suppression of CACNA1C exon 8A expression is a potential treatment for TS1. Broadly, these experiments illustrate how a multilevel, in vivo and in vitro stem cell model-based approach can identify strategies to reverse disease-relevant neural pathophysiology.
Assuntos
Transtorno Autístico , Síndrome do QT Longo , Oligonucleotídeos Antissenso , Sindactilia , Animais , Feminino , Humanos , Masculino , Camundongos , Processamento Alternativo/efeitos dos fármacos , Processamento Alternativo/genética , Transtorno Autístico/tratamento farmacológico , Transtorno Autístico/genética , Cálcio/metabolismo , Canais de Cálcio Tipo L/metabolismo , Canais de Cálcio Tipo L/genética , Movimento Celular/efeitos dos fármacos , Dendritos/metabolismo , Éxons/genética , Síndrome do QT Longo/tratamento farmacológico , Síndrome do QT Longo/genética , Neurônios/metabolismo , Neurônios/efeitos dos fármacos , Oligonucleotídeos Antissenso/farmacologia , Oligonucleotídeos Antissenso/uso terapêutico , Organoides/efeitos dos fármacos , Organoides/metabolismo , Prosencéfalo/metabolismo , Prosencéfalo/citologia , Sindactilia/tratamento farmacológico , Sindactilia/genética , Interneurônios/citologia , Interneurônios/efeitos dos fármacosRESUMO
Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.
Assuntos
Processamento Alternativo , Splicing de RNA , Humanos , Sequência de Bases , Íntrons/genética , Éxons/genéticaRESUMO
How the splicing machinery defines exons or introns as the spliced unit has remained a puzzle for 30 years. Here, we demonstrate that peripheral and central regions of the nucleus harbor genes with two distinct exon-intron GC content architectures that differ in the splicing outcome. Genes with low GC content exons, flanked by long introns with lower GC content, are localized in the periphery, and the exons are defined as the spliced unit. Alternative splicing of these genes results in exon skipping. In contrast, the nuclear center contains genes with a high GC content in the exons and short flanking introns. Most splicing of these genes occurs via intron definition, and aberrant splicing leads to intron retention. We demonstrate that the nuclear periphery and center generate different environments for the regulation of alternative splicing and that two sets of splicing factors form discrete regulatory subnetworks for the two gene architectures. Our study connects 3D genome organization and splicing, thus demonstrating that exon and intron definition modes of splicing occur in different nuclear regions.
Assuntos
Processamento Alternativo , Splicing de RNA , Composição de Bases , Éxons/genética , Íntrons/genéticaRESUMO
Alternative splicing (AS) is a critical regulatory layer; yet, factors controlling functionally coordinated splicing programs during developmental transitions are poorly understood. Here, we employ a screening strategy to identify factors controlling dynamic splicing events important for mammalian neurogenesis. Among previously unknown regulators, Rbm38 acts widely to negatively control neural AS, in part through interactions mediated by the established repressor of splicing, Ptbp1. Puf60, a ubiquitous factor, is surprisingly found to promote neural splicing patterns. This activity requires a conserved, neural-differential exon that remodels Puf60 co-factor interactions. Ablation of this exon rewires distinct AS networks in embryonic stem cells and at different stages of mouse neurogenesis. Single-cell transcriptome analyses further reveal distinct roles for Rbm38 and Puf60 isoforms in establishing neuronal identity. Our results describe important roles for previously unknown regulators of neurogenesis and establish how an alternative exon in a widely expressed splicing factor orchestrates temporal control over cell differentiation.
Assuntos
Neurogênese , Splicing de RNA , Processamento Alternativo , Animais , Éxons/genética , Mamíferos , Camundongos , Neurogênese/genética , Neurônios , Proteínas de Ligação a RNA/genéticaRESUMO
Pre-mRNA splicing involves two sequential reactions: branching and exon ligation. The C complex after branching undergoes remodeling to become the C∗ complex, which executes exon ligation. Here, we report cryo-EM structures of two intermediate human spliceosomal complexes, pre-C∗-I and pre-C∗-II, both at 3.6 Å. In both structures, the 3' splice site is already docked into the active site, the ensuing 3' exon sequences are anchored on PRP8, and the step II factor FAM192A contacts the duplex between U2 snRNA and the branch site. In the transition of pre-C∗-I to pre-C∗-II, the step II factors Cactin, FAM32A, PRKRIP1, and SLU7 are recruited. Notably, the RNA helicase PRP22 is positioned quite differently in the pre-C∗-I, pre-C∗-II, and C∗ complexes, suggesting a role in 3' exon binding and proofreading. Together with information on human C and C∗ complexes, our studies recapitulate a molecular choreography of the C-to-C∗ transition, revealing mechanistic insights into exon ligation.
Assuntos
Proteínas de Saccharomyces cerevisiae , Spliceossomos , Éxons/genética , Humanos , Precursores de RNA/metabolismo , Sítios de Splice de RNA , Splicing de RNA , Fatores de Processamento de RNA/genética , Fatores de Processamento de RNA/metabolismo , RNA Nuclear Pequeno/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Spliceossomos/metabolismoRESUMO
Despite a long appreciation for the role of nonsense-mediated mRNA decay (NMD) in destroying faulty, disease-causing mRNAs and maintaining normal, physiologic mRNA abundance, additional effectors that regulate NMD activity in mammalian cells continue to be identified. Here, we describe a haploid-cell genetic screen for NMD effectors that has unexpectedly identified 13 proteins constituting the AKT signaling pathway. We show that AKT supersedes UPF2 in exon-junction complexes (EJCs) that are devoid of RNPS1 but contain CASC3, defining an unanticipated insulin-stimulated EJC. Without altering UPF1 RNA binding or ATPase activity, AKT-mediated phosphorylation of the UPF1 CH domain at T151 augments UPF1 helicase activity, which is critical for NMD and also decreases the dependence of helicase activity on ATP. We demonstrate that upregulation of AKT signaling contributes to the hyperactivation of NMD that typifies Fragile X syndrome, as exemplified using FMR1-KO neural stem cells derived from induced pluripotent stem cells.
Assuntos
Degradação do RNAm Mediada por Códon sem Sentido , Proteínas Proto-Oncogênicas c-akt , Animais , Códon sem Sentido/genética , Éxons/genética , Mamíferos/metabolismo , Proteínas Proto-Oncogênicas c-akt/genética , Proteínas Proto-Oncogênicas c-akt/metabolismo , RNA Helicases/genética , RNA Helicases/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transativadores/genética , Transativadores/metabolismo , Fatores de Transcrição/metabolismoRESUMO
Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on average per human haplotype. We develop a genome-wide map of IGC donors and acceptors, including 498 acceptor and 454 donor hotspots affecting the exons of about 800 protein-coding genes. These include 171 genes that have 'relocated' on average 1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework, we show that SD regions are slightly evolutionarily older when compared to unique sequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutational spectrum: a 27.1% increase in transversions that convert cytosine to guanine or the reverse across all triplet contexts and a 7.6% reduction in the frequency of CpG-associated mutations when compared to unique DNA. We reason that these distinct mutational properties help to maintain an overall higher GC content of SD DNA compared to that of unique DNA, probably driven by GC-biased conversion between paralogous sequences5,6.
Assuntos
Conversão Gênica , Mutação , Duplicações Segmentares Genômicas , Humanos , Conversão Gênica/genética , Genoma Humano/genética , Polimorfismo de Nucleotídeo Único/genética , Haplótipos/genética , Éxons/genética , Citosina/química , Guanina/química , Ilhas de CpG/genéticaRESUMO
Exitron splicing (EIS) creates a cryptic intron (called an exitron) within a protein-coding exon to increase proteome diversity. EIS is poorly characterized, but emerging evidence suggests a role for EIS in cancer. Through a systematic investigation of EIS across 33 cancers from 9,599 tumor transcriptomes, we discovered that EIS affected 63% of human coding genes and that 95% of those events were tumor specific. Notably, we observed a mutually exclusive pattern between EIS and somatic mutations in their affected genes. Functionally, we discovered that EIS altered known and novel cancer driver genes for causing gain- or loss-of-function, which promotes tumor progression. Importantly, we identified EIS-derived neoepitopes that bind to major histocompatibility complex (MHC) class I or II. Analysis of clinical data from a clear cell renal cell carcinoma cohort revealed an association between EIS-derived neoantigen load and checkpoint inhibitor response. Our findings establish the importance of considering EIS alterations when nominating cancer driver events and neoantigens.