RESUMO
The pause-release model of transcription proposes that 40-100 bases from the start site RNA Pol II pauses, followed by release into productive elongation. Pause release is facilitated by the PTEFb phosphorylation of the RNA Pol II elongation factor, Spt5. We mapped paused polymerases by eNET-seq and found frequent pausing in zones that extend â¼0.3-3 kb into genes even when PTEFb is inhibited. The fraction of paused polymerases or pausing propensity declines gradually over several kb and not abruptly as predicted for a discrete pause-release event. Spt5 depletion extends pausing zones, suggesting that it promotes the maturation of elongation complexes to a low-pausing state. The expression of mutants after Spt5 depletion showed that phosphomimetic substitutions in the CTR1 domain diminished pausing throughout genes. By contrast, mutants that prevent the phosphorylation of the Spt5 RNA-binding domain strengthened pausing. Thus, distinct Spt5 phospho-isoforms set the balance between pausing and elongation.
Assuntos
RNA Polimerase II , Fatores de Elongação da Transcrição , Fatores de Alongamento de Peptídeos/metabolismo , Fosforilação , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Transcrição Gênica , Fatores de Elongação da Transcrição/genética , Fatores de Elongação da Transcrição/metabolismoRESUMO
The exonuclease torpedo Xrn2 loads onto nascent RNA 5'-PO4 ends and chases down pol II to promote termination downstream from polyA sites. We report that Xrn2 is recruited to preinitiation complexes and "travels" to 3' ends of genes. Mapping of 5'-PO4 ends in nascent RNA identified Xrn2 loading sites stabilized by an active site mutant, Xrn2(D235A). Xrn2 loading sites are approximately two to 20 bases downstream from where CPSF73 cleaves at polyA sites and histone 3' ends. We propose that processing of all mRNA 3' ends comprises cleavage and limited 5'-3' trimming by CPSF73, followed by handoff to Xrn2. A similar handoff occurs at tRNA 3' ends, where cotranscriptional RNase Z cleavage generates novel Xrn2 substrates. Exonuclease-dead Xrn2 increased transcription in 3' flanking regions by inhibiting polyA site-dependent termination. Surprisingly, the mutant Xrn2 also rescued transcription in promoter-proximal regions to the same extent as in 3' flanking regions. eNET-seq revealed Xrn2-mediated degradation of sense and antisense nascent RNA within a few bases of the TSS, where 5'-PO4 ends may be generated by decapping or endonucleolytic cleavage. These results suggest that a major fraction of pol II complexes terminates prematurely close to the start site under normal conditions by an Xrn2-mediated torpedo mechanism.
Assuntos
Poli A , RNA Polimerase II , RNA Polimerase II/genética , Núcleo Celular , Exonucleases , RNA AntissensoRESUMO
Most RNA processing occurs co-transcriptionally. We interrogated nascent pol II transcripts by chemical and enzymatic probing and determined how the "nascent RNA structureome" relates to splicing, A-I editing and transcription speed. RNA folding within introns and steep structural transitions at splice sites are associated with efficient co-transcriptional splicing. A slow pol II mutant elicits extensive remodeling into more folded conformations with increased A-I editing. Introns that become more structured at their 3' splice sites get co-transcriptionally excised more efficiently. Slow pol II altered folding of intronic Alu elements where cryptic splicing and intron retention are stimulated, an outcome mimicked by UV, which decelerates transcription. Slow transcription also remodeled RNA folding around alternative exons in distinct ways that predict whether skipping or inclusion is favored, even though it occurs post-transcriptionally. Hence, co-transcriptional RNA folding modulates post-transcriptional alternative splicing. In summary, the plasticity of nascent transcripts has widespread effects on RNA processing.
Assuntos
Processamento Alternativo/genética , Processamento Pós-Transcricional do RNA/genética , RNA/genética , Transcrição Gênica/genética , Linhagem Celular , Éxons/genética , Células HEK293 , Humanos , Íntrons/genética , Dobramento de RNA/genética , RNA Polimerase II/genética , Precursores de RNA/genética , Sítios de Splice de RNA/genéticaRESUMO
Critical COVID-19 is caused by immune-mediated inflammatory lung injury. Host genetic variation influences the development of illness requiring critical care1 or hospitalization2-4 after infection with SARS-CoV-2. The GenOMICC (Genetics of Mortality in Critical Care) study enables the comparison of genomes from individuals who are critically ill with those of population controls to find underlying disease mechanisms. Here we use whole-genome sequencing in 7,491 critically ill individuals compared with 48,400 controls to discover and replicate 23 independent variants that significantly predispose to critical COVID-19. We identify 16 new independent associations, including variants within genes that are involved in interferon signalling (IL10RB and PLSCR1), leucocyte differentiation (BCL11A) and blood-type antigen secretor status (FUT2). Using transcriptome-wide association and colocalization to infer the effect of gene expression on disease severity, we find evidence that implicates multiple genes-including reduced expression of a membrane flippase (ATP11A), and increased expression of a mucin (MUC1)-in critical disease. Mendelian randomization provides evidence in support of causal roles for myeloid cell adhesion molecules (SELE, ICAM5 and CD209) and the coagulation factor F8, all of which are potentially druggable targets. Our results are broadly consistent with a multi-component model of COVID-19 pathophysiology, in which at least two distinct mechanisms can predispose to life-threatening disease: failure to control viral replication; or an enhanced tendency towards pulmonary inflammation and intravascular coagulation. We show that comparison between cases of critical illness and population controls is highly efficient for the detection of therapeutically relevant mechanisms of disease.
Assuntos
COVID-19 , Estado Terminal , Genoma Humano , Interações Hospedeiro-Patógeno , Sequenciamento Completo do Genoma , Transportadores de Cassetes de Ligação de ATP , COVID-19/genética , COVID-19/mortalidade , COVID-19/patologia , COVID-19/virologia , Moléculas de Adesão Celular , Cuidados Críticos , Estado Terminal/mortalidade , Selectina E , Fator VIII , Fucosiltransferases , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Interações Hospedeiro-Patógeno/genética , Humanos , Subunidade beta de Receptor de Interleucina-10 , Lectinas Tipo C , Mucina-1 , Proteínas do Tecido Nervoso , Proteínas de Transferência de Fosfolipídeos , Receptores de Superfície Celular , Proteínas Repressoras , SARS-CoV-2/patogenicidade , Galactosídeo 2-alfa-L-FucosiltransferaseRESUMO
CDK7 associates with the 10-subunit TFIIH complex and regulates transcription by phosphorylating the C-terminal domain (CTD) of RNA polymerase II (RNAPII). Few additional CDK7 substrates are known. Here, using the covalent inhibitor SY-351 and quantitative phosphoproteomics, we identified CDK7 kinase substrates in human cells. Among hundreds of high-confidence targets, the vast majority are unique to CDK7 (i.e., distinct from other transcription-associated kinases), with a subset that suggest novel cellular functions. Transcription-associated factors were predominant CDK7 substrates, including SF3B1, U2AF2, and other splicing components. Accordingly, widespread and diverse splicing defects, such as alternative exon inclusion and intron retention, were characterized in CDK7-inhibited cells. Combined with biochemical assays, we establish that CDK7 directly activates other transcription-associated kinases CDK9, CDK12, and CDK13, invoking a "master regulator" role in transcription. We further demonstrate that TFIIH restricts CDK7 kinase function to the RNAPII CTD, whereas other substrates (e.g., SPT5 and SF3B1) are phosphorylated by the three-subunit CDK-activating kinase (CAK; CCNH, MAT1, and CDK7). These results suggest new models for CDK7 function in transcription and implicate CAK dissociation from TFIIH as essential for kinase activation. This straightforward regulatory strategy ensures CDK7 activation is spatially and temporally linked to transcription, and may apply toward other transcription-associated kinases.
Assuntos
Quinases Ciclina-Dependentes/metabolismo , Modelos Biológicos , Fator de Transcrição TFIIH/metabolismo , Transcrição Gênica/genética , Processamento Alternativo/genética , Sobrevivência Celular/efeitos dos fármacos , Quinases Ciclina-Dependentes/antagonistas & inibidores , Quinases Ciclina-Dependentes/genética , Ativação Enzimática/genética , Células HL-60 , Humanos , Quinase Ativadora de Quinase Dependente de CiclinaRESUMO
In addition to phosphodiester bond formation, RNA polymerase II has an RNA endonuclease activity, stimulated by TFIIS, which rescues complexes that have arrested and backtracked. How TFIIS affects transcription under normal conditions is poorly understood. We identified backtracking sites in human cells using a dominant-negative TFIIS (TFIISDN) that inhibits RNA cleavage and stabilizes backtracked complexes. Backtracking is most frequent within 2 kb of start sites, consistent with slow elongation early in transcription, and in 3' flanking regions where termination is enhanced by TFIISDN, suggesting that backtracked pol II is a favorable substrate for termination. Rescue from backtracking by RNA cleavage also promotes escape from 5' pause sites, prevents premature termination of long transcripts, and enhances activation of stress-inducible genes. TFIISDN slowed elongation rates genome-wide by half, suggesting that rescue of backtracked pol II by TFIIS is a major stimulus of elongation under normal conditions.
Assuntos
Clivagem do RNA , RNA Polimerase II/metabolismo , RNA/metabolismo , Elongação da Transcrição Genética , Terminação da Transcrição Genética , Ativação Transcricional , Região 3'-Flanqueadora , Animais , Regulação da Expressão Gênica , Células HEK293 , Humanos , Cinética , Camundongos , Mutação , RNA/genética , RNA Polimerase II/genética , Fatores de Elongação da Transcrição/genética , Fatores de Elongação da Transcrição/metabolismoRESUMO
Control of transcription speed, which influences many co-transcriptional processes, is poorly understood. We report that PNUTS-PP1 phosphatase is a negative regulator of RNA polymerase II (Pol II) elongation rate. The PNUTS W401A mutation, which disrupts PP1 binding, causes genome-wide acceleration of transcription associated with hyper-phosphorylation of the Spt5 elongation factor. Immediately downstream of poly(A) sites, Pol II decelerates from >2 kb/min to <1 kb/min, which correlates with Spt5 dephosphorylation. Pol II deceleration and Spt5 dephosphorylation require poly(A) site recognition and the PNUTS-PP1 complex, which is in turn necessary for transcription termination. These results lead to a model for termination, the "sitting duck torpedo" mechanism, where poly(A) site-dependent deceleration caused by PNUTS-PP1 and Spt5 dephosphorylation is required to convert Pol II into a viable target for the Xrn2 terminator exonuclease. Spt5 and its bacterial homolog NusG therefore have related functions controlling kinetic competition between RNA polymerases and the termination factors that pursue them.
Assuntos
Proteínas de Ligação a DNA/metabolismo , Exorribonucleases/metabolismo , Proteína Fosfatase 1/metabolismo , Processamento de Proteína Pós-Traducional , RNA Polimerase II/metabolismo , RNA Mensageiro/biossíntese , Proteínas de Ligação a RNA/metabolismo , Terminação da Transcrição Genética , Sítios de Ligação , Proteínas de Ligação a DNA/genética , Exorribonucleases/genética , Células HEK293 , Humanos , Cinética , Proteínas Nucleares/genética , Fosforilação , Poli A/metabolismo , Ligação Proteica , Proteína Fosfatase 1/genética , RNA Mensageiro/genética , Proteínas de Ligação a RNA/genética , Transdução de Sinais , Fatores de Elongação da Transcrição/genéticaRESUMO
Transcription of eukaryotic genes by RNA polymerase II (Pol II) yields RNA precursors containing introns that must be spliced out and the flanking exons ligated together. Splicing is catalyzed by a dynamic ribonucleoprotein complex called the spliceosome. Recent evidence has shown that a large fraction of splicing occurs cotranscriptionally as the RNA chain is extruded from Pol II at speeds of up to 5 kb/minute. Splicing is more efficient when it is tethered to the transcription elongation complex, and this linkage permits functional coupling of splicing with transcription. We discuss recent progress that has uncovered a network of connections that link splicing to transcript elongation and other cotranscriptional RNA processing events.
Assuntos
Precursores de RNA , Transcrição Gênica , Precursores de RNA/genética , Splicing de RNA/genética , Spliceossomos/genética , Spliceossomos/metabolismo , ÍntronsRESUMO
Transcription elongation rate influences cotranscriptional pre-mRNA maturation, but how such kinetic coupling works is poorly understood. The formation of nonadenylated histone mRNA 3' ends requires recognition of an RNA structure by stem-loop-binding protein (SLBP). We report that slow transcription by mutant RNA polymerase II (Pol II) caused accumulation of polyadenylated histone mRNAs that extend past the stem-loop processing site. UV irradiation, which decelerates Pol II elongation, also induced long poly(A)+ histone transcripts. Inhibition of 3' processing by slow Pol II correlates with failure to recruit SLBP to histone genes. Chemical probing of nascent RNA structure showed that the stem-loop fails to fold in transcripts made by slow Pol II, thereby explaining the absence of SLBP and failure to process 3' ends. These results show that regulation of transcription speed can modulate pre-mRNA processing by changing nascent RNA structure and suggest a mechanism by which alternative processing could be controlled.
Assuntos
Histonas/genética , Processamento de Terminações 3' de RNA , Precursores de RNA/metabolismo , RNA Mensageiro/metabolismo , Elongação da Transcrição Genética , Células HEK293 , Histonas/metabolismo , Humanos , Cinética , Proteínas Nucleares/metabolismo , Dobramento de RNA , Precursores de RNA/química , RNA Mensageiro/química , Transcrição Gênica/efeitos da radiação , Raios Ultravioleta , Fatores de Poliadenilação e Clivagem de mRNA/metabolismoRESUMO
Paused RNA polymerase II (Pol II) that piles up near most human promoters is the target of mechanisms that control entry into productive elongation. Whether paused Pol II is a stable or dynamic target remains unresolved. We report that most 5' paused Pol II throughout the genome is turned over within 2 min. This process is revealed under hypertonic conditions that prevent Pol II recruitment to promoters. This turnover requires cell viability but is not prevented by inhibiting transcription elongation, suggesting that it is mediated at the level of termination. When initiation was prevented by triptolide during recovery from high salt, a novel preinitiated state of Pol II lacking the pausing factor Spt5 accumulated at transcription start sites. We propose that Pol II occupancy near 5' ends is governed by a cycle of ongoing assembly of preinitiated complexes that transition to pause sites followed by eviction from the DNA template. This model suggests that mechanisms regulating the transition to productive elongation at pause sites operate on a dynamic population of Pol II that is turning over at rates far higher than previously suspected. We suggest that a plausible alternative to elongation control via escape from a stable pause is by escape from premature termination.
Assuntos
Regiões Promotoras Genéticas , RNA Polimerase II/metabolismo , Iniciação da Transcrição Genética , Diterpenos/farmacologia , Compostos de Epóxi/farmacologia , Células HCT116 , Humanos , Soluções Isotônicas , Fenantrenos/farmacologia , Solução Salina Hipertônica , Elongação da Transcrição Genética/efeitos dos fármacos , Iniciação da Transcrição Genética/efeitos dos fármacosRESUMO
Intrachromosomal amplification of chromosome 21 defines a subtype of high-risk childhood acute lymphoblastic leukemia (iAMP21-ALL) characterized by copy number changes and complex rearrangements of chromosome 21. The genomic basis of iAMP21-ALL and the pathogenic role of the region of amplification of chromosome 21 to leukemogenesis remains incompletely understood. In this study, using integrated whole genome and transcriptome sequencing of 124 patients with iAMP21-ALL, including rare cases arising in the context of constitutional chromosomal aberrations, we identified subgroups of iAMP21-ALL based on the patterns of copy number alteration and structural variation. This large data set enabled formal delineation of a 7.8 Mb common region of amplification harboring 71 genes, 43 of which were differentially expressed compared with non-iAMP21-ALL ones, including multiple genes implicated in the pathogenesis of acute leukemia (CHAF1B, DYRK1A, ERG, HMGN1, and RUNX1). Using multimodal single-cell genomic profiling, including single-cell whole genome sequencing of 2 cases, we documented clonal heterogeneity and genomic evolution, demonstrating that the acquisition of the iAMP21 chromosome is an early event that may undergo progressive amplification during disease ontogeny. We show that UV-mutational signatures and high mutation load are characteristic secondary genetic features. Although the genomic alterations of chromosome 21 are variable, these integrated genomic analyses and demonstration of an extended common minimal region of amplification broaden the definition of iAMP21-ALL for more precise diagnosis using cytogenetic or genomic methods to inform clinical management.
Assuntos
Cromossomos Humanos Par 21 , Leucemia-Linfoma Linfoblástico de Células Precursoras , Humanos , Criança , Cromossomos Humanos Par 21/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Aberrações Cromossômicas , Citogenética , Genômica , Fator 1 de Modelagem da Cromatina/genéticaRESUMO
Eukaryotic genes are marked by conserved post-translational modifications on the RNA pol II C-terminal domain (CTD) and the chromatin template. How the 5'-3' profiles of these marks are established is poorly understood. Using pol II mutants in human cells, we found that slow transcription repositioned specific co-transcriptionally deposited chromatin modifications; histone H3 lysine 36 trimethyl (H3K36me3) shifted within genes toward 5' ends, and histone H3 lysine 4 dimethyl (H3K4me2) extended farther upstream of start sites. Slow transcription also evoked a hyperphosphorylation of CTD Ser2 residues at 5' ends of genes that is conserved in yeast. We propose a "dwell time in the target zone" model to explain the effects of transcriptional dynamics on the establishment of co-transcriptionally deposited protein modifications. Promoter-proximal Ser2 phosphorylation is associated with a longer pol II dwell time at start sites and reduced transcriptional polarity because of strongly enhanced divergent antisense transcription at promoters. These results demonstrate that pol II dynamics help govern the decision between sense and divergent antisense transcription.
Assuntos
Montagem e Desmontagem da Cromatina , Cromatina/enzimologia , DNA Fúngico/metabolismo , RNA Polimerase II/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/enzimologia , Transcrição Gênica , Cromatina/genética , DNA Fúngico/genética , Regulação Fúngica da Expressão Gênica , Células HEK293 , Humanos , Mutação , Fosforilação , Domínios Proteicos , RNA Polimerase II/genética , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Fatores de TempoRESUMO
DNA damage induces apoptosis and many apoptotic genes are regulated via alternative splicing (AS), but little is known about the control mechanisms. Here we show that ultraviolet irradiation (UV) affects cotranscriptional AS in a p53-independent way, through the hyperphosphorylation of RNA polymerase II carboxy-terminal domain (CTD) and a subsequent inhibition of transcriptional elongation, estimated in vivo and in real time. Phosphomimetic CTD mutants not only display lower elongation but also duplicate the UV effect on AS. Consistently, nonphosphorylatable mutants prevent the UV effect. Apoptosis promoted by UV in cells lacking p53 is prevented when the change in AS of the apoptotic gene bcl-x is reverted, confirming the relevance of this mechanism. Splicing-sensitive microarrays revealed a significant overlap of the subsets of genes that have changed AS with UV and those that have reduced expression, suggesting that transcriptional coupling to AS is a key feature of the DNA-damage response.
Assuntos
Processamento Alternativo/efeitos da radiação , RNA Polimerase II/metabolismo , Raios Ultravioleta , Apoptose , Linhagem Celular Tumoral , Dano ao DNA , Diclororribofuranosilbenzimidazol/farmacologia , Fibronectinas/genética , Fibronectinas/metabolismo , Recuperação de Fluorescência Após Fotodegradação , Humanos , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação/efeitos dos fármacos , Fosforilação/efeitos da radiação , RNA Polimerase II/química , Transcrição GênicaRESUMO
PURPOSE: This study investigated the effect of menthol (MEN) mouth rinsing (MR) on cycling performance during a modified variable cycle test (M-VCT) in adolescent athletes under hot conditions (31.4 ± 0.9 °C, 23.4 ± 3.7% relative humidity). METHODS: Trained adolescent male cyclists (n = 11, 16.7 ± 1.3 years, height 176.6 ± 8.8 cm, body mass 65.8 ± 11.6 kg, maximal oxygen uptake 62.97 ± 7.47 ml·kg-1·min-1) voluntarily completed three trials (familiarization and two experimental) of a 30-min M-VCT, which included five 6-min laps consisting of three 6-s accelerations and three 10-s sprints throughout each lap. In a randomized crossover design, MEN (0.01%) or placebo (PLA) (crystal-light), was swilled for 5 s before the start of each lap (total of 6 MR). Power output, distance (in kilometers), core temperature, heart rate, perceptual exertion, thermal stimulation (thermal comfort and thermal sensation), and blood lactate concentration were recorded. RESULTS: MEN MR significantly improved M-VCT mean power output by 1.81 ± 1.57% compared to PLA (MEN, 177.8 ± 31.4 W; PLA, 174.7 ± 30.5 W, p < .001, 95% confidence interval [1.73, 4.46], d = 1.53). For maximal intermittent sprints, 6- and 10-s mean power output was significantly higher with MEN than PLA (6 s, p = .041, 95% confidence interval [0.73, 27.19], d = 0.71; 10 s, p = .002, 95% confidence interval [11.08, 35.22], d = 1.29). There was no significant difference in core temperature, heart rate, blood lactate concentration, or any perceptual measure between trials (p > .05) despite significantly higher work with MEN. CONCLUSION: 64% of athletes (7/11) improved M-VCT performance with MEN. The results of this investigation suggest that a MEN MR may improve power output during a sport-specific stochastic cycling task in elite adolescent male cyclists.
RESUMO
The torpedo model of transcription termination asserts that the exonuclease Xrn2 attacks the 5'PO4-end exposed by nascent RNA cleavage and chases down the RNA polymerase. We tested this mechanism using a dominant-negative human Xrn2 mutant and found that it delayed termination genome-wide. Xrn2 nuclease inactivation caused strong termination defects downstream of most poly(A) sites and modest delays at some histone and U snRNA genes, suggesting that the torpedo mechanism is not limited to poly(A) site-dependent termination. A central untested feature of the torpedo model is that there is kinetic competition between the exonuclease and the pol II elongation complex. Using pol II rate mutants, we found that slow transcription robustly shifts termination upstream, and fast elongation extends the zone of termination further downstream. These results suggest that kinetic competition between elongating pol II and the Xrn2 exonuclease is integral to termination of transcription on most human genes.
Assuntos
Exorribonucleases/genética , Poli A/genética , RNA Polimerase II/genética , RNA Mensageiro/genética , Elongação da Transcrição Genética , Terminação da Transcrição Genética , Linhagem Celular , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Exorribonucleases/metabolismo , Genoma Humano , Células HEK293 , Células HeLa , Humanos , Cinética , Linfócitos/citologia , Linfócitos/metabolismo , Modelos Genéticos , Mutação , Poli A/metabolismo , RNA Polimerase II/metabolismo , RNA Mensageiro/metabolismoRESUMO
Alternative splicing modulates expression of most human genes. The kinetic model of cotranscriptional splicing suggests that slow elongation expands and that fast elongation compresses the "window of opportunity" for recognition of upstream splice sites, thereby increasing or decreasing inclusion of alternative exons. We tested the model using RNA polymerase II mutants that change average elongation rates genome-wide. Slow and fast elongation affected constitutive and alternative splicing, frequently altering exon inclusion and intron retention in ways not predicted by the model. Cassette exons included by slow and excluded by fast elongation (type I) have weaker splice sites, shorter flanking introns, and distinct sequence motifs relative to "slow-excluded" and "fast-included" exons (type II). Many rate-sensitive exons are misspliced in tumors. Unexpectedly, slow and fast elongation often both increased or both decreased inclusion of a particular exon or retained intron. These results suggest that an optimal rate of transcriptional elongation is required for normal cotranscriptional pre-mRNA splicing.
Assuntos
RNA Polimerase II/metabolismo , Precursores de RNA/metabolismo , Splicing de RNA , Elongação da Transcrição Genética/fisiologia , Éxons/genética , Células HEK293 , Humanos , Íntrons/genética , Mutação , RNA Polimerase II/genética , Precursores de RNA/genéticaRESUMO
BACKGROUND: The sequence content of the 3' UTRs of many mRNA transcripts is regulated through alternative polyadenylation (APA). The study of this process using RNAseq data, though, has been historically challenging. RESULTS: To combat this problem, we developed LABRAT, an APA isoform quantification method. LABRAT takes advantage of newly developed transcriptome quantification techniques to accurately determine relative APA site usage and how it varies across conditions. Using LABRAT, we found consistent relationships between gene-distal APA and subcellular RNA localization in multiple cell types. We also observed connections between transcription speed and APA site choice as well as tumor-specific transcriptome-wide shifts in APA isoform abundance in hundreds of patient-derived tumor samples that were associated with patient prognosis. We investigated the effects of APA on transcript expression and found a weak overall relationship, although many individual genes showed strong correlations between relative APA isoform abundance and overall gene expression. We interrogated the roles of 191 RNA-binding proteins in the regulation of APA isoforms, finding that dozens promote broad, directional shifts in relative APA isoform abundance both in vitro and in patient-derived samples. Finally, we find that APA site shifts in the two classes of APA, tandem UTRs and alternative last exons, are strongly correlated across many contexts, suggesting that they are coregulated. CONCLUSIONS: We conclude that LABRAT has the ability to accurately quantify APA isoform ratios from RNAseq data across a variety of sample types. Further, LABRAT is able to derive biologically meaningful insights that connect APA isoform regulation to cellular and molecular phenotypes.
Assuntos
Neoplasias , Poliadenilação , Regiões 3' não Traduzidas , Humanos , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genéticaRESUMO
Responsible for the metabolism of ~21% of clinically used drugs, CYP2D6 is a critical component of personalized medicine initiatives. Genotyping CYP2D6 is challenging due to sequence similarity with its pseudogene paralog CYP2D7 and a high number and variety of common structural variants (SVs). Here we describe a novel bioinformatics method, Cyrius, that accurately genotypes CYP2D6 using whole-genome sequencing (WGS) data. We show that Cyrius has superior performance (96.5% concordance with truth genotypes) compared to existing methods (84-86.8%). After implementing the improvements identified from the comparison against the truth data, Cyrius's accuracy has since been improved to 99.3%. Using Cyrius, we built a haplotype frequency database from 2504 ethnically diverse samples and estimate that SV-containing star alleles are more frequent than previously reported. Cyrius will be an important tool to incorporate pharmacogenomics in WGS-based precision medicine initiatives.
Assuntos
Citocromo P-450 CYP2D6/genética , Técnicas de Genotipagem/métodos , Alelos , Biologia Computacional/métodos , Etnicidade/genética , Genótipo , Haplótipos/genética , Humanos , Polimorfismo Genético/genética , Sequenciamento Completo do Genoma/métodosRESUMO
Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased "Platinum" variant catalog of 4.7 million single-nucleotide variants (SNVs) plus 0.7 million small (1-50 bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and 11 children of this pedigree. Platinum genotypes are highly concordant with the current catalog of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%) and add a validated truth catalog that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission ("nonplatinum") revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes.