RESUMO
Throughout their lifetimes, messenger RNAs (mRNAs) associate with proteins to form ribonucleoproteins (mRNPs). Since the discovery of the first mRNP component more than 40 years ago, what is known as the mRNA interactome now comprises >1,000 proteins. These proteins bind mRNAs in myriad ways with varying affinities and stoichiometries, with many assembling onto nascent RNAs in a highly ordered process during transcription and precursor mRNA (pre-mRNA) processing. The nonrandom distribution of major mRNP proteins observed in transcriptome-wide studies leads us to propose that mRNPs are organized into three major domains loosely corresponding to 5' untranslated regions (UTRs), open reading frames, and 3' UTRs. Moving from the nucleus to the cytoplasm, mRNPs undergo extensive remodeling as they are first acted upon by the nuclear pore complex and then by the ribosome. When not being actively translated, cytoplasmic mRNPs can assemble into large multi-mRNP assemblies or be permanently disassembled and degraded. In this review, we aim to give the reader a thorough understanding of past and current eukaryotic mRNP research.
Assuntos
Ribonucleoproteínas/química , Transporte Ativo do Núcleo Celular , Animais , Humanos , Biossíntese de Proteínas , Splicing de RNA , Estabilidade de RNA , RNA Mensageiro/metabolismo , Transcrição GênicaRESUMO
Directed differentiation of human embryonic stem cells (ESCs) into cardiovascular cells provides a model for studying molecular mechanisms of human cardiovascular development. Although it is known that chromatin modification patterns in ESCs differ markedly from those in lineage-committed progenitors and differentiated cells, the temporal dynamics of chromatin alterations during differentiation along a defined lineage have not been studied. We show that differentiation of human ESCs into cardiovascular cells is accompanied by programmed temporal alterations in chromatin structure that distinguish key regulators of cardiovascular development from other genes. We used this temporal chromatin signature to identify regulators of cardiac development, including the homeobox gene MEIS2. Using the zebrafish model, we demonstrate that MEIS2 is critical for proper heart tube formation and subsequent cardiac looping. Temporal chromatin signatures should be broadly applicable to other models of stem cell differentiation to identify regulators and provide key insights into major developmental decisions.
Assuntos
Diferenciação Celular , Cromatina , Células-Tronco Embrionárias/metabolismo , Coração/embriologia , Miocárdio/citologia , Animais , Epigênese Genética , Proteínas de Homeodomínio/metabolismo , Humanos , Peixe-Zebra/embriologia , Proteínas de Peixe-Zebra/metabolismoRESUMO
Many proteins regulate the expression of genes by binding to specific regions encoded in the genome1. Here we introduce a new data set of RNA elements in the human genome that are recognized by RNA-binding proteins (RBPs), generated as part of the Encyclopedia of DNA Elements (ENCODE) project phase III. This class of regulatory elements functions only when transcribed into RNA, as they serve as the binding sites for RBPs that control post-transcriptional processes such as splicing, cleavage and polyadenylation, and the editing, localization, stability and translation of mRNAs. We describe the mapping and characterization of RNA elements recognized by a large collection of human RBPs in K562 and HepG2 cells. Integrative analyses using five assays identify RBP binding sites on RNA and chromatin in vivo, the in vitro binding preferences of RBPs, the function of RBP binding sites and the subcellular localization of RBPs, producing 1,223 replicated data sets for 356 RBPs. We describe the spectrum of RBP binding throughout the transcriptome and the connections between these interactions and various aspects of RNA biology, including RNA stability, splicing regulation and RNA localization. These data expand the catalogue of functional elements encoded in the human genome by the addition of a large set of elements that function at the RNA level by interacting with RBPs.
Assuntos
Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Transcriptoma/genética , Processamento Alternativo/genética , Sequência de Bases , Sítios de Ligação , Linhagem Celular , Cromatina/genética , Cromatina/metabolismo , Bases de Dados Genéticas , Feminino , Técnicas de Silenciamento de Genes , Humanos , Espaço Intracelular/genética , Masculino , Ligação Proteica , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genética , Especificidade por SubstratoRESUMO
RNA binding proteins (RBPs) orchestrate the production, processing, and function of mRNAs. Here, we present the affinity landscapes of 78 human RBPs using an unbiased assay that determines the sequence, structure, and context preferences of these proteins in vitro by deep sequencing of bound RNAs. These data enable construction of "RNA maps" of RBP activity without requiring crosslinking-based assays. We found an unexpectedly low diversity of RNA motifs, implying frequent convergence of binding specificity toward a relatively small set of RNA motifs, many with low compositional complexity. Offsetting this trend, however, we observed extensive preferences for contextual features distinct from short linear RNA motifs, including spaced "bipartite" motifs, biased flanking nucleotide composition, and bias away from or toward RNA structure. Our results emphasize the importance of contextual features in RNA recognition, which likely enable targeting of distinct subsets of transcripts by different RBPs that recognize the same linear motif.
Assuntos
Proteínas com Motivo de Reconhecimento de RNA/metabolismo , RNA/metabolismo , Sequência de Bases , Sítios de Ligação , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Conformação de Ácido Nucleico , Motivos de Nucleotídeos , Ligação Proteica , RNA/química , RNA/genética , Proteínas com Motivo de Reconhecimento de RNA/química , Proteínas com Motivo de Reconhecimento de RNA/genética , Relação Estrutura-AtividadeRESUMO
RNA metabolism is controlled by an expanding, yet incomplete, catalog of RNA-binding proteins (RBPs), many of which lack characterized RNA binding domains. Approaches to expand the RBP repertoire to discover non-canonical RBPs are currently needed. Here, HaloTag fusion pull down of 12 nuclear and cytoplasmic RBPs followed by quantitative mass spectrometry (MS) demonstrates that proteins interacting with multiple RBPs in an RNA-dependent manner are enriched for RBPs. This motivated SONAR, a computational approach that predicts RNA binding activity by analyzing large-scale affinity precipitation-MS protein-protein interactomes. Without relying on sequence or structure information, SONAR identifies 1,923 human, 489 fly, and 745 yeast RBPs, including over 100 human candidate RBPs that contain zinc finger domains. Enhanced CLIP confirms RNA binding activity and identifies transcriptome-wide RNA binding sites for SONAR-predicted RBPs, revealing unexpected RNA binding activity for disease-relevant proteins and DNA binding proteins.
Assuntos
Algoritmos , Anotação de Sequência Molecular , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/classificação , RNA/química , Animais , Sítios de Ligação , Núcleo Celular/química , Núcleo Celular/metabolismo , Citoplasma/química , Citoplasma/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Expressão Gênica , Ontologia Genética , Células HEK293 , Humanos , Motivos de Nucleotídeos , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , RNA/genética , RNA/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Software , Dedos de ZincoRESUMO
Transcriptome-wide maps of RNA binding protein (RBP)-RNA interactions by immunoprecipitation (IP)-based methods such as RNA IP (RIP) and crosslinking and IP (CLIP) are key starting points for evaluating the molecular roles of the thousands of human RBPs. A significant bottleneck to the application of these methods in diverse cell lines, tissues, and developmental stages is the availability of validated IP-quality antibodies. Using IP followed by immunoblot assays, we have developed a validated repository of 438 commercially available antibodies that interrogate 365 unique RBPs. In parallel, 362 short-hairpin RNA (shRNA) constructs against 276 unique RBPs were also used to confirm specificity of these antibodies. These antibodies can characterize subcellular RBP localization. With the burgeoning interest in the roles of RBPs in cancer, neurobiology, and development, these resources are invaluable to the broad scientific community. Detailed information about these resources is publicly available at the ENCODE portal (https://www.encodeproject.org/).
Assuntos
Bases de Dados Genéticas , Proteínas de Ligação a RNA/genética , RNA/metabolismo , Transcriptoma/genética , Sítios de Ligação , Humanos , Ligação Proteica , RNA/genética , RNA Interferente Pequeno/classificação , RNA Interferente Pequeno/genética , Proteínas de Ligação a RNA/metabolismoRESUMO
RNA quality-control pathways get rid of faulty RNAs and therefore must be able to discriminate these RNAs from those that are normal. Here we present evidence that the adenosine triphosphatase (ATPase) cycle of the SF1 helicase Upf1 is required for mRNA discrimination during nonsense-mediated decay (NMD). Mutations affecting the Upf1 ATPase cycle disrupt the mRNA selectivity of Upf1, leading to indiscriminate accumulation of NMD complexes on both NMD target and non-target mRNAs. In addition, two modulators of NMD-translation and termination codon-proximal poly(A) binding protein-depend on the ATPase activity of Upf1 to limit Upf1-non-target association. Preferential ATPase-dependent dissociation of Upf1 from non-target mRNAs in vitro suggests that selective release of Upf1 contributes to the ATPase dependence of Upf1 target discrimination. Given the prevalence of helicases in RNA regulation, ATP hydrolysis may be a widely used activity in target RNA discrimination.
Assuntos
Trifosfato de Adenosina/metabolismo , Degradação do RNAm Mediada por Códon sem Sentido , RNA Mensageiro/metabolismo , Transativadores/genética , Transativadores/metabolismo , Regiões 3' não Traduzidas , Domínio Catalítico , Células HEK293 , Humanos , Técnicas In Vitro , Dados de Sequência Molecular , Mutação , RNA Helicases , RNA Mensageiro/genética , Especificidade por SubstratoRESUMO
Umbilical cord blood-derived haematopoietic stem cells (HSCs) are essential for many life-saving regenerative therapies. However, despite their advantages for transplantation, their clinical use is restricted because HSCs in cord blood are found only in small numbers. Small molecules that enhance haematopoietic stem and progenitor cell (HSPC) expansion in culture have been identified, but in many cases their mechanisms of action or the nature of the pathways they impinge on are poorly understood. A greater understanding of the molecular circuitry that underpins the self-renewal of human HSCs will facilitate the development of targeted strategies that expand HSCs for regenerative therapies. Whereas transcription factor networks have been shown to influence the self-renewal and lineage decisions of human HSCs, the post-transcriptional mechanisms that guide HSC fate have not been closely investigated. Here we show that overexpression of the RNA-binding protein Musashi-2 (MSI2) induces multiple pro-self-renewal phenotypes, including a 17-fold increase in short-term repopulating cells and a net 23-fold ex vivo expansion of long-term repopulating HSCs. By performing a global analysis of MSI2-RNA interactions, we show that MSI2 directly attenuates aryl hydrocarbon receptor (AHR) signalling through post-transcriptional downregulation of canonical AHR pathway components in cord blood HSPCs. Our study gives mechanistic insight into RNA networks controlled by RNA-binding proteins that underlie self-renewal and provides evidence that manipulating such networks ex vivo can enhance the regenerative potential of human HSCs.
Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Autorrenovação Celular , Células-Tronco Hematopoéticas/citologia , Células-Tronco Hematopoéticas/metabolismo , Proteínas de Ligação a RNA/metabolismo , Receptores de Hidrocarboneto Arílico/metabolismo , Transdução de Sinais , Animais , Sequência de Bases , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Contagem de Células , Autorrenovação Celular/genética , Regulação para Baixo/genética , Feminino , Sangue Fetal/citologia , Técnicas de Silenciamento de Genes , Humanos , Masculino , Camundongos , Ligação Proteica , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genética , Receptores de Hidrocarboneto Arílico/genética , Transdução de Sinais/genéticaRESUMO
Alternative splicing of pre-messenger RNA transcripts enables the generation of multiple protein isoforms from the same gene locus, providing a major source of protein diversity in mammalian genomes. RNA binding proteins (RBPs) bind to RNA to control splice site choice and define which exons are included in the resulting mature RNA transcript. However, depending on where the RBPs bind relative to splice sites, they can activate or repress splice site usage. To explore this position-specific regulation, in vivo binding sites identified by methods such as cross-linking and immunoprecipitation (CLIP) are integrated with alternative splicing events identified by RNA-seq or microarray. Merging these data sets enables the generation of a "splicing map," where CLIP signal relative to a merged meta-exon provides a simple summary of the position-specific effect of binding on splicing regulation. Here, we provide RBP-Maps, a software tool to simplify generation of these maps and enable researchers to rapidly query regulatory patterns of an RBP of interest. Further, we discuss various alternative approaches to generate such splicing maps, focusing on how decisions in construction (such as the use of peak versus read density, or whole-reads versus only single-nucleotide candidate crosslink positions) can affect the interpretation of these maps using example eCLIP data from the 150 RBPs profiled by the ENCODE consortium.
Assuntos
Processamento Alternativo/genética , Biologia Computacional/métodos , Isoformas de Proteínas/genética , Sítios de Splice de RNA/genética , Proteínas de Ligação a RNA/química , Software , Regulação da Expressão Gênica/genética , Humanos , RNA Mensageiro/genética , Análise de Sequência de RNARESUMO
As RNA-binding proteins (RBPs) play essential roles in cellular physiology by interacting with target RNA molecules, binding site identification by UV crosslinking and immunoprecipitation (CLIP) of ribonucleoprotein complexes is critical to understanding RBP function. However, current CLIP protocols are technically demanding and yield low-complexity libraries with high experimental failure rates. We have developed an enhanced CLIP (eCLIP) protocol that decreases requisite amplification by â¼1,000-fold, decreasing discarded PCR duplicate reads by â¼60% while maintaining single-nucleotide binding resolution. By simplifying the generation of paired IgG and size-matched input controls, eCLIP improves specificity in the discovery of authentic binding sites. We generated 102 eCLIP experiments for 73 diverse RBPs in HepG2 and K562 cells (available at https://www.encodeproject.org), demonstrating that eCLIP enables large-scale and robust profiling, with amplification and sample requirements similar to those of ChIP-seq. eCLIP enables integrative analysis of diverse RBPs to reveal factor-specific profiles, common artifacts for CLIP and RNA-centric perspectives on RBP activity.
Assuntos
Perfilação da Expressão Gênica/métodos , Imunoprecipitação/métodos , Proteínas de Ligação a RNA/genética , Transcriptoma , Sítios de Ligação , Reagentes de Ligações Cruzadas/química , Células Hep G2 , Humanos , Células K562 , Processos Fotoquímicos , Raios UltravioletaRESUMO
Crosslinking and immunoprecipitation (CLIP) followed by high-throughput sequencing identifies the binding sites of RNA binding proteins on RNAs. The covalent RNA-amino acid adducts produced by UV irradiation can cause premature reverse transcription termination and deletions (referred to as crosslink-induced mutation sites (CIMS)), which may decrease overall cDNA yield but are exploited in state-of-the-art CLIP methods to identify these crosslink sites at single-nucleotide resolution. Here, we show the ratio of both crosslinked base deletions and read-through versus termination are highly dependent on the identity of the reverse transcriptase enzyme as well as on buffer conditions used. AffinityScript and TGIRT showed a lack of deletion of the crosslinked base with other enzymes showing variable rates, indicating that utilization and interpretation of CIMS analysis requires knowledge of the reverse transcriptase enzyme used. Commonly used enzymes, including Superscript III and AffinityScript, show high termination rates in standard magnesium buffer conditions, but show a single base difference in the position of termination for TARDBP motifs. In contrast, manganese-containing buffer promoted read-through at the adduct site. These results validate the use of standard enzymes and also propose alternative enzyme and buffer choices for particularly challenging samples that contain extensive RNA adducts or other modifications that inhibit standard reverse transcription.
Assuntos
Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Transcrição Reversa/fisiologia , Análise de Sequência de RNA/métodos , Sequência de Bases/fisiologia , Sítios de Ligação/fisiologia , HumanosRESUMO
Identification of in vivo direct RNA targets for RNA binding proteins (RBPs) provides critical insight into their regulatory activities and mechanisms. Recently, we described a methodology for enhanced crosslinking and immunoprecipitation followed by high-throughput sequencing (eCLIP) using antibodies against endogenous RNA binding proteins. However, in many cases it is desirable to profile targets of an RNA binding protein for which an immunoprecipitation-grade antibody is lacking. Here we describe a scalable method for using CRISPR/Cas9-mediated homologous recombination to insert a peptide tag into the endogenous RNA binding protein locus. Further, we show that TAG-eCLIP performed using tag-specific antibodies can yield the same robust binding profiles after proper control normalization as eCLIP with antibodies against endogenous proteins. Finally, we note that antibodies against commonly used tags can immunoprecipitate significant amounts of antibody-specific RNA, emphasizing the need for paired controls alongside each experiment for normalization. TAG-eCLIP enables eCLIP profiling of new native proteins where no suitable antibody exists, expanding the RBP-RNA interaction landscape.
Assuntos
Sistemas CRISPR-Cas , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Proteínas de Ligação a RNA/genética , RNA/química , Coloração e Rotulagem/métodos , Anticorpos/química , Sequência de Bases , Sítios de Ligação , Clonagem Molecular , Endonucleases/química , Células HEK293 , Recombinação Homóloga , Humanos , Células K562 , Peptídeos/química , Reação em Cadeia da Polimerase , Ligação Proteica , RNA/genética , RNA/metabolismo , Proteínas de Ligação a RNA/metabolismo , Análise de Sequência de RNA/métodos , TranscriptomaRESUMO
Mitochondrial genomes co-evolve with the nuclear genome over evolutionary timescales and are shaped by selection in the female germline. Here we investigate how mismatching between nuclear and mitochondrial ancestry impacts the somatic evolution of the mitochondrial genome in different tissues throughout ageing. We used ultrasensitive duplex sequencing to profile ~2.5 million mitochondrial genomes across five mitochondrial haplotypes and three tissues in young and aged mice, cataloguing ~1.2 million mitochondrial somatic and ultralow-frequency inherited mutations, of which 81,097 are unique. We identify haplotype-specific mutational patterns and several mutational hotspots, including at the light strand origin of replication, which consistently exhibits the highest mutation frequency. We show that rodents exhibit a distinct mitochondrial somatic mutational spectrum compared with primates with a surfeit of reactive oxygen species-associated G > T/C > A mutations, and that somatic mutations in protein-coding genes exhibit signatures of negative selection. Lastly, we identify an extensive enrichment in somatic reversion mutations that 're-align' mito-nuclear ancestry within an organism's lifespan. Together, our findings demonstrate that mitochondrial genomes are a dynamically evolving subcellular population shaped by somatic mutation and selection throughout organismal lifetimes.
Assuntos
Envelhecimento , Genoma Mitocondrial , Haplótipos , Mutação , Seleção Genética , Animais , Envelhecimento/genética , Camundongos , DNA Mitocondrial/genética , Núcleo Celular/genética , Feminino , Mitocôndrias/genética , Camundongos Endogâmicos C57BL , MasculinoRESUMO
Mitochondrial genomes co-evolve with the nuclear genome over evolutionary timescales and are shaped by selection in the female germline. Here, we investigate how mismatching between nuclear and mitochondrial ancestry impacts the somatic evolution of the mt-genome in different tissues throughout aging. We used ultra-sensitive Duplex Sequencing to profile ~2.5 million mt-genomes across five mitochondrial haplotypes and three tissues in young and aged mice, cataloging ~1.2 million mitochondrial somatic and ultra low frequency inherited mutations, of which 81,097 are unique. We identify haplotype-specific mutational patterns and several mutational hotspots, including at the Light Strand Origin of Replication, which consistently exhibits the highest mutation frequency. We show that rodents exhibit a distinct mitochondrial somatic mutational spectrum compared to primates with a surfeit of reactive oxygen species-associated G>T/C>A mutations, and that somatic mutations in protein coding genes exhibit signatures of negative selection. Lastly, we identify an extensive enrichment in somatic reversion mutations that "re-align" mito-nuclear ancestry within an organism's lifespan. Together, our findings demonstrate that mitochondrial genomes are a dynamically evolving subcellular population shaped by somatic mutation and selection throughout organismal lifetimes.
RESUMO
Duplex sequencing (DS) is an error-corrected next-generation sequencing method in which molecular barcodes informatically link PCR-copies back to their source DNA strands, enabling computational removal of errors in consensus sequences. The resulting background of less than one artifactual mutation per 107 nucleotides allows for direct detection of somatic mutations. TwinStrand Biosciences, Inc. has developed a DS-based mutagenesis assay to sample the rat genome, which can be applied to genetic toxicity testing. To evaluate this assay for early detection of mutagenesis, a time-course study was conducted using male Hsd:Sprague Dawley SD rats (3 per group) administered a single dose of 40 mg/kg N-ethyl-N-nitrosourea (ENU) via gavage, with mutation frequency (MF) and spectrum analyzed in stomach, bone marrow, blood, and liver tissues at 3 h, 24 h, 7 d, and 28 d post-exposure. Significant increases in MF were observed in ENU-exposed rats as early as 24 h for stomach (site of contact) and bone marrow (a highly proliferative tissue) and at 7 d for liver and blood. The canonical, mutational signature of ENU was established by 7 d post-exposure in all four tissues. Interlaboratory analysis of a subset of samples from different tissues and time points demonstrated remarkable reproducibility for both MF and spectrum. These results demonstrate that MF and spectrum can be evaluated successfully by directly sequencing targeted regions of DNA obtained from various tissuesâ , a considerable advancement compared to currently used in vivo gene mutation assays.
Assuntos
Etilnitrosoureia , Compostos de Nitrosoureia , Ratos , Masculino , Animais , Etilnitrosoureia/toxicidade , Reprodutibilidade dos Testes , Ratos Sprague-Dawley , Mutagênese , Mutação , Mutagênicos/toxicidadeRESUMO
Duplex sequencing (DuplexSeq) is an error-corrected next-generation sequencing (ecNGS) method in which molecular barcodes informatically link PCR-copies back to their source DNA strands, enabling computational removal of errors by comparing grouped strand sequencing reads. The resulting background of less than one artifactual mutation per 10 7 nucleotides allows for direct detection of somatic mutations. TwinStrand Biosciences, Inc. has developed a DuplexSeq-based mutagenesis assay to sample the rat genome, which can be applied to genetic toxicity testing. To evaluate this assay for early detection of mutagenesis, a time-course study was conducted using male Hsd:Sprague Dawley SD rats (3 per group) administered a single dose of 40 mg/kg N-ethyl-N-nitrosourea (ENU) via gavage, with mutation frequency (MF) and spectrum analyzed in stomach, bone marrow, blood, and liver tissues at 3 h, 24 h, 7 d, and 28 d post-exposure. Significant increases in MF were observed in ENU-exposed rats as early as 24 h for stomach (site of contact) and bone marrow (a highly proliferative tissue) and at 7 d for liver and blood. The canonical, mutational signature of ENU was established by 7 d post-exposure in all four tissues. Interlaboratory analysis of a subset of samples from different tissues and time points demonstrated remarkable reproducibility for both MF and spectrum. These results demonstrate that MF and spectrum can be evaluated successfully by directly sequencing targeted regions of DNA obtained from various tissues, a considerable advancement compared to currently used in vivo gene mutation assays. HIGHLIGHTS: DuplexSeq is an ultra-accurate NGS technology that directly quantifies mutationsENU-dependent mutagenesis was detected 24 h post-exposure in proliferative tissuesMultiple tissues exhibited the canonical ENU mutation spectrum 7 d after exposureResults obtained with DuplexSeq were highly concordant between laboratoriesThe Rat-50 Mutagenesis Assay is promising for applications in genetic toxicology.
RESUMO
Discovery of interaction sites between RNA-binding proteins (RBPs) and their RNA targets plays a critical role in enabling our understanding of how these RBPs control RNA processing and regulation. Cross-linking and immunoprecipitation (CLIP) provides a generalizable, transcriptome-wide method by which RBP/RNA complexes are purified and sequenced to identify sites of intermolecular contact. By simplifying technical challenges in prior CLIP methods and incorporating the generation of and quantitative comparison against size-matched input controls, the single-end enhanced CLIP (seCLIP) protocol allows for the profiling of these interactions with high resolution, efficiency and scalability. Here, we present a step-by-step guide to the seCLIP method, detailing critical steps and offering insights regarding troubleshooting and expected results while carrying out the ~4-d protocol. Furthermore, we describe a comprehensive bioinformatics pipeline that offers users the tools necessary to process two replicate datasets and identify reproducible and significant peaks for an RBP of interest in ~2 d.
Assuntos
RNA , Transcriptoma , Sítios de Ligação , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Imunoprecipitação , Ligação Proteica , RNA/genética , Proteínas de Ligação a RNA/metabolismoRESUMO
The molecular functions of the majority of RNA-binding proteins (RBPs) remain unclear, highlighting a major bottleneck to a full understanding of gene expression regulation. Here, we develop a plasmid resource of 690 human RBPs that we subject to luciferase-based 3'-untranslated-region tethered function assays to pinpoint RBPs that regulate RNA stability or translation. Enhanced UV-cross-linking and immunoprecipitation of these RBPs identifies thousands of endogenous mRNA targets that respond to changes in RBP level, recapitulating effects observed in tethered function assays. Among these RBPs, the ubiquitin-associated protein 2-like (UBAP2L) protein interacts with RNA via its RGG domain and cross-links to mRNA and rRNA. Fusion of UBAP2L to RNA-targeting CRISPR-Cas9 demonstrates programmable translational enhancement. Polysome profiling indicates that UBAP2L promotes translation of target mRNAs, particularly global regulators of translation. Our tethering survey allows rapid assignment of the molecular activity of proteins, such as UBAP2L, to specific steps of mRNA metabolism.
Assuntos
Proteínas de Transporte/metabolismo , Biossíntese de Proteínas , Estabilidade de RNA , Proteínas de Ligação a RNA/metabolismo , Regiões 3' não Traduzidas , Sítios de Ligação , Sistemas CRISPR-Cas , Proteínas de Transporte/química , Proteínas de Transporte/genética , Linhagem Celular , Humanos , Luciferases/genética , Luciferases/metabolismo , Fases de Leitura Aberta , Polirribossomos/genética , Polirribossomos/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/genética , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Raios UltravioletaRESUMO
BACKGROUND: A critical step in uncovering rules of RNA processing is to study the in vivo regulatory networks of RNA binding proteins (RBPs). Crosslinking and immunoprecipitation (CLIP) methods enable mapping RBP targets transcriptome-wide, but methodological differences present challenges to large-scale analysis across datasets. The development of enhanced CLIP (eCLIP) enabled the mapping of targets for 150 RBPs in K562 and HepG2, creating a unique resource of RBP interactomes profiled with a standardized methodology in the same cell types. RESULTS: Our analysis of 223 eCLIP datasets reveals a range of binding modalities, including highly resolved positioning around splicing signals and mRNA untranslated regions that associate with distinct RBP functions. Quantification of enrichment for repetitive and abundant multicopy elements reveals 70% of RBPs have enrichment for non-mRNA element classes, enables identification of novel ribosomal RNA processing factors and sites, and suggests that association with retrotransposable elements reflects multiple RBP mechanisms of action. Analysis of spliceosomal RBPs indicates that eCLIP resolves AQR association after intronic lariat formation, enabling identification of branch points with single-nucleotide resolution, and provides genome-wide validation for a branch point-based scanning model for 3' splice site recognition. Finally, we show that eCLIP peak co-occurrences across RBPs enable the discovery of novel co-interacting RBPs. CONCLUSIONS: This work reveals novel insights into RNA biology by integrated analysis of eCLIP profiling of 150 RBPs with distinct functions. Further, our quantification of both mRNA and other element association will enable further research to identify novel roles of RBPs in regulating RNA processing.