RESUMEN
RNA-binding proteins (RBPs) participate in diverse cellular processes and have important roles in human development and disease. The human genome, and that of many other eukaryotes, encodes hundreds of RBPs that contain canonical sequence-specific RNA-binding domains (RBDs) as well as numerous other unconventional RNA binding proteins (ucRBPs). ucRBPs physically associate with RNA but lack common RBDs. The degree to which these proteins bind RNA, in a sequence specific manner, is unknown. Here, we provide a detailed description of both the laboratory and data processing methods for RNAcompete, a method we have previously used to analyze the RNA binding preferences of hundreds of RBD-containing RBPs, from diverse eukaryotes. We also determine the RNA-binding preferences for two human ucRBPs, NUDT21 and CNBP, and use this analysis to exemplify the RNAcompete pipeline. The results of our RNAcompete experiments are consistent with independent RNA-binding data for these proteins and demonstrate the utility of RNAcompete for analyzing the growing repertoire of ucRBPs.
Asunto(s)
Factor de Especificidad de Desdoblamiento y Poliadenilación/genética , Análisis por Micromatrices/métodos , Proteínas de Unión al ARN/genética , ARN/química , Animales , Secuencia de Bases , Sitios de Unión , Factor de Especificidad de Desdoblamiento y Poliadenilación/metabolismo , Clonación Molecular , Cartilla de ADN/química , Cartilla de ADN/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Expresión Génica , Humanos , Unión Proteica , Dominios Proteicos , ARN/genética , ARN/metabolismo , Proteínas de Unión al ARN/metabolismo , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Alineación de SecuenciaRESUMEN
RNA-binding proteins (RBPs) are key regulators of gene expression. Here, we introduce EuPRI (Eukaryotic Protein-RNA Interactions) - a freely available resource of RNA motifs for 34,736 RBPs from 690 eukaryotes. EuPRI includes in vitro binding data for 504 RBPs, including newly collected RNAcompete data for 174 RBPs, along with thousands of reconstructed motifs. We reconstruct these motifs with a new computational platform - Joint Protein-Ligand Embedding (JPLE) - which can detect distant homology relationships and map specificity-determining peptides. EuPRI quadruples the number of known RBP motifs, expanding the motif repertoire across all major eukaryotic clades, and assigning motifs to the majority of human RBPs. EuPRI drastically improves knowledge of RBP motifs in flowering plants. For example, it increases the number of Arabidopsis thaliana RBP motifs 7-fold, from 14 to 105. EuPRI also has broad utility for inferring post-transcriptional function and evolutionary relationships. We demonstrate this by predicting a role for 12 Arabidopsis thaliana RBPs in RNA stability and identifying rapid and recent evolution of post-transcriptional regulatory networks in worms and plants. In contrast, the vertebrate RNA motif set has remained relatively stable after its drastic expansion between the metazoan and vertebrate ancestors. EuPRI represents a powerful resource for the study of gene regulation across eukaryotes.
RESUMEN
Thousands of RNA-binding proteins (RBPs) crosslink to cellular mRNA. Among these are numerous unconventional RBPs (ucRBPs)-proteins that associate with RNA but lack known RNA-binding domains (RBDs). The vast majority of ucRBPs have uncharacterized RNA-binding specificities. We analyzed 492 human ucRBPs for intrinsic RNA-binding in vitro and identified 23 that bind specific RNA sequences. Most (17/23), including 8 ribosomal proteins, were previously associated with RNA-related function. We identified the RBDs responsible for sequence-specific RNA-binding for several of these 23 ucRBPs and surveyed whether corresponding domains from homologous proteins also display RNA sequence specificity. CCHC-zf domains from seven human proteins recognized specific RNA motifs, indicating that this is a major class of RBD. For Nudix, HABP4, TPR, RanBP2-zf, and L7Ae domains, however, only isolated members or closely related homologs yielded motifs, consistent with RNA-binding as a derived function. The lack of sequence specificity for most ucRBPs is striking, and we suggest that many may function analogously to chromatin factors, which often crosslink efficiently to cellular DNA, presumably via indirect recruitment. Finally, we show that ucRBPs tend to be highly abundant proteins and suggest their identification in RNA interactome capture studies could also result from weak nonspecific interactions with RNA.