RESUMO
Splicing of pre-mRNAs critically contributes to gene regulation and proteome expansion in eukaryotes, but our understanding of the recognition and pairing of splice sites during spliceosome assembly lacks detail. Here, we identify the multidomain RNA-binding protein FUBP1 as a key splicing factor that binds to a hitherto unknown cis-regulatory motif. By collecting NMR, structural, and in vivo interaction data, we demonstrate that FUBP1 stabilizes U2AF2 and SF1, key components at the 3' splice site, through multivalent binding interfaces located within its disordered regions. Transcriptional profiling and kinetic modeling reveal that FUBP1 is required for efficient splicing of long introns, which is impaired in cancer patients harboring FUBP1 mutations. Notably, FUBP1 interacts with numerous U1 snRNP-associated proteins, suggesting a unique role for FUBP1 in splice site bridging for long introns. We propose a compelling model for 3' splice site recognition of long introns, which represent 80% of all human introns.
Assuntos
Sítios de Splice de RNA , Splicing de RNA , Humanos , Sítios de Splice de RNA/genética , Íntrons/genética , Fatores de Processamento de RNA/genética , Fatores de Processamento de RNA/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Precursores de RNA/genética , Precursores de RNA/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismoRESUMO
Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ?14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ?30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a "broader" human interactome network than currently appreciated. The map also uncovers significant interconnectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high-quality interactome models will help "connect the dots" of the genomic revolution.
Assuntos
Mapas de Interação de Proteínas , Proteoma/metabolismo , Animais , Bases de Dados de Proteínas , Estudo de Associação Genômica Ampla , Humanos , Camundongos , Neoplasias/metabolismoRESUMO
Piwi-interacting RNAs (piRNAs) direct PIWI proteins to transposons to silence them, thereby preserving genome integrity and fertility. The piRNA population can be expanded in the ping-pong amplification loop. Within this process, piRNA-associated PIWI proteins (piRISC) enter a membraneless organelle called nuage to cleave their target RNA, which is stimulated by Gtsf proteins. The resulting cleavage product gets loaded into an empty PIWI protein to form a new piRISC complex. However, for piRNA amplification to occur, the new RNA substrates, Gtsf-piRISC, and empty PIWI proteins have to be in physical proximity. In this study, we show that in silkworm cells, the Gtsf1 homolog BmGtsf1L binds to piRNA-loaded BmAgo3 and localizes to granules positive for BmAgo3 and BmVreteno. Biochemical assays further revealed that conserved residues within the unstructured tail of BmGtsf1L directly interact with BmVreteno. Using a combination of AlphaFold modeling, atomistic molecular dynamics simulations, and in vitro assays, we identified a novel binding interface on the BmVreteno-eTudor domain, which is required for BmGtsf1L binding. Our study reveals that a single eTudor domain within BmVreteno provides two binding interfaces and thereby interconnects piRNA-loaded BmAgo3 and BmGtsf1L.
Assuntos
Bombyx , Animais , Proteínas Argonautas/genética , Proteínas Argonautas/metabolismo , Bombyx/genética , Bombyx/metabolismo , RNA de Interação com Piwi , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo , Domínio TudorRESUMO
Global insights into cellular organization and genome function require comprehensive understanding of the interactome networks that mediate genotype-phenotype relationships1,2. Here we present a human 'all-by-all' reference interactome map of human binary protein interactions, or 'HuRI'. With approximately 53,000 protein-protein interactions, HuRI has approximately four times as many such interactions as there are high-quality curated interactions from small-scale studies. The integration of HuRI with genome3, transcriptome4 and proteome5 data enables cellular function to be studied within most physiological or pathological cellular contexts. We demonstrate the utility of HuRI in identifying the specific subcellular roles of protein-protein interactions. Inferred tissue-specific networks reveal general principles for the formation of cellular context-specific functions and elucidate potential molecular mechanisms that might underlie tissue-specific phenotypes of Mendelian diseases. HuRI is a systematic proteome-wide reference that links genomic variation to phenotypic outcomes.
Assuntos
Proteoma/metabolismo , Espaço Extracelular/metabolismo , Humanos , Especificidade de Órgãos , Mapeamento de Interação de ProteínasRESUMO
MacroH2A has been linked to transcriptional silencing, cell identity, and is a hallmark of the inactive X chromosome (Xi). However, it remains unclear whether macroH2A plays a role in DNA replication. Using knockdown/knockout cells for each macroH2A isoform, we show that macroH2A-containing nucleosomes slow down replication progression rate in the Xi reflecting the higher nucleosome stability. Moreover, macroH2A1, but not macroH2A2, regulates the number of nano replication foci in the Xi, and macroH2A1 downregulation increases DNA loop sizes corresponding to replicons. This relates to macroH2A1 regulating replicative helicase loading during G1 by interacting with it. We mapped this interaction to a phenylalanine in macroH2A1 that is not conserved in macroH2A2 and the C-terminus of Mcm3 helicase subunit. We propose that macroH2A1 enhances the licensing of pre-replication complexes via DNA helicase interaction and loading onto the Xi.
RESUMO
MOTIVATION: While the release of AlphaFold (AF) represented a breakthrough for the prediction of protein complex structures, its sensitivity, especially when using full length protein sequences, still remains limited. Modeling success rates might increase if AF predictions were guided by likely interacting protein fragments. This approach requires available sets of highly confident protein-protein interface types. Computational resources, such as 3did, infer interacting globular domain types from observed contacts in protein structures. Assessing the accuracy of these predicted interface types is difficult because we lack hand-curated reference sets of verified domain-domain interface (DDI) types. RESULTS: To improve protein complex modeling of DDIs by AF, we manually inspected 80 randomly selected DDI types from the 3did resource to generate a first reference set of DDI types. Identified cases of DDI type nonapproval (40%) primarily resulted from inaccurate Pfam domain matches, crystal contacts, and synthetic protein constructs. Using logistic regression, we predicted a subset of 2411 out of 5724 considered DDI types in 3did to be of high confidence, which we subsequently applied to 53 000 human-protein interactions to predict DDIs followed by AF modeling. We obtained highly confident AF models for 604 out of 1129 predicted DDIs. Of note, for 47% of them no confident AF structural model could be obtained using full length protein sequences. AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/KatjaLuckLab/DDI_manuscript.
Assuntos
Proteínas , Proteínas/química , Proteínas/metabolismo , Domínios Proteicos , Modelos Moleculares , Bases de Dados de Proteínas , Software , Biologia Computacional/métodos , Conformação ProteicaRESUMO
Structural resolution of protein interactions enables mechanistic and functional studies as well as interpretation of disease variants. However, structural data is still missing for most protein interactions because we lack computational and experimental tools at scale. This is particularly true for interactions mediated by short linear motifs occurring in disordered regions of proteins. We find that AlphaFold-Multimer predicts with high sensitivity but limited specificity structures of domain-motif interactions when using small protein fragments as input. Sensitivity decreased substantially when using long protein fragments or full length proteins. We delineated a protein fragmentation strategy particularly suited for the prediction of domain-motif interfaces and applied it to interactions between human proteins associated with neurodevelopmental disorders. This enabled the prediction of highly confident and likely disease-related novel interfaces, which we further experimentally corroborated for FBXO23-STX1B, STX1B-VAMP2, ESRRG-PSMC5, PEX3-PEX19, PEX3-PEX16, and SNRPB-GIGYF1 providing novel molecular insights for diverse biological processes. Our work highlights exciting perspectives, but also reveals clear limitations and the need for future developments to maximize the power of Alphafold-Multimer for interface predictions.
Assuntos
Proteínas de Transporte , Proteínas , Humanos , Proteínas/metabolismo , Proteínas de Membrana/metabolismoRESUMO
RNA-binding proteins (RBPs) form highly diverse and dynamic ribonucleoprotein complexes, whose functions determine the molecular fate of the bound RNA. In the model organism Sacchromyces cerevisiae, the number of proteins identified as RBPs has greatly increased over the last decade. However, the cellular function of most of these novel RBPs remains largely unexplored. We used mass spectrometry-based quantitative proteomics to systematically identify protein-protein interactions (PPIs) and RNA-dependent interactions (RDIs) to create a novel dataset for 40 RBPs that are associated with the mRNA life cycle. Domain, functional and pathway enrichment analyses revealed an over-representation of RNA functionalities among the enriched interactors. Using our extensive PPI and RDI networks, we revealed putative new members of RNA-associated pathways, and highlighted potential new roles for several RBPs. Our RBP interactome resource is available through an online interactive platform as a community tool to guide further in-depth functional studies and RBP network analysis (https://www.butterlab.org/RINE).
Assuntos
Proteínas de Ligação a RNA , RNA , Saccharomyces cerevisiae , Proteômica , RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/metabolismo , Mapeamento de Interação de Proteínas , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismoRESUMO
PARP1 mediates poly-ADP-ribosylation of proteins on chromatin in response to different types of DNA lesions. PARP inhibitors are used for the treatment of BRCA1/2-deficient breast, ovarian, and prostate cancer. Loss of DNA replication fork protection is proposed as one mechanism that contributes to the vulnerability of BRCA1/2-deficient cells to PARP inhibitors. However, the mechanisms that regulate PARP1 activity at stressed replication forks remain poorly understood. Here, we performed proximity proteomics of PARP1 and isolation of proteins on stressed replication forks to map putative PARP1 regulators. We identified TPX2 as a direct PARP1-binding protein that regulates the auto-ADP-ribosylation activity of PARP1. TPX2 interacts with DNA damage response proteins and promotes homology-directed repair of DNA double-strand breaks. Moreover, TPX2 mRNA levels are increased in BRCA1/2-mutated breast and prostate cancers, and high TPX2 expression levels correlate with the sensitivity of cancer cells to PARP-trapping inhibitors. We propose that TPX2 confers a mitosis-independent function in the cellular response to replication stress by interacting with PARP1.
Assuntos
Replicação do DNA , Poli(ADP-Ribose) Polimerase-1 , Proteômica , Quebras de DNA de Cadeia Dupla , Reparo do DNA , Poli(ADP-Ribose) Polimerase-1/genética , Inibidores de Poli(ADP-Ribose) Polimerases/farmacologiaRESUMO
Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life.
Assuntos
Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Proteínas/metabolismo , Proteoma/metabolismo , Humanos , Ligação Proteica , Proteínas/química , ProteômicaRESUMO
BAF complexes are multi-subunit chromatin remodelers, which have a fundamental role in genomic regulation. Large-scale sequencing efforts have revealed frequent BAF complex mutations in many human diseases, particularly in cancer and neurological disorders. These findings not only underscore the importance of the BAF chromatin remodelers in cellular physiological processes, but urge a more detailed understanding of their structure and molecular action to enable the development of targeted therapeutic approaches for diseases with BAF complex alterations. Here, we review recent progress in understanding the composition, assembly, structure, and function of BAF complexes, and the consequences of their disease-associated mutations. Furthermore, we highlight intra-complex subunit dependencies and synthetic lethal interactions, which have emerged as promising treatment modalities for BAF-related diseases.
Assuntos
Montagem e Desmontagem da Cromatina , Fatores de Transcrição/metabolismo , Humanos , Neoplasias/metabolismo , Doenças do Sistema Nervoso/metabolismo , Conformação ProteicaRESUMO
Many protein interactions are mediated by small linear motifs interacting specifically with defined families of globular domains. Quantifying the specificity of a motif requires measuring and comparing its binding affinities to all its putative target domains. To this end, we developed the high-throughput holdup assay, a chromatographic approach that can measure up to 1,000 domain-motif equilibrium binding affinities per day. After benchmarking the approach on 210 PDZ-peptide pairs with known affinities, we determined the affinities of two viral PDZ-binding motifs derived from human papillomavirus E6 oncoproteins for 209 PDZ domains covering 79% of the human 'PDZome'. We obtained sharply sequence-dependent binding profiles that quantitatively describe the PDZome recognition specificity of each motif. This approach, applicable to many categories of domain-ligand interactions, has wide potential for quantifying the specificities of interactomes.
Assuntos
Ensaios de Triagem em Larga Escala , Domínios PDZ , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Motivos de Aminoácidos , Cromatografia , Proteínas de Ligação a DNA/química , Humanos , Cinética , Ligantes , Proteínas Oncogênicas Virais/química , Conformação Proteica , Proteoma , Proteínas Repressoras/química , Biologia de SistemasRESUMO
Linear motifs are short, evolutionarily plastic components of regulatory proteins and provide low-affinity interaction interfaces. These compact modules play central roles in mediating every aspect of the regulatory functionality of the cell. They are particularly prominent in mediating cell signaling, controlling protein turnover and directing protein localization. Given their importance, our understanding of motifs is surprisingly limited, largely as a result of the difficulty of discovery, both experimentally and computationally. The Eukaryotic Linear Motif (ELM) resource at http://elm.eu.org provides the biological community with a comprehensive database of known experimentally validated motifs, and an exploratory tool to discover putative linear motifs in user-submitted protein sequences. The current update of the ELM database comprises 1800 annotated motif instances representing 170 distinct functional classes, including approximately 500 novel instances and 24 novel classes. Several older motif class entries have been also revisited, improving annotation and adding novel instances. Furthermore, addition of full-text search capabilities, an enhanced interface and simplified batch download has improved the overall accessibility of the ELM data. The motif discovery portion of the ELM resource has added conservation, and structural attributes have been incorporated to aid users to discriminate biologically relevant motifs from stochastically occurring non-functional instances.
Assuntos
Motivos de Aminoácidos , Bases de Dados de Proteínas , Gráficos por Computador , Doença/genética , Eucariotos , Análise de Sequência de Proteína , Interface Usuário-Computador , Proteínas Virais/químicaRESUMO
Generating reference maps of interactome networks illuminates genetic studies by providing a protein-centric approach to finding new components of existing pathways, complexes, and processes. We apply state-of-the-art methods to identify binary protein-protein interactions (PPIs) for Drosophila melanogaster. Four all-by-all yeast two-hybrid (Y2H) screens of > 10,000 Drosophila proteins result in the 'FlyBi' dataset of 8723 PPIs among 2939 proteins. Testing subsets of data from FlyBi and previous PPI studies using an orthogonal assay allows for normalization of data quality; subsequent integration of FlyBi and previous data results in an expanded binary Drosophila reference interaction network, DroRI, comprising 17,232 interactions among 6511 proteins. We use FlyBi data to generate an autophagy network, then validate in vivo using autophagy-related assays. The deformed wings (dwg) gene encodes a protein that is both a regulator and a target of autophagy. Altogether, these resources provide a foundation for building new hypotheses regarding protein networks and function.
Assuntos
Proteínas de Drosophila , Mapas de Interação de Proteínas , Animais , Mapas de Interação de Proteínas/genética , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Drosophila/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Mapeamento de Interação de Proteínas/métodos , Técnicas do Sistema de Duplo-HíbridoRESUMO
MOTIVATION: The phage display peptide selection approach is widely used for defining binding specificities of globular domains. PDZ domains recognize partner proteins via C-terminal motifs and are often used as a model for interaction predictions. Here, we investigated to which extent phage display data that were recently published for 54 human PDZ domains can be applied to the prediction of human PDZ-peptide interactions. RESULTS: Promising predictions were obtained for one-third of the 54 PDZ domains. For the other two-thirds, we detected in the phage display peptides an important bias for hydrophobic amino acids that seemed to impair correct predictions. Therefore, phage display-selected peptides may be over-hydrophobic and of high affinity, while natural interaction motifs are rather hydrophilic and mostly combine low affinity with high specificity. We suggest that potential amino acid composition bias should systematically be investigated when applying phage display data to the prediction of specific natural domain-linear motif interactions.
Assuntos
Domínios PDZ , Biblioteca de Peptídeos , Peptídeos/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Aminoácidos/análise , Aminoácidos/química , Humanos , Interações Hidrofóbicas e Hidrofílicas , Ligação Proteica , Domínios e Motivos de Interação entre ProteínasRESUMO
Protein abundance is controlled at the transcriptional, translational and post-translational levels, and its regulatory principles are starting to emerge. Investigating these principles requires large-scale proteomics data and cannot just be done with transcriptional outcomes that are commonly used as a proxy for protein abundance. Here, we determine proteome changes resulting from the individual knockout of 3308 nonessential genes in the yeast Schizosaccharomyces pombe. We use similarity clustering of global proteome changes to infer gene functionality that can be extended to other species, such as humans or baker's yeast. Furthermore, we analyze a selected set of deletion mutants by paired transcriptome and proteome measurements and show that upregulation of proteins under stable transcript expression utilizes optimal codons.
Assuntos
Proteínas de Schizosaccharomyces pombe , Schizosaccharomyces , Humanos , Proteoma/genética , Proteoma/metabolismo , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo , Proteínas de Schizosaccharomyces pombe/genética , Proteínas de Schizosaccharomyces pombe/metabolismo , Proteômica/métodos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismoRESUMO
Knowing which proteins interact with each other is essential information for understanding how most biological processes at the cellular and organismal level operate and how their perturbation can cause disease. Continuous technical and methodological advances over the last two decades have led to many genome-wide systematically-generated protein-protein interaction (PPI) maps. To help store, visualize, analyze and disseminate these specialized experimental datasets via the web, we developed the freely-available Open-source Protein Interaction Platform (openPIP) as a customizable web portal designed to host experimental PPI maps. Such a portal is often required to accompany a paper describing the experimental data set, in addition to depositing the data in a standard repository. No coding skills are required to set up and customize the database and web portal. OpenPIP has been used to build the databases and web portals of two major protein interactome maps, the Human and Yeast Reference Protein Interactome maps (HuRI and YeRI, respectively). OpenPIP is freely available as a ready-to-use Docker container for hosting and sharing PPI data with the scientific community at http://openpip.baderlab.org/ and the source code can be downloaded from https://github.com/BaderLab/openPIP/.
Assuntos
Uso da Internet , Mapas de Interação de Proteínas , Software , Bases de Dados Factuais , Genoma Humano , HumanosRESUMO
MOTIVATION: We noted that the sumoylation site in C/EBP homologues is conserved beyond the canonical consensus sequence for sumoylation. Therefore, we investigated whether this pattern might define a more general protein motif. RESULTS: We undertook a survey of the human proteome using a regular expression based on the C/EBP motif. This revealed significant enrichment of the motif using different Gene Ontology terms (e.g. 'transcription') that pertain to the nucleus. When considering requirements for the motif to be functional (evolutionary conservation, structural accessibility of the motif and proper cell localization of the protein), more than 130 human proteins were retrieved from the UniProt/Swiss-Prot database. These candidates were particularly enriched in transcription factors, including FOS, JUN, Hif-1alpha, MLL2 and members of the KLF, MAF and NFATC families; chromatin modifiers like CHD-8, HDAC4 and DNA Top1; and the transcriptional regulatory kinases HIPK1 and HIPK2. The KEPEmotif appears to be restricted to the metazoan lineage and has three length variants-short, medium and long-which do not appear to interchange.
Assuntos
Cromatina/metabolismo , Nucleoproteínas/química , Proteínas Modificadoras Pequenas Relacionadas à Ubiquitina/química , Fatores de Transcrição/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Sequência Conservada , Bases de Dados de Proteínas , Regulação da Expressão Gênica , Humanos , Dados de Sequência Molecular , Mutação/genética , Proteoma/química , Saccharomyces cerevisiae/química , Fatores de Transcrição/metabolismoRESUMO
Genetic variants are often not predictive of the phenotypic outcome. Individuals carrying the same pathogenic variant, associated with Mendelian or complex disease, can manifest to different extents, from severe-to-mild to no disease. Improving the accuracy of predicted clinical manifestations of genetic variants has emerged as one of the biggest challenges in precision medicine, which can only be addressed by understanding the mechanisms underlying genotype-phenotype relationships. Efforts to understand the molecular basis of these relationships have identified complex systems of interacting biomolecules that underlie cellular function. Here, we review recent advances in how modeling cellular systems as networks of interacting proteins has fueled identification of disease-associated processes, delineation of underlying molecular mechanisms, and prediction of the pathogenicity of variants. This review is intended to be inspiring for clinicians, geneticists, and network biologists alike who aim to jointly advance our understanding of human disease and accelerate progress toward precision medicine.
Assuntos
Bases de Dados Genéticas , Medicina de Precisão , HumanosRESUMO
Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. Computational tools offer a promising alternative, helping identify biologically significant, yet unmapped protein-protein interactions (PPIs). While link prediction methods connect proteins on the basis of biological or network-based similarity, interacting proteins are not necessarily similar and similar proteins do not necessarily interact. Here, we offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other's partners. This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. Given its high accuracy, we show that L3 can offer mechanistic insights into disease mechanisms and can complement future experimental efforts to complete the human interactome.