Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Genome Biol Evol ; 15(11)2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-37936309

RESUMO

The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictive pipeline.


Assuntos
Biologia Computacional , Proteínas , Humanos , Biologia Computacional/métodos , Proteínas/química , Genômica , Alinhamento de Sequência , Mutação de Sentido Incorreto
2.
PLoS Genet ; 19(8): e1010848, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37585488

RESUMO

N-terminal ends of polypeptides are critical for the selective co-translational recruitment of N-terminal modification enzymes. However, it is unknown whether specific N-terminal signatures differentially regulate protein fate according to their cellular functions. In this work, we developed an in-silico approach to detect functional preferences in cellular N-terminomes, and identified in S. cerevisiae more than 200 Gene Ontology terms with specific N-terminal signatures. In particular, we discovered that Mitochondrial Targeting Sequences (MTS) show a strong and specific over-representation at position 2 of hydrophobic residues known to define potential substrates of the N-terminal acetyltransferase NatC. We validated mitochondrial precursors as co-translational targets of NatC by selective purification of translating ribosomes, and found that their N-terminal signature is conserved in Saccharomycotina yeasts. Finally, systematic mutagenesis of the position 2 in a prototypal yeast mitochondrial protein confirmed its critical role in mitochondrial protein import. Our work highlights the hydrophobicity of MTS N-terminal residues and their targeting by NatC as important features for the definition of the mitochondrial proteome, providing a molecular explanation for mitochondrial defects observed in yeast or human NatC-depleted cells. Functional mapping of N-terminal residues thus has the potential to support the discovery of novel mechanisms of protein regulation or targeting.


Assuntos
Proteoma , Saccharomyces cerevisiae , Humanos , Saccharomyces cerevisiae/genética , Sequência de Aminoácidos , Proteoma/metabolismo , Transporte Proteico , Proteínas Fúngicas/metabolismo , Proteínas Mitocondriais/metabolismo
3.
Proteomics ; 23(17): e2200159, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37403279

RESUMO

Physical interactions between proteins are central to all biological processes. Yet, the current knowledge of who interacts with whom in the cell and in what manner relies on partial, noisy, and highly heterogeneous data. Thus, there is a need for methods comprehensively describing and organizing such data. LEVELNET is a versatile and interactive tool for visualizing, exploring, and comparing protein-protein interaction (PPI) networks inferred from different types of evidence. LEVELNET helps to break down the complexity of PPI networks by representing them as multi-layered graphs and by facilitating the direct comparison of their subnetworks toward biological interpretation. It focuses primarily on the protein chains whose 3D structures are available in the Protein Data Bank. We showcase some potential applications, such as investigating the structural evidence supporting PPIs associated to specific biological processes, assessing the co-localization of interaction partners, comparing the PPI networks obtained through computational experiments versus homology transfer, and creating PPI benchmarks with desired properties.


Assuntos
Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Bases de Dados de Proteínas , Biologia Computacional
4.
J Struct Biol ; 215(3): 107997, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37453591

RESUMO

Alternative splicing of repeats in proteins provides a mechanism for rewiring and fine-tuning protein interaction networks. In this work, we developed a robust and versatile method, ASPRING, to identify alternatively spliced protein repeats from gene annotations. ASPRING leverages evolutionary meaningful alternative splicing-aware hierarchical graphs to provide maps between protein repeats sequences and 3D structures. We re-think the definition of repeats by explicitly accounting for transcript diversity across several genes/species. Using a stringent sequence-based similarity criterion, we detected over 5,000 evolutionary conserved repeats by screening virtually all human protein-coding genes and their orthologs across a dozen species. Through a joint analysis of their sequences and structures, we extracted specificity-determining sequence signatures and assessed their implication in experimentally resolved and modelled protein interactions. Our findings demonstrate the widespread alternative usage of protein repeats in modulating protein interactions and open avenues for targeting repeat-mediated interactions.


Assuntos
Processamento Alternativo , Proteínas , Humanos , Processamento Alternativo/genética , Proteínas/genética
5.
Nature ; 620(7973): 434-444, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37468638

RESUMO

Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability.


Assuntos
Biologia , Engenharia de Proteínas , Dobramento de Proteína , Proteínas , Aminoácidos/genética , Aminoácidos/metabolismo , Biologia/métodos , DNA Complementar/genética , Estabilidade Proteica , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Termodinâmica , Proteólise , Engenharia de Proteínas/métodos , Domínios Proteicos/genética , Mutação
6.
Bioinformatics ; 39(39 Suppl 1): i544-i552, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387162

RESUMO

MOTIVATION: The spectacular recent advances in protein and protein complex structure prediction hold promise for reconstructing interactomes at large-scale and residue resolution. Beyond determining the 3D arrangement of interacting partners, modeling approaches should be able to unravel the impact of sequence variations on the strength of the association. RESULTS: In this work, we report on Deep Local Analysis, a novel and efficient deep learning framework that relies on a strikingly simple deconstruction of protein interfaces into small locally oriented residue-centered cubes and on 3D convolutions recognizing patterns within cubes. Merely based on the two cubes associated with the wild-type and the mutant residues, DLA accurately estimates the binding affinity change for the associated complexes. It achieves a Pearson correlation coefficient of 0.735 on about 400 mutations on unseen complexes. Its generalization capability on blind datasets of complexes is higher than the state-of-the-art methods. We show that taking into account the evolutionary constraints on residues contributes to predictions. We also discuss the influence of conformational variability on performance. Beyond the predictive power on the effects of mutations, DLA is a general framework for transferring the knowledge gained from the available non-redundant set of complex protein structures to various tasks. For instance, given a single partially masked cube, it recovers the identity and physicochemical class of the central residue. Given an ensemble of cubes representing an interface, it predicts the function of the complex. AVAILABILITY AND IMPLEMENTATION: Source code and models are available at http://gitlab.lcqb.upmc.fr/DLA/DLA.git.


Assuntos
Evolução Biológica , Software , Mutação
7.
Bioinformatics ; 38(19): 4505-4512, 2022 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-35962985

RESUMO

MOTIVATION: With the recent advances in protein 3D structure prediction, protein interactions are becoming more central than ever before. Here, we address the problem of determining how proteins interact with one another. More specifically, we investigate the possibility of discriminating near-native protein complex conformations from incorrect ones by exploiting local environments around interfacial residues. RESULTS: Deep Local Analysis (DLA)-Ranker is a deep learning framework applying 3D convolutions to a set of locally oriented cubes representing the protein interface. It explicitly considers the local geometry of the interfacial residues along with their neighboring atoms and the regions of the interface with different solvent accessibility. We assessed its performance on three docking benchmarks made of half a million acceptable and incorrect conformations. We show that DLA-Ranker successfully identifies near-native conformations from ensembles generated by molecular docking. It surpasses or competes with other deep learning-based scoring functions. We also showcase its usefulness to discover alternative interfaces. AVAILABILITY AND IMPLEMENTATION: http://gitlab.lcqb.upmc.fr/dla-ranker/DLA-Ranker.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas , Simulação de Acoplamento Molecular , Conformação Proteica , Proteínas/química , Ligação Proteica
8.
Bioinformatics ; 38(9): 2615-2616, 2022 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-35188186

RESUMO

SUMMARY: ASES is a versatile tool for assessing the impact of alternative splicing (AS), initiation and termination of transcription on protein diversity in evolution. It identifies exon and transcript orthogroups from a set of input genes/species for comparative transcriptomics analyses. It computes an evolutionary splicing graph, where the nodes are exon orthogroups, allowing for a direct evaluation of AS conservation. It also reconstructs a transcripts' phylogenetic forest to date the appearance of specific transcripts and explore the events that have shaped them. ASES web server features a highly interactive interface enabling the synchronous selection of events, exons or transcripts in the different outputs, and the visualization and retrieval of the corresponding amino acid sequences, for subsequent 3D structure prediction. AVAILABILITY AND IMPLEMENTATION: http://www.lcqb.upmc.fr/Ases. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Processamento Alternativo , Proteínas , Filogenia , Éxons , Proteínas/química , Splicing de RNA
9.
PLoS Comput Biol ; 18(1): e1009825, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-35089918

RESUMO

Proteins ensure their biological functions by interacting with each other. Hence, characterising protein interactions is fundamental for our understanding of the cellular machinery, and for improving medicine and bioengineering. Over the past years, a large body of experimental data has been accumulated on who interacts with whom and in what manner. However, these data are highly heterogeneous and sometimes contradictory, noisy, and biased. Ab initio methods provide a means to a "blind" protein-protein interaction network reconstruction. Here, we report on a molecular cross-docking-based approach for the identification of protein partners. The docking algorithm uses a coarse-grained representation of the protein structures and treats them as rigid bodies. We applied the approach to a few hundred of proteins, in the unbound conformations, and we systematically investigated the influence of several key ingredients, such as the size and quality of the interfaces, and the scoring function. We achieved some significant improvement compared to previous works, and a very high discriminative power on some specific functional classes. We provide a readout of the contributions of shape and physico-chemical complementarity, interface matching, and specificity, in the predictions. In addition, we assessed the ability of the approach to account for protein surface multiple usages, and we compared it with a sequence-based deep learning method. This work may contribute to guiding the exploitation of the large amounts of protein structural models now available toward the discovery of unexpected partners and their complex structure characterisation.


Assuntos
Sítios de Ligação/fisiologia , Simulação de Acoplamento Molecular , Conformação Proteica , Mapas de Interação de Proteínas/fisiologia , Proteínas , Algoritmos , Biologia Computacional , Bases de Dados de Proteínas , Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo
10.
Proteins ; 89(12): 1770-1786, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34519095

RESUMO

The potential of deep learning has been recognized in the protein structure prediction community for some time, and became indisputable after CASP13. In CASP14, deep learning has boosted the field to unanticipated levels reaching near-experimental accuracy. This success comes from advances transferred from other machine learning areas, as well as methods specifically designed to deal with protein sequences and structures, and their abstractions. Novel emerging approaches include (i) geometric learning, that is, learning on representations such as graphs, three-dimensional (3D) Voronoi tessellations, and point clouds; (ii) pretrained protein language models leveraging attention; (iii) equivariant architectures preserving the symmetry of 3D space; (iv) use of large meta-genome databases; (v) combinations of protein representations; and (vi) finally truly end-to-end architectures, that is, differentiable models starting from a sequence and returning a 3D structure. Here, we provide an overview and our opinion of the novel deep learning approaches developed in the last 2 years and widely used in CASP14.


Assuntos
Sequência de Aminoácidos , Conformação Proteica , Proteínas , Software , Biologia Computacional , Bases de Dados de Proteínas , Aprendizado Profundo , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína
11.
Genome Res ; 31(8): 1462-1473, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34266979

RESUMO

Understanding how protein function has evolved and diversified is of great importance for human genetics and medicine. Here, we tackle the problem of describing the whole transcript variability observed in several species by generalizing the definition of splicing graph. We provide a practical solution to construct parsimonious evolutionary splicing graphs where each node is a minimal transcript building block defined across species. We show a clear link between the functional relevance, tissue regulation, and conservation of alternative transcripts on a set of 50 genes. By scaling up to the whole human protein-coding genome, we identify a few thousand genes where alternative splicing modulates the number and composition of pseudorepeats. We have implemented our approach in ThorAxe, an efficient, versatile, robust, and freely available computational tool.


Assuntos
Processamento Alternativo , Splicing de RNA , Genoma Humano , Humanos
12.
J Phys Chem B ; 125(10): 2577-2588, 2021 03 18.
Artigo em Inglês | MEDLINE | ID: mdl-33687221

RESUMO

In light of the recent very rapid progress in protein structure prediction, accessing the multitude of functional protein states is becoming more central than ever before. Indeed, proteins are flexible macromolecules, and they often perform their function by switching between different conformations. However, high-resolution experimental techniques such as X-ray crystallography and cryogenic electron microscopy can catch relatively few protein functional states. Many others are only accessible under physiological conditions in solution. Therefore, there is a pressing need to fill this gap with computational approaches. We present HOPMA, a novel method to predict protein functional states and transitions by using a modified elastic network model. The method exploits patterns in a protein contact map, taking its 3D structure as input, and excludes some disconnected patches from the elastic network. Combined with nonlinear normal mode analysis, this strategy boosts the protein conformational space exploration, especially when the input structure is highly constrained, as we demonstrate on a set of more than 400 transitions. Our results let us envision the discovery of new functional conformations, which were unreachable previously, starting from the experimentally known protein structures. The method is computationally efficient and available at https://github.com/elolaine/HOPMA and https://team.inria.fr/nano-d/software/nolb-normal-modes.


Assuntos
Proteínas , Cristalografia por Raios X , Substâncias Macromoleculares , Modelos Moleculares , Conformação Proteica
13.
Trends Pharmacol Sci ; 42(1): 3-6, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33234336

RESUMO

Solute carrier (SLC) transporters are emerging drug targets. Identifying the molecular determinants responsible for their specific and selective transport activities and describing key interactions with their ligands are crucial steps towards the design of potential new drugs. A general functional mapping across more than 400 human SLC transporters would pave the way to the rational and systematic design of molecules modulating cellular transport.


Assuntos
Proteínas de Membrana Transportadoras , Proteínas Carreadoras de Solutos , Humanos , Ligantes
14.
BMC Bioinformatics ; 21(Suppl 19): 573, 2020 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-33349244

RESUMO

BACKGROUND: Coiled-coils are described as stable structural motifs, where two or more helices wind around each other. However, coiled-coils are associated with local mobility and intrinsic disorder. Intrinsically disordered regions in proteins are characterized by lack of stable secondary and tertiary structure under physiological conditions in vitro. They are increasingly recognized as important for protein function. However, characterizing their behaviour in solution and determining precisely the extent of disorder of a protein region remains challenging, both experimentally and computationally. RESULTS: In this work, we propose a computational framework to quantify the extent of disorder within a coiled-coil in solution and to help design substitutions modulating such disorder. Our method relies on the analysis of conformational ensembles generated by relatively short all-atom Molecular Dynamics (MD) simulations. We apply it to the phosphoprotein multimerisation domains (PMD) of Measles virus (MeV) and Nipah virus (NiV), both forming tetrameric left-handed coiled-coils. We show that our method can help quantify the extent of disorder of the C-terminus region of MeV and NiV PMDs from MD simulations of a few tens of nanoseconds, and without requiring an extensive exploration of the conformational space. Moreover, this study provided a conceptual framework for the rational design of substitutions aimed at modulating the stability of the coiled-coils. By assessing the impact of four substitutions known to destabilize coiled-coils, we derive a set of rules to control MeV PMD structural stability and cohesiveness. We therefore design two contrasting substitutions, one increasing the stability of the tetramer and the other increasing its flexibility. CONCLUSIONS: Our method can be considered as a platform to reason about how to design substitutions aimed at regulating flexibility and stability.


Assuntos
Biologia Computacional/métodos , Proteínas Virais/química , Sequência de Aminoácidos , Vírus do Sarampo/metabolismo , Simulação de Dinâmica Molecular , Vírus Nipah/metabolismo , Domínios Proteicos , Estabilidade Proteica , Estrutura Secundária de Proteína , Proteínas Virais/metabolismo
15.
Biophys J ; 118(10): 2513-2525, 2020 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-32330413

RESUMO

Large macromolecules, including proteins and their complexes, very often adopt multiple conformations. Some of them can be seen experimentally, for example with x-ray crystallography or cryo-electron microscopy. This structural heterogeneity is not occasional and is frequently linked with specific biological function. Thus, the accurate description of macromolecular conformational transitions is crucial for understanding fundamental mechanisms of life's machinery. We report on a real-time method to predict such transitions by extrapolating from instantaneous eigen motions, computed using the normal mode analysis, to a series of twists. We demonstrate the applicability of our approach to the prediction of a wide range of motions, including large collective opening-closing transitions and conformational changes induced by partner binding. We also highlight particularly difficult cases of very small transitions between crystal and solution structures. Our method guarantees preservation of the protein structure during the transition and allows accessing conformations that are unreachable with classical normal mode analysis. We provide practical solutions to describe localized motions with a few low-frequency modes and to relax some geometrical constraints along the predicted transitions. This work opens the way to the systematic description of protein motions, whatever their degree of collectivity. Our method is freely available as a part of the NOn-Linear rigid Block (NOLB) package.


Assuntos
Proteínas , Microscopia Crioeletrônica , Cristalografia por Raios X , Modelos Moleculares , Conformação Proteica
16.
J Mol Biol ; 432(7): 2121-2140, 2020 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-32067951

RESUMO

Alternative splicing and alternative initiation/termination transcription sites have the potential to greatly expand the proteome in eukaryotes by producing several transcript isoforms from the same gene. Although these mechanisms are well described at the genomic level, little is known about their contribution to protein evolution and their impact at the protein structure level. Here, we address both issues by reconstructing the evolutionary history of transcripts and by modeling the tertiary structures of the corresponding protein isoforms. We reconstruct phylogenetic forests relating 60 protein-coding transcripts from the c-Jun N-terminal kinase (JNK) family observed in seven species. We identify two alternative splicing events of ancient origin and show that they induce subtle changes in the protein's structural dynamics. We highlight a previously uncharacterized transcript whose predicted structure seems stable in solution. We further demonstrate that orphan transcripts, for which no phylogeny could be reconstructed, display peculiar sequence and structural properties. Our approach is implemented in PhyloSofS (Phylogenies of Splicing Isoforms Structures), a fully automated computational tool freely available at https://github.com/PhyloSofS-Team/PhyloSofS.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , MAP Quinase Quinase 4/genética , MAP Quinase Quinase 4/metabolismo , Conformação Proteica , Proteoma/análise , Transcriptoma , Processamento Alternativo , Animais , Humanos , MAP Quinase Quinase 4/química , MAP Quinase Quinase 4/classificação , Filogenia , Isoformas de Proteínas , Transcrição Gênica
17.
PLoS Comput Biol ; 16(2): e1007624, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32012150

RESUMO

Interactions between proteins and nucleic acids are at the heart of many essential biological processes. Despite increasing structural information about how these interactions may take place, our understanding of the usage made of protein surfaces by nucleic acids is still very limited. This is in part due to the inherent complexity associated to protein surface deformability and evolution. In this work, we present a method that contributes to decipher such complexity by predicting protein-DNA interfaces and characterizing their properties. It relies on three biologically and physically meaningful descriptors, namely evolutionary conservation, physico-chemical properties and surface geometry. We carefully assessed its performance on several hundreds of protein structures and compared it to several machine-learning state-of-the-art methods. Our approach achieves a higher sensitivity compared to the other methods, with a similar precision. Importantly, we show that it is able to unravel 'hidden' binding sites by applying it to unbound protein structures and to proteins binding to DNA via multiple sites and in different conformations. It is also applicable to the detection of RNA-binding sites, without significant loss of performance. This confirms that DNA and RNA-binding sites share similar properties. Our method is implemented as a fully automated tool, [Formula: see text], freely accessible at: http://www.lcqb.upmc.fr/JET2DNA. We also provide a new dataset of 187 protein-DNA complex structures, along with a subset of 82 associated unbound structures. The set represents the largest body of high-resolution crystallographic structures of protein-DNA complexes, use biological protein assemblies as DNA-binding units, and covers all major types of protein-DNA interactions. It is available at: http://www.lcqb.upmc.fr/PDNAbenchmarks.


Assuntos
Evolução Biológica , Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Proteínas/metabolismo , Algoritmos , Aprendizado de Máquina
18.
Proteins ; 87(12): 1200-1221, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31612567

RESUMO

We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Software , Algoritmos , Sítios de Ligação/genética , Bases de Dados de Proteínas , Modelos Moleculares , Ligação Proteica/genética , Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/genética , Homologia Estrutural de Proteína
19.
Mol Biol Evol ; 36(11): 2604-2619, 2019 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-31406981

RESUMO

The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.

20.
Proteins ; 87(11): 952-965, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31199528

RESUMO

The growing body of experimental and computational data describing how proteins interact with each other has emphasized the multiplicity of protein interactions and the complexity underlying protein surface usage and deformability. In this work, we propose new concepts and methods toward deciphering such complexity. We introduce the notion of interacting region to account for the multiple usage of a protein's surface residues by several partners and for the variability of protein interfaces coming from molecular flexibility. We predict interacting patches by crossing evolutionary, physicochemical and geometrical properties of the protein surface with information coming from complete cross-docking (CC-D) simulations. We show that our predictions match well interacting regions and that the different sources of information are complementary. We further propose an indicator of whether a protein has a few or many partners. Our prediction strategies are implemented in the dynJET2 algorithm and assessed on a new dataset of 262 protein on which we performed CC-D. The code and the data are available at: http://www.lcqb.upmc.fr/dynJET2/.


Assuntos
Proteínas/metabolismo , Algoritmos , Animais , Sítios de Ligação , Humanos , Simulação de Acoplamento Molecular , Ligação Proteica , Conformação Proteica , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas , Proteínas/química , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA