Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 39(39 Suppl 1): i544-i552, 2023 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-37387162

RESUMEN

MOTIVATION: The spectacular recent advances in protein and protein complex structure prediction hold promise for reconstructing interactomes at large-scale and residue resolution. Beyond determining the 3D arrangement of interacting partners, modeling approaches should be able to unravel the impact of sequence variations on the strength of the association. RESULTS: In this work, we report on Deep Local Analysis, a novel and efficient deep learning framework that relies on a strikingly simple deconstruction of protein interfaces into small locally oriented residue-centered cubes and on 3D convolutions recognizing patterns within cubes. Merely based on the two cubes associated with the wild-type and the mutant residues, DLA accurately estimates the binding affinity change for the associated complexes. It achieves a Pearson correlation coefficient of 0.735 on about 400 mutations on unseen complexes. Its generalization capability on blind datasets of complexes is higher than the state-of-the-art methods. We show that taking into account the evolutionary constraints on residues contributes to predictions. We also discuss the influence of conformational variability on performance. Beyond the predictive power on the effects of mutations, DLA is a general framework for transferring the knowledge gained from the available non-redundant set of complex protein structures to various tasks. For instance, given a single partially masked cube, it recovers the identity and physicochemical class of the central residue. Given an ensemble of cubes representing an interface, it predicts the function of the complex. AVAILABILITY AND IMPLEMENTATION: Source code and models are available at http://gitlab.lcqb.upmc.fr/DLA/DLA.git.


Asunto(s)
Evolución Biológica , Programas Informáticos , Mutación
2.
Nucleic Acids Res ; 50(W1): W412-W419, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35670671

RESUMEN

Residue coevolution within and between proteins is used as a marker of physical interaction and/or residue functional cooperation. Pairs or groups of coevolving residues are extracted from multiple sequence alignments based on a variety of computational approaches. However, coevolution signals emerging in subsets of sequences might be lost if the full alignment is considered. iBIS2Analyzer is a web server dedicated to a phylogeny-driven coevolution analysis of protein families with different evolutionary pressure. It is based on the iterative version, iBIS2, of the coevolution analysis method BIS, Blocks in Sequences. iBIS2 is designed to iteratively select and analyse subtrees in phylogenetic trees, possibly large and comprising thousands of sequences. With iBIS2Analyzer, openly accessible at http://ibis2analyzer.lcqb.upmc.fr/, the user visualizes, compares and inspects clusters of coevolving residues by mapping them onto sequences, alignments or structures of choice, greatly simplifying downstream analysis steps. A rich and interactive graphic interface facilitates the biological interpretation of the results.


Asunto(s)
Computadores , Evolución Molecular , Internet , Filogenia , Proteínas , Alineación de Secuencia , Programas Informáticos , Proteínas/química , Proteínas/clasificación , Secuencia de Aminoácidos , Visualización de Datos
3.
Proteomics ; 23(17): e2200159, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37403279

RESUMEN

Physical interactions between proteins are central to all biological processes. Yet, the current knowledge of who interacts with whom in the cell and in what manner relies on partial, noisy, and highly heterogeneous data. Thus, there is a need for methods comprehensively describing and organizing such data. LEVELNET is a versatile and interactive tool for visualizing, exploring, and comparing protein-protein interaction (PPI) networks inferred from different types of evidence. LEVELNET helps to break down the complexity of PPI networks by representing them as multi-layered graphs and by facilitating the direct comparison of their subnetworks toward biological interpretation. It focuses primarily on the protein chains whose 3D structures are available in the Protein Data Bank. We showcase some potential applications, such as investigating the structural evidence supporting PPIs associated to specific biological processes, assessing the co-localization of interaction partners, comparing the PPI networks obtained through computational experiments versus homology transfer, and creating PPI benchmarks with desired properties.


Asunto(s)
Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Proteínas/metabolismo , Bases de Datos de Proteínas , Biología Computacional
4.
Bioinformatics ; 38(19): 4505-4512, 2022 09 30.
Artículo en Inglés | MEDLINE | ID: mdl-35962985

RESUMEN

MOTIVATION: With the recent advances in protein 3D structure prediction, protein interactions are becoming more central than ever before. Here, we address the problem of determining how proteins interact with one another. More specifically, we investigate the possibility of discriminating near-native protein complex conformations from incorrect ones by exploiting local environments around interfacial residues. RESULTS: Deep Local Analysis (DLA)-Ranker is a deep learning framework applying 3D convolutions to a set of locally oriented cubes representing the protein interface. It explicitly considers the local geometry of the interfacial residues along with their neighboring atoms and the regions of the interface with different solvent accessibility. We assessed its performance on three docking benchmarks made of half a million acceptable and incorrect conformations. We show that DLA-Ranker successfully identifies near-native conformations from ensembles generated by molecular docking. It surpasses or competes with other deep learning-based scoring functions. We also showcase its usefulness to discover alternative interfaces. AVAILABILITY AND IMPLEMENTATION: http://gitlab.lcqb.upmc.fr/dla-ranker/DLA-Ranker.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas , Simulación del Acoplamiento Molecular , Conformación Proteica , Proteínas/química , Unión Proteica
5.
PLoS Comput Biol ; 18(11): e1010713, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36395332

RESUMEN

The relationship between interactions, flexibility and disorder in proteins has been explored from many angles over the years: folding upon binding, flexibility of the core relative to the periphery, entropy changes, etc. In this work, we provide statistical evidence for the involvement of highly mobile and disordered regions in complex assembly. We ordered the entire set of X-ray crystallographic structures in the Protein Data Bank into hierarchies of progressive interactions involving identical or very similar protein chains, yielding 40205 hierarchies of protein complexes with increasing numbers of partners. We then examine them as proxies for the assembly pathways. Using this database, we show that upon oligomerisation, the new interfaces tend to be observed at residues that were characterised as softly disordered (flexible, amorphous or missing residues) in the complexes preceding them in the hierarchy. We also rule out the possibility that this correlation is just a surface effect by restricting the analysis to residues on the surface of the complexes. Interestingly, we find that the location of soft disordered residues in the sequence changes as the number of partners increases. Our results show that there is a general mechanism for protein assembly that involves soft disorder and modulates the way protein complexes are assembled. This work highlights the difficulty of predicting the structure of large protein complexes from sequence and emphasises the importance of linking predictors of soft disorder to the next generation of predictors of complex structure. Finally, we investigate the relationship between the Alphafold2's confidence metric pLDDT for structure prediction in unbound versus bound structures, and soft disorder. We show a strong correlation between Alphafold2 low confidence residues and the union of all regions of soft disorder observed in the hierarchy. This paves the way for using the pLDDT metric as a proxy for predicting interfaces and assembly paths.


Asunto(s)
Procesos Mentales , Bases de Datos de Proteínas , Cristalografía por Rayos X , Entropía
6.
PLoS Comput Biol ; 18(1): e1009825, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-35089918

RESUMEN

Proteins ensure their biological functions by interacting with each other. Hence, characterising protein interactions is fundamental for our understanding of the cellular machinery, and for improving medicine and bioengineering. Over the past years, a large body of experimental data has been accumulated on who interacts with whom and in what manner. However, these data are highly heterogeneous and sometimes contradictory, noisy, and biased. Ab initio methods provide a means to a "blind" protein-protein interaction network reconstruction. Here, we report on a molecular cross-docking-based approach for the identification of protein partners. The docking algorithm uses a coarse-grained representation of the protein structures and treats them as rigid bodies. We applied the approach to a few hundred of proteins, in the unbound conformations, and we systematically investigated the influence of several key ingredients, such as the size and quality of the interfaces, and the scoring function. We achieved some significant improvement compared to previous works, and a very high discriminative power on some specific functional classes. We provide a readout of the contributions of shape and physico-chemical complementarity, interface matching, and specificity, in the predictions. In addition, we assessed the ability of the approach to account for protein surface multiple usages, and we compared it with a sequence-based deep learning method. This work may contribute to guiding the exploitation of the large amounts of protein structural models now available toward the discovery of unexpected partners and their complex structure characterisation.


Asunto(s)
Sitios de Unión/fisiología , Simulación del Acoplamiento Molecular , Conformación Proteica , Mapas de Interacción de Proteínas/fisiología , Proteínas , Algoritmos , Biología Computacional , Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Proteínas/química , Proteínas/metabolismo
7.
Nucleic Acids Res ; 49(W1): W452-W458, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-34023906

RESUMEN

The ever-increasing number of genomic and metagenomic sequences accumulating in our databases requires accurate approaches to explore their content against specific domain targets. MyCLADE is a user-friendly webserver designed for targeted functional profiling of genomic and metagenomic sequences based on a database of a few million probabilistic models of Pfam domains. It uses the MetaCLADE multi-source domain annotation strategy, modelling domains based on multiple probabilistic profiles. MyCLADE takes a list of protein sequences and possibly a target set of domains/clans as input and, for each sequence, it provides a domain architecture built from the targeted domains or from all Pfam domains. It is linked to the Pfam and QuickGO databases in multiple ways for easy retrieval of domain and clan information. E-value, bit-score, domain-dependent probability scores and logos representing the match of the model with the sequence are provided to help the user to assess the quality of each annotation. Availability and implementation: MyCLADE is freely available at http://www.lcqb.upmc.fr/myclade.


Asunto(s)
Anotación de Secuencia Molecular , Dominios Proteicos , Programas Informáticos , Genómica , Metagenómica , Análisis de Secuencia de Proteína/métodos , Staphylococcus aureus/genética
8.
J Struct Biol ; 214(3): 107873, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35680033

RESUMEN

The Calvin-Benson cycle fixes carbon dioxide into organic triosephosphates through the collective action of eleven conserved enzymes. Regeneration of ribulose-1,5-bisphosphate, the substrate of Rubisco-mediated carboxylation, requires two lyase reactions catalyzed by fructose-1,6-bisphosphate aldolase (FBA). While cytoplasmic FBA has been extensively studied in non-photosynthetic organisms, functional and structural details are limited for chloroplast FBA encoded by oxygenic phototrophs. Here we determined the crystal structure of plastidial FBA from the unicellular green alga Chlamydomonas reinhardtii (Cr). We confirm that CrFBA folds as a TIM barrel, describe its catalytic pocket and homo-tetrameric state. Multiple sequence profiling classified the photosynthetic paralogs of FBA in a distinct group from non-photosynthetic paralogs. We mapped the sites of thiol- and phospho-based post-translational modifications known from photosynthetic organisms and predict their effects on enzyme catalysis.


Asunto(s)
Chlamydomonas reinhardtii , Dióxido de Carbono , Chlamydomonas reinhardtii/metabolismo , Cloroplastos , Fructosa , Fructosa-Bifosfato Aldolasa , Fotosíntesis , Ribulosa-Bifosfato Carboxilasa/química , Ribulosa-Bifosfato Carboxilasa/metabolismo
9.
PLoS Comput Biol ; 17(1): e1008546, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33417598

RESUMEN

The importance of unstructured biology has quickly grown during the last decades accompanying the explosion of the number of experimentally resolved protein structures. The idea that structural disorder might be a novel mechanism of protein interaction is widespread in the literature, although the number of statistically significant structural studies supporting this idea is surprisingly low. At variance with previous works, our conclusions rely exclusively on a large-scale analysis of all the 134337 X-ray crystallographic structures of the Protein Data Bank averaged over clusters of almost identical protein sequences. In this work, we explore the complexity of the organisation of all the interaction interfaces observed when a protein lies in alternative complexes, showing that interfaces progressively add up in a hierarchical way, which is reflected in a logarithmic law for the size of the union of the interface regions on the number of distinct interfaces. We further investigate the connection of this complexity with different measures of structural disorder: the standard missing residues and a new definition, called "soft disorder", that covers all the flexible and structurally amorphous residues of a protein. We show evidences that both the interaction interfaces and the soft disordered regions tend to involve roughly the same amino-acids of the protein, and preliminary results suggesting that soft disorder spots those surface regions where new interfaces are progressively accommodated by complex formation. In fact, our results suggest that structurally disordered regions not only carry crucial information about the location of alternative interfaces within complexes, but also about the order of the assembly. We verify these hypotheses in several examples, such as the DNA binding domains of P53 and P73, the C3 exoenzyme, and two known biological orders of assembly. We finally compare our measures of structural disorder with several disorder bioinformatics predictors, showing that these latter are optimised to predict the residues that are missing in all the alternative structures of a protein and they are not able to catch the progressive evolution of the disordered regions upon complex formation. Yet, the predicted residues, when not missing, tend to be characterised as soft disordered regions.


Asunto(s)
Biología Computacional/métodos , Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica , Secuencia de Aminoácidos , Análisis por Conglomerados , Cristalografía por Rayos X , Bases de Datos de Proteínas , Proteínas Intrínsecamente Desordenadas/metabolismo , Unión Proteica
10.
Nucleic Acids Res ; 48(W1): W558-W565, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32374885

RESUMEN

Overlapping genes are commonplace in viruses and play an important role in their function and evolution. For these genes, molecular coevolution may be seen as a mechanism to decrease the evolutionary constraints of amino acid positions in the overlapping regions and to tolerate or compensate unfavorable mutations. Tracing these mutational sites, could help to gain insight on the direct or indirect effect of the mutations in the corresponding overlapping proteins. In the past, coevolution analysis has been used to identify residue pairs and coevolutionary signatures within or between proteins that served as markers of physical interactions and/or functional relationships. Coevolution in OVerlapped sequences by Tree analysis (COVTree) is a web server providing the online analysis of coevolving amino-acid pairs in overlapping genes, where residues might be located inside or outside the overlapping region. COVTree is designed to handle protein families with various characteristics, among which those that typically display a small number of highly conserved sequences. It is based on BIS2, a fast version of the coevolution analysis tool Blocks in Sequences (BIS). COVTree provides a rich and interactive graphical interface to ease biological interpretation of the results and it is openly accessible at http://www.lcqb.upmc.fr/COVTree/.


Asunto(s)
Evolución Molecular , Genes Sobrepuestos , Programas Informáticos , Genes Virales , Antígenos de Superficie de la Hepatitis B/genética , Virus de la Hepatitis B/genética , Alineación de Secuencia
11.
Mol Biol Evol ; 37(9): 2747-2762, 2020 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-32384156

RESUMEN

Gene order can be used as an informative character to reconstruct phylogenetic relationships between species independently from the local information present in gene/protein sequences. PhyChro is a reconstruction method based on chromosomal rearrangements, applicable to a wide range of eukaryotic genomes with different gene contents and levels of synteny conservation. For each synteny breakpoint issued from pairwise genome comparisons, the algorithm defines two disjoint sets of genomes, named partial splits, respectively, supporting the two block adjacencies defining the breakpoint. Considering all partial splits issued from all pairwise comparisons, a distance between two genomes is computed from the number of partial splits separating them. Tree reconstruction is achieved through a bottom-up approach by iteratively grouping sister genomes minimizing genome distances. PhyChro estimates branch lengths based on the number of synteny breakpoints and provides confidence scores for the branches. PhyChro performance is evaluated on two data sets of 13 vertebrates and 21 yeast genomes by using up to 130,000 and 179,000 breakpoints, respectively, a scale of genomic markers that has been out of reach until now. PhyChro reconstructs very accurate tree topologies even at known problematic branching positions. Its robustness has been benchmarked for different synteny block reconstruction methods. On simulated data PhyChro reconstructs phylogenies perfectly in almost all cases, and shows the highest accuracy compared with other existing tools. PhyChro is very fast, reconstructing the vertebrate and yeast phylogenies in <15 min.


Asunto(s)
Técnicas Genéticas , Modelos Genéticos , Filogenia , Programas Informáticos , Sintenía , Algoritmos , Animales , Orden Génico , Genoma , Vertebrados/genética , Levaduras/genética
12.
Bioinformatics ; 36(13): 3975-3981, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32330240

RESUMEN

MOTIVATION: The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly 'explore' the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. RESULTS: S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains-a typical scenario-S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. AVAILABILITY AND IMPLEMENTATION: S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Metagenómica , Metagenoma , Análisis de Secuencia de ADN , Programas Informáticos
13.
PLoS Comput Biol ; 16(2): e1007624, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32012150

RESUMEN

Interactions between proteins and nucleic acids are at the heart of many essential biological processes. Despite increasing structural information about how these interactions may take place, our understanding of the usage made of protein surfaces by nucleic acids is still very limited. This is in part due to the inherent complexity associated to protein surface deformability and evolution. In this work, we present a method that contributes to decipher such complexity by predicting protein-DNA interfaces and characterizing their properties. It relies on three biologically and physically meaningful descriptors, namely evolutionary conservation, physico-chemical properties and surface geometry. We carefully assessed its performance on several hundreds of protein structures and compared it to several machine-learning state-of-the-art methods. Our approach achieves a higher sensitivity compared to the other methods, with a similar precision. Importantly, we show that it is able to unravel 'hidden' binding sites by applying it to unbound protein structures and to proteins binding to DNA via multiple sites and in different conformations. It is also applicable to the detection of RNA-binding sites, without significant loss of performance. This confirms that DNA and RNA-binding sites share similar properties. Our method is implemented as a fully automated tool, [Formula: see text], freely accessible at: http://www.lcqb.upmc.fr/JET2DNA. We also provide a new dataset of 187 protein-DNA complex structures, along with a subset of 82 associated unbound structures. The set represents the largest body of high-resolution crystallographic structures of protein-DNA complexes, use biological protein assemblies as DNA-binding units, and covers all major types of protein-DNA interactions. It is available at: http://www.lcqb.upmc.fr/PDNAbenchmarks.


Asunto(s)
Evolución Biológica , Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Proteínas/metabolismo , Algoritmos , Aprendizaje Automático
14.
BMC Bioinformatics ; 21(Suppl 19): 573, 2020 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-33349244

RESUMEN

BACKGROUND: Coiled-coils are described as stable structural motifs, where two or more helices wind around each other. However, coiled-coils are associated with local mobility and intrinsic disorder. Intrinsically disordered regions in proteins are characterized by lack of stable secondary and tertiary structure under physiological conditions in vitro. They are increasingly recognized as important for protein function. However, characterizing their behaviour in solution and determining precisely the extent of disorder of a protein region remains challenging, both experimentally and computationally. RESULTS: In this work, we propose a computational framework to quantify the extent of disorder within a coiled-coil in solution and to help design substitutions modulating such disorder. Our method relies on the analysis of conformational ensembles generated by relatively short all-atom Molecular Dynamics (MD) simulations. We apply it to the phosphoprotein multimerisation domains (PMD) of Measles virus (MeV) and Nipah virus (NiV), both forming tetrameric left-handed coiled-coils. We show that our method can help quantify the extent of disorder of the C-terminus region of MeV and NiV PMDs from MD simulations of a few tens of nanoseconds, and without requiring an extensive exploration of the conformational space. Moreover, this study provided a conceptual framework for the rational design of substitutions aimed at modulating the stability of the coiled-coils. By assessing the impact of four substitutions known to destabilize coiled-coils, we derive a set of rules to control MeV PMD structural stability and cohesiveness. We therefore design two contrasting substitutions, one increasing the stability of the tetramer and the other increasing its flexibility. CONCLUSIONS: Our method can be considered as a platform to reason about how to design substitutions aimed at regulating flexibility and stability.


Asunto(s)
Biología Computacional/métodos , Proteínas Virales/química , Secuencia de Aminoácidos , Virus del Sarampión/metabolismo , Simulación de Dinámica Molecular , Virus Nipah/metabolismo , Dominios Proteicos , Estabilidad Proteica , Estructura Secundaria de Proteína , Proteínas Virales/metabolismo
15.
Mol Biol Evol ; 36(11): 2604-2619, 2019 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-31406981

RESUMEN

The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.

16.
PLoS Pathog ; 14(3): e1006908, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29505618

RESUMEN

Amino-acid coevolution can be referred to mutational compensatory patterns preserving the function of a protein. Viral envelope glycoproteins, which mediate entry of enveloped viruses into their host cells, are shaped by coevolution signals that confer to viruses the plasticity to evade neutralizing antibodies without altering viral entry mechanisms. The functions and structures of the two envelope glycoproteins of the Hepatitis C Virus (HCV), E1 and E2, are poorly described. Especially, how these two proteins mediate the HCV fusion process between the viral and the cell membrane remains elusive. Here, as a proof of concept, we aimed to take advantage of an original coevolution method recently developed to shed light on the HCV fusion mechanism. When first applied to the well-characterized Dengue Virus (DENV) envelope glycoproteins, coevolution analysis was able to predict important structural features and rearrangements of these viral protein complexes. When applied to HCV E1E2, computational coevolution analysis predicted that E1 and E2 refold interdependently during fusion through rearrangements of the E2 Back Layer (BL). Consistently, a soluble BL-derived polypeptide inhibited HCV infection of hepatoma cell lines, primary human hepatocytes and humanized liver mice. We showed that this polypeptide specifically inhibited HCV fusogenic rearrangements, hence supporting the critical role of this domain during HCV fusion. By combining coevolution analysis and in vitro assays, we also uncovered functionally-significant coevolving signals between E1 and E2 BL/Stem regions that govern HCV fusion, demonstrating the accuracy of our coevolution predictions. Altogether, our work shed light on important structural features of the HCV fusion mechanism and contributes to advance our functional understanding of this process. This study also provides an important proof of concept that coevolution can be employed to explore viral protein mediated-processes, and can guide the development of innovative translational strategies against challenging human-tropic viruses.


Asunto(s)
Evolución Molecular , Hepacivirus/fisiología , Proteínas del Envoltorio Viral/metabolismo , Internalización del Virus , Animales , Carcinoma Hepatocelular/metabolismo , Carcinoma Hepatocelular/patología , Carcinoma Hepatocelular/virología , Hepatitis C/metabolismo , Hepatitis C/patología , Hepatitis C/virología , Humanos , Neoplasias Hepáticas/metabolismo , Neoplasias Hepáticas/patología , Neoplasias Hepáticas/virología , Ratones , Ratones Endogámicos C57BL , Unión Proteica , Células Tumorales Cultivadas , Proteínas del Envoltorio Viral/química , Proteínas del Envoltorio Viral/genética , Replicación Viral
17.
Proteins ; 87(11): 952-965, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31199528

RESUMEN

The growing body of experimental and computational data describing how proteins interact with each other has emphasized the multiplicity of protein interactions and the complexity underlying protein surface usage and deformability. In this work, we propose new concepts and methods toward deciphering such complexity. We introduce the notion of interacting region to account for the multiple usage of a protein's surface residues by several partners and for the variability of protein interfaces coming from molecular flexibility. We predict interacting patches by crossing evolutionary, physicochemical and geometrical properties of the protein surface with information coming from complete cross-docking (CC-D) simulations. We show that our predictions match well interacting regions and that the different sources of information are complementary. We further propose an indicator of whether a protein has a few or many partners. Our prediction strategies are implemented in the dynJET2 algorithm and assessed on a new dataset of 262 protein on which we performed CC-D. The code and the data are available at: http://www.lcqb.upmc.fr/dynJET2/.


Asunto(s)
Proteínas/metabolismo , Algoritmos , Animales , Sitios de Unión , Humanos , Simulación del Acoplamiento Molecular , Unión Proteica , Conformación Proteica , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas , Proteínas/química , Programas Informáticos
18.
Proteins ; 87(12): 1200-1221, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31612567

RESUMEN

We present the results for CAPRI Round 46, the third joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo-oligomers and 6 heterocomplexes. Eight of the homo-oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher-order assemblies. These were more difficult to model, as their prediction mainly involved "ab-initio" docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance "gap" was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template-based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.


Asunto(s)
Biología Computacional , Conformación Proteica , Proteínas/ultraestructura , Programas Informáticos , Algoritmos , Sitios de Unión/genética , Bases de Datos de Proteínas , Modelos Moleculares , Unión Proteica/genética , Mapeo de Interacción de Proteínas , Proteínas/química , Proteínas/genética , Homología Estructural de Proteína
19.
Genome Res ; 26(7): 918-32, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-27247244

RESUMEN

Reconstructing genome history is complex but necessary to reveal quantitative principles governing genome evolution. Such reconstruction requires recapitulating into a single evolutionary framework the evolution of genome architecture and gene repertoire. Here, we reconstructed the genome history of the genus Lachancea that appeared to cover a continuous evolutionary range from closely related to more diverged yeast species. Our approach integrated the generation of a high-quality genome data set; the development of AnChro, a new algorithm for reconstructing ancestral genome architecture; and a comprehensive analysis of gene repertoire evolution. We found that the ancestral genome of the genus Lachancea contained eight chromosomes and about 5173 protein-coding genes. Moreover, we characterized 24 horizontal gene transfers and 159 putative gene creation events that punctuated species diversification. We retraced all chromosomal rearrangements, including gene losses, gene duplications, chromosomal inversions and translocations at single gene resolution. Gene duplications outnumbered losses and balanced rearrangements with 1503, 929, and 423 events, respectively. Gene content variations between extant species are mainly driven by differential gene losses, while gene duplications remained globally constant in all lineages. Remarkably, we discovered that balanced chromosomal rearrangements could be responsible for up to 14% of all gene losses by disrupting genes at their breakpoints. Finally, we found that nonsynonymous substitutions reached fixation at a coordinated pace with chromosomal inversions, translocations, and duplications, but not deletions. Overall, we provide a granular view of genome evolution within an entire eukaryotic genus, linking gene content, chromosome rearrangements, and protein divergence into a single evolutionary framework.


Asunto(s)
Ascomicetos/genética , Cromosomas Fúngicos/genética , Evolución Molecular , Reordenamiento Génico , Genoma Fúngico , Modelos Genéticos , Filogenia
20.
Bioinformatics ; 34(3): 459-468, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-29028884

RESUMEN

Motivation: Large-scale computational docking will be increasingly used in future years to discriminate protein-protein interactions at the residue resolution. Complete cross-docking experiments make in silico reconstruction of protein-protein interaction networks a feasible goal. They ask for efficient and accurate screening of the millions structural conformations issued by the calculations. Results: We propose CIPS (Combined Interface Propensity for decoy Scoring), a new pair potential combining interface composition with residue-residue contact preference. CIPS outperforms several other methods on screening docking solutions obtained either with all-atom or with coarse-grain rigid docking. Further testing on 28 CAPRI targets corroborates CIPS predictive power over existing methods. By combining CIPS with atomic potentials, discrimination of correct conformations in all-atom structures reaches optimal accuracy. The drastic reduction of candidate solutions produced by thousands of proteins docked against each other makes large-scale docking accessible to analysis. Availability and implementation: CIPS source code is freely available at http://www.lcqb.upmc.fr/CIPS. Contact: alessandra.carbone@lip6.fr. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Simulación del Acoplamiento Molecular , Mapeo de Interacción de Proteínas/métodos , Programas Informáticos , Biología Computacional/métodos , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA