Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Cell ; 167(1): 158-170.e12, 2016 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-27662088

RESUMEN

Protein flexibility ranges from simple hinge movements to functional disorder. Around half of all human proteins contain apparently disordered regions with little 3D or functional information, and many of these proteins are associated with disease. Building on the evolutionary couplings approach previously successful in predicting 3D states of ordered proteins and RNA, we developed a method to predict the potential for ordered states for all apparently disordered proteins with sufficiently rich evolutionary information. The approach is highly accurate (79%) for residue interactions as tested in more than 60 known disordered regions captured in a bound or specific condition. Assessing the potential for structure of more than 1,000 apparently disordered regions of human proteins reveals a continuum of structural order with at least 50% with clear propensity for three- or two-dimensional states. Co-evolutionary constraints reveal hitherto unseen structures of functional importance in apparently disordered proteins.


Asunto(s)
Proteínas Intrínsecamente Desordenadas/química , Evolución Molecular Dirigida/métodos , Genómica , Humanos , Proteínas Intrínsecamente Desordenadas/genética , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteoma/química , Proteoma/genética
2.
Mol Biol Evol ; 37(4): 1179-1192, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-31670785

RESUMEN

Protein structure is tightly intertwined with function according to the laws of evolution. Understanding how structure determines function has been the aim of structural biology for decades. Here, we have wondered instead whether it is possible to exploit the function for which a protein was evolutionary selected to gain information on protein structure and on the landscape explored during the early stages of molecular and natural evolution. To answer to this question, we developed a new methodology, which we named CAMELS (Coupling Analysis by Molecular Evolution Library Sequencing), that is able to obtain the in vitro evolution of a protein from an artificial selection based on function. We were able to observe with CAMELS many features of the TEM-1 beta-lactamase local fold exclusively by generating and sequencing large libraries of mutational variants. We demonstrated that we can, whenever a functional phenotypic selection of a protein is available, sketch the structural and evolutionary landscape of a protein without utilizing purified proteins, collecting physical measurements, or relying on the pool of natural protein variants.


Asunto(s)
Evolución Molecular Dirigida/métodos , Relación Estructura-Actividad , beta-Lactamasas/genética , Pliegue de Proteína , Análisis de Secuencia de ADN
3.
Proteins ; 86(10): 1064-1074, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30020551

RESUMEN

Binding small ligands such as ions or macromolecules such as DNA, RNA, and other proteins is one important aspect of the molecular function of proteins. Many binding sites remain without experimental annotations. Predicting binding sites on a per-residue level is challenging, but if 3D structures are known, information about coevolving residue pairs (evolutionary couplings) can predict catalytic residues through mutual information. Here, we predicted protein binding sites from evolutionary couplings derived from a global statistical model using maximum entropy. Additionally, we included information from sequence variation. A simple method using a weighted sum over eight scores substantially outperformed random (F1 = 19.3% ± 0.7% vs F1 = 2% for random). Training a neural network on these eight scores (along with predicted solvent accessibility and conservation in protein families) improved substantially (F1 = 26.2% ±0.8%). Although the machine learning was limited by the small data set and possibly wrong annotations of binding sites, the predicted binding sites formed spatial clusters in the protein. The source code of the binding site predictions is available through GitHub: https://github.com/Rostlab/bindPredict.


Asunto(s)
Evolución Molecular , Proteínas/química , Sitios de Unión , Evolución Biológica , ADN/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Bases de Datos de Proteínas , Entropía , Variación Genética , Humanos , Aprendizaje Automático , Modelos Biológicos , Modelos Moleculares , Redes Neurales de la Computación , Unión Proteica , Proteínas/genética , Proteínas/metabolismo
4.
Adv Exp Med Biol ; 1105: 153-169, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30617828

RESUMEN

While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid "EC-NMR" method can be used to accurately model larger (15-60 kDa) proteins, and more rapidly determine structures of smaller (5-15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.


Asunto(s)
Evolución Molecular , Resonancia Magnética Nuclear Biomolecular , Conformación Proteica , Proteínas/química , Alineación de Secuencia
5.
Proc Natl Acad Sci U S A ; 112(17): 5413-8, 2015 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-25858953

RESUMEN

Transmembrane ß-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting ß-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent ß-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand-strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of ß-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.


Asunto(s)
Inteligencia Artificial , Proteínas de Escherichia coli/química , Escherichia coli/química , Estructura Secundaria de Proteína , Receptores de Superficie Celular/química , Análisis de Secuencia de Proteína/métodos , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Modelos Moleculares , Estructura Terciaria de Proteína , Receptores de Superficie Celular/genética
6.
Int J Mol Sci ; 19(11)2018 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-30366362

RESUMEN

Although improved strategies for the detection and analysis of evolutionary couplings (ECs) between protein residues already enable the prediction of protein structures and interactions, they are mostly restricted to conserved and well-folded proteins. Whereas intrinsically disordered proteins (IDPs) are central to cellular interaction networks, due to the lack of strict structural constraints, they undergo faster evolutionary changes than folded domains. This makes the reliable identification and alignment of IDP homologs difficult, which led to IDPs being omitted in most large-scale residue co-variation analyses. By preforming a dedicated analysis of phylogenetically widespread bacterial IDP⁻partner interactions, here we demonstrate that partner binding imposes constraints on IDP sequences that manifest in detectable interprotein ECs. These ECs were not detected for interactions mediated by short motifs, rather for those with larger IDP⁻partner interfaces. Most identified coupled residue pairs reside close (<10 Å) to each other on the interface, with a third of them forming multiple direct atomic contacts. EC-carrying interfaces of IDPs are enriched in negatively charged residues, and the EC residues of both IDPs and partners preferentially reside in helices. Our analysis brings hope that IDP⁻partner interactions difficult to study could soon be successfully dissected through residue co-variation analysis.


Asunto(s)
Evolución Molecular , Proteínas Intrínsecamente Desordenadas/metabolismo , Animales , Humanos , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/genética , Unión Proteica , Pliegue de Proteína
7.
Protein Sci ; 30(5): 1006-1021, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33759266

RESUMEN

Electron microscopy (EM) continues to provide near-atomic resolution structures for well-behaved proteins and protein complexes. Unfortunately, structures of some complexes are limited to low- to medium-resolution due to biochemical or conformational heterogeneity. Thus, the application of unbiased systematic methods for fitting individual structures into EM maps is important. A method that employs co-evolutionary information obtained solely from sequence data could prove invaluable for quick, confident localization of subunits within these structures. Here, we incorporate the co-evolution of intermolecular amino acids as a new type of distance restraint in the integrative modeling platform in order to build three-dimensional models of atomic structures into EM maps ranging from 10-14 Å in resolution. We validate this method using four complexes of known structure, where we highlight the conservation of intermolecular couplings despite dynamic conformational changes using the BAM complex. Finally, we use this method to assemble the subunits of the bacterial holo-translocon into a model that agrees with previous biochemical data. The use of evolutionary couplings in integrative modeling improves systematic, unbiased fitting of atomic models into medium- to low-resolution EM maps, providing additional information to integrative models lacking in spatial data.


Asunto(s)
Microscopía por Crioelectrón , Modelos Moleculares , Proteínas , Proteínas/química , Proteínas/ultraestructura
8.
Cell Syst ; 10(1): 15-24.e5, 2020 01 22.
Artículo en Inglés | MEDLINE | ID: mdl-31838147

RESUMEN

Natural evolution encodes rich information about the structure and function of biomolecules in the genetic record. Previously, statistical analysis of co-variation patterns in natural protein families has enabled the accurate computation of 3D structures. Here, we explored generating similar information by experimental evolution, starting from a single gene and performing multiple cycles of in vitro mutagenesis and functional selection in Escherichia coli. We evolved two antibiotic resistance proteins, ß-lactamase PSE1 and acetyltransferase AAC6, and obtained hundreds of thousands of diverse functional sequences. Using evolutionary coupling analysis, we inferred residue interaction constraints that were in agreement with contacts in known 3D structures, confirming genetic encoding of structural constraints in the selected sequences. Computational protein folding with interaction constraints then yielded 3D structures with the same fold as natural relatives. This work lays the foundation for a new experimental method (3Dseq) for protein structure determination, combining evolution experiments with inference of residue interactions from sequence information. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.


Asunto(s)
Evolución Molecular , Proteínas/química , Humanos , Conformación Proteica
9.
PeerJ ; 7: e7280, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31328041

RESUMEN

Patterns of amino acid covariation in large protein sequence alignments can inform the prediction of de novo protein structures, binding interfaces, and mutational effects. While algorithms that detect these so-called evolutionary couplings between residues have proven useful for practical applications, less is known about how and why these methods perform so well, and what insights into biological processes can be gained from their application. Evolutionary coupling algorithms are commonly benchmarked by comparison to true structural contacts derived from solved protein structures. However, the methods used to determine true structural contacts are not standardized and different definitions of structural contacts may have important consequences for interpreting the results from evolutionary coupling analyses and understanding their overall utility. Here, we show that evolutionary coupling analyses are significantly more likely to identify structural contacts between side-chain atoms than between backbone atoms. We use both simulations and empirical analyses to highlight that purely backbone-based definitions of true residue-residue contacts (i.e., based on the distance between Cα atoms) may underestimate the accuracy of evolutionary coupling algorithms by as much as 40% and that a commonly used reference point (Cß atoms) underestimates the accuracy by 10-15%. These findings show that co-evolutionary outcomes differ according to which atoms participate in residue-residue interactions and suggest that accounting for different interaction types may lead to further improvements to contact-prediction methods.

10.
Methods Enzymol ; 614: 363-392, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30611430

RESUMEN

Accurate protein structure determination by solution-state NMR is challenging for proteins greater than about 20kDa, for which extensive perdeuteration is generally required, providing experimental data that are incomplete (sparse) and ambiguous. However, the massive increase in evolutionary sequence information coupled with advances in methods for sequence covariance analysis can provide reliable residue-residue contact information for a protein from sequence data alone. These "evolutionary couplings (ECs)" can be combined with sparse NMR data to determine accurate 3D protein structures. This hybrid "EC-NMR" method has been developed using NMR data for several soluble proteins and validated by comparison with corresponding reference structures determined by X-ray crystallography and/or conventional NMR methods. For small proteins, only backbone resonance assignments are utilized, while for larger proteins both backbone and some sidechain methyl resonance assignments are generally required. ECs can be combined with sparse NMR data obtained on deuterated, selectively protonated protein samples to provide structures that are more accurate and complete than those obtained using such sparse NMR data alone. EC-NMR also has significant potential for analysis of protein structures from solid-state NMR data and for studies of integral membrane proteins. The requirement that ECs are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.


Asunto(s)
Algoritmos , Proteínas de Escherichia coli/química , Escherichia coli/química , Evolución Molecular , Resonancia Magnética Nuclear Biomolecular/métodos , Proteínas de Unión Periplasmáticas/química , Programas Informáticos , Análisis de Varianza , Sitios de Unión , Cristalografía por Rayos X , Bases de Datos de Proteínas , Deuterio/química , Escherichia coli/metabolismo , Proteínas de Escherichia coli/metabolismo , Humanos , Marcaje Isotópico , Modelos Moleculares , Proteínas de Unión Periplasmáticas/metabolismo , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Homología Estructural de Proteína , Termodinámica
11.
Cell Syst ; 6(1): 65-74.e3, 2018 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-29275173

RESUMEN

While genes are defined by sequence, in biological systems a protein's function is largely determined by its three-dimensional structure. Evolutionary information embedded within multiple sequence alignments provides a rich source of data for inferring structural constraints on macromolecules. Still, many proteins of interest lack sufficient numbers of related sequences, leading to noisy, error-prone residue-residue contact predictions. Here we introduce DeepContact, a convolutional neural network (CNN)-based approach that discovers co-evolutionary motifs and leverages these patterns to enable accurate inference of contact probabilities, particularly when few related sequences are available. DeepContact significantly improves performance over previous methods, including in the CASP12 blind contact prediction task where we achieved top performance with another CNN-based approach. Moreover, our tool converts hard-to-interpret coupling scores into probabilities, moving the field toward a consistent metric to assess contact prediction across diverse proteins. Through substantially improving the precision-recall behavior of contact prediction, DeepContact suggests we are near a paradigm shift in template-free modeling for protein structure prediction.


Asunto(s)
Biología Computacional/métodos , Predicción/métodos , Redes Neurales de la Computación , Proteínas/química , Algoritmos , Animales , Bases de Datos de Proteínas , Humanos , Aprendizaje Automático , Modelos Moleculares , Probabilidad , Conformación Proteica , Pliegue de Proteína , Alineación de Secuencia/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA