Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 121(21): e2318905121, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38739787

RESUMEN

We propose that spontaneous folding and molecular evolution of biopolymers are two universal aspects that must concur for life to happen. These aspects are fundamentally related to the chemical composition of biopolymers and crucially depend on the solvent in which they are embedded. We show that molecular information theory and energy landscape theory allow us to explore the limits that solvents impose on biopolymer existence. We consider 54 solvents, including water, alcohols, hydrocarbons, halogenated solvents, aromatic solvents, and low molecular weight substances made up of elements abundant in the universe, which may potentially take part in alternative biochemistries. We find that along with water, there are many solvents for which the liquid regime is compatible with biopolymer folding and evolution. We present a ranking of the solvents in terms of biopolymer compatibility. Many of these solvents have been found in molecular clouds or may be expected to occur in extrasolar planets.


Asunto(s)
Solventes , Biopolímeros/química , Solventes/química , Medio Ambiente Extraterrestre/química , Evolución Molecular , Agua/química
2.
Proc Natl Acad Sci U S A ; 121(28): e2400151121, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38954548

RESUMEN

Protein folding and evolution are intimately linked phenomena. Here, we revisit the concept of exons as potential protein folding modules across a set of 38 abundant and conserved protein families. Taking advantage of genomic exon-intron organization and extensive protein sequence data, we explore exon boundary conservation and assess the foldon-like behavior of exons using energy landscape theoretic measurements. We found deviations in the exon size distribution from exponential decay indicating selection in evolution. We show that when taken together there is a pronounced tendency to independent foldability for segments corresponding to the more conserved exons, supporting the idea of exon-foldon correspondence. While 45% of the families follow this general trend when analyzed individually, there are some families for which other stronger functional determinants, such as preserving frustrated active sites, may be acting. We further develop a systematic partitioning of protein domains using exon boundary hotspots, showing that minimal common exons correspond with uninterrupted alpha and/or beta elements for the majority of the families but not for all of them.


Asunto(s)
Exones , Pliegue de Proteína , Exones/genética , Humanos , Proteínas/genética , Proteínas/química , Evolución Molecular , Intrones/genética
3.
Nucleic Acids Res ; 52(W1): W233-W237, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38587198

RESUMEN

According to the Principle of Minimal Frustration, folded proteins can only have a minimal number of strong energetic conflicts in their native states. However, not all interactions are energetically optimized for folding but some remain in energetic conflict, i.e. they are highly frustrated. This remaining local energetic frustration has been shown to be statistically correlated with distinct functional aspects such as protein-protein interaction sites, allosterism and catalysis. Fuelled by the recent breakthroughs in efficient protein structure prediction that have made available good quality models for most proteins, we have developed a strategy to calculate local energetic frustration within large protein families and quantify its conservation over evolutionary time. Based on this evolutionary information we can identify how stability and functional constraints have appeared at the common ancestor of the family and have been maintained over the course of evolution. Here, we present FrustraEvo, a web server tool to calculate and quantify the conservation of local energetic frustration in protein families.


Asunto(s)
Internet , Pliegue de Proteína , Proteínas , Programas Informáticos , Proteínas/química , Termodinámica , Conformación Proteica , Evolución Molecular , Modelos Moleculares
4.
Proc Natl Acad Sci U S A ; 119(31): e2204131119, 2022 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-35905321

RESUMEN

Repeat proteins are made with tandem copies of similar amino acid stretches that fold into elongated architectures. These proteins constitute excellent model systems to investigate how evolution relates to structure, folding, and function. Here, we propose a scheme to map evolutionary information at the sequence level to a coarse-grained model for repeat-protein folding and use it to investigate the folding of thousands of repeat proteins. We model the energetics by a combination of an inverse Potts-model scheme with an explicit mechanistic model of duplications and deletions of repeats to calculate the evolutionary parameters of the system at the single-residue level. These parameters are used to inform an Ising-like model that allows for the generation of folding curves, apparent domain emergence, and occupation of intermediate states that are highly compatible with experimental data in specific case studies. We analyzed the folding of thousands of natural Ankyrin repeat proteins and found that a multiplicity of folding mechanisms are possible. Fully cooperative all-or-none transitions are obtained for arrays with enough sequence-similar elements and strong interactions between them, while noncooperative element-by-element intermittent folding arose if the elements are dissimilar and the interactions between them are energetically weak. Additionally, we characterized nucleation-propagation and multidomain folding mechanisms. We show that the global stability and cooperativity of the repeating arrays can be predicted from simple sequence scores.


Asunto(s)
Repetición de Anquirina , Pliegue de Proteína , Modelos Químicos
5.
Bioinformatics ; 37(18): 3038-3040, 2021 09 29.
Artículo en Inglés | MEDLINE | ID: mdl-33720293

RESUMEN

SUMMARY: Once folded, natural protein molecules have few energetic conflicts within their polypeptide chains. Many protein structures do however contain regions where energetic conflicts remain after folding, i.e. they are highly frustrated. These regions, kept in place over evolutionary and physiological timescales, are related to several functional aspects of natural proteins such as protein-protein interactions, small ligand recognition, catalytic sites and allostery. Here, we present FrustratometeR, an R package that easily computes local energetic frustration on a personal computer or a cluster. This package facilitates large scale analysis of local frustration, point mutants and molecular dynamics (MD) trajectories, allowing straightforward integration of local frustration analysis into pipelines for protein structural analysis. AVAILABILITY AND IMPLEMENTATION: https://github.com/proteinphysiologylab/frustratometeR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Dominio Catalítico , Programas Informáticos
6.
Acc Chem Res ; 54(5): 1251-1259, 2021 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-33550810

RESUMEN

Are all protein interactions fully optimized? Do suboptimal interactions compromise specificity? What is the functional impact of frustration? Why does evolution not optimize some contacts? Proteins and their complexes are best described as ensembles of states populating an energy landscape. These ensembles vary in breadth from narrow ensembles clustered around a single average X-ray structure to broader ensembles encompassing a few different functional "taxonomic" states on to near continua of rapidly interconverting conformations, which are called "fuzzy" or even "intrinsically disordered". Here we aim to provide a comprehensive framework for confronting the structural and dynamical continuum of protein assemblies by combining the concepts of energetic frustration and interaction fuzziness. The diversity of the protein structural ensemble arises from the frustrated conflicts between the interactions that create the energy landscape. When frustration is minimal after folding, it results in a narrow ensemble, but residual frustrated interactions result in fuzzy ensembles, and this fuzziness allows a versatile repertoire of biological interactions. Here we discuss how fuzziness and frustration play off each other as proteins fold and assemble, viewing their significance from energetic, functional, and evolutionary perspectives.We demonstrate, in particular, that the common physical origin of both concepts is related to the ruggedness of the energy landscapes, intramolecular in the case of frustration and intermolecular in the case of fuzziness. Within this framework, we show that alternative sets of suboptimal contacts may encode specificity without achieving a single structural optimum. Thus, we demonstrate that structured complexes may not be optimized, and energetic frustration is realized via different sets of contacts leading to multiplicity of specific complexes. Furthermore, we propose that these suboptimal, frustrated, or fuzzy interactions are under evolutionary selection and expand the biological repertoire by providing a multiplicity of biological activities. In accord, we show that non-native interactions in folding or interaction landscapes can cooperate to generate diverse functional states, which are essential to facilitate adaptation to different cellular conditions. Thus, we propose that not fully optimized structures may actually be beneficial for biological activities of proteins via an alternative set of suboptimal interactions. The importance of such variability has not been recognized across different areas of biology.This account provides a modern view on folding, function, and assembly across the protein universe. The physical framework presented here is applicable to the structure and dynamics continuum of proteins and opens up new perspectives for drug design involving not fully structured, highly dynamic protein assemblies.


Asunto(s)
Proteínas , Cristalografía por Rayos X , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Proteínas/metabolismo
7.
Proc Natl Acad Sci U S A ; 116(10): 4037-4043, 2019 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-30765513

RESUMEN

Conflicting biological goals often meet in the specification of protein sequences for structure and function. Overall, strong energetic conflicts are minimized in folded native states according to the principle of minimal frustration, so that a sequence can spontaneously fold, but local violations of this principle open up the possibility to encode the complex energy landscapes that are required for active biological functions. We survey the local energetic frustration patterns of all protein enzymes with known structures and experimentally annotated catalytic residues. In agreement with previous hypotheses, the catalytic sites themselves are often highly frustrated regardless of the protein oligomeric state, overall topology, and enzymatic class. At the same time a secondary shell of more weakly frustrated interactions surrounds the catalytic site itself. We evaluate the conservation of these energetic signatures in various family members of major enzyme classes, showing that local frustration is evolutionarily more conserved than the primary structure itself.


Asunto(s)
Enzimas/química , Modelos Moleculares , Pliegue de Proteína , Dominio Catalítico
8.
PLoS Comput Biol ; 15(8): e1007282, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31415557

RESUMEN

The coding space of protein sequences is shaped by evolutionary constraints set by requirements of function and stability. We show that the coding space of a given protein family-the total number of sequences in that family-can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences. We analyzed and calculated the size of three abundant repeat proteins families, whose members are large proteins made of many repetitions of conserved portions of ∼30 amino acids. While amino acid conservation at each position of the alignment explains most of the reduction of diversity relative to completely random sequences, we found that correlations between amino acid usage at different positions significantly impact that diversity. We quantified the impact of different types of correlations, functional and evolutionary, on sequence diversity. Analysis of the detailed structure of the coding space of the families revealed a rugged landscape, with many local energy minima of varying sizes with a hierarchical structure, reminiscent of fustrated energy landscapes of spin glass in physics. This clustered structure indicates a multiplicity of subtypes within each family, and suggests new strategies for protein design.


Asunto(s)
Proteínas/química , Proteínas/genética , Secuencias Repetitivas de Aminoácido/genética , Algoritmos , Secuencia de Aminoácidos , Biología Computacional , Secuencia Conservada , Entropía , Evolución Molecular , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Alineación de Secuencia/estadística & datos numéricos , Homología de Secuencia de Aminoácido , Termodinámica
9.
PLoS Comput Biol ; 13(6): e1005584, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28617812

RESUMEN

Natural protein sequences contain a record of their history. A common constraint in a given protein family is the ability to fold to specific structures, and it has been shown possible to infer the main native ensemble by analyzing covariations in extant sequences. Still, many natural proteins that fold into the same structural topology show different stabilization energies, and these are often related to their physiological behavior. We propose a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry. We explicitly account for single amino acid and pair-wise interactions and treat higher order correlations with a single term. We show that the resulting evolutionary field can be interpreted with structural detail. We trace the variations in the energetic scores of natural proteins and relate them to their experimental characterization. The resulting energetic evolutionary field allows the prediction of the folding free energy change for several mutants, and can be used to generate synthetic sequences that are statistically indistinguishable from the natural counterparts.


Asunto(s)
Evolución Química , Modelos Moleculares , Proteínas/química , Proteínas/ultraestructura , Secuencias Repetitivas de Aminoácido/genética , Análisis de Secuencia de Proteína/métodos , Transferencia de Energía , Modelos Químicos , Mutación Puntual/genética , Conformación Proteica , Pliegue de Proteína , Proteínas/genética , Relación Estructura-Actividad
10.
Nucleic Acids Res ; 44(W1): W356-60, 2016 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-27131359

RESUMEN

The protein frustratometer is an energy landscape theory-inspired algorithm that aims at localizing and quantifying the energetic frustration present in protein molecules. Frustration is a useful concept for analyzing proteins' biological behavior. It compares the energy distributions of the native state with respect to structural decoys. The network of minimally frustrated interactions encompasses the folding core of the molecule. Sites of high local frustration often correlate with functional regions such as binding sites and regions involved in allosteric transitions. We present here an upgraded version of a webserver that measures local frustration. The new implementation that allows the inclusion of electrostatic energy terms, important to the interactions with nucleic acids, is significantly faster than the previous version enabling the analysis of large macromolecular complexes within a user-friendly interface. The webserver is freely available at URL: http://frustratometer.qb.fcen.uba.ar.


Asunto(s)
Algoritmos , Proteínas Nucleares/química , Ácidos Nucleicos/química , Nucleosomas/química , Interfaz Usuario-Computador , Secuencia de Aminoácidos , Gráficos por Computador , Humanos , Internet , Simulación de Dinámica Molecular , Proteínas Nucleares/genética , Ácidos Nucleicos/genética , Nucleosomas/genética , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Estructura Secundaria de Proteína , Análisis de Secuencia de Proteína , Electricidad Estática , Termodinámica
11.
Q Rev Biophys ; 47(4): 285-363, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25225856

RESUMEN

Biomolecules are the prime information processing elements of living matter. Most of these inanimate systems are polymers that compute their own structures and dynamics using as input seemingly random character strings of their sequence, following which they coalesce and perform integrated cellular functions. In large computational systems with finite interaction-codes, the appearance of conflicting goals is inevitable. Simple conflicting forces can lead to quite complex structures and behaviors, leading to the concept of frustration in condensed matter. We present here some basic ideas about frustration in biomolecules and how the frustration concept leads to a better appreciation of many aspects of the architecture of biomolecules, and especially how biomolecular structure connects to function by means of localized frustration. These ideas are simultaneously both seductively simple and perilously subtle to grasp completely. The energy landscape theory of protein folding provides a framework for quantifying frustration in large systems and has been implemented at many levels of description. We first review the notion of frustration from the areas of abstract logic and its uses in simple condensed matter systems. We discuss then how the frustration concept applies specifically to heteropolymers, testing folding landscape theory in computer simulations of protein models and in experimentally accessible systems. Studying the aspects of frustration averaged over many proteins provides ways to infer energy functions useful for reliable structure prediction. We discuss how frustration affects folding mechanisms. We review here how the biological functions of proteins are related to subtle local physical frustration effects and how frustration influences the appearance of metastable states, the nature of binding processes, catalysis and allosteric transitions. In this review, we also emphasize that frustration, far from being always a bad thing, is an essential feature of biomolecules that allows dynamics to be harnessed for function. In this way, we hope to illustrate how Frustration is a fundamental concept in molecular biology.


Asunto(s)
Bioquímica/métodos , Biopolímeros/metabolismo , Biopolímeros/química , Humanos , Fenómenos Magnéticos , Movimiento , Termodinámica
12.
PLoS Comput Biol ; 11(12): e1004659, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26691182

RESUMEN

Ankyrin repeat containing proteins are one of the most abundant solenoid folds. Usually implicated in specific protein-protein interactions, these proteins are readily amenable for design, with promising biotechnological and biomedical applications. Studying repeat protein families presents technical challenges due to the high sequence divergence among the repeating units. We developed and applied a systematic method to consistently identify and annotate the structural repetitions over the members of the complete Ankyrin Repeat Protein Family, with increased sensitivity over previous studies. We statistically characterized the number of repeats, the folding of the repeat-arrays, their structural variations, insertions and deletions. An energetic analysis of the local frustration patterns reveal the basic features underlying fold stability and its relation to the functional binding regions. We found a strong linear correlation between the conservation of the energetic features in the repeat arrays and their sequence variations, and discuss new insights into the organization and function of these ubiquitous proteins.


Asunto(s)
Repetición de Anquirina , Ancirinas/química , Ancirinas/ultraestructura , Modelos Químicos , Modelos Moleculares , Secuencia de Aminoácidos , Simulación por Computador , Transferencia de Energía , Datos de Secuencia Molecular , Análisis de Secuencia de Proteína/métodos
13.
BMC Bioinformatics ; 16: 207, 2015 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-26134293

RESUMEN

BACKGROUND: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts - portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins - natural systems for which the identification of folding domains remains challenging. RESULTS: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. CONCLUSIONS: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric.


Asunto(s)
Secuencias de Aminoácidos , Aminoácidos/química , Evolución Molecular , Multimerización de Proteína , Proteínas/química , Humanos , Modelos Moleculares , Pliegue de Proteína
14.
Mol Biol Evol ; 31(11): 2905-12, 2014 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25086000

RESUMEN

The 20 protein-coding amino acids are found in proteomes with different relative abundances. The most abundant amino acid, leucine, is nearly an order of magnitude more prevalent than the least abundant amino acid, cysteine. Amino acid metabolic costs differ similarly, constraining their incorporation into proteins. On the other hand, a diverse set of protein sequences is necessary to build functional proteomes. Here, we present a simple model for a cost-diversity trade-off postulating that natural proteomes minimize amino acid metabolic flux while maximizing sequence entropy. The model explains the relative abundances of amino acids across a diverse set of proteomes. We found that the data are remarkably well explained when the cost function accounts for amino acid chemical decay. More than 100 organisms reach comparable solutions to the trade-off by different combinations of proteome cost and sequence diversity. Quantifying the interplay between proteome size and entropy shows that proteomes can get optimally large and diverse.


Asunto(s)
Aminoácidos/metabolismo , Genoma , Modelos Biológicos , Biosíntesis de Proteínas/genética , Proteoma/metabolismo , Adenosina Trifosfato/metabolismo , Secuencia de Aminoácidos , Aminoácidos/química , Aminoácidos/genética , Entropía , Variación Estructural del Genoma , Análisis de los Mínimos Cuadrados , Datos de Secuencia Molecular , Proteoma/química , Proteoma/genética
15.
Biochem Soc Trans ; 43(5): 844-9, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26517892

RESUMEN

Structural domains are believed to be modules within proteins that can fold and function independently. Some proteins show tandem repetitions of apparent modular structure that do not fold independently, but rather co-operate in stabilizing structural forms that comprise several repeat-units. For many natural repeat-proteins, it has been shown that weak energetic links between repeats lead to the breakdown of co-operativity and the appearance of folding sub-domains within an apparently regular repeat array. The quasi-1D architecture of repeat-proteins is crucial in detailing how the local energetic balances can modulate the folding dynamics of these proteins, which can be related to the physiological behaviour of these ubiquitous biological systems.


Asunto(s)
Modelos Moleculares , Conformación Proteica , Secuencias Repetitivas de Aminoácido , Secuencias Repetidas en Tándem , Animales , Transferencia de Energía , Evolución Molecular , Humanos , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Estabilidad Proteica , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína
16.
Nucleic Acids Res ; 40(Web Server issue): W348-51, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22645321

RESUMEN

The frustratometer is an energy landscape theory-inspired algorithm that aims at quantifying the location of frustration manifested in protein molecules. Frustration is a useful concept for gaining insight to the proteins biological behavior by analyzing how the energy is distributed in protein structures and how mutations or conformational changes shift the energetics. Sites of high local frustration often indicate biologically important regions involved in binding or allostery. In contrast, minimally frustrated linkages comprise a stable folding core of the molecule that is conserved in conformational changes. Here, we describe the implementation of these ideas in a webserver freely available at the National EMBNet node-Argentina, at URL: http://lfp.qb.fcen.uba.ar/embnet/.


Asunto(s)
Conformación Proteica , Programas Informáticos , Algoritmos , Internet , Mutación , Pliegue de Proteína , Estructura Terciaria de Proteína , Proteínas/genética , Interfaz Usuario-Computador
17.
Proc Natl Acad Sci U S A ; 108(9): 3499-503, 2011 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-21273505

RESUMEN

Natural protein domains must be sufficiently stable to fold but often need to be locally unstable to function. Overall, strong energetic conflicts are minimized in native states satisfying the principle of minimal frustration. Local violations of this principle open up possibilities to form the complex multifunnel energy landscapes needed for large-scale conformational changes. We survey the local frustration patterns of allosteric domains and show that the regions that reconfigure are often enriched in patches of highly frustrated interactions, consistent both with the idea that these locally frustrated regions may act as specific hinges or that proteins may "crack" in these locations. On the other hand, the symmetry of multimeric protein assemblies allows near degeneracy by reconfiguring while maintaining minimally frustrated interactions. We also anecdotally examine some specific examples of complex conformational changes and speculate on the role of frustration in the kinetics of allosteric change.


Asunto(s)
Proteínas/metabolismo , Regulación Alostérica , Aminoácidos/metabolismo , Bases de Datos de Proteínas , Modelos Moleculares , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteínas/química , Termodinámica
18.
Proc Natl Acad Sci U S A ; 107(17): 7751-6, 2010 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-20375284

RESUMEN

Protein recognition of DNA sites is a primary event for gene function. Its ultimate mechanistic understanding requires an integrated structural, dynamic, kinetic, and thermodynamic dissection that is currently limited considering the hundreds of structures of protein-DNA complexes available. We describe a protein-DNA-binding pathway in which an initial, diffuse, transition state ensemble with some nonnative contacts is followed by formation of extensive nonnative interactions that drive the system into a kinetic trap. Finally, nonnative contacts are slowly rearranged into native-like interactions with the DNA backbone. Dissimilar protein-DNA interfaces that populate along the DNA-binding route are explained by a temporary degeneracy of protein-DNA interactions, centered on "dual-role" residues. The nonnative species slow down the reaction allowing for extended functionality.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Modelos Moleculares , Proteínas Oncogénicas Virales/metabolismo , Sitios de Unión/genética , Proteínas de Unión al ADN/genética , Cinética , Imagen Molecular/métodos , Mutación/genética , Proteínas Oncogénicas Virales/genética , Unión Proteica
19.
Nat Commun ; 14(1): 8379, 2023 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-38104123

RESUMEN

Energetic local frustration offers a biophysical perspective to interpret the effects of sequence variability on protein families. Here we present a methodology to analyze local frustration patterns within protein families and superfamilies that allows us to uncover constraints related to stability and function, and identify differential frustration patterns in families with a common ancestry. We analyze these signals in very well studied protein families such as PDZ, SH3, ɑ and ß globins and RAS families. Recent advances in protein structure prediction make it possible to analyze a vast majority of the protein space. An automatic and unsupervised proteome-wide analysis on the SARS-CoV-2 virus demonstrates the potential of our approach to enhance our understanding of the natural phenotypic diversity of protein families beyond single protein instances. We apply our method to modify biophysical properties of natural proteins based on their family properties, as well as perform unsupervised analysis of large datasets to shed light on the physicochemical signatures of poorly characterized proteins such as the ones belonging to emergent pathogens.


Asunto(s)
Proteínas , Proteínas/metabolismo
20.
Protein Sci ; 31(6): e4337, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35634768

RESUMEN

The NusG protein family is structurally and functionally conserved in all domains of life. Its members directly bind RNA polymerases and regulate transcription processivity and termination. RfaH, a divergent sub-family in its evolutionary history, is known for displaying distinct features than those in NusG proteins, which allows them to regulate the expression of virulence factors in enterobacteria in a DNA sequence-dependent manner. A striking feature is its structural interconversion between an active fold, which is the canonical NusG three-dimensional structure, and an autoinhibited fold, which is distinctively novel. How this novel fold is encoded within RfaH sequence to encode a metamorphic protein remains elusive. In this work, we used publicly available genomic RfaH protein sequences to construct a complete multiple sequence alignment, which was further augmented with metagenomic sequences and curated by predicting their secondary structure propensities using JPred. Coevolving pairs of residues were calculated from these sequences using plmDCA and GREMLIN, which allowed us to detect the enrichment of key metamorphic contacts after sequence filtering. Finally, we combined our coevolutionary predictions with molecular dynamics to demonstrate that these interactions are sufficient to predict the structures of both native folds, where coevolutionary-derived non-native contacts may play a key role in achieving the compact RfaH novel fold. All in all, emergent coevolutionary signals found within RfaH sequences encode the autoinhibited and active folds of this protein, shedding light on the key interactions responsible for the action of this metamorphic protein.


Asunto(s)
Proteínas de Escherichia coli , Factores de Transcripción , ARN Polimerasas Dirigidas por ADN/química , Proteínas de Escherichia coli/química , Factores de Elongación de Péptidos/química , Factores de Elongación de Péptidos/genética , Factores de Elongación de Péptidos/metabolismo , Transactivadores/química , Factores de Transcripción/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA