Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
1.
Proc Natl Acad Sci U S A ; 121(21): e2318905121, 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38739787

RESUMO

We propose that spontaneous folding and molecular evolution of biopolymers are two universal aspects that must concur for life to happen. These aspects are fundamentally related to the chemical composition of biopolymers and crucially depend on the solvent in which they are embedded. We show that molecular information theory and energy landscape theory allow us to explore the limits that solvents impose on biopolymer existence. We consider 54 solvents, including water, alcohols, hydrocarbons, halogenated solvents, aromatic solvents, and low molecular weight substances made up of elements abundant in the universe, which may potentially take part in alternative biochemistries. We find that along with water, there are many solvents for which the liquid regime is compatible with biopolymer folding and evolution. We present a ranking of the solvents in terms of biopolymer compatibility. Many of these solvents have been found in molecular clouds or may be expected to occur in extrasolar planets.


Assuntos
Solventes , Biopolímeros/química , Solventes/química , Meio Ambiente Extraterreno/química , Evolução Molecular , Água/química
2.
Nucleic Acids Res ; 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38587198

RESUMO

According to the Principle of Minimal Frustration, folded proteins can only have a minimal number of strong energetic conflicts in their native states. However, not all interactions are energetically optimized for folding but some remain in energetic conflict, i.e. they are highly frustrated. This remaining local energetic frustration has been shown to be statistically correlated with distinct functional aspects such as protein-protein interaction sites, allosterism and catalysis. Fuelled by the recent breakthroughs in efficient protein structure prediction that have made available good quality models for most proteins, we have developed a strategy to calculate local energetic frustration within large protein families and quantify its conservation over evolutionary time. Based on this evolutionary information we can identify how stability and functional constraints have appeared at the common ancestor of the family and have been maintained over the course of evolution. Here, we present FrustraEvo, a web server tool to calculate and quantify the conservation of local energetic frustration in protein families.

3.
Nat Commun ; 14(1): 8379, 2023 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-38104123

RESUMO

Energetic local frustration offers a biophysical perspective to interpret the effects of sequence variability on protein families. Here we present a methodology to analyze local frustration patterns within protein families and superfamilies that allows us to uncover constraints related to stability and function, and identify differential frustration patterns in families with a common ancestry. We analyze these signals in very well studied protein families such as PDZ, SH3, ɑ and ß globins and RAS families. Recent advances in protein structure prediction make it possible to analyze a vast majority of the protein space. An automatic and unsupervised proteome-wide analysis on the SARS-CoV-2 virus demonstrates the potential of our approach to enhance our understanding of the natural phenotypic diversity of protein families beyond single protein instances. We apply our method to modify biophysical properties of natural proteins based on their family properties, as well as perform unsupervised analysis of large datasets to shed light on the physicochemical signatures of poorly characterized proteins such as the ones belonging to emergent pathogens.


Assuntos
Proteínas , Proteínas/metabolismo
4.
J Phys Chem B ; 126(43): 8655-8668, 2022 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-36282961

RESUMO

We propose an application of molecular information theory to analyze the folding of single domain proteins. We analyze results from various areas of protein science, such as sequence-based potentials, reduced amino acid alphabets, backbone configurational entropy, secondary structure content, residue burial layers, and mutational studies of protein stability changes. We found that the average information contained in the sequences of evolved proteins is very close to the average information needed to specify a fold ∼2.2 ± 0.3 bits/(site·operation). The effective alphabet size in evolved proteins equals the effective number of conformations of a residue in the compact unfolded state at around 5. We calculated an energy-to-information conversion efficiency upon folding of around 50%, lower than the theoretical limit of 70%, but much higher than human-built macroscopic machines. We propose a simple mapping between molecular information theory and energy landscape theory and explore the connections between sequence evolution, configurational entropy, and the energetics of protein folding.


Assuntos
Teoria da Informação , Dobramento de Proteína , Humanos , Estrutura Secundária de Proteína , Proteínas/química , Entropia , Conformação Proteica
5.
Proc Natl Acad Sci U S A ; 119(31): e2204131119, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35905321

RESUMO

Repeat proteins are made with tandem copies of similar amino acid stretches that fold into elongated architectures. These proteins constitute excellent model systems to investigate how evolution relates to structure, folding, and function. Here, we propose a scheme to map evolutionary information at the sequence level to a coarse-grained model for repeat-protein folding and use it to investigate the folding of thousands of repeat proteins. We model the energetics by a combination of an inverse Potts-model scheme with an explicit mechanistic model of duplications and deletions of repeats to calculate the evolutionary parameters of the system at the single-residue level. These parameters are used to inform an Ising-like model that allows for the generation of folding curves, apparent domain emergence, and occupation of intermediate states that are highly compatible with experimental data in specific case studies. We analyzed the folding of thousands of natural Ankyrin repeat proteins and found that a multiplicity of folding mechanisms are possible. Fully cooperative all-or-none transitions are obtained for arrays with enough sequence-similar elements and strong interactions between them, while noncooperative element-by-element intermittent folding arose if the elements are dissimilar and the interactions between them are energetically weak. Additionally, we characterized nucleation-propagation and multidomain folding mechanisms. We show that the global stability and cooperativity of the repeating arrays can be predicted from simple sequence scores.


Assuntos
Repetição de Anquirina , Dobramento de Proteína , Modelos Químicos
6.
Protein Sci ; 31(6): e4337, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35634768

RESUMO

The NusG protein family is structurally and functionally conserved in all domains of life. Its members directly bind RNA polymerases and regulate transcription processivity and termination. RfaH, a divergent sub-family in its evolutionary history, is known for displaying distinct features than those in NusG proteins, which allows them to regulate the expression of virulence factors in enterobacteria in a DNA sequence-dependent manner. A striking feature is its structural interconversion between an active fold, which is the canonical NusG three-dimensional structure, and an autoinhibited fold, which is distinctively novel. How this novel fold is encoded within RfaH sequence to encode a metamorphic protein remains elusive. In this work, we used publicly available genomic RfaH protein sequences to construct a complete multiple sequence alignment, which was further augmented with metagenomic sequences and curated by predicting their secondary structure propensities using JPred. Coevolving pairs of residues were calculated from these sequences using plmDCA and GREMLIN, which allowed us to detect the enrichment of key metamorphic contacts after sequence filtering. Finally, we combined our coevolutionary predictions with molecular dynamics to demonstrate that these interactions are sufficient to predict the structures of both native folds, where coevolutionary-derived non-native contacts may play a key role in achieving the compact RfaH novel fold. All in all, emergent coevolutionary signals found within RfaH sequences encode the autoinhibited and active folds of this protein, shedding light on the key interactions responsible for the action of this metamorphic protein.


Assuntos
Proteínas de Escherichia coli , Fatores de Transcrição , RNA Polimerases Dirigidas por DNA/química , Proteínas de Escherichia coli/química , Fatores de Alongamento de Peptídeos/química , Fatores de Alongamento de Peptídeos/genética , Fatores de Alongamento de Peptídeos/metabolismo , Transativadores/química , Fatores de Transcrição/química
7.
Sci Adv ; 8(2): eabj3984, 2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-35030025

RESUMO

Biological redox reactions drive planetary biogeochemical cycles. Using a novel, structure-guided sequence analysis of proteins, we explored the patterns of evolution of enzymes responsible for these reactions. Our analysis reveals that the folds that bind transition metal­containing ligands have similar structural geometry and amino acid sequences across the full diversity of proteins. Similarity across folds reflects the availability of key transition metals over geological time and strongly suggests that transition metal­ligand binding had a small number of common peptide origins. We observe that structures central to our similarity network come primarily from oxidoreductases, suggesting that ancestral peptides may have also facilitated electron transfer reactions. Last, our results reveal that the earliest biologically functional peptides were likely available before the assembly of fully functional protein domains over 3.8 billion years ago.Thus, life is a special, very complex form of motion of matter, but this form did not always exist, and it is not separated from inorganic nature by an impassable abyss; rather, it arose from inorganic nature as a new property in the process of evolution of the world. We must study the history of this evolution if we want to solve the problem of the origin of life. [A. I. Oparin (1)]

8.
Methods Mol Biol ; 2376: 387-398, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34845622

RESUMO

We present a detailed heuristic method to quantify the degree of local energetic frustration manifested by protein molecules. Current applications are realized in computational experiments where a protein structure is visualized highlighting the energetic conflicts or the concordance of the local interactions in that structure. Minimally frustrated linkages highlight the stable folding core of the molecule. Sites of high local frustration, in contrast, often indicate functionally relevant regions such as binding, active, or allosteric sites.


Assuntos
Conformação Proteica , Modelos Moleculares , Dobramento de Proteína , Proteínas , Termodinâmica
9.
QRB Discov ; 3: e7, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37529289

RESUMO

Ankyrin (ANK) repeat proteins are coded by tandem occurrences of patterns with around 33 amino acids. They often mediate protein-protein interactions in a diversity of biological systems. These proteins have an elongated non-globular shape and often display complex folding mechanisms. This work investigates the energy landscape of representative proteins of this class made up of 3, 4 and 6 ANK repeats using the energy-landscape visualisation method (ELViM). By combining biased and unbiased coarse-grained molecular dynamics AWSEM simulations that sample conformations along the folding trajectories with the ELViM structure-based phase space, one finds a three-dimensional representation of the globally funnelled energy surface. In this representation, it is possible to delineate distinct folding pathways. We show that ELViMs can project, in a natural way, the intricacies of the highly dimensional energy landscapes encoded by the highly symmetric ankyrin repeat proteins into useful low-dimensional representations. These projections can discriminate between multiplicities of specific parallel folding mechanisms that otherwise can be hidden in oversimplified depictions.

10.
J Phys Chem B ; 125(10): 2513-2520, 2021 03 18.
Artigo em Inglês | MEDLINE | ID: mdl-33667107

RESUMO

Disordered proteins frequently serve as interaction hubs involving a constrained variety of partners. Complexes with different partners frequently exhibit distinct binding modes, involving regions that remain disordered in the bound state. While the conformational properties of disordered proteins are well-characterized in their free states, less is known about the molecular mechanisms by which specificity can be achieved not with one but with multiple partners. Using the energy landscape theory concept of protein frustration, we demonstrate that complexes of disordered proteins exhibit a high degree of local frustration, especically at the binding interface. These suboptimal interactions lead to the possibility of multiple bound substates, each displaying distinct frustration patterns, which are differently populated in complexes with different partners. These results explain how specificity of disordered proteins can be achieved without a single common bound conformation and how the confliict between different interactions can be used to control the binding to multiple partners.


Assuntos
Proteínas Intrinsicamente Desordenadas , Proteínas Intrinsicamente Desordenadas/metabolismo , Ligação Proteica , Conformação Proteica , Dobramento de Proteína
11.
Bioinformatics ; 37(18): 3038-3040, 2021 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-33720293

RESUMO

SUMMARY: Once folded, natural protein molecules have few energetic conflicts within their polypeptide chains. Many protein structures do however contain regions where energetic conflicts remain after folding, i.e. they are highly frustrated. These regions, kept in place over evolutionary and physiological timescales, are related to several functional aspects of natural proteins such as protein-protein interactions, small ligand recognition, catalytic sites and allostery. Here, we present FrustratometeR, an R package that easily computes local energetic frustration on a personal computer or a cluster. This package facilitates large scale analysis of local frustration, point mutants and molecular dynamics (MD) trajectories, allowing straightforward integration of local frustration analysis into pipelines for protein structural analysis. AVAILABILITY AND IMPLEMENTATION: https://github.com/proteinphysiologylab/frustratometeR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Domínio Catalítico , Software
12.
Acc Chem Res ; 54(5): 1251-1259, 2021 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-33550810

RESUMO

Are all protein interactions fully optimized? Do suboptimal interactions compromise specificity? What is the functional impact of frustration? Why does evolution not optimize some contacts? Proteins and their complexes are best described as ensembles of states populating an energy landscape. These ensembles vary in breadth from narrow ensembles clustered around a single average X-ray structure to broader ensembles encompassing a few different functional "taxonomic" states on to near continua of rapidly interconverting conformations, which are called "fuzzy" or even "intrinsically disordered". Here we aim to provide a comprehensive framework for confronting the structural and dynamical continuum of protein assemblies by combining the concepts of energetic frustration and interaction fuzziness. The diversity of the protein structural ensemble arises from the frustrated conflicts between the interactions that create the energy landscape. When frustration is minimal after folding, it results in a narrow ensemble, but residual frustrated interactions result in fuzzy ensembles, and this fuzziness allows a versatile repertoire of biological interactions. Here we discuss how fuzziness and frustration play off each other as proteins fold and assemble, viewing their significance from energetic, functional, and evolutionary perspectives.We demonstrate, in particular, that the common physical origin of both concepts is related to the ruggedness of the energy landscapes, intramolecular in the case of frustration and intermolecular in the case of fuzziness. Within this framework, we show that alternative sets of suboptimal contacts may encode specificity without achieving a single structural optimum. Thus, we demonstrate that structured complexes may not be optimized, and energetic frustration is realized via different sets of contacts leading to multiplicity of specific complexes. Furthermore, we propose that these suboptimal, frustrated, or fuzzy interactions are under evolutionary selection and expand the biological repertoire by providing a multiplicity of biological activities. In accord, we show that non-native interactions in folding or interaction landscapes can cooperate to generate diverse functional states, which are essential to facilitate adaptation to different cellular conditions. Thus, we propose that not fully optimized structures may actually be beneficial for biological activities of proteins via an alternative set of suboptimal interactions. The importance of such variability has not been recognized across different areas of biology.This account provides a modern view on folding, function, and assembly across the protein universe. The physical framework presented here is applicable to the structure and dynamics continuum of proteins and opens up new perspectives for drug design involving not fully structured, highly dynamic protein assemblies.


Assuntos
Proteínas , Cristalografia por Raios X , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Proteínas/química , Proteínas/metabolismo
13.
Nat Commun ; 11(1): 5944, 2020 11 23.
Artigo em Inglês | MEDLINE | ID: mdl-33230150

RESUMO

To function, biomolecules require sufficient specificity of interaction as well as stability to live in the cell while still being able to move. Thermodynamic stability of only a limited number of specific structures is important so as to prevent promiscuous interactions. The individual interactions in proteins, therefore, have evolved collectively to give funneled minimally frustrated landscapes but some strategic parts of biomolecular sequences located at specific sites in the structure have been selected to be frustrated in order to allow both motion and interaction with partners. We describe a framework efficiently to quantify and localize biomolecular frustration at atomic resolution by examining the statistics of the energy changes that occur when the local environment of a site is changed. The location of patches of highly frustrated interactions correlates with key biological locations needed for physiological function. At atomic resolution, it becomes possible to extend frustration analysis to protein-ligand complexes. At this resolution one sees that drug specificity is correlated with there being a minimally frustrated binding pocket leading to a funneled binding landscape. Atomistic frustration analysis provides a route for screening for more specific compounds for drug discovery.


Assuntos
Proteínas/química , Sítios de Ligação , Domínio Catalítico , Descoberta de Drogas , Ligantes , Modelos Moleculares , Ligação Proteica , Dobramento de Proteína , Proteínas/metabolismo , Termodinâmica
14.
PLoS One ; 15(6): e0233865, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32579546

RESUMO

Ankyrin containing proteins are one of the most abundant repeat protein families present in all extant organisms. They are made with tandem copies of similar amino acid stretches that fold into elongated architectures. Here, we built and curated a dataset of 200 thousand proteins that contain 1.2 million Ankyrin regions and characterize the abundance, structure and energetics of the repetitive regions in natural proteins. We found that there is a continuous roughly exponential variety of array lengths with an exceptional frequency at 24 repeats. We described that individual repeats are seldom interrupted with long insertions and accept few deletions, in line with the known tertiary structures. We found that longer arrays are made up of repeats that are more similar to each other than shorter arrays, and display more favourable folding energy, hinting at their evolutionary origin. The array distributions show that there is a physical upper limit to the size of an array of repeats of about 120 copies, consistent with the limit found in nature. The identity patterns within the arrays suggest that they may have originated by sequential copies of more than one Ankyrin unit.


Assuntos
Repetição de Anquirina , Anquirinas/química , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína
15.
PLoS Comput Biol ; 15(8): e1007282, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31415557

RESUMO

The coding space of protein sequences is shaped by evolutionary constraints set by requirements of function and stability. We show that the coding space of a given protein family-the total number of sequences in that family-can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences. We analyzed and calculated the size of three abundant repeat proteins families, whose members are large proteins made of many repetitions of conserved portions of ∼30 amino acids. While amino acid conservation at each position of the alignment explains most of the reduction of diversity relative to completely random sequences, we found that correlations between amino acid usage at different positions significantly impact that diversity. We quantified the impact of different types of correlations, functional and evolutionary, on sequence diversity. Analysis of the detailed structure of the coding space of the families revealed a rugged landscape, with many local energy minima of varying sizes with a hierarchical structure, reminiscent of fustrated energy landscapes of spin glass in physics. This clustered structure indicates a multiplicity of subtypes within each family, and suggests new strategies for protein design.


Assuntos
Proteínas/química , Proteínas/genética , Sequências Repetitivas de Aminoácidos/genética , Algoritmos , Sequência de Aminoácidos , Biologia Computacional , Sequência Conservada , Entropia , Evolução Molecular , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Alinhamento de Sequência/estatística & dados numéricos , Homologia de Sequência de Aminoácidos , Termodinâmica
16.
Proc Natl Acad Sci U S A ; 116(10): 4037-4043, 2019 03 05.
Artigo em Inglês | MEDLINE | ID: mdl-30765513

RESUMO

Conflicting biological goals often meet in the specification of protein sequences for structure and function. Overall, strong energetic conflicts are minimized in folded native states according to the principle of minimal frustration, so that a sequence can spontaneously fold, but local violations of this principle open up the possibility to encode the complex energy landscapes that are required for active biological functions. We survey the local energetic frustration patterns of all protein enzymes with known structures and experimentally annotated catalytic residues. In agreement with previous hypotheses, the catalytic sites themselves are often highly frustrated regardless of the protein oligomeric state, overall topology, and enzymatic class. At the same time a secondary shell of more weakly frustrated interactions surrounds the catalytic site itself. We evaluate the conservation of these energetic signatures in various family members of major enzyme classes, showing that local frustration is evolutionarily more conserved than the primary structure itself.


Assuntos
Enzimas/química , Modelos Moleculares , Dobramento de Proteína , Domínio Catalítico
17.
J Phys Chem B ; 122(49): 11295-11301, 2018 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-30239207

RESUMO

All known terrestrial proteins are coded as continuous strings of ≈20 amino acids. The patterns formed by the repetitions of elements in groups of finite sequences describes the natural architectures of protein families. We present a method to search for patterns and groupings of patterns in protein sequences using a mathematically precise definition for "repetition", an efficient algorithmic implementation and a robust scoring system with no adjustable parameters. We show that the sequence patterns can be well-separated into disjoint classes according to their recurrence in nested structures. The statistics of the occurrences of patterns indicate that short repetitions are sufficient to account for the differences between natural families and randomized groups of sequences by more than 10 standard deviations, while contiguous sequence patterns shorter than 5 residues are effectively random in their occurrences. A small subset of patterns is sufficient to account for a robust "familiarity" definition between arbitrary sets of sequences.


Assuntos
Proteínas/química , Algoritmos , Sequência de Aminoácidos , Domínios Proteicos , Proteínas/classificação
18.
Curr Opin Struct Biol ; 48: 68-73, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29101782

RESUMO

Natural protein molecules are exceptional polymers. Encoded in apparently random strings of amino-acids, these objects perform clear physical tasks that are rare to find by simple chance. Accurate folding, specific binding, powerful catalysis, are examples of basic chemical activities that the great majority of polypeptides do not display, and are thought to be the outcome of the natural history of proteins. Function, a concept genuine to Biology, is at the core of evolution and often conflicts with the physical constraints. Locating the frustration between discrepant goals in a recurrent system leads to fundamental insights about the chances and necessities that shape the encoding of biological information.


Assuntos
Aminoácidos/química , Simulação de Dinâmica Molecular , Proteínas/química , Sequência de Aminoácidos , Animais , Biocatálise , Evolução Molecular , Humanos , Cinética , Ligação Proteica , Dobramento de Proteína , Proteínas/fisiologia , Relação Estrutura-Atividade , Termodinâmica
19.
PLoS Comput Biol ; 13(6): e1005584, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28617812

RESUMO

Natural protein sequences contain a record of their history. A common constraint in a given protein family is the ability to fold to specific structures, and it has been shown possible to infer the main native ensemble by analyzing covariations in extant sequences. Still, many natural proteins that fold into the same structural topology show different stabilization energies, and these are often related to their physiological behavior. We propose a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry. We explicitly account for single amino acid and pair-wise interactions and treat higher order correlations with a single term. We show that the resulting evolutionary field can be interpreted with structural detail. We trace the variations in the energetic scores of natural proteins and relate them to their experimental characterization. The resulting energetic evolutionary field allows the prediction of the folding free energy change for several mutants, and can be used to generate synthetic sequences that are statistically indistinguishable from the natural counterparts.


Assuntos
Evolução Química , Modelos Moleculares , Proteínas/química , Proteínas/ultraestrutura , Sequências Repetitivas de Aminoácidos/genética , Análise de Sequência de Proteína/métodos , Transferência de Energia , Modelos Químicos , Mutação Puntual/genética , Conformação Proteica , Dobramento de Proteína , Proteínas/genética , Relação Estrutura-Atividade
20.
Biophys J ; 111(11): 2339-2341, 2016 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-27926834

Assuntos
Pressão , Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA