Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Res Sq ; 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38746330

RESUMO

Protein kinases are molecular machines with rich sequence variation that distinguishes the two main evolutionary branches - tyrosine kinases (TKs) from serine/threonine kinases (STKs). Using a sequence co-variation Potts statistical energy model we previously concluded that TK catalytic domains are more likely than STKs to adopt an inactive conformation with the activation loop in an autoinhibitory "folded" conformation, due to intrinsic sequence effects. Here we investigated the structural basis for this phenomenon by integrating the sequence-based model with structure-based molecular dynamics (MD) to determine the effects of mutations on the free energy difference between active and inactive conformations, using a novel thermodynamic cycle involving many (n=108) protein-mutation free energy perturbation (FEP) simulations in the active and inactive conformations. The sequence and structure-based results are consistent and support the hypothesis that the inactive conformation "DFG-out Activation Loop Folded", is a functional regulatory state that has been stabilized in TKs relative to STKs over the course of their evolution via the accumulation of residue substitutions in the activation loop and catalytic loop that facilitate distinct substrate binding modes in trans and additional modes of regulation in cis for TKs.

2.
Proc Natl Acad Sci U S A ; 121(15): e2316662121, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38557187

RESUMO

Drug resistance in HIV type 1 (HIV-1) is a pervasive problem that affects the lives of millions of people worldwide. Although records of drug-resistant mutations (DRMs) have been extensively tabulated within public repositories, our understanding of the evolutionary kinetics of DRMs and how they evolve together remains limited. Epistasis, the interaction between a DRM and other residues in HIV-1 protein sequences, is key to the temporal evolution of drug resistance. We use a Potts sequence-covariation statistical-energy model of HIV-1 protein fitness under drug selection pressure, which captures epistatic interactions between all positions, combined with kinetic Monte-Carlo simulations of sequence evolutionary trajectories, to explore the acquisition of DRMs as they arise in an ensemble of drug-naive patient protein sequences. We follow the time course of 52 DRMs in the enzymes protease, RT, and integrase, the primary targets of antiretroviral therapy. The rates at which DRMs emerge are highly correlated with their observed acquisition rates reported in the literature when drug pressure is applied. This result highlights the central role of epistasis in determining the kinetics governing DRM emergence. Whereas rapidly acquired DRMs begin to accumulate as soon as drug pressure is applied, slowly acquired DRMs are contingent on accessory mutations that appear only after prolonged drug pressure. We provide a foundation for using computational methods to determine the temporal evolution of drug resistance using Potts statistical potentials, which can be used to gain mechanistic insights into drug resistance pathways in HIV-1 and other infectious agents.


Assuntos
Fármacos Anti-HIV , Infecções por HIV , Soropositividade para HIV , HIV-1 , Humanos , HIV-1/genética , Farmacorresistência Viral/genética , Genótipo , Infecções por HIV/tratamento farmacológico , Infecções por HIV/genética , Mutação , Fármacos Anti-HIV/farmacologia , Fármacos Anti-HIV/uso terapêutico
3.
bioRxiv ; 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38559238

RESUMO

Protein kinases are molecular machines with rich sequence variation that distinguishes the two main evolutionary branches - tyrosine kinases (TKs) from serine/threonine kinases (STKs). Using a sequence co-variation Potts statistical energy model we previously concluded that TK catalytic domains are more likely than STKs to adopt an inactive conformation with the activation loop in an autoinhibitory "folded" conformation, due to intrinsic sequence effects. Here we investigated the structural basis for this phenomenon by integrating the sequence-based model with structure-based molecular dynamics (MD) to determine the effects of mutations on the free energy difference between active and inactive conformations, using a novel thermodynamic cycle involving many (n=108) protein-mutation free energy perturbation (FEP) simulations in the active and inactive conformations. The sequence and structure-based results are consistent and support the hypothesis that the inactive conformation "DFG-out Activation Loop Folded", is a functional regulatory state that has been stabilized in TKs relative to STKs over the course of their evolution via the accumulation of residue substitutions in the activation loop and catalytic loop that facilitate distinct substrate binding modes in trans and additional modes of regulation in cis for TKs.

4.
bioRxiv ; 2024 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-38260358

RESUMO

Polycystin-1 (PC1) is the membrane protein product of the PKD1 gene whose mutation is responsible for 85% of the cases of autosomal dominant polycystic kidney disease (ADPKD). ADPKD is primarily characterized by the formation of renal cysts and potential kidney failure. PC1 is an atypical G protein-coupled receptor (GPCR) consisting of 11 transmembrane helices and an autocatalytic GAIN domain that cleaves PC1 into extracellular N-terminal (NTF) and membrane-embedded C-terminal (CTF) fragments. Recently, signaling activation of the PC1 CTF was shown to be regulated by a stalk tethered agonist (TA), a distinct mechanism observed in the adhesion GPCR family. A novel allosteric activation pathway was elucidated for the PC1 CTF through a combination of Gaussian accelerated molecular dynamics (GaMD), mutagenesis and cellular signaling experiments. Here, we show that synthetic, soluble peptides with 7 to 21 residues derived from the stalk TA, in particular, peptides including the first 9 residues (p9), 17 residues (p17) and 21 residues (p21) exhibited the ability to re-activate signaling by a stalkless PC1 CTF mutant in cellular assays. To reveal molecular mechanisms of stalk peptide-mediated signaling activation, we have applied a novel Peptide GaMD (Pep-GaMD) algorithm to elucidate binding conformations of selected stalk peptide agonists p9, p17 and p21 to the stalkless PC1 CTF. The simulations revealed multiple specific binding regions of the stalk peptide agonists to the PC1 protein including an "intermediate" bound yet inactive state. Our Pep-GaMD simulation findings were consistent with the cellular assay experimental data. Binding of peptide agonists to the TOP domain of PC1 induced close TOP-putative pore loop interactions, a characteristic feature of the PC1 CTF signaling activation mechanism. Using sequence covariation analysis of PC1 homologs, we further showed that the peptide binding regions were consistent with covarying residue pairs identified between the TOP domain and the stalk TA. Therefore, structural dynamic insights into the mechanisms of PC1 activation by stalk-derived peptide agonists have enabled an in-depth understanding of PC1 signaling. They will form a foundation for development of PC1 as a therapeutic target for the treatment of ADPKD.

5.
Sci Adv ; 9(29): eadg5953, 2023 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-37478179

RESUMO

HIV-1 infection depends on the integration of viral DNA into host chromatin. Integration is mediated by the viral enzyme integrase and is blocked by integrase strand transfer inhibitors (INSTIs), first-line antiretroviral therapeutics widely used in the clinic. Resistance to even the best INSTIs is a problem, and the mechanisms of resistance are poorly understood. Here, we analyze combinations of the mutations E138K, G140A/S, and Q148H/K/R, which confer resistance to INSTIs. The investigational drug 4d more effectively inhibited the mutants compared with the approved drug Dolutegravir (DTG). We present 11 new cryo-EM structures of drug-resistant HIV-1 intasomes bound to DTG or 4d, with better than 3-Å resolution. These structures, complemented with free energy simulations, virology, and enzymology, explain the mechanisms of DTG resistance involving E138K + G140A/S + Q148H/K/R and show why 4d maintains potency better than DTG. These data establish a foundation for further development of INSTIs that potently inhibit resistant forms in integrase.


Assuntos
Inibidores de Integrase de HIV , Integrase de HIV , Inibidores de Integrase de HIV/farmacologia , Inibidores de Integrase de HIV/química , Oxazinas/farmacologia , Mutação , Integrase de HIV/genética , Integrase de HIV/química , Integrase de HIV/metabolismo
6.
Elife ; 112022 12 23.
Artigo em Inglês | MEDLINE | ID: mdl-36562610

RESUMO

Inactive conformations of protein kinase catalytic domains where the DFG motif has a "DFG-out" orientation and the activation loop is folded present a druggable binding pocket that is targeted by FDA-approved 'type-II inhibitors' in the treatment of cancers. Tyrosine kinases (TKs) typically show strong binding affinity with a wide spectrum of type-II inhibitors while serine/threonine kinases (STKs) usually bind more weakly which we suggest here is due to differences in the folded to extended conformational equilibrium of the activation loop between TKs vs. STKs. To investigate this, we use sequence covariation analysis with a Potts Hamiltonian statistical energy model to guide absolute binding free-energy molecular dynamics simulations of 74 protein-ligand complexes. Using the calculated binding free energies together with experimental values, we estimated free-energy costs for the large-scale (~17-20 Å) conformational change of the activation loop by an indirect approach, circumventing the very challenging problem of simulating the conformational change directly. We also used the Potts statistical potential to thread large sequence ensembles over active and inactive kinase states. The structure-based and sequence-based analyses are consistent; together they suggest TKs evolved to have free-energy penalties for the classical 'folded activation loop' DFG-out conformation relative to the active conformation, that is, on average, 4-6 kcal/mol smaller than the corresponding values for STKs. Potts statistical energy analysis suggests a molecular basis for this observation, wherein the activation loops of TKs are more weakly 'anchored' against the catalytic loop motif in the active conformation and form more stable substrate-mimicking interactions in the inactive conformation. These results provide insights into the molecular basis for the divergent functional properties of TKs and STKs, and have pharmacological implications for the target selectivity of type-II inhibitors.


Assuntos
Proteínas Serina-Treonina Quinases , Tirosina , Proteínas Serina-Treonina Quinases/metabolismo , Inibidores de Proteínas Quinases/farmacologia , Simulação de Dinâmica Molecular , Conformação Proteica , Treonina , Serina
7.
J Phys Chem B ; 126(50): 10622-10636, 2022 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-36493468

RESUMO

The ability of HIV-1 to rapidly mutate leads to antiretroviral therapy (ART) failure among infected patients. Drug-resistance mutations (DRMs), which cause a fitness penalty to intrinsic viral fitness, are compensated by accessory mutations with favorable epistatic interactions which cause an evolutionary trapping effect, but the kinetics of this overall process has not been well characterized. Here, using a Potts Hamiltonian model describing epistasis combined with kinetic Monte Carlo simulations of evolutionary trajectories, we explore how epistasis modulates the evolutionary dynamics of HIV DRMs. We show how the occurrence of a drug-resistance mutation is contingent on favorable epistatic interactions with many other residues of the sequence background and that subsequent mutations entrench DRMs. We measure the time-autocorrelation of fluctuations in the likelihood of DRMs due to epistatic coupling with the sequence background, which reveals the presence of two evolutionary processes controlling DRM kinetics with two distinct time scales. Further analysis of waiting times for the evolutionary trapping effect to reverse reveals that the sequences which entrench (trap) a DRM are responsible for the slower time scale. We also quantify the overall strength of epistatic effects on the evolutionary kinetics for different mutations and show these are much larger for DRM positions than polymorphic positions, and we also show that trapping of a DRM is often caused by the collective effect of many accessory mutations, rather than a few strongly coupled ones, suggesting the importance of multiresidue sequence variations in HIV evolution. The analysis presented here provides a framework to explore the kinetic pathways through which viral proteins like HIV evolve under drug-selection pressure.


Assuntos
Infecções por HIV , HIV-1 , Humanos , Farmacorresistência Viral/genética , Mutação , Infecções por HIV/tratamento farmacológico , HIV-1/genética , Cinética
8.
PLoS One ; 17(1): e0262314, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35041711

RESUMO

The rapid evolution of HIV is constrained by interactions between mutations which affect viral fitness. In this work, we explore the role of epistasis in determining the mutational fitness landscape of HIV for multiple drug target proteins, including Protease, Reverse Transcriptase, and Integrase. Epistatic interactions between residues modulate the mutation patterns involved in drug resistance, with unambiguous signatures of epistasis best seen in the comparison of the Potts model predicted and experimental HIV sequence "prevalences" expressed as higher-order marginals (beyond triplets) of the sequence probability distribution. In contrast, experimental measures of fitness such as viral replicative capacities generally probe fitness effects of point mutations in a single background, providing weak evidence for epistasis in viral systems. The detectable effects of epistasis are obscured by higher evolutionary conservation at sites. While double mutant cycles in principle, provide one of the best ways to probe epistatic interactions experimentally without reference to a particular background, we show that the analysis is complicated by the small dynamic range of measurements. Overall, we show that global pairwise interaction Potts models are necessary for predicting the mutational landscape of viral proteins.


Assuntos
Epistasia Genética , Aptidão Genética , Infecções por HIV/virologia , Protease de HIV/genética , HIV-1/genética , Mutação , Proteínas Virais/genética , Evolução Molecular , Infecções por HIV/genética , Humanos , Replicação Viral
9.
Proteins ; 90(2): 601-614, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34599827

RESUMO

G-protein-coupled receptors (GPCRs) are the largest family of human membrane proteins and represent the primary targets of about one third of currently marketed drugs. Despite the critical importance, experimental structures have been determined for only a limited portion of GPCRs and functional mechanisms of GPCRs remain poorly understood. Here, we have constructed novel sequence coevolutionary models of the A and B classes of GPCRs and compared them with residue contact frequency maps generated with available experimental structures. Significant portions of structural residue contacts were successfully detected in the sequence-based covariational models. "Exception" residue contacts predicted from sequence coevolutionary models but not available structures added missing links that were important for GPCR activation and allosteric modulation. Moreover, we identified distinct residue contacts involving different sets of functional motifs for GPCR activation, such as the Na+ pocket, CWxP, DRY, PIF, and NPxxY motifs in the class A and the HETx and PxxG motifs in the class B. Finally, we systematically uncovered critical residue contacts tuned by allosteric modulation in the two classes of GPCRs, including those from the activation motifs and particularly the extracellular and intracellular loops in class A GPCRs. These findings provide a promising framework for rational design of ligands to regulate GPCR activation and allosteric modulation.


Assuntos
Receptores Acoplados a Proteínas G , Humanos , Ligantes , Receptores Acoplados a Proteínas G/química
10.
Nat Commun ; 12(1): 6302, 2021 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-34728624

RESUMO

Potts models and variational autoencoders (VAEs) have recently gained popularity as generative protein sequence models (GPSMs) to explore fitness landscapes and predict mutation effects. Despite encouraging results, current model evaluation metrics leave unclear whether GPSMs faithfully reproduce the complex multi-residue mutational patterns observed in natural sequences due to epistasis. Here, we develop a set of sequence statistics to assess the "generative capacity" of three current GPSMs: the pairwise Potts Hamiltonian, the VAE, and the site-independent model. We show that the Potts model's generative capacity is largest, as the higher-order mutational statistics generated by the model agree with those observed for natural sequences, while the VAE's lies between the Potts and site-independent models. Importantly, our work provides a new framework for evaluating and interpreting GPSM accuracy which emphasizes the role of higher-order covariation and epistasis, with broader implications for probabilistic sequence models in general.


Assuntos
Mutação , Proteínas/química , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Aminoácidos , Simulação por Computador , Bases de Dados de Proteínas , Humanos , Modelos Estatísticos , Elementos Estruturais de Proteínas , Proteínas/genética , Relação Estrutura-Atividade
11.
Comput Phys Commun ; 2602021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33716309

RESUMO

Inverse Ising inference is a method for inferring the coupling parameters of a Potts/Ising model based on observed site-covariation, which has found important applications in protein physics for detecting interactions between residues in protein families. We introduce Mi3-GPU ("mee-three", for MCMC Inverse Ising Inference) software for solving the inverse Ising problem for protein-sequence datasets with few analytic approximations, by parallel Markov-Chain Monte-Carlo sampling on GPUs. We also provide tools for analysis and preparation of protein-family Multiple Sequence Alignments (MSAs) to account for finite-sampling issues, which are a major source of error or bias in inverse Ising inference. Our method is "generative" in the sense that the inferred model can be used to generate synthetic MSAs whose mutational statistics (marginals) can be verified to match the dataset MSA statistics up to the limits imposed by the effects of finite sampling. Our GPU implementation enables the construction of models which reproduce the covariation patterns of the observed MSA with a precision that is not possible with more approximate methods. The main components of our method are a GPU-optimized algorithm to greatly accelerate MCMC sampling, combined with a multi-step Quasi-Newton parameter-update scheme using a "Zwanzig reweighting" technique. We demonstrate the ability of this software to produce generative models on typical protein family datasets for sequence lengths L ~ 300 with 21 residue types with tens of millions of inferred parameters in short running times.

12.
Nature ; 585(7825): 357-362, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32939066

RESUMO

Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.


Assuntos
Biologia Computacional/métodos , Matemática , Linguagens de Programação , Design de Software
13.
Elife ; 82019 10 08.
Artigo em Inglês | MEDLINE | ID: mdl-31591964

RESUMO

The development of drug resistance in HIV is the result of primary mutations whose effects on viral fitness depend on the entire genetic background, a phenomenon called 'epistasis'. Based on protein sequences derived from drug-experienced patients in the Stanford HIV database, we use a co-evolutionary (Potts) Hamiltonian model to provide direct confirmation of epistasis involving many simultaneous mutations. Building on earlier work, we show that primary mutations leading to drug resistance can become highly favored (or entrenched) by the complex mutation patterns arising in response to drug therapy despite being disfavored in the wild-type background, and provide the first confirmation of entrenchment for all three drug-target proteins: protease, reverse transcriptase, and integrase; a comparative analysis reveals that NNRTI-induced mutations behave differently from the others. We further show that the likelihood of resistance mutations can vary widely in patient populations, and from the population average compared to specific molecular clones.


Assuntos
Fármacos Anti-HIV/farmacologia , Farmacorresistência Viral , Epistasia Genética , HIV-1/efeitos dos fármacos , HIV-1/genética , Proteínas do Vírus da Imunodeficiência Humana/genética , Humanos , Proteínas Mutantes/genética , Mutação
14.
Phys Rev E ; 99(3-1): 032405, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30999494

RESUMO

Potts statistical models have become a popular and promising way to analyze mutational covariation in protein multiple sequence alignments (MSAs) in order to understand protein structure, function, and fitness. But the statistical limitations of these models, which can have millions of parameters and are fit to MSAs of only thousands or hundreds of effective sequences using a procedure known as inverse Ising inference, are incompletely understood. In this work we predict how model quality degrades as a function of the number of sequences N, sequence length L, amino-acid alphabet size q, and the degree of conservation of the MSA, in different applications of the Potts models: in "fitness" predictions of individual protein sequences, in predictions of the effects of single-point mutations, in "double mutant cycle" predictions of epistasis, and in 3D contact prediction in protein structure. We show how as MSA depth N decreases an "overfitting" effect occurs such that sequences in the training MSA have overestimated fitness, and we predict the magnitude of this effect and discuss how regularization can help correct for it, using a regularization procedure motivated by statistical analysis of the effects of finite sampling. We find that as N decreases the quality of point-mutation effect predictions degrade least, fitness and epistasis predictions degrade more rapidly, and contact predictions are most affected. However, overfitting becomes negligible for MSA depths of more than a few thousand effective sequences, as often used in practice, and regularization becomes less necessary. We discuss the implications of these results for users of Potts covariation analysis.


Assuntos
Modelos Moleculares , Modelos Estatísticos , Proteínas/genética , Proteínas/metabolismo , Algoritmos , Sequência de Aminoácidos , Simulação por Computador , Modelos Genéticos , Mutação , Conformação Proteica , Proteínas/química , Alinhamento de Sequência
15.
Biophys J ; 114(1): 21-31, 2018 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-29320688

RESUMO

The protein kinase catalytic domain is one of the most abundant domains across all branches of life. Although kinases share a common core function of phosphoryl-transfer, they also have wide functional diversity and play varied roles in cell signaling networks, and for this reason are implicated in a number of human diseases. This functional diversity is primarily achieved through sequence variation, and uncovering the sequence-function relationships for the kinase family is a major challenge. In this study we use a statistical inference technique inspired by statistical physics, which builds a coevolutionary "Potts" Hamiltonian model of sequence variation in a protein family. We show how this model has sufficient power to predict the probability of specific subsequences in the highly diverged kinase family, which we verify by comparing the model's predictions with experimental observations in the Uniprot database. We show that the pairwise (residue-residue) interaction terms of the statistical model are necessary and sufficient to capture higher-than-pairwise mutation patterns of natural kinase sequences. We observe that previously identified functional sets of residues have much stronger correlated interaction scores than are typical.


Assuntos
Evolução Molecular , Proteínas Quinases/química , Proteínas Quinases/metabolismo , Motivos de Aminoácidos , Método de Monte Carlo , Mutação , Probabilidade , Proteínas Quinases/genética
16.
Mol Biol Evol ; 34(6): 1291-1306, 2017 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-28369521

RESUMO

Understanding the complex mutation patterns that give rise to drug resistant viral strains provides a foundation for developing more effective treatment strategies for HIV/AIDS. Multiple sequence alignments of drug-experienced HIV-1 protease sequences contain networks of many pair correlations which can be used to build a (Potts) Hamiltonian model of these mutation patterns. Using this Hamiltonian model, we translate HIV-1 protease sequence covariation data into quantitative predictions for the probability of observing specific mutation patterns which are in agreement with the observed sequence statistics. We find that the statistical energies of the Potts model are correlated with the fitness of individual proteins containing therapy-associated mutations as estimated by in vitro measurements of protein stability and viral infectivity. We show that the penalty for acquiring primary resistance mutations depends on the epistatic interactions with the sequence background. Primary mutations which lead to drug resistance can become highly advantageous (or entrenched) by the complex mutation patterns which arise in response to drug therapy despite being destabilizing in the wildtype background. Anticipating epistatic effects is important for the design of future protease inhibitor therapies.


Assuntos
Farmacorresistência Viral/genética , Protease de HIV/genética , Sequência de Aminoácidos , Simulação por Computador , Epistasia Genética/genética , Infecções por HIV , Protease de HIV/metabolismo , HIV-1/genética , Humanos , Modelos Moleculares , Mutação , Alinhamento de Sequência
17.
Curr Opin Struct Biol ; 43: 55-62, 2017 04.
Artigo em Inglês | MEDLINE | ID: mdl-27870991

RESUMO

Potts Hamiltonian models of protein sequence co-variation are statistical models constructed from the pair correlations observed in a multiple sequence alignment (MSA) of a protein family. These models are powerful because they capture higher order correlations induced by mutations evolving under constraints and help quantify the connections between protein sequence, structure, and function maintained through evolution. We review recent work with Potts models to predict protein structure and sequence-dependent conformational free energy landscapes, to survey protein fitness landscapes and to explore the effects of epistasis on fitness. We also comment on the numerical methods used to infer these models for each application.


Assuntos
Evolução Molecular , Modelos Moleculares , Proteínas/genética , Proteínas/metabolismo , Epistasia Genética , Proteínas/química , Termodinâmica
18.
Protein Sci ; 25(8): 1378-84, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27241634

RESUMO

Understanding the conformational propensities of proteins is key to solving many problems in structural biology and biophysics. The co-variation of pairs of mutations contained in multiple sequence alignments of protein families can be used to build a Potts Hamiltonian model of the sequence patterns which accurately predicts structural contacts. This observation paves the way to develop deeper connections between evolutionary fitness landscapes of entire protein families and the corresponding free energy landscapes which determine the conformational propensities of individual proteins. Using statistical energies determined from the Potts model and an alignment of 2896 PDB structures, we predict the propensity for particular kinase family proteins to assume a "DFG-out" conformation implicated in the susceptibility of some kinases to type-II inhibitors, and validate the predictions by comparison with the observed structural propensities of the corresponding proteins and experimental binding affinity data. We decompose the statistical energies to investigate which interactions contribute the most to the conformational preference for particular sequences and the corresponding proteins. We find that interactions involving the activation loop and the C-helix and HRD motif are primarily responsible for stabilizing the DFG-in state. This work illustrates how structural free energy landscapes and fitness landscapes of proteins can be used in an integrated way, and in the context of kinase family proteins, can potentially impact therapeutic design strategies.


Assuntos
Proteína Quinase 14 Ativada por Mitógeno/antagonistas & inibidores , Proteínas Oncogênicas v-abl/antagonistas & inibidores , Inibidores de Proteínas Quinases/química , Motivos de Aminoácidos , Bases de Dados de Proteínas , Humanos , Cinética , Ligantes , Proteína Quinase 14 Ativada por Mitógeno/química , Modelos Moleculares , Proteínas Oncogênicas v-abl/química , Ligação Proteica , Domínios Proteicos , Estrutura Secundária de Proteína , Homologia Estrutural de Proteína , Termodinâmica
19.
PLoS Comput Biol ; 10(7): e1003683, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25010228

RESUMO

Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions.


Assuntos
Sítios de Ligação/fisiologia , DNA , Modelos Biológicos , Fatores de Transcrição , Biologia Computacional , DNA/química , DNA/metabolismo , Ligação Proteica , Proteínas de Saccharomyces cerevisiae , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo
20.
Theor Popul Biol ; 82(1): 66-76, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22838027

RESUMO

Monomorphic loci evolve through a series of substitutions on a fitness landscape. Understanding how mutation, selection, and genetic drift drive this process, and uncovering the structure of the fitness landscape from genomic data are two major goals of evolutionary theory. Population genetics models of the substitution process have traditionally focused on the weak-selection regime, which is accurately described by diffusion theory. Predictions in this regime can be considered universal in the sense that many population models exhibit equivalent behavior in the diffusion limit. However, a growing number of experimental studies suggest that strong selection plays a key role in some systems, and thus there is a need to understand universal properties of models without a priori assumptions about selection strength. Here we study time reversibility in a general substitution model of a monomorphic haploid population. We show that for any time-reversible population model, such as the Moran process, substitution rates obey an exact scaling law. For several other irreversible models, such as the simple Wright­Fisher process and its extensions, the scaling law is accurate up to selection strengths that are well outside the diffusion regime. Time reversibility gives rise to a power-law expression for the steady-state distribution of populations on an arbitrary fitness landscape. The steady-state behavior is dominated by weak selection and is thus adequately described by the diffusion approximation, which guarantees universality of the steady-state formula and its applicability to the problem of reconstructing fitness landscapes from DNA or protein sequence data.


Assuntos
Evolução Biológica , Modelos Teóricos , Dinâmica Populacional
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...