RESUMO
Transcription factors are challenging to target with small-molecule inhibitors due to their structural plasticity and lack of catalytic sites. Notable exceptions include naturally ligand-regulated transcription factors, including our prior work with the hypoxia-inducible factor (HIF)-2 transcription factor, showing that small-molecule binding within an internal pocket of the HIF-2α Per-Aryl hydrocarbon Receptor Nuclear Translocator (ARNT)-Sim (PAS)-B domain can disrupt its interactions with its dimerization partner, ARNT. Here, we explore the feasibility of targeting small molecules to the analogous ARNT PAS-B domain itself, potentially opening a promising route to modulate several ARNT-mediated signaling pathways. Using solution NMR fragment screening, we previously identified several compounds that bind ARNT PAS-B and, in certain cases, antagonize ARNT association with the transforming acidic coiled-coil containing protein 3 transcriptional coactivator. However, these ligands have only modest binding affinities, complicating characterization of their binding sites. We address this challenge by combining NMR, molecular dynamics simulations, and ensemble docking to identify ligand-binding "hotspots" on and within the ARNT PAS-B domain. Our data indicate that the two ARNT/transforming acidic coiled-coil containing protein 3 inhibitors, KG-548 and KG-655, bind to a ß-sheet surface implicated in both HIF-2 dimerization and coactivator recruitment. Furthermore, while KG-548 binds exclusively to the ß-sheet surface, KG-655 can additionally bind within a water-accessible internal cavity in ARNT PAS-B. Finally, KG-279, while not a coactivator inhibitor, exemplifies ligands that preferentially bind only to the internal cavity. All three ligands promoted ARNT PAS-B homodimerization, albeit to varying degrees. Taken together, our findings provide a comprehensive overview of ARNT PAS-B ligand-binding sites and may guide the development of more potent coactivator inhibitors for cellular and functional studies.
Assuntos
Translocador Nuclear Receptor Aril Hidrocarboneto , Fatores de Transcrição Hélice-Alça-Hélice Básicos , Translocador Nuclear Receptor Aril Hidrocarboneto/metabolismo , Translocador Nuclear Receptor Aril Hidrocarboneto/química , Translocador Nuclear Receptor Aril Hidrocarboneto/antagonistas & inibidores , Humanos , Ligantes , Sítios de Ligação , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Fatores de Transcrição Hélice-Alça-Hélice Básicos/química , Fatores de Transcrição Hélice-Alça-Hélice Básicos/antagonistas & inibidores , Domínios Proteicos , Ligação Proteica , Multimerização Proteica , Bibliotecas de Moléculas Pequenas/farmacologia , Bibliotecas de Moléculas Pequenas/químicaRESUMO
An ongoing challenge to chemists is the analysis of pathways and kinetics for chemical reactions in solution, including transient structures between the reactants and products that are difficult to resolve using laboratory experiments. Here, we enabled direct molecular dynamics simulations of a textbook series of chemical reactions on the hundreds of ns to µs time scale using the weighted ensemble (WE) path sampling strategy with hybrid quantum mechanical/molecular mechanical (QM/MM) models. We focused on azide-clock reactions involving addition of an azide anion to each of three long-lived trityl cations in an acetonitrile-water solvent mixture. Results reveal a two-step mechanism: (1) diffusional collision of reactants to form an ion-pair intermediate; (2) "activation" or rearrangement of the intermediate to the product. Our simulations yield not only reaction rates that are within error of experiment but also rates for individual steps, indicating the activation step as rate-limiting for all three cations. Further, the trend in reaction rates is due to dynamical effects, i.e., differing extents of the azide anion "crawling" along the cation's phenyl-ring "propellers" during the activation step. Our study demonstrates the power of analyzing pathways and kinetics to gain insights on reaction mechanisms, underscoring the value of including WE and other related path sampling strategies in the modern toolbox for chemists.
RESUMO
Given the growing interest in path sampling methods for extending the time scales of molecular dynamics (MD) simulations, there has been great interest in software tools that streamline the generation of plots for monitoring the progress of large-scale simulations. Here, we present the WEDAP Python package for simplifying the analysis of data generated from either conventional MD simulations or the weighted ensemble (WE) path sampling method, as implemented in the widely used WESTPA software package. WEDAP facilitates (i) the parsing of WE simulation data stored in highly compressed, hierarchical HDF5 files and (ii) incorporates trajectory weights from WE simulations into all generated plots. Our Python package consists of multiple user-friendly interfaces: a command-line interface, a graphical user interface, and a Python application programming interface. We demonstrate the plotting features of WEDAP through a series of examples using data from WE and conventional MD simulations that focus on the HIV-1 capsid protein's C-terminal domain dimer as a showcase system. The source code for WEDAP is freely available on GitHub at https://github.com/chonglab-pitt/wedap.
Assuntos
Simulação de Dinâmica Molecular , Software , Interface Usuário-Computador , HIV-1 , Proteínas do Capsídeo/químicaRESUMO
The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here, we present the LPATH Python tool, which implements a semiautomated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of the alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.
Assuntos
Dipeptídeos , Simulação de Dinâmica Molecular , Dipeptídeos/química , Software , Conformação Molecular , Análise por ConglomeradosRESUMO
Passive permeability of a drug-like molecule is a critical property assayed early in a drug discovery campaign that informs a medicinal chemist how well a compound can traverse biological membranes, such as gastrointestinal epithelial or restrictive organ barriers, so it can perform a specific therapeutic function. However, the challenge that remains is the development of a method, experimental or computational, which can both determine the permeation rate and provide mechanistic insights into the transport process to help with the rational design of any given molecule. Typically, one of the following three methods are used to measure the membrane permeability: (1) experimental permeation assays acting on either artificial or natural membranes; (2) quantitative structure-permeability relationship models that rely on experimental values of permeability or related pharmacokinetic properties of a range of molecules to infer those for new molecules; and (3) estimation of permeability from the Smoluchowski equation, where free energy and diffusion profiles along the membrane normal are taken as input from large-scale molecular dynamics simulations. While all these methods provide estimates of permeation coefficients, they provide very little information for guiding rational drug design. In this study, we employ a highly parallelizable weighted ensemble (WE) path sampling strategy, empowered by cloud computing techniques, to generate unbiased permeation pathways and permeability coefficients for a set of drug-like molecules across a neat 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphatidylcholine membrane bilayer. Our WE method predicts permeability coefficients that compare well to experimental values from an MDCK-LE cell line and PAMPA assays for a set of drug-like amines of varying size, shape, and flexibility. Our method also yields a series of continuous permeation pathways weighted and ranked by their associated probabilities. Taken together, the ensemble of reactive permeation pathways, along with the estimate of the permeability coefficient, provides a clearer picture of the microscopic underpinnings of small-molecule membrane permeation.
Assuntos
Bicamadas Lipídicas , Fosfatidilcolinas , Permeabilidade da Membrana Celular , Difusão , Simulação de Dinâmica Molecular , PermeabilidadeRESUMO
We developed force field parameters for fluorinated, aromatic amino acids enabling molecular dynamics (MD) simulations of fluorinated proteins. These parameters are tailored to the AMBER ff15ipq protein force field and enable the modeling of 4, 5, 6, and 7F-tryptophan, 3F- and 3,5F-tyrosine, and 4F- or 4-CF3-phenylalanine. The parameters include 181 unique atomic charges derived using the implicitly polarized charge (IPolQ) scheme in the presence of SPC/Eb explicit water molecules and 9 unique bond, angle, or torsion terms. Our simulations of benchmark peptides and proteins maintain expected conformational propensities on the µs time scale. In addition, we have developed an open-source Python program to calculate fluorine relaxation rates from MD simulations. The extracted relaxation rates from protein simulations are in good agreement with experimental values determined by 19F NMR. Collectively, our results illustrate the power and robustness of the IPolQ lineage of force fields for modeling the structure and dynamics of fluorine-containing proteins at the atomic level.
Assuntos
Flúor , Proteínas , Aminoácidos Aromáticos , Conformação Molecular , Simulação de Dinâmica Molecular , Proteínas/químicaRESUMO
A promising approach for simulating rare events with rigorous kinetics is the weighted ensemble path sampling strategy. One challenge of this strategy is the division of configurational space into bins for sampling. Here we present a minimal adaptive binning (MAB) scheme for the automated, adaptive placement of bins along a progress coordinate within the framework of the weighted ensemble strategy. Results reveal that the MAB binning scheme, despite its simplicity, is more efficient than a manual, fixed binning scheme in generating transitions over large free energy barriers, generating a diversity of pathways, estimating rate constants, and sampling conformations. The scheme is general and extensible to any rare-events sampling strategy that employs progress coordinates.
RESUMO
We present the Rate from Event Durations (RED) scheme, a new scheme that more efficiently calculates rate constants using the weighted ensemble path sampling strategy. This scheme enables rate-constant estimation from shorter trajectories by incorporating the probability distribution of event durations, or barrier-crossing times, from a simulation. We have applied the RED scheme to weighted ensemble simulations of a variety of rare-event processes that range in complexity: residue-level simulations of protein conformational switching, atomistic simulations of Na+/Cl- association in explicit solvent, and atomistic simulations of protein-protein association in explicit solvent. Rate constants were estimated with up to 50% greater efficiency than the original weighted ensemble scheme. Importantly, our scheme accounts for the systematic error that results from statistical bias toward the observation of events with short durations and reweights the event duration distribution accordingly. The RED scheme is relevant to any simulation strategy that involves unbiased trajectories of similar length to the most probable event duration, including weighted ensemble, milestoning, and standard simulations as well as the construction of Markov state models.
RESUMO
We develop a generalizable AI-driven workflow that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems. We use this workflow to investigate the mechanisms of infectivity of the SARS-CoV-2 spike protein, the main viral infection machinery. Our workflow enables more efficient investigation of spike dynamics in a variety of complex environments, including within a complete SARS-CoV-2 viral envelope simulation, which contains 305 million atoms and shows strong scaling on ORNL Summit using NAMD. We present several novel scientific discoveries, including the elucidation of the spike's full glycan shield, the role of spike glycans in modulating the infectivity of the virus, and the characterization of the flexible interactions between the spike and the human ACE2 receptor. We also demonstrate how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.
RESUMO
We present a new force field, AMBER ff15ipq-m, for simulations of protein mimetics in applications from therapeutics to biomaterials. This force field is an expansion of the AMBER ff15ipq force field that was developed for canonical proteins and enables the modeling of four classes of artificial backbone units that are commonly used alongside natural α residues in blended or "heterogeneous" backbones: chirality-reversed D-α-residues, the Cα-methylated α-residue Aib, homologated ß-residues (ß3) bearing proteinogenic side chains, and two cyclic ß residues (ßcyc; APC and ACPC). The ff15ipq-m force field includes 472 unique atomic charges and 148 unique torsion terms. Consistent with the AMBER IPolQ lineage of force fields, the charges were derived using the Implicitly Polarized Charge (IPolQ) scheme in the presence of explicit solvent. To our knowledge, no general force field reported to date models the combination of artificial building blocks examined here. In addition, we have derived Karplus coefficients for the calculation of backbone amide J-coupling constants for ß3Ala and ACPC ß residues. The AMBER ff15ipq-m force field reproduces experimentally observed J-coupling constants in simple tetrapeptides and maintains the expected conformational propensities in reported structures of proteins/peptides containing the artificial building blocks of interest-all on the µs timescale. These encouraging results demonstrate the power and robustness of the IPolQ lineage of force fields in modeling the structure and dynamics of natural proteins as well as mimetics with protein-inspired artificial backbones in atomic detail.
RESUMO
Multidomain proteins with two or more independently folded functional domains are prevalent in nature. Whereas most multidomain proteins are linked linearly in sequence, roughly one-tenth possess domain insertions where a guest domain is implanted into a loop of a host domain, such that the two domains are connected by a pair of interdomain linkers. Here, we characterized the influence of the interdomain linkers on the structure and dynamics of a domain-insertion protein in which the guest LysM domain is inserted into a central loop of the host CVNH domain. Expanding upon our previous crystallographic and NMR studies, we applied SAXS in combination with NMR paramagnetic relaxation enhancement to construct a structural model of the overall two-domain system. Although the two domains have no fixed relative orientation, certain orientations were found to be preferred over others. We also assessed the accuracies of molecular mechanics force fields in modeling the structure and dynamics of tethered multidomain proteins by integrating our experimental results with microsecond-scale atomistic molecular dynamics simulations. In particular, our evaluation of two different combinations of the latest force fields and water models revealed that both combinations accurately reproduce certain structural and dynamical properties, but are inaccurate for others. Overall, our study illustrates the value of integrating experimental NMR and SAXS studies with long timescale atomistic simulations for characterizing structural ensembles of flexibly linked multidomain systems.
Assuntos
Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Magnaporthe/metabolismo , Complexos Multiproteicos/química , Complexos Multiproteicos/metabolismo , Ressonância Magnética Nuclear Biomolecular/métodos , Espalhamento a Baixo Ângulo , Modelos Moleculares , Simulação de Dinâmica Molecular , Conformação Proteica , Domínios Proteicos , Difração de Raios XRESUMO
The ff15ipq protein force field is a fixed charge model built by automated tools based on the two charge sets of the implicitly polarized charge method: one set (appropriate for vacuum) for deriving bonded parameters and the other (appropriate for aqueous solution) for running simulations. The duality is intended to treat water-induced electronic polarization with an understanding that fitting data for bonded parameters will come from quantum mechanical calculations in the gas phase. In this study, we compare ff15ipq to two alternatives produced with the same fitting software and a further expanded data set but following more conventional methods for tailoring bonded parameters (harmonic angle terms and torsion potentials) to the charge model. First, ff15ipq-Qsolv derives bonded parameters in the context of the ff15ipq solution phase charge set. Second, ff15ipq-Vac takes ff15ipq's bonded parameters and runs simulations with the vacuum phase charge set used to derive those parameters. The IPolQ charge model and associated protocol for deriving bonded parameters are shown to be an incremental improvement over protocols that do not account for the material phases of each source of their fitting data. Both force fields incorporating the polarized charge set depict stable globular proteins and have varying degrees of success modeling the metastability of short (5-19 residues) peptides. In this particular case, ff15ipq-Qsolv increases stability in a number of α-helices, correctly obtaining 70% helical character in the K19 system at 275 K and showing appropriately diminishing content up to 325 K, but overestimating the helical fraction of AAQAA3 by 50% or more, forming long-lived α-helices in simulations of a ß-hairpin, and increasing the likelihood that the disordered p53 N-terminal peptide will also form a helix. This may indicate a systematic bias imparted by the ff15ipq-Qsolv parameter development strategy, which has the hallmarks of strategies used to develop other popular force fields, and may explain some of the need for manual corrections in this force fields' evolution. In contrast, ff15ipq-Vac incorrectly depicts globular protein unfolding in numerous systems tested, including Trp cage, villin, lysozyme, and GB3, and does not perform any better than ff15ipq or ff15ipq-Qsolv in tests on short peptides. We analyze the free energy surfaces of individual amino acid dipeptides and the electrostatic potential energy surfaces of each charge model to explain the differences.
Assuntos
Oligopeptídeos/química , Proteínas/química , Simulação de Dinâmica Molecular , Ressonância Magnética Nuclear Biomolecular , Estrutura Secundária de Proteína , TermodinâmicaRESUMO
Given the growing interest in path sampling methods for extending the timescales of molecular dynamics (MD) simulations, there has been great interest in software tools that streamline the generation of plots for monitoring the progress of large-scale simulations. Here, we present the WEDAP Python package for simplifying the analysis of data generated from either conventional MD simulations or the weighted ensemble (WE) path sampling method, as implemented in the widely used WESTPA software package. WEDAP facilitates (i) the parsing of WE simulation data stored in highly compressed, hierarchical HDF5 files, and (ii) incorporates trajectory weights from WE simulations into all generated plots. Our Python package consists of multiple user-friendly interfaces: a command-line interface, a graphical user interface, and a Python application programming interface. We demonstrate the plotting features of WEDAP through a series of examples using data from WE and conventional MD simulations that focus on the HIV-1 capsid protein C-terminal domain dimer as a showcase system. The source code for WEDAP is freely available on GitHub at https://github.com/chonglab-pitt/wedap .
RESUMO
Despite the power of path sampling strategies in enabling simulations of rare events, such strategies have not reached their full potential. A common challenge that remains is the identification of a progress coordinate that captures the slow relevant motions of a rare event. Here we have developed a weighted ensemble (WE) path sampling strategy that exploits reinforcement learning to automatically identify an effective progress coordinate among a set of potential coordinates during a simulation. We apply our WE strategy with reinforcement learning to three benchmark systems: (i) an egg carton-shaped toy potential, (ii) an S-shaped toy potential, and (iii) a dimer of the HIV-1 capsid protein (C-terminal domain). To enable rapid testing of the latter system at the atomic level, we employed discrete-state synthetic molecular dynamics trajectories using a generative, fine-grained Markov state model that was based on extensive conventional simulations. Our results demonstrate that using concepts from reinforcement learning with a weighted ensemble of trajectories automatically identifies relevant progress co-ordinates among multiple candidates at a given time during a simulation. Due to the rigorous weighting of trajectories, the simulations maintain rigorous kinetics.
RESUMO
Sequence-encoded protein folding is a ubiquitous biological process that has been successfully engineered in a range of oligomeric molecules with artificial backbone chemical connectivity. A remarkable aspect of protein folding is the contrast between the rapid rates at which most sequences in nature fold and the vast number of conformational states possible in an unfolded chain with hundreds of rotatable bonds. Research efforts spanning several decades have sought to elucidate the fundamental chemical principles that dictate the speed and mechanism of natural protein folding. In contrast, little is known about how protein mimetic entities transition between an unfolded and folded state. Here, we report effects of altered backbone connectivity on the folding kinetics and mechanism of the B domain of Staphylococcal protein A (BdpA), an ultrafast-folding sequence. A combination of experimental biophysical analysis and atomistic molecular dynamics simulations performed on the prototype protein and several heterogeneous-backbone variants reveal the interplay among backbone flexibility, folding rates, and structural details of the transition state ensemble. Collectively, these findings suggest a significant degree of plasticity in the mechanisms that can give rise to ultrafast folding in the BdpA sequence and provide atomic level insights into how protein mimetic chains adopt an ordered folded state.
RESUMO
While transcription factors have been generally perceived as "undruggable," an exception is the HIF-2 hypoxia-inducible transcription factor, which contains an internal cavity that is sufficiently large to accommodate a range of small-molecules, including the therapeutically used inhibitor belzutifan. Given the relatively long ligand residence times of these small molecules and the lack of any experimentally observed pathway connecting the cavity to solvent, there has been great interest in understanding how these drug ligands exit the buried receptor cavity. Here, we focus on the relevant PAS-B domain of hypoxia-inducible factor 2α (HIF-2α) and examine how one such small molecule (THS-017) exits from the buried cavity within this domain on the seconds-timescale using atomistic simulations and ZZ-exchange NMR. To enable the simulations, we applied the weighted ensemble path sampling strategy, which generates continuous pathways for a rare-event process [e.g., ligand (un)binding] with rigorous kinetics in orders of magnitude less computing time compared to conventional simulations. Results reveal the formation of an encounter complex intermediate and two distinct classes of pathways for ligand exit. Based on these pathways, we identified two pairs of conformational gating residues in the receptor: one for the major class (N288 and S304) and another for the minor class (L272 and M309). ZZ-exchange NMR validated the kinetic importance of N288 for ligand unbinding. Our results provide an ideal simulation dataset for rational manipulation of ligand unbinding kinetics.
Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos , Fatores de Transcrição Hélice-Alça-Hélice Básicos/química , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Ligantes , Cinética , Humanos , Simulação de Dinâmica Molecular , Ligação ProteicaRESUMO
The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here we present the LPATH Python tool, which implements a semi-automated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.
RESUMO
Transcription factors are generally challenging to target with small molecule inhibitors due to their structural plasticity and lack of catalytic sites. Notable exceptions to this include a number of transcription factors which are naturally ligand-regulated, a strategy we have successfully exploited with the heterodimeric HIF-2 transcription factor, showing that a ligand-binding internal pocket in the HIF-2α PAS-B domain could be utilized to disrupt its dimerization with its partner, ARNT. Here, we explore the feasibility of directly targeting small molecules to the structurally similar ARNT PAS-B domain, potentially opening a promising route to simultaneously modulate several ARNT-mediated signaling pathways. Using solution NMR screening of an in-house fragment library, we previously identified several compounds that bind ARNT PAS-B and, in certain cases, antagonize ARNT association with the TACC3 transcriptional coactivator. However, these ligands only have mid-micromolar binding affinities, complicating characterization of their binding sites. Here we combine NMR, MD simulations, and ensemble docking to identify ligand-binding 'hotspots' on and within the ARNT PAS-B domain. Our data indicate that the two ARNT/TACC3 inhibitors, KG-548 and KG-655, bind to a ß-sheet surface implicated in both HIF-2 dimerization and coactivator recruitment. Furthermore, KG-548 binds exclusively to the ß-sheet surface, while KG-655 binds to the same site but can also enter a water-accessible internal cavity in ARNT PAS-B. Finally, KG-279, while not a coactivator inhibitor, exemplifies ligands that preferentially bind only to the internal cavity. Taken together, our findings provide a comprehensive overview of ARNT PAS-B ligand-binding sites and may guide the development of more potent coactivator inhibitors for cellular and functional studies.
RESUMO
The weighted ensemble (WE) strategy has been demonstrated to be highly efficient in generating pathways and rate constants for rare events such as protein folding and protein binding using atomistic molecular dynamics simulations. Here we present two sets of tutorials instructing users in the best practices for preparing, carrying out, and analyzing WE simulations for various applications using the WESTPA software. The first set of more basic tutorials describes a range of simulation types, from a molecular association process in explicit solvent to more complex processes such as host-guest association, peptide conformational sampling, and protein folding. The second set ecompasses six advanced tutorials instructing users in the best practices of using key new features and plugins/extensions of the WESTPA 2.0 software package, which consists of major upgrades for larger systems and/or slower processes. The advanced tutorials demonstrate the use of the following key features: (i) a generalized resampler module for the creation of "binless" schemes, (ii) a minimal adaptive binning scheme for more efficient surmounting of free energy barriers, (iii) streamlined handling of large simulation datasets using an HDF5 framework, (iv) two different schemes for more efficient rate-constant estimation, (v) a Python API for simplified analysis of WE simulations, and (vi) plugins/extensions for Markovian Weighted Ensemble Milestoning and WE rule-based modeling for systems biology models. Applications of the advanced tutorials include atomistic and non-spatial models, and consist of complex processes such as protein folding and the membrane permeability of a drug-like molecule. Users are expected to already have significant experience with running conventional molecular dynamics or systems biology simulations.
RESUMO
The dimerization domain of the yeast transcription factor GCN4, one of the first coiled-coil proteins to be structurally characterized at high resolution, has served as the basis for numerous fundamental studies on α-helical folding. Mutations in the GCN4 leucine zipper are known to change its preferred oligomerization state from dimeric to trimeric or tetrameric; however, the wild-type sequence has been assumed to encode a two-chain assembly exclusively. Here we demonstrate that the GCN4 coiled-coil domain can populate either a dimer or trimer fold, depending on environment. We report high-resolution crystal structures of the wild-type sequence in dimeric and trimeric assemblies. Biophysical measurements suggest populations of both oligomerization states under certain experimental conditions in solution. We use parallel tempering molecular dynamics simulations on the microsecond time scale to compare the stability of the dimer and trimer folded states in isolation. In total, our results suggest that the folding behavior of the well-studied GCN4 leucine-zipper domain is more complex than was previously appreciated. Our results have implications in ongoing efforts to establish predictive algorithms for coiled-coil folds and the selection of coiled-coil model systems for design and mutational studies where oligomerization state specificity is an important consideration.