RESUMO
High-throughput computational materials discovery has promised significant acceleration of the design and discovery of new materials for many years. Despite a surge in interest and activity, the constraints imposed by large-scale computational resources present a significant bottleneck. Furthermore, examples of very large-scale computational discovery carried out through experimental validation remain scarce, especially for materials with product applicability. Here, we demonstrate how this vision became reality by combining state-of-the-art machine learning (ML) models and traditional physics-based models on cloud high-performance computing (HPC) resources to quickly navigate through more than 32 million candidates and predict around half a million potentially stable materials. By focusing on solid-state electrolytes for battery applications, our discovery pipeline further identified 18 promising candidates with new compositions and rediscovered a decade's worth of collective knowledge in the field as a byproduct. We then synthesized and experimentally characterized the structures and conductivities of our top candidates, the NaxLi3-xYCl6 (0≤ x≤ 3) series, demonstrating the potential of these compounds to serve as solid electrolytes. Additional candidate materials that are currently under experimental investigation could offer more examples of the computational discovery of new phases of Li- and Na-conducting solid electrolytes. The showcased screening of millions of materials candidates highlights the transformative potential of advanced ML and HPC methodologies, propelling materials discovery into a new era of efficiency and innovation.
RESUMO
We present a data-driven approach to determine the memory kernel and random noise in generalized Langevin equations. To facilitate practical implementations, we parameterize the kernel function in the Laplace domain by a rational function, with coefficients directly linked to the equilibrium statistics of the coarse-grain variables. We show that such an approximation can be constructed to arbitrarily high order and the resulting generalized Langevin dynamics can be embedded in an extended stochastic model without explicit memory. We demonstrate how to introduce the stochastic noise so that the second fluctuation-dissipation theorem is exactly satisfied. Results from several numerical tests are presented to demonstrate the effectiveness of the proposed method.
RESUMO
The challenge of quantifying uncertainty propagation in real-world systems is rooted in the high-dimensionality of the stochastic input and the frequent lack of explicit knowledge of its probability distribution. Traditional approaches show limitations for such problems, especially when the size of the training data is limited. To address these difficulties, we have developed a general framework of constructing surrogate models on spaces of stochastic input with arbitrary probability measure irrespective of the mutual dependencies between individual components of the random inputs and the analytical form. The present Data-driven Sparsity-enhancing Rotation for Arbitrary Randomness (DSRAR) framework includes a data-driven construction of multivariate polynomial basis for arbitrary mutually dependent probability measures and a sparsity enhancement rotation procedure. This sparsity-enhancing rotation method was initially proposed in our previous work [1] for Gaussian density distributions, which may not be feasible for non-Gaussian distributions due to the loss of orthogonality after the rotation. To remedy such difficulties, we developed a new data-driven approach to construct orthonormal polynomials for arbitrary mutually dependent randomness, ensuring the constructed basis maintains the orthogonality/near-orthogonality with respect to the density of the rotated random vector, where directly applying the regular polynomial chaos including arbitrary polynomial chaos (aPC) [2] shows limitations due to the assumption of the mutual independence between the components of the random inputs. The developed DSRAR framework leads to accurate recovery, with only limited training data, of a sparse representation of the target functions. The effectiveness of our method is demonstrated in challenging problems such as partial differential equations and realistic molecular systems within high-dimensional (O(10)) conformational spaces where the underlying density is implicitly represented by a large collection of sample data, as well as systems with explicitly given non-Gaussian probabilistic measures.
RESUMO
We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized PB equation, for molecules represented as non-overlapping spherical cavities. The PB-AM software package includes the generation of outputs files appropriate for visualization using visual molecular dynamics, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmann Solver (APBS) software package to make it more accessible to a larger group of scientists, educators, and students that are more familiar with the APBS framework. © 2016 Wiley Periodicals, Inc.
Assuntos
Simulação de Dinâmica Molecular , Proteínas/química , Software , Algoritmos , Cinética , Eletricidade EstáticaRESUMO
The ionic atmospheres around nucleic acids play important roles in biological function. Large-scale explicit solvent simulations coupled to experimental assays such as anomalous small-angle x-ray scattering can provide important insights into the structure and energetics of such atmospheres but are time- and resource intensive. In this article, we use classical density functional theory to explore the balance among ion-DNA, ion-water, and ion-ion interactions in ionic atmospheres of RbCl, SrCl2, and CoHexCl3 (cobalt hexamine chloride) around a B-form DNA molecule. The accuracy of the classical density functional theory calculations was assessed by comparison between simulated and experimental anomalous small-angle x-ray scattering curves, demonstrating that an accurate model should take into account ion-ion correlation and ion hydration forces, DNA topology, and the discrete distribution of charges on the DNA backbone. As expected, these calculations revealed significant differences among monovalent, divalent, and trivalent cation distributions around DNA. Approximately half of the DNA-bound Rb(+) ions penetrate into the minor groove of the DNA and half adsorb on the DNA backbone. The fraction of cations in the minor groove decreases for the larger Sr(2+) ions and becomes zero for CoHex(3+) ions, which all adsorb on the DNA backbone. The distribution of CoHex(3+) ions is mainly determined by Coulomb and steric interactions, while ion-correlation forces play a central role in the monovalent Rb(+) distribution and a combination of ion-correlation and hydration forces affect the Sr(2+) distribution around DNA. This does not imply that correlations in CoHex solutions are weaker or stronger than for other ions. Steric inaccessibility of the grooves to large CoHex ions leads to their binding at the DNA surface. In this binding mode, first-order electrostatic interactions (Coulomb) dominate the overall binding energy as evidenced by low sensitivity of ionic distribution to the presence or absence of second-order electrostatic correlation interactions.
Assuntos
Cobalto/química , DNA de Forma B/química , Rubídio/química , Estrôncio/química , Eletricidade EstáticaRESUMO
We present a semi-quantitative model of condensation of short nucleic acid (NA) duplexes induced by trivalent cobalt(iii) hexammine (CoHex) ions. The model is based on partitioning of bound counterion distribution around single NA duplex into "external" and "internal" ion binding shells distinguished by the proximity to duplex helical axis. In the aggregated phase the shells overlap, which leads to significantly increased attraction of CoHex ions in these overlaps with the neighboring duplexes. The duplex aggregationfree energy is decomposed into attractive and repulsive components in such a way that they can be represented by simple analytical expressions with parameters derived from molecular dynamic simulations and numerical solutions of Poisson equation. The attractive term depends on the fractions of bound ions in the overlapping shells and affinity of CoHex to the "external" shell of nearly neutralized duplex. The repulsive components of the free energy are duplex configurational entropy loss upon the aggregation and the electrostatic repulsion of the duplexes that remains after neutralization by bound CoHex ions. The estimates of the aggregationfree energy are consistent with the experimental range of NA duplex condensation propensities, including the unusually poor condensation of RNA structures and subtle sequence effects upon DNAcondensation. The model predicts that, in contrast to DNA, RNA duplexes may condense into tighter packed aggregates with a higher degree of duplex neutralization. An appreciable CoHex mediated RNA-RNA attraction requires closer inter-duplex separation to engage CoHex ions (bound mostly in the "internal" shell of RNA) into short-range attractive interactions. The model also predicts that longer NA fragments will condense more readily than shorter ones. The ability of this model to explain experimentally observed trends in NAcondensation lends support to proposed NAcondensation picture based on the multivalent "ion binding shells."
Assuntos
Cobalto/química , DNA/química , RNA/química , Modelos Químicos , Simulação de Dinâmica MolecularRESUMO
An understanding of molecular interactions is essential for insight into biological systems at the molecular scale. Among the various components of molecular interactions, electrostatics are of special importance because of their long-range nature and their influence on polar or charged molecules, including water, aqueous ions, proteins, nucleic acids, carbohydrates, and membrane lipids. In particular, robust models of electrostatic interactions are essential for understanding the solvation properties of biomolecules and the effects of solvation upon biomolecular folding, binding, enzyme catalysis, and dynamics. Electrostatics, therefore, are of central importance to understanding biomolecular structure and modeling interactions within and among biological molecules. This review discusses the solvation of biomolecules with a computational biophysics view toward describing the phenomenon. While our main focus lies on the computational aspect of the models, we provide an overview of the basic elements of biomolecular solvation (e.g. solvent structure, polarization, ion binding, and non-polar behavior) in order to provide a background to understand the different types of solvation models.
Assuntos
Substâncias Macromoleculares/química , Solventes/química , Eletricidade Estática , Modelos Moleculares , Teoria Quântica , Água/químicaRESUMO
Side-chain oxysterols, such as 25-hydroxycholesterol (25-HC), are key regulators of cholesterol homeostasis. New evidence suggests that the alteration of membrane structure by 25-HC contributes to its regulatory effects. We have examined the role of oxysterol membrane effects on cholesterol accessibility within the membrane using perfringolysin O (PFO), a cholesterol-dependent cytolysin that selectively binds accessible cholesterol, as a sensor of membrane cholesterol accessibility. We show that 25-HC increases cholesterol accessibility in a manner dependent on the membrane lipid composition. Structural analysis of molecular dynamics simulations reveals that increased cholesterol accessibility is associated with membrane thinning, and that the effects of 25-HC on cholesterol accessibility are driven by these changes in membrane thickness. Further, we find that the 25-HC antagonist LY295427 (agisterol) abrogates the membrane effects of 25-HC in a nonenantioselective manner, suggesting that agisterol antagonizes the cholesterol-homeostatic effects of 25-HC indirectly through its membrane interactions. These studies demonstrate that oxysterols regulate cholesterol accessibility, and thus the availability of cholesterol to be sensed and transported throughout the cell, by modulating the membrane environment. This work provides new insights into how alterations in membrane structure can be used to relay cholesterol regulatory signals.
Assuntos
Membrana Celular/efeitos dos fármacos , Colesterol/química , Toxinas Bacterianas/farmacologia , Colestanóis/farmacologia , Colesterol/metabolismo , Proteínas Hemolisinas/farmacologia , Homeostase/efeitos dos fármacos , Hidroxicolesteróis/farmacologia , Lipossomos/metabolismo , Lipídeos de Membrana/química , Simulação de Dinâmica Molecular , Relação Estrutura-AtividadeRESUMO
This article investigates an ensemble-based technique called Bayesian Model Averaging (BMA) to improve the performance of protein amino acid pKa predictions. Structure-based pKa calculations play an important role in the mechanistic interpretation of protein structure and are also used to determine a wide range of protein properties. A diverse set of methods currently exist for pKa prediction, ranging from empirical statistical models to ab initio quantum mechanical approaches. However, each of these methods are based on a set of conceptual assumptions that can effect a model's accuracy and generalizability for pKa prediction in complicated biomolecular systems. We use BMA to combine eleven diverse prediction methods that each estimate pKa values of amino acids in staphylococcal nuclease. These methods are based on work conducted for the pKa Cooperative and the pKa measurements are based on experimental work conducted by the García-Moreno lab. Our cross-validation study demonstrates that the aggregated estimate obtained from BMA outperforms all individual prediction methods with improvements ranging from 45 to 73% over other method classes. This study also compares BMA's predictive performance to other ensemble-based techniques and demonstrates that BMA can outperform these approaches with improvements ranging from 27 to 60%. This work illustrates a new possible mechanism for improving the accuracy of pKa prediction and lays the foundation for future work on aggregate models that balance computational cost with prediction accuracy.
Assuntos
Teorema de Bayes , Biologia Computacional/métodos , Proteínas/química , Proteínas/metabolismo , Sequência de Aminoácidos , Modelos EstatísticosRESUMO
Nanoparticles offer new options for medical diagnosis and therapeutics with their capacity to specifically target cells and tissues with imaging agents and/or drug payloads. The unique physical aspects of nanoparticles present new challenges for this promising technology. Studies indicate that nanoparticles often elicit moderate to severe complement activation. Using human in vitro assays that corroborated the mouse in vivo results we previously presented mechanistic studies that define the pathway and key components involved in modulating complement interactions with several gadolinium-functionalized perfluorocarbon nanoparticles (PFOB). Here we employ a modified in vitro hemolysis-based assay developed in conjunction with the mouse in vivo model to broaden our analysis to include PFOBs of varying size, charge and surface chemistry and examine the variations in nanoparticle-mediated complement activity between individuals. This approach may provide the tools for an in-depth structure-activity relationship study that will guide the eventual development of biocompatible nanoparticles. FROM THE CLINICAL EDITOR: Unique physical aspects of nanoparticles may lead to moderate to severe complement activation in vivo, which represents a challenge to clinical applicability. In order to guide the eventual development of biocompatible nanoparticles, this team of authors report a modified in vitro hemolysis-based assay developed in conjunction with their previously presented mouse model to enable in-depth structure-activity relationship studies.
Assuntos
Ativação do Complemento/efeitos dos fármacos , Fluorocarbonos/imunologia , Hemólise/efeitos dos fármacos , Nanopartículas/metabolismo , Animais , Fluorocarbonos/química , Humanos , Camundongos , Camundongos Endogâmicos C57BL , Nanopartículas/química , Tamanho da PartículaRESUMO
Although the majority of free cellular cholesterol is present in the plasma membrane, cholesterol homeostasis is principally regulated through sterol-sensing proteins that reside in the cholesterol-poor endoplasmic reticulum (ER). In response to acute cholesterol loading or depletion, there is rapid equilibration between the ER and plasma membrane cholesterol pools, suggesting a biophysical model in which the availability of plasma membrane cholesterol for trafficking to internal membranes modulates ER membrane behavior. Previous studies have predominantly examined cholesterol availability in terms of binding to extramembrane acceptors, but have provided limited insight into the structural changes underlying cholesterol activation. In this study, we use both molecular dynamics simulations and experimental membrane systems to examine the behavior of cholesterol in membrane bilayers. We find that cholesterol depth within the bilayer provides a reasonable structural metric for cholesterol availability and that this is correlated with cholesterol-acceptor binding. Further, the distribution of cholesterol availability in our simulations is continuous rather than divided into distinct available and unavailable pools. This data provide support for a revised cholesterol activation model in which activation is driven not by saturation of membrane-cholesterol interactions but rather by bulk membrane remodeling that reduces membrane-cholesterol affinity.
Assuntos
Membrana Celular/química , Colesterol/química , Bicamadas Lipídicas/química , Simulação de Dinâmica Molecular , Fosfatidilcolinas/químicaRESUMO
This review discusses the application of cellular biology, molecular biophysics, and computational simulation to understand membrane-mediated mechanisms by which oxysterols regulate cholesterol homeostasis. Side-chain oxysterols, which are produced enzymatically in vivo, are physiological regulators of cholesterol homeostasis and primarily serve as cellular signals for excess cholesterol. These oxysterols regulate cholesterol homeostasis through both transcriptional and non-transcriptional pathways; however, many molecular details of their interactions in these pathways are still not well understood. Cholesterol trafficking provides one mechanism for regulation. The current model of cholesterol trafficking regulation is based on the existence of two distinct cholesterol pools in the membrane: a low and a high availability/activity pool. It is proposed that the low availability/activity pool of cholesterol is integrated into tightly packing phospholipids and relatively inaccessible to water or cellular proteins, while the high availability cholesterol pool is more mobile in the membrane and is present in membranes where the phospholipids are not as compressed. Recent results suggest that oxysterols may promote cholesterol egress from membranes by shifting cholesterol from the low to the high activity pools. Furthermore, molecular simulations suggest a potential mechanism for oxysterol "activation" of cholesterol through its displacement in the membrane. This review discusses these results as well as several other important interactions between oxysterols and cholesterol in cellular and model lipid membranes. This article is part of a Special Issue entitled: Membrane protein structure and function.
Assuntos
Membrana Celular/metabolismo , Esteróis/metabolismo , Animais , Membrana Celular/química , Colesterol/metabolismo , Homeostase , Humanos , Modelos Moleculares , Esteróis/químicaRESUMO
BACKGROUND AND MOTIVATION: The high-throughput genomics communities have been successfully using standardized spreadsheet-based formats to capture and share data within labs and among public repositories. The nanomedicine community has yet to adopt similar standards to share the diverse and multi-dimensional types of data (including metadata) pertaining to the description and characterization of nanomaterials. Owing to the lack of standardization in representing and sharing nanomaterial data, most of the data currently shared via publications and data resources are incomplete, poorly-integrated, and not suitable for meaningful interpretation and re-use of the data. Specifically, in its current state, data cannot be effectively utilized for the development of predictive models that will inform the rational design of nanomaterials. RESULTS: We have developed a specification called ISA-TAB-Nano, which comprises four spreadsheet-based file formats for representing and integrating various types of nanomaterial data. Three file formats (Investigation, Study, and Assay files) have been adapted from the established ISA-TAB specification; while the Material file format was developed de novo to more readily describe the complexity of nanomaterials and associated small molecules. In this paper, we have discussed the main features of each file format and how to use them for sharing nanomaterial descriptions and assay metadata. CONCLUSION: The ISA-TAB-Nano file formats provide a general and flexible framework to record and integrate nanomaterial descriptions, assay data (metadata and endpoint measurements) and protocol information. Like ISA-TAB, ISA-TAB-Nano supports the use of ontology terms to promote standardized descriptions and to facilitate search and integration of the data. The ISA-TAB-Nano specification has been submitted as an ASTM work item to obtain community feedback and to provide a nanotechnology data-sharing standard for public development and adoption.
Assuntos
Armazenamento e Recuperação da Informação , Nanoestruturas/química , Disseminação de Informação , PesquisaRESUMO
Implicit solvent models are popular for their high computational efficiency and simplicity over explicit solvent models and are extensively used for computing molecular solvation properties. The accuracy of implicit solvent models depends on the geometric description of the solute-solvent interface and the solvent dielectric profile that is defined near the surface of the solute molecule. Typically, it is assumed that the dielectric profile is spatially homogeneous in the bulk solvent medium and varies sharply across the solute-solvent interface. However, the specific form of this profile is often described by ad hoc geometric models rather than physical solute-solvent interactions. Hence, it is of significant interest to improve the accuracy of these implicit solvent models by more realistically defining the solute-solvent boundary within a continuum setting. Recently, a differential geometry-based geometric flow solvation model was developed, in which the polar and nonpolar free energies are coupled through a characteristic function that describes a smooth dielectric interface profile across the solvent-solute boundary in a thermodynamically self-consistent fashion. The main parameters of the model are the solute/solvent dielectric coefficients, solvent pressure on the solute, microscopic surface tension, solvent density, and molecular force-field parameters. In this work, we investigate how changes in the pressure, surface tension, solute dielectric coefficient, and choice of different force-field charge and radii parameters affect the prediction accuracy for hydration free energies of 17 small organic molecules based on the geometric flow solvation model. The results of our study provide insights on the parameterization, accuracy, and predictive power of this new implicit solvent model.
Assuntos
Modelos Químicos , Solventes/química , Estrutura MolecularRESUMO
Implicit solvent models are important tools for calculating solvation free energies for chemical and biophysical studies since they require fewer computational resources but can achieve accuracy comparable to that of explicit-solvent models. In past papers, geometric flow-based solvation models have been established for solvation analysis of small and large compounds. In the present work, the use of realistic experiment-based parameter choices for the geometric flow models is studied. We find that the experimental parameters of solvent internal pressure p = 172 MPa and surface tension γ = 72 mN/m produce solvation free energies within 1 RT of the global minimum root-mean-squared deviation from experimental data over the expanded set. Our results demonstrate that experimental values can be used for geometric flow solvent model parameters, thus eliminating the need for additional parameterization. We also examine the correlations between optimal values of p and γ which are strongly anti-correlated. Geometric analysis of the small molecule test set shows that these results are inter-connected with an approximately linear relationship between area and volume in the range of molecular sizes spanned by the data set. In spite of this considerable degeneracy between the surface tension and pressure terms in the model, both terms are important for the broader applicability of the model.
Assuntos
Solventes/química , Termodinâmica , Simulação por Computador , Modelos Químicos , Modelos Moleculares , Tensão SuperficialRESUMO
Perfluorocarbon-based nanoemulsion particles have become promising platforms for the delivery of therapeutic and diagnostic agents to specific target cells in a non-invasive manner. A "contact-facilitated" delivery mechanism has been proposed wherein the emulsifying phospholipid monolayer on the nanoemulsion surface contacts and forms a lipid complex with the outer monolayer of target cell plasma membrane, allowing cargo to diffuse to the surface of target cell. While this mechanism is supported by experimental evidence, its molecular details are unknown. The present study develops a coarse-grained model of nanoemulsion particles that are compatible with the MARTINI force field. Simulations using this coarse-grained model have demonstrated multiple fusion events between the particles and a model vesicular lipid bilayer. The fusion proceeds in the following sequence: dehydration at the interface, close apposition of the particles, protrusion of hydrophobic molecules to the particle surface, transient lipid complex formation, absorption of nanoemulsion into the liposome. The initial monolayer disruption acts as a rate-limiting step and is strongly influenced by particle size as well as by the presence of phospholipids supporting negative spontaneous curvature. The core-forming perfluorocarbons play critical roles in initiating the fusion process by facilitating protrusion of hydrophobic moieties into the interface between the two particles. This study directly supports the hypothesized nanoemulsion delivery mechanism and provides the underlying molecular details that enable engineering of nanoemulsions for a variety of medical applications.
RESUMO
Solvation analysis is one of the most important tasks in chemical and biological modeling. Implicit solvent models are some of the most popular approaches. However, commonly used implicit solvent models rely on unphysical definitions of solvent-solute boundaries. Based on differential geometry, the present work defines the solvent-solute boundary via the variation of the nonpolar solvation free energy. The solvation free energy functional of the system is constructed based on a continuum description of the solvent and the discrete description of the solute, which are dynamically coupled by the solvent-solute boundaries via van der Waals interactions. The first variation of the energy functional gives rise to the governing Laplace-Beltrami equation. The present model predictions of the nonpolar solvation energies are in an excellent agreement with experimental data, which supports the validity of the proposed nonpolar solvation model.
Assuntos
Solventes/química , Modelos Químicos , Propriedades de Superfície , TermodinâmicaRESUMO
Ion-channel function is determined by its gating movement. Yet, molecular dynamics and electrophysiological simulations were never combined to link molecular structure to function. We performed multiscale molecular dynamics and continuum electrostatics calculations to simulate a cardiac K(+) channel (I(Ks)) gating and its alteration by mutations that cause arrhythmias and sudden death. An all-atom model of the I(Ks) alpha-subunit KCNQ1, based on the recent Kv1.2 structure, is used to calculate electrostatic energies during gating. Simulations are compared with experiments where varying degrees of positive charge-added via point mutation-progressively reduce current. Whole-cell simulations show that mutations cause action potential and ECG QT interval prolongation, consistent with clinical phenotypes. This framework allows integration of multiscale observations to study the molecular basis of excitation and its alteration by disease.
Assuntos
Potenciais de Ação/fisiologia , Coração/fisiologia , Canal de Potássio KCNQ1/metabolismo , Modelos Moleculares , Eletricidade Estática , Sequência de Aminoácidos , Eletrocardiografia , Canal de Potássio KCNQ1/química , Cinética , Modelos Cardiovasculares , Dados de Sequência Molecular , Proteínas Mutantes/química , Proteínas Mutantes/metabolismo , Mutação/genética , Estrutura Secundária de ProteínaRESUMO
Side-chain oxysterols are enzymatically generated oxidation products of cholesterol that serve a central role in mediating cholesterol homeostasis. Recent work has shown that side-chain oxysterols, such as 25-hydroxycholesterol (25-HC), alter membrane structure in very different ways from cholesterol, suggesting a possible mechanism for how these oxysterols regulate cholesterol homeostasis. Here we extend our previous work by using molecular-dynamics simulations of 25-HC and cholesterol mixtures in 1-palmitoyl-2-oleoyl-phosphatidylcholine bilayers to examine the combined effects of 25-HC and cholesterol in the same bilayer. 25-HC causes larger changes in membrane structure when added to cholesterol-containing membranes than when added to cholesterol-free membranes. We also find that the presence of 25-HC changes the position, orientation, and solvent accessibility of cholesterol, shifting it into the water interface and thus increasing its availability to external acceptors. This is consistent with experimental results showing that oxysterols can trigger cholesterol trafficking from the plasma membrane to the endoplasmic reticulum. These effects provide a potential mechanism for 25-HC-mediated regulation of cholesterol trafficking and homeostasis through modulation of cholesterol availability.
Assuntos
Colesterol/metabolismo , Hidroxicolesteróis/metabolismo , Membranas Artificiais , Fosfolipídeos/metabolismo , Disponibilidade Biológica , Colesterol/química , Ligação de Hidrogênio , Hidroxicolesteróis/química , Bicamadas Lipídicas/metabolismo , Simulação de Dinâmica Molecular , Fosfatidilcolinas/metabolismo , Solventes/químicaRESUMO
Protein pK(a) calculation methods are developed partly to provide fast non-experimental estimates of the ionization constants of protein side chains. However, the most significant reason for developing such methods is that a good pK(a) calculation method is presumed to provide an accurate physical model of protein electrostatics, which can be applied in methods for drug design, protein design, and other structure-based energy calculation methods. We explore the validity of this presumption by simulating the development of a pK(a) calculation method using artificial experimental data derived from a human-defined physical reality. We examine the ability of an RMSD-guided development protocol to retrieve the correct (artificial) physical reality and find that a rugged optimization landscape and a huge parameter space prevent the identification of the correct physical reality. We examine the importance of the training set in developing pK(a) calculation methods and investigate the effect of experimental noise on our ability to identify the correct physical reality, and find that both effects have a significant and detrimental impact on the physical reality of the optimal model identified. Our findings are of relevance to all structure-based methods for protein energy calculations and simulation, and have large implications for all types of current pK(a) calculation methods. Our analysis furthermore suggests that careful and extensive validation on many types of experimental data can go some way in making current models more realistic.