RESUMO
Inspired by the successful application of the embedded cluster reference interaction site model (EC-RISM), a combination of quantum-mechanical calculations with three-dimensional RISM theory to predict Gibbs energies of species in solution within the SAMPL6.1 (acidity constants, pKa) and SAMPL6.2 (octanol-water partition coefficients, log P) the methodology was applied to the recent SAMPL7 physical property challenge on aqueous pKa and octanol-water log P values. Not part of the challenge but provided by the organizers, we also computed distribution coefficients log D7.4 from predicted pKa and log P data. While macroscopic pKa predictions compared very favorably with experimental data (root mean square error, RMSE 0.72 pK units), the performance of the log P model (RMSE 1.84) fell behind expectations from the SAMPL6.2 challenge, leading to reasonable log D7.4 predictions (RMSE 1.69) from combining the independent calculations. In the post-submission phase, conformations generated by different methodology yielded results that did not significantly improve the original predictions. While overall satisfactory compared to previous log D challenges, the predicted data suggest that further effort is needed for optimizing the robustness of the partition coefficient model within EC-RISM calculations and for shaping the agreement between experimental conditions and the corresponding model description.
Assuntos
1-Octanol/química , Simulação por Computador , Modelos Químicos , Teoria Quântica , Termodinâmica , Água/química , Modelos Lineares , Fenômenos Físicos , SolubilidadeRESUMO
Joint academic-industrial projects supporting drug discovery are frequently pursued to deploy and benchmark cutting-edge methodical developments from academia in a real-world industrial environment at different scales. The dimensionality of tasks ranges from small molecule physicochemical property assessment over protein-ligand interaction up to statistical analyses of biological data. This way, method development and usability both benefit from insights gained at both ends, when predictiveness and readiness of novel approaches are confirmed, but the pharmaceutical drug makers get early access to novel tools for the quality of drug products and benefit of patients. Quantum-mechanical and simulation methods particularly fall into this group of methods, as they require skills and expense in their development but also significant resources in their application, thus are comparatively slowly dripping into the realm of industrial use. Nevertheless, these physics-based methods are becoming more and more useful. Starting with a general overview of these and in particular quantum-mechanical methods for drug discovery we review a decade-long and ongoing collaboration between Sanofi and the Kast group focused on the application of the embedded cluster reference interaction site model (EC-RISM), a solvation model for quantum chemistry, to study small molecule chemistry in the context of joint participation in several SAMPL (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenges. Starting with early application to tautomer equilibria in water (SAMPL2) the methodology was further developed to allow for challenge contributions related to predictions of distribution coefficients (SAMPL5) and acidity constants (SAMPL6) over the years. Particular emphasis is put on a frequently overlooked aspect of measuring the quality of models, namely the retrospective analysis of earlier datasets and predictions in light of more recent and advanced developments. We therefore demonstrate the performance of the current methodical state of the art as developed and optimized for the SAMPL6 pKa and octanol-water log P challenges when re-applied to the earlier SAMPL5 cyclohexane-water log D and SAMPL2 tautomer equilibria datasets. Systematic improvement is not consistently found throughout despite the similarity of the problem class, i.e. protonation reactions and phase distribution. Hence, it is possible to learn about hidden bias in model assessment, as results derived from more elaborate methods do not necessarily improve quantitative agreement. This indicates the role of chance or coincidence for model development on the one hand which allows for the identification of systematic error and opportunities toward improvement and reveals possible sources of experimental uncertainty on the other. These insights are particularly useful for further academia-industry collaborations, as both partners are then enabled to optimize both the computational and experimental settings for data generation.
Assuntos
Descoberta de Drogas , Preparações Farmacêuticas/química , Teoria Quântica , Simulação por Computador , Cicloexanos/química , Ligantes , Modelos Químicos , Solubilidade , Solventes/química , Termodinâmica , Água/químicaRESUMO
The mode coupling theory of supercooled liquids is combined with advanced closures to the integral equation theory of liquids in order to estimate the glass transition line of Yukawa one-component plasmas from the unscreened Coulomb limit up to the strong screening regime. The present predictions constitute a major improvement over the current literature predictions. The calculations confirm the validity of an existing analytical parameterization of the glass transition line. It is verified that the glass transition line is an approximate isomorphic curve and the value of the corresponding reduced excess entropy is estimated. Capitalizing on the isomorphic nature of the glass transition line, two structural vitrification indicators are identified that allow a rough estimate of the glass transition point only through simple curve metrics of the static properties of supercooled liquids. The vitrification indicators are demonstrated to be quasi-universal by an investigation of hard sphere and inverse power law supercooled liquids. The straightforward extension of the present results to bi-Yukawa systems is also discussed.
Assuntos
Vidro/química , Plasma/química , Entropia , Transição de Fase , VitrificaçãoRESUMO
Results are reported for octanol-water partition coefficients (log P) of the neutral states of drug-like molecules provided during the SAMPL6 (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenge from applying the "embedded cluster reference interaction site model" (EC-RISM) as a solvation model for quantum-chemical calculations. Following the strategy outlined during earlier SAMPL challenges we first train 1- and 2-parameter water-free ("dry") and water-saturated ("wet") models for n-octanol solvation Gibbs energies with respect to experimental values from the "Minnesota Solvation Database" (MNSOL), yielding a root mean square error (RMSE) of 1.5 kcal mol-1 for the best-performing 2-parameter wet model, while the optimal water model developed for the pKa part of the SAMPL6 challenge is kept unchanged (RMSE 1.6 kcal mol-1 for neutral compounds from a model trained on both neutral and ionic species). Applying these models to the blind prediction set yields a log P RMSE of less than 0.5 for our best model (2-parameters, wet). Further analysis of our results reveals that a single compound is responsible for most of the error, SM15, without which the RMSE drops to 0.2. Since this is the only compound in the challenge dataset with a hydroxyl group we investigate other alcohols for which Gibbs energy of solvation data for both water and n-octanol are available in the MNSOL database to demonstrate a systematic cause of error and to discuss strategies for improvement.
Assuntos
1-Octanol/química , Octanóis/química , Termodinâmica , Água/química , Cicloexanos/química , Ligantes , Modelos Químicos , Teoria QuânticaRESUMO
Mean-field treatment of solvent provides an efficient technique to investigate chemical processes in solution in quantum mechanics/molecular mechanics (QM/MM) framework. In the algorithm, an iterative calculation is required to obtain the self-consistency between QM and MM regions, which is a time-consuming step. In the present study, we have proposed a noniterative approach by introducing a linear response approximation (LRA) into the solvation term in the one-electron part of Fock matrix in a hybrid approach between molecular-orbital calculations and a three-dimensional (3D) integral equation theory for molecular liquids (multicenter molecular Ornstein-Zernike self-consistent field [MC-MOZ-SCF]; Kido et al., J. Chem. Phys. 2015, 143, 014103). To save the computational time, we have also developed a fast method to generate electrostatic potential map near solute and the solvation term in Fock matrix, using Fourier transformation (FT) and real spherical harmonics expansion (RSHE). To numerically validate the LRA and FT-RSHE method, we applied the present approach to water, carbonic acid, and their ionic species in aqueous solution. Molecular properties of the solutes were evaluated by the present approach with four different types of initial wave functions and compared with those by the original (MC-MOZ-SCF). We found that an initial wave function considering solvation effects is needed to appropriately reproduce the properties by MC-MOZ-SCF. Furthermore, a benchmark test for 32 solute molecules was performed to evaluate the accuracy of the present approach for solvation free energy (SFE) and measure the speedup ratio for MC-MOZ-SCF. The error of SFE for MC-MOZ-SCF does not correlate with the SFE but increases in proportion to the electronic reorganization energy. Similar to water and carbonic acid, an initial wave function with solvation effects is also important to make the error small. From the averaged speed up ratio, the present approach is 13.5 times faster than MC-MOZ-SCF. © 2019 Wiley Periodicals, Inc.
RESUMO
BACKGROUND: Living systems are characterized by the dynamic assembly and disassembly of biomolecules. The dynamical ordering mechanism of these biomolecules has been investigated both experimentally and theoretically. The main theoretical approaches include quantum mechanical (QM) calculation, all-atom (AA) modeling, and coarse-grained (CG) modeling. The selected approach depends on the size of the target system (which differs among electrons, atoms, molecules, and molecular assemblies). These hierarchal approaches can be combined with molecular dynamics (MD) simulation and/or integral equation theories for liquids, which cover all size hierarchies. SCOPE OF REVIEW: We review the framework of quantum mechanical/molecular mechanical (QM/MM) calculations, AA MD simulations, CG modeling, and integral equation theories. Applications of these methods to the dynamical ordering of biomolecular systems are also exemplified. MAJOR CONCLUSIONS: The QM/MM calculation enables the study of chemical reactions. The AA MD simulation, which omits the QM calculation, can follow longer time-scale phenomena. By reducing the number of degrees of freedom and the computational cost, CG modeling can follow much longer time-scale phenomena than AA modeling. Integral equation theories for liquids elucidate the liquid structure, for example, whether the liquid follows a radial distribution function. GENERAL SIGNIFICANCE: These theoretical approaches can analyze the dynamic behaviors of biomolecular systems. They also provide useful tools for exploring the dynamic ordering systems of biomolecules, such as self-assembly. This article is part of a Special Issue entitled "Biophysical Exploration of Dynamical Ordering of Biomolecular Systems" edited by Dr. Koichi Kato.
Assuntos
Biologia Computacional , Substâncias Macromoleculares/metabolismo , Modelos Biológicos , Animais , Humanos , Cinética , Substâncias Macromoleculares/química , Modelos Químicos , Simulação de Dinâmica Molecular , Estrutura Molecular , Relação Estrutura-AtividadeRESUMO
The "embedded cluster reference interaction site model" (EC-RISM) integral equation theory is applied to the problem of predicting aqueous pKa values for drug-like molecules based on an ensemble of tautomers. EC-RISM is based on self-consistent calculations of a solute's electronic structure and the distribution function of surrounding water. Following-up on the workflow developed after the SAMPL5 challenge on cyclohexane-water distribution coefficients we extended and improved the methodology by taking into account exact electrostatic solute-solvent interactions taken from the wave function in solution. As before, the model is calibrated against Gibbs energies of hydration from the "Minnesota Solvation Database" and a public dataset of acidity constants of organic acids and bases by adjusting in total 4 parameters, among which only 3 are relevant for predicting pKa values. While the best-performing training model yields a root-mean-square error (RMSE) of 1 pK unit, the corresponding test set prediction on the full SAMPL6 dataset of macroscopic pKa values using the same level of theory exhibits slightly larger error (1.7 pK units) than the best test set model submitted (1.7 pK units for corresponding training set vs. test set performance of 1.6). Post-submission analysis revealed a number of physical optimization options regarding the numerical treatment of electrostatic interactions and conformational sampling. While the experimental test set data revealed after submission was not used for reparametrizing the methodology, the best physically optimized models consequentially result in RMSEs of 1.5 if only improved electrostatic interactions are considered and of 1.1 if, in addition, conformational sampling accounts for quantum-chemically derived rankings. We conclude that these numbers are probably near the ultimate accuracy achievable with the simple 3-parameter model using a single or the two best-ranking conformations per tautomer or microstate. Finally, relations of the present macrostate approach to microstate pKa results are discussed and some illustrative results for microstate populations are presented.
Assuntos
Hidrocarbonetos Cíclicos/química , Modelos Químicos , Simulação por Computador , Bases de Dados de Compostos Químicos , Modelos Teóricos , Conformação Molecular , Soluções/química , Solventes/química , Eletricidade Estática , Termodinâmica , Água/químicaRESUMO
In this paper we propose a model for the two dimensional fluid with one site-site associating point. We studied its structural and thermodynamic properties by the Monte Carlo computer simulations, the site-site integral equation theory (RISM), the Wertheim's thermodynamic perturbation theory (TPT) and the Wertheim's integral equation theory (WIET) for associative liquids. The model can have arbitrary position of the associating point from the center of particles. All particles have Lennard-Jones core while interactions between associating points are modeled as Gaussian like potential where the interaction depends only on the distance between sites. The methods were used to study the thermodynamic and structural properties as a function of the position of associating point, temperature and density. The accuracy of the analytic theories were checked by comparing the theoretical results with the corresponding Monte Carlo ones. The theories are quite accurate for cases when the associating point is on the surface and only dimers can be formed. In this case, the theories correctly predict the pair correlation functions of the model, internal energy, ratios of free and bonded particles and chemical potential. This is no longer true when associating point is away from the surface of particles and the higher clusters are formed.
RESUMO
In this paper we applied analytical theories for the two dimensional chain-forming fluid. Wertheims thermodynamic perturbation theory (TPT) and integral equation theory (IET) for associative liquids were used to study thermodynamical and structural properties of the chain-forming model. The model has polymerizing points at arbitrary position from center of the particles. Calculated analytical results were tested against corresponding results obtained by Monte Carlo computer simulations to check the accuracy of the theories. The theories are accurate for the different positions of patches of the model at all values of the temperature and density studied. The IET's pair correlation functions of the model agree well with computer simulations. Both TPT and IET are in good agreement with the Monte Carlo values of the energy, chemical potential and ratios of free, once and twice bonded particles.
RESUMO
In this paper we applied an analytical theory for the two dimensional dimerising fluid. We applied Wertheims thermodynamic perturbation theory (TPT) and integral equation theory (IET) for associative liquids to the dimerising model with arbitrary position of dimerising points from center of the particles. The theory was used to study thermodynamical and structural properties. To check the accuracy of the theories we compared theoretical results with corresponding results obtained by Monte Carlo computer simulations. The theories are accurate for the different positions of patches of the model at all values of the temperature and density studied. IET correctly predicts the pair correlation function of the model. Both TPT and IET are in good agreement with the Monte Carlo values of the energy, pressure, chemical potential, compressibility and ratios of free and bonded particles.
RESUMO
We develop a new method for calculating the hydration free energy (HFE) of a protein with any net charge. The polar part of the energetic component in the HFE is expressed as a linear combination of four geometric measures (GMs) of the protein structure and the generalized Born (GB) energy plus a constant. The other constituents in the HFE are expressed as linear combinations of the four GMs. The coefficients (including the constant) in the linear combinations are determined using the three-dimensional reference interaction site model (3D-RISM) theory applied to sufficiently many protein structures. Once the coefficients are determined, the HFE and its constituents of any other protein structure are obtained simply by calculating the four GMs and GB energy. Our method and the 3D-RISM theory give perfectly correlated results. Nevertheless, the computation time required in our method is over four orders of magnitude shorter.
Assuntos
Proteínas/química , Termodinâmica , Água/química , Animais , Bases de Dados de Proteínas , Humanos , Dobramento de Proteína , Eletricidade Estática , Ubiquitina/químicaRESUMO
We predict cyclohexane-water distribution coefficients (log D 7.4) for drug-like molecules taken from the SAMPL5 blind prediction challenge by the "embedded cluster reference interaction site model" (EC-RISM) integral equation theory. This task involves the coupled problem of predicting both partition coefficients (log P) of neutral species between the solvents and aqueous acidity constants (pK a) in order to account for a change of protonation states. The first issue is addressed by calibrating an EC-RISM-based model for solvation free energies derived from the "Minnesota Solvation Database" (MNSOL) for both water and cyclohexane utilizing a correction based on the partial molar volume, yielding a root mean square error (RMSE) of 2.4 kcal mol-1 for water and 0.8-0.9 kcal mol-1 for cyclohexane depending on the parametrization. The second one is treated by employing on one hand an empirical pK a model (MoKa) and, on the other hand, an EC-RISM-derived regression of published acidity constants (RMSE of 1.5 for a single model covering acids and bases). In total, at most 8 adjustable parameters are necessary (2-3 for each solvent and two for the pK a) for training solvation and acidity models. Applying the final models to the log D 7.4 dataset corresponds to evaluating an independent test set comprising other, composite observables, yielding, for different cyclohexane parametrizations, 2.0-2.1 for the RMSE with the first and 2.2-2.8 with the combined first and second SAMPL5 data set batches. Notably, a pure log P model (assuming neutral species only) performs statistically similarly for these particular compounds. The nature of the approximations and possible perspectives for future developments are discussed.
Assuntos
Simulação por Computador , Cicloexanos/química , Preparações Farmacêuticas/química , Água/química , Modelos Químicos , Estrutura Molecular , Teoria Quântica , Solubilidade , Solventes/química , TermodinâmicaRESUMO
We report a method to predict physicochemical properties of druglike molecules using a classical statistical mechanics based solvent model combined with machine learning. The RISM-MOL-INF method introduced here provides an accurate technique to characterize solvation and desolvation processes based on solute-solvent correlation functions computed by the 1D reference interaction site model of the integral equation theory of molecular liquids. These functions can be obtained in a matter of minutes for most small organic and druglike molecules using existing software (RISM-MOL) (Sergiievskyi, V. P.; Hackbusch, W.; Fedorov, M. V. J. Comput. Chem. 2011, 32, 1982-1992). Predictions of caco-2 cell permeability and hydration free energy obtained using the RISM-MOL-INF method are shown to be more accurate than the state-of-the-art tools for benchmark data sets. Due to the importance of solvation and desolvation effects in biological systems, it is anticipated that the RISM-MOL-INF approach will find many applications in biophysical and biomedical property prediction.
Assuntos
Fenômenos Químicos , Modelos Teóricos , Preparações Farmacêuticas/química , Solventes/química , Água/química , Células CACO-2 , Química Farmacêutica , Humanos , TermodinâmicaRESUMO
Hydrophobicity of a protein is considered to be one of the major intrinsic factors dictating the protein aggregation propensity. Understanding how protein hydrophobicity is determined is, therefore, of central importance in preventing protein aggregation diseases and in the biotechnological production of human therapeutics. Traditionally, protein hydrophobicity is estimated based on hydrophobicity scales determined for individual free amino acids, assuming that those scales are unaltered when amino acids are embedded in a protein. Here, we investigate how the hydrophobicity of constituent amino acid residues depends on the protein context. To this end, we analyze the hydration free energy-free energy change on hydration quantifying the hydrophobicity-of the wild-type and 21 mutants of amyloid-beta protein associated with Alzheimer's disease by performing molecular dynamics simulations and integral-equation calculations. From detailed analysis of mutation effects on the protein hydrophobicity, we elucidate how the protein global factor such as the total charge as well as underlying protein conformations influence the hydrophobicity of amino acid residues. Our results provide a unique insight into the protein hydrophobicity for rationalizing and predicting the protein aggregation propensity on mutation, and open a new avenue to design aggregation-resistant proteins as biotherapeutics.
Assuntos
Aminoácidos/química , Peptídeos beta-Amiloides/química , Interações Hidrofóbicas e Hidrofílicas , Mutação Puntual , Humanos , Simulação de Dinâmica Molecular , Conformação Proteica , TermodinâmicaRESUMO
The thermodynamic and structural properties of the 2D hexagonal soft-sites fluid are examined by integral equation theory benchmarked against extensive Monte Carlo simulations. Hexamers are built of six equal Lennard-Jones segments. Site-site integral equation theory is used to compute site-site correlation functions, excess internal energies and isotherms over a wide range of conditions and compared with results obtained from Monte Carlo simulations. Various approaches for computing the pressure are discussed as well. Satisfactory qualitative agreement between theory and simulations is found with details depending on the applied closure relation.
RESUMO
Thermostabilization of a membrane proteins, especially G-protein-coupled receptors (GPCRs), is often necessary for biochemical applications and pharmaceutical studies involving structure-based drug design. Here we review our theoretical, physics-based method for identifying thermostabilizing amino acid mutations. Its novel aspects are the following: The entropic effect originating from the translational displacement of hydrocarbon groups within the lipid bilayer is treated as a pivotal factor; a reliable measure of thermostability is introduced and a mutation which enlarges the measure to a significant extent is chosen; and all the possible mutations can be examined with moderate computational effort. It was shown that mutating the residue at a position of NBW = 3.39 (NBW is the Ballesteros-Weinstein number) to Arg or Lys leads to the stabilization of significantly many different GPCRs of class A in the inactive state. Up to now, we have been successful in stabilizing several GPCRs and newly solving three-dimensional structures for the muscarinic acetylcholine receptor 2 (M2R), prostaglandin E receptor 4 (EP4), and serotonin 2A receptor (5-HT2AR) using X-ray crystallography. The subjects to be pursued in future studies are also discussed.
RESUMO
Here we review a new method for calculating a hydration free energy (HFE) of a solute and discuss its physical implication for biomolecular functions in aqueous environments. The solute hydration is decomposed into processes 1 and 2. A cavity matching the geometric characteristics of the solute at the atomic level is created in process 1. Solute-water van der Waals and electrostatic interaction potentials are incorporated in process 2. The angle-dependent integral equation theory combined with our morphometric approach is applied to process 1, and the three-dimensional reference interaction site model theory is employed for process 2. Molecular models are adopted for water. The new method is characterized by the following. Solutes with various sizes including proteins can be treated in the same manner. It is almost as accurate as the molecular dynamics simulation despite its far smaller computational burden. It enables us to handle a solute possessing a significantly large total charge without difficulty. The HFE can be decomposed into a variety of physically insightful, energetic, and entropic components. It is best suited to the elucidation of mechanisms of protein folding, pressure and cold denaturation of a protein, and different types of molecular recognition.
RESUMO
Calculations of acidities of molecules with multiple tautomeric and/or conformational states require adequate treatment of the relative energetics of accessible states accompanied by a statistical-mechanical formulation of their contribution to the macroscopic pKa value. Here, we demonstrate rigorously the formal equivalence of two such approaches: a partition function treatment and statistics over transitions between molecular tautomeric and conformational states in the limit of a theory that does not require adjustment by empirical parameters correcting energetic values. However, for a frequently employed correction scheme, linear scaling of (free) energies and regression with respect to reference data taking an additive constant into account, this equivalence breaks down if more than one acid or base state is involved. The consequences of the resulting inconsistency are discussed on our datasets developed for aqueous pKa predictions during the recent SAMPL6 challenge, where molecular state energetics were computed based on the "embedded cluster reference interaction site model" (EC-RISM). This method couples integral equation theory as a solvation model to quantum-chemical calculations and yielded a test set root mean square error of 1.1 pK units from a partition function ansatz. For all practical purposes, the present results indicate that a state transition approach yields comparable accuracy despite the formal theoretical inconsistency, and that an additive regression intercept, which is strictly constant in the limit of large compound mass only, is a valid approximation. Graphical abstract Embedded cluster reference interaction site model-derived vs. experimental pKa for the test set calculated with either the partition function (blue) or the state transition approach (red), using m as a free parameter.
RESUMO
Upon biological self-assembly, the number of accessible translational configurations of water in the system increases considerably, leading to a large gain in water entropy. It is important to calculate the solvation entropy of a biomolecule with a prescribed structure by accounting for the change in water-water correlations caused by solute insertion. Modeling water as a dielectric continuum is not capable of capturing the physical essence of the water entropy effect. As a reliable tool, we propose a hybrid of the angle-dependent integral equation theory combined with a multipolar water model and a morphometric approach. Using our methods wherein the water entropy effect is treated as the key factor, we can elucidate a variety of processes such as protein folding, cold, pressure, and heat denaturating of a protein, molecular recognition, ordered association of proteins such as amyloid fibril formation, and functioning of ATP-driven proteins.
RESUMO
We briefly review our theoretical study on the rotation scheme of F1-ATPase. In the scheme, the key factor is the water entropy which has been shown to drive a variety of self-assembly processes in biological systems. We decompose the crystal structure of F1-ATPase into three sub-complexes each of which is composed of the γ subunit, one of the ß subunits, and two α subunits adjacent to them. The ßE, ßTP, and ßDP subunits are involved in the sub-complexes I, II, and III, respectively. We calculate the hydration entropy of each sub-complex using a hybrid of the integral equation theory for molecular liquids and the morphometric approach. It is found that the absolute value of the hydration entropy follows the order, sub-complex I > sub-complex II > sub-complex III. Moreover, the differences are quite large, which manifests highly asymmetrical packing of F1-ATPase. In our picture, this asymmetrical packing plays crucially important roles in the rotation of the γ subunit. We discuss how the rotation is induced by the water-entropy effect coupled with such chemical processes as ATP binding, ATP hydrolysis, and release of the products.