Pesquisa | Portal Regional da BVS

1.

Uncertain of uncertainties? A comparison of uncertainty quantification metrics for chemical data sets.

Rasmussen, Maria H; Duan, Chenru; Kulik, Heather J; Jensen, Jan H.

J Cheminform ; 15(1): 121, 2023 Dec 18.

Artigo em Inglês | MEDLINE | ID: mdl-38111020

RESUMO

With the increasingly more important role of machine learning (ML) models in chemical research, the need for putting a level of confidence to the model predictions naturally arises. Several methods for obtaining uncertainty estimates have been proposed in recent years but consensus on the evaluation of these have yet to be established and different studies on uncertainties generally uses different metrics to evaluate them. We compare three of the most popular validation metrics (Spearman's rank correlation coefficient, the negative log likelihood (NLL) and the miscalibration area) to the error-based calibration introduced by Levi et al. (Sensors 2022, 22, 5540). Importantly, metrics such as the negative log likelihood (NLL) and Spearman's rank correlation coefficient bear little information in themselves. We therefore introduce reference values obtained through errors simulated directly from the uncertainty distribution. The different metrics target different properties and we show how to interpret them, but we generally find the best overall validation to be done based on the error-based calibration plot introduced by Levi et al. Finally, we illustrate the sensitivity of ranking-based methods (e.g. Spearman's rank correlation coefficient) towards test set design by using the same toy model ferent test sets and obtaining vastly different metrics (0.05 vs. 0.65).

2.

Toward De Novo Catalyst Discovery: Fast Identification of New Catalyst Candidates for Alcohol-Mediated Morita-Baylis-Hillman Reactions.

Rasmussen, Maria H; Seumer, Julius; Jensen, Jan H.

Angew Chem Int Ed Engl ; 62(49): e202310580, 2023 Dec 04.

Artigo em Inglês | MEDLINE | ID: mdl-37830522

RESUMO

Recently we have demonstrated how a genetic algorithm (GA) starting from random tertiary amines can be used to discover a new and efficient catalyst for the alcohol-mediated Morita-Baylis-Hillman (MBH) reaction. In particular, the discovered catalyst was shown experimentally to be eight times more active than DABCO, commonly used to catalyze the MBH reaction. This represents a breakthrough in using generative models for catalyst optimization. However, the GA procedure, and hence discovery, relied on two important pieces of information; 1)âthe knowledge that tertiary amines catalyze the reaction and 2)âthe mechanism and reaction profile for the catalyzed reaction, in particular the transition state structure of the rate-determining step. Thus, truly de novo catalyst discovery must include these steps. Here we present such a method for discovering catalyst candidates for a specific reaction while simultaneously proposing a mechanism for the catalyzed reaction. We show that tertiary amines and phosphines are potential catalysts for the MBH reaction by screening 11 molecular templates representing common functional groups. The method relies on an automated reaction discovery workflow using meta-dynamics calculations. Combining this method for catalyst candidate discovery with our GA-based catalyst optimization method results in an algorithm for truly de novo catalyst discovery.

3.

Computational Evolution Of New Catalysts For The Morita-Baylis-Hillman Reaction.

Seumer, Julius; Kirschner Solberg Hansen, Jonathan; Brøndsted Nielsen, Mogens; Jensen, Jan H.

Angew Chem Int Ed Engl ; 62(18): e202218565, 2023 Apr 24.

Artigo em Inglês | MEDLINE | ID: mdl-36786212

RESUMO

We present a de novo discovery of an efficient catalyst of the Morita-Baylis-Hillman (MBH) reaction by searching chemical space for molecules that lower the estimated barrier of the rate-determining step using a genetic algorithm (GA) starting from randomly selected tertiary amines. We identify 435 candidates, virtually all of which contain an azetidine N as the catalytically active site, which is discovered by the GA. Two molecules are selected for further study based on their predicted synthetic accessibility and have predicted rate-determining barriers that are lower than that of a known catalyst. Azetidines have not been used as catalysts for the MBH reaction. One suggested azetidine is successfully synthesized and showed an eightfold increase in activity over a commonly used catalyst. We believe this is the first experimentally verified de novo discovery of an efficient catalyst using a generative model.

4.

What the Heck?-Automated Regioselectivity Calculations of Palladium-Catalyzed Heck Reactions Using Quantum Chemistry.

Ree, Nicolai; Göller, Andreas H; Jensen, Jan H.

ACS Omega ; 7(49): 45617-45623, 2022 Dec 13.

Artigo em Inglês | MEDLINE | ID: mdl-36530278

RESUMO

We present a quantum chemistry (QM)-based method that computes the relative energies of intermediates in the Heck reaction that relate to the regioselective reaction outcome: branched (α), linear (ß), or a mix of the two. The calculations are done for two different reaction pathways (neutral and cationic) and are based on r 2SCAN-3c single-point calculations on GFN2-xTB geometries that, in turn, derive from a GFNFF-xTB conformational search. The method is completely automated and is sufficiently efficient to allow for the calculation of thousands of reaction outcomes. The method can mostly reproduce systematic experimental studies where the ratios of regioisomers are carefully determined. For a larger dataset extracted from Reaxys, the results are somewhat worse with accuracies of 63% for ß-selectivity using the neutral pathway and 29% for α-selectivity using the cationic pathway. Our analysis of the dataset suggests that only the major or desired regioisomer is reported in the literature in many cases, which makes accurate comparisons difficult. The code is freely available on GitHub under the MIT open-source license: https://github.com/jensengroup/HeckQM.

5.

A Neural Network Approach for Property Determination of Molecular Solar Cell Candidates.

Christensen, Oliver; Schlosser, Rasmus Dalsgaard; Nielsen, Rasmus Buus; Johansen, Jes; Koerstz, Mads; Jensen, Jan H; Mikkelsen, Kurt V.

J Phys Chem A ; 126(10): 1681-1688, 2022 Mar 17.

Artigo em Inglês | MEDLINE | ID: mdl-35245050

RESUMO

The dihydroazulene/vinylheptafulvene (DHA/VHF) photocouple is a promising candidate for molecular solar heat batteries, storing and releasing energy in a closed cycle. Much work has been done on improving the energy storage capacity and the half-life of the high-energy isomer via substituent functionalization, but similarly important is keeping these improved properties in common polar solvents, along with being soluble in these, which is tied to the dipole properties. However, the number of possible derivatives makes an overview of this combinatorial space impossible both for experimental work and traditional computational chemistry. Due to the time-consuming nature of running many thousands of computations, we look to machine learning, which bears the advantage that once a model has been trained, it can be used to rapidly estimate approximate values for the given system. Applying a convolutional neural network, we show that it is possible to reach good agreement with traditional computations on a scale that allows us to rapidly screen tens of thousands of the DHA/VHF photocouple, eliminating bad candidates and allowing computational resources to be directed toward meaningful compounds.

Assuntos

Aprendizado de Máquina , Redes Neurais de Computação , Energia Solar , Isomerismo

6.

Substituent Control of σ-Interference Effects in the Transmission of Saturated Molecules.

Garner, Marc H; Koerstz, Mads; Jensen, Jan H; Solomon, Gemma C.

ACS Phys Chem Au ; 2(4): 282-288, 2022 Jul 27.

Artigo em Inglês | MEDLINE | ID: mdl-36855417

RESUMO

The single-molecule conductance of saturated molecules can potentially be fully suppressed by destructive quantum interference in their σ-system. However, only few molecules with σ-interference have been identified, and the structure-property relationship remains to be elucidated. Here, we explore the role of substituents in modulating the electronic transmission of saturated molecules. In functionalized bicyclo[2.2.2]octanes, the transmission is suppressed by σ-interference when fluorine substituents are applied. For bicyclo[2.2.2]octasilane and -octagermanes, the transmission is suppressed when carbon-based substituents are used, and such molecules are likely to be highly insulating. For the carbon-based substituents, we find a strong correlation between the appropriate Hammett constants and the transmission. The substituent effect enables systematic optimization of the insulating properties of saturated molecular cores.

7.

Virtual screening of norbornadiene-based molecular solar thermal energy storage systems using a genetic algorithm.

Ree, Nicolai; Koerstz, Mads; Mikkelsen, Kurt V; Jensen, Jan H.

J Chem Phys ; 155(18): 184105, 2021 Nov 14.

Artigo em Inglês | MEDLINE | ID: mdl-34773961

RESUMO

We present a computational methodology for the screening of a chemical space of 1025 substituted norbornadiene molecules for promising kinetically stable molecular solar thermal (MOST) energy storage systems with high energy densities that absorb in the visible part of the solar spectrum. We use semiempirical tight-binding methods to construct a dataset of nearly 34 000 molecules and train graph convolutional networks to predict energy densities, kinetic stability, and absorption spectra and then use the models together with a genetic algorithm to search the chemical space for promising MOST energy storage systems. We identify 15 kinetically stable molecules, five of which have energy densities greater than 0.45 MJ/kg, and the main conclusion of this study is that the largest energy density that can be obtained for a single norbornadiene moiety with the substituents considered here, while maintaining a long half-life and absorption in the visible spectrum, is around 0.55 MJ/kg.

8.

RegioSQM20: improved prediction of the regioselectivity of electrophilic aromatic substitutions.

Ree, Nicolai; Göller, Andreas H; Jensen, Jan H.

J Cheminform ; 13(1): 10, 2021 Feb 12.

Artigo em Inglês | MEDLINE | ID: mdl-33579374

RESUMO

We present RegioSQM20, a new version of RegioSQM (Chem Sci 9:660, 2018), which predicts the regioselectivities of electrophilic aromatic substitution (EAS) reactions from the calculation of proton affinities. The following improvements have been made: The open source semiempirical tight binding program xtb is used instead of the closed source MOPAC program. Any low energy tautomeric forms of the input molecule are identified and regioselectivity predictions are made for each form. Finally, RegioSQM20 offers a qualitative prediction of the reactivity of each tautomer (low, medium, or high) based on the reaction center with the highest proton affinity. The inclusion of tautomers increases the success rate from 90.7 to 92.7%. RegioSQM20 is compared to two machine learning based models: one developed by Struble et al. (React Chem Eng 5:896, 2020) specifically for regioselectivity predictions of EAS reactions (WLN) and a more generally applicable reactivity predictor (IBM RXN) developed by Schwaller et al. (ACS Cent Sci 5:1572, 2019). RegioSQM20 and WLN offers roughly the same success rates for the entire data sets (without considering tautomers), while WLN is many orders of magnitude faster. The accuracy of the more general IBM RXN approach is somewhat lower: 76.3-85.0%, depending on the data set. The code is freely available under the MIT open source license and will be made available as a webservice (regiosqm.org) in the near future.

9.

A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space.

Jensen, Jan H.

Chem Sci ; 10(12): 3567-3572, 2019 Mar 28.

Artigo em Inglês | MEDLINE | ID: mdl-30996948

RESUMO

This paper presents a comparison of a graph-based genetic algorithm (GB-GA) and machine learning (ML) results for the optimization of log P values with a constraint for synthetic accessibility and shows that the GA is as good as or better than the ML approaches for this particular property. The molecules found by the GB-GA bear little resemblance to the molecules used to construct the initial mating pool, indicating that the GB-GA approach can traverse a relatively large distance in chemical space using relatively few (50) generations. The paper also introduces a new non-ML graph-based generative model (GB-GM) that can be parameterized using very small data sets and combined with a Monte Carlo tree search (MCTS) algorithm. The results are comparable to previously published results (Sci. Technol. Adv. Mater., 2017, 18, 972-976) using a recurrent neural network (RNN) generative model, and the GB-GM-based method is several orders of magnitude faster. The MCTS results seem more dependent on the composition of the training set than the GA approach for this particular property. Our results suggest that the performance of new ML-based generative models should be compared to that of more traditional, and often simpler, approaches such a GA.

10.

The Bicyclo[2.2.2]octane Motif: A Class of Saturated Group 14 Quantum Interference Based Single-Molecule Insulators.

Garner, Marc H; Koerstz, Mads; Jensen, Jan H; Solomon, Gemma C.

J Phys Chem Lett ; 9(24): 6941-6947, 2018 Dec 20.

Artigo em Inglês | MEDLINE | ID: mdl-30484655

RESUMO

The electronic transmission through σ-conjugated molecules can be fully suppressed by destructive quantum interference, which makes them potential candidates for single-molecule insulators. The first molecule with clear suppression of the single-molecule conductance due to σ-interference was recently found in the form of a functionalized bicyclo[2.2.2]octasilane. Here we continue the search for potential single-molecule insulators based on saturated group 14 molecules. Using a high-throughput screening approach, we assess the electron transport properties of the bicyclo[2.2.2]octane class by systematically varying the constituent atoms between carbon, silicon, and germanium, thus exploring the full chemical space of 771 different molecules. The majority of the molecules in the bicyclo[2.2.2]octane class are found to be highly insulating molecules. Though the all-silicon molecule is a clear-cut case of σ-interference, it is not unique within its class and there are many potential molecules that we predict to be more insulating. The finding of this class of quantum interference based single-molecule insulators indicates that a broad range of highly insulating saturated group 14 molecules are likely to exist.

11.

Improving solvation energy predictions using the SMD solvation method and semiempirical electronic structure methods.

Kromann, Jimmy C; Steinmann, Casper; Jensen, Jan H.

J Chem Phys ; 149(10): 104102, 2018 Sep 14.

Artigo em Inglês | MEDLINE | ID: mdl-30219007

RESUMO

The PM6 implementation in the GAMESS program is extended to elements requiring d-integrals and interfaced with the conducter-like polarized continuum model of solvation, including gradients. The accuracy of aqueous solvation energies computed using AM1, PM3, PM6, and DFT tight binding (DFTB) and the Solvation Model Density (SMD) continuum solvation model is tested using the Minnesota Solvation Database data set. The errors in SMD solvation energies predicted using Neglect of Diatomic Differential Overlap (NDDO)-based methods are considerably larger than when using density functional theory (DFT) and HF, with root mean square error (RMSE) values of 3.4-5.9 (neutrals) and 6-15 kcal/mol (ions) compared to 2.4 and â¼5 kcal/mol for HF/6-31G(d). For the NDDO-based methods, the errors are especially large for cations and considerably higher than the corresponding conductor-like screening model results, which suggests that the NDDO/SMD results can be improved by re-parameterizing the SMD parameters focusing on ions. We found that the best results are obtained by changing only the radii for hydrogen, carbon, oxygen, nitrogen, and sulfur, and this leads to RMSE values for PM3 (neutrals: 2.8/ions: â¼5 kcal/mol), PM6 (4.7/â¼5 kcal/mol), and DFTB (3.9/â¼5 kcal/mol) that are more comparable to HF/6-31G(d) (2.4/â¼5 kcal/mol). Although the radii are optimized to reproduce aqueous solvation energies, they also lead more accurate predictions for other polar solvents such as dimethyl sulfoxide, acetonitrile, and methanol, while the improvements for non-polar solvents are negligible.

12.

Fast and accurate prediction of the regioselectivity of electrophilic aromatic substitution reactions.

Kromann, Jimmy C; Jensen, Jan H; Kruszyk, Monika; Jessing, Mikkel; Jørgensen, Morten.

Chem Sci ; 9(3): 660-665, 2018 Jan 21.

Artigo em Inglês | MEDLINE | ID: mdl-29629133

RESUMO

While computational prediction of chemical reactivity is possible it usually requires expert knowledge and there are relatively few computational tools that can be used by a bench chemist to help guide synthesis. The RegioSQM method for predicting the regioselectivity of electrophilic aromatic substitution reactions of heteroaromatic systems is presented in this paper. RegioSQM protonates all aromatic C-H carbon atoms and identifies those with the lowest free energies in chloroform using the PM3 semiempirical method as the most nucleophilic center. These positions are found to correlate qualitatively with the regiochemical outcome in a retrospective analysis of 96% of more than 525 literature examples of electrophilic aromatic halogenation reactions. The method is automated and requires only a SMILES string of the molecule of interest, which can easily be generated using chemical drawing programs such as ChemDraw. The computational cost is 1-10 minutes per molecule depending on size, using relatively modest computational resources and the method is freely available via a web server at ; http://www.regiosqm.org. RegioSQM should therefore be of practical use in the planning of organic synthesis.

13.

Random versus Systematic Errors in Reaction Enthalpies Computed Using Semiempirical and Minimal Basis Set Methods.

Kromann, Jimmy C; Welford, Alexander; Christensen, Anders S; Jensen, Jan H.

ACS Omega ; 3(4): 4372-4377, 2018 Apr 30.

Artigo em Inglês | MEDLINE | ID: mdl-31458662

RESUMO

The connectivity-based hierarchy (CBH) protocol for computing accurate reaction enthalpies developed by Sengupta and Raghavachari is tested for fast ab initio methods (PBEh-3c, HF-3c, and HF/STO-3G), tight-binding density functional theory (DFT) methods (GFN-xTB, DFTB, and DFTB-D3), and neglect-of-diatomic-differential-overlap (NDDO)-based semiempirical methods (AM1, PM3, PM6, PM6-DH+, PM6-D2, PM6-D3H+, PM6-D3H4X, PM7, and OM2) using the same set of 25 reactions as in the original study. For the CBH-2 scheme, which reflects the change in the immediate chemical environment of all of the heavy atoms, the respective mean unsigned error relative to G4 for PBEh-3c, HF-3c, HF/STO-3G, GFN-xTB, DFTB-D3, DFTB, PM3, AM1, PM6, PM6-DH+, PM6-D3, PM6-D3H+, PM6-D3H4X, PM7, and OM2 are 1.9, 2.4, 3.0, 3.9, 3.7, 4.5, 4.8, 5.5, 5.4, 5.3, 5,4, 6.5, 5.3, 5.2, and 5.9 kcal/mol, with a single outlier removed for HF-3c, PM6, PM6-DH+, PM6-D3, PM6-D3H4X, and PM7. The increase in accuracy for the NDDO-based methods is relatively modest due to the random errors in predicted heats for formation.

14.

Intermolecular interactions in the condensed phase: Evaluation of semi-empirical quantum mechanical methods.

Christensen, Anders S; Kromann, Jimmy C; Jensen, Jan H; Cui, Qiang.

J Chem Phys ; 147(16): 161704, 2017 Oct 28.

Artigo em Inglês | MEDLINE | ID: mdl-29096452

RESUMO

To facilitate further development of approximate quantum mechanical methods for condensed phase applications, we present a new benchmark dataset of intermolecular interaction energies in the solution phase for a set of 15 dimers, each containing one charged monomer. The reference interaction energy in solution is computed via a thermodynamic cycle that integrates dimer binding energy in the gas phase at the coupled cluster level and solute-solvent interaction with density functional theory; the estimated uncertainty of such calculated interaction energy is ±1.5 kcal/mol. The dataset is used to benchmark the performance of a set of semi-empirical quantum mechanical (SQM) methods that include DFTB3-D3, DFTB3/CPE-D3, OM2-D3, PM6-D3, PM6-D3H+, and PM7 as well as the HF-3c method. We find that while all tested SQM methods tend to underestimate binding energies in the gas phase with a root-mean-squared error (RMSE) of 2-5 kcal/mol, they overestimate binding energies in the solution phase with an RMSE of 3-4 kcal/mol, with the exception of DFTB3/CPE-D3 and OM2-D3, for which the systematic deviation is less pronounced. In addition, we find that HF-3c systematically overestimates binding energies in both gas and solution phases. As most approximate QM methods are parametrized and evaluated using data measured or calculated in the gas phase, the dataset represents an important first step toward calibrating QM based methods for application in the condensed phase where polarization and exchange repulsion need to be treated in a balanced fashion.

15.

pICalculax: Improved Prediction of Isoelectric Point for Modified Peptides.

Bjerrum, Esben J; Jensen, Jan H; Tolborg, Jakob L.

J Chem Inf Model ; 57(8): 1723-1727, 2017 08 28.

Artigo em Inglês | MEDLINE | ID: mdl-28671456

RESUMO

The isoelectric point of a peptide is a physicochemical property that can be accurately predicted from the sequence of the peptide when the peptide is built from natural amino acids. Peptides can however have chemical modifications, such as phosphorylations, amidations, and unnatural amino acids, which can result in erroneous predictions if not accounted for. Here we report on an open source program, pICalculax, which in an extensible way can handle pI calculations of modified peptides. Tests on a database of modified peptides and experimentally determined pI values show an improvement in pI predictions when taking the modifications into account. The correlation coefficient improves from 0.45 to 0.91, and the root-mean-square deviation likewise improves from 3.3 to 0.9. The program is available at https://github.com/EBjerrum/pICalculax.

Assuntos

Biologia Computacional/métodos , Peptídeos/química , Software , Algoritmos , Ponto Isoelétrico , Interface Usuário-Computador

16.

Protein structure refinement using a quantum mechanics-based chemical shielding predictor.

Bratholm, Lars A; Jensen, Jan H.

Chem Sci ; 8(3): 2061-2072, 2017 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-28451325

RESUMO

The accurate prediction of protein chemical shifts using a quantum mechanics (QM)-based method has been the subject of intense research for more than 20 years but so far empirical methods for chemical shift prediction have proven more accurate. In this paper we show that a QM-based predictor of a protein backbone and CB chemical shifts (ProCS15, PeerJ, 2016, 3, e1344) is of comparable accuracy to empirical chemical shift predictors after chemical shift-based structural refinement that removes small structural errors. We present a method by which quantum chemistry based predictions of isotropic chemical shielding values (ProCS15) can be used to refine protein structures using Markov Chain Monte Carlo (MCMC) simulations, relating the chemical shielding values to the experimental chemical shifts probabilistically. Two kinds of MCMC structural refinement simulations were performed using force field geometry optimized X-ray structures as starting points: simulated annealing of the starting structure and constant temperature MCMC simulation followed by simulated annealing of a representative ensemble structure. Annealing of the CHARMM structure changes the CA-RMSD by an average of 0.4 Å but lowers the chemical shift RMSD by 1.0 and 0.7 ppm for CA and N. Conformational averaging has a relatively small effect (0.1-0.2 ppm) on the overall agreement with carbon chemical shifts but lowers the error for nitrogen chemical shifts by 0.4 ppm. If an amino acid specific offset is included the ProCS15 predicted chemical shifts have RMSD values relative to experiments that are comparable to popular empirical chemical shift predictors. The annealed representative ensemble structures differ in CA-RMSD relative to the initial structures by an average of 2.0 Å, with >2.0 Å difference for six proteins. In four of the cases, the largest structural differences arise in structurally flexible regions of the protein as determined by NMR, and in the remaining two cases, the large structural change may be due to force field deficiencies. The overall accuracy of the empirical methods are slightly improved by annealing the CHARMM structure with ProCS15, which may suggest that the minor structural changes introduced by ProCS15-based annealing improves the accuracy of the protein structures. Having established that QM-based chemical shift prediction can deliver the same accuracy as empirical shift predictors we hope this can help increase the accuracy of related approaches such as QM/MM or linear scaling approaches or interpreting protein structural dynamics from QM-derived chemical shift.

17.

Prediction of pK_a Values for Druglike Molecules Using Semiempirical Quantum Chemical Methods.

Jensen, Jan H; Swain, Christopher J; Olsen, Lars.

J Phys Chem A ; 121(3): 699-707, 2017 Jan 26.

Artigo em Inglês | MEDLINE | ID: mdl-28054775

RESUMO

Rapid yet accurate pKa prediction for druglike molecules is a key challenge in computational chemistry. This study uses PM6-DH+/COSMO, PM6/COSMO, PM7/COSMO, PM3/COSMO, AM1/COSMO, PM3/SMD, AM1/SMD, and DFTB3/SMD to predict the pKa values of 53 amine groups in 48 druglike compounds. The approach uses an isodesmic reaction where the pKa value is computed relative to a chemically related reference compound for which the pKa value has been measured experimentally or estimated using a standard empirical approach. The AM1- and PM3-based methods perform best with RMSE values of 1.4-1.6 pH units that have uncertainties of ±0.2-0.3 pH units, which make them statistically equivalent. However, for all but PM3/SMD and AM1/SMD the RMSEs are dominated by a single outlier, cefadroxil, caused by proton transfer in the zwitterionic protonation state. If this outlier is removed, the RMSE values for PM3/COSMO and AM1/COSMO drop to 1.0 ± 0.2 and 1.1 ± 0.3, whereas PM3/SMD and AM1/SMD remain at 1.5 ± 0.3 and 1.6 ± 0.3/0.4 pH units, making the COSMO-based predictions statistically better than the SMD-based predictions. For pKa calculations where a zwitterionic state is not involved or proton transfer in a zwitterionic state is not observed, PM3/COSMO or AM1/COSMO is the best pKa prediction method; otherwise PM3/SMD or AM1/SMD should be used. Thus, fast and relatively accurate pKa prediction for 100-1000s of druglike amines is feasible with the current setup and relatively modest computational resources.

Assuntos

Aminas/química , Teoria Quântica , Concentração de Íons de Hidrogênio , Estrutura Molecular , Termodinâmica

18.

Prediction of pKa values using the PM6 semiempirical method.

Kromann, Jimmy C; Larsen, Frej; Moustafa, Hadeel; Jensen, Jan H.

PeerJ ; 4: e2335, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27602298

RESUMO

The PM6 semiempirical method and the dispersion and hydrogen bond-corrected PM6-D3H+ method are used together with the SMD and COSMO continuum solvation models to predict pKa values of pyridines, alcohols, phenols, benzoic acids, carboxylic acids, and phenols using isodesmic reactions and compared to published ab initio results. The pKa values of pyridines, alcohols, phenols, and benzoic acids considered in this study can generally be predicted with PM6 and ab initio methods to within the same overall accuracy, with average mean absolute differences (MADs) of 0.6-0.7 pH units. For carboxylic acids, the accuracy (0.7-1.0 pH units) is also comparable to ab initio results if a single outlier is removed. For primary, secondary, and tertiary amines the accuracy is, respectively, similar (0.5-0.6), slightly worse (0.5-1.0), and worse (1.0-2.5), provided that di- and tri-ethylamine are used as reference molecules for secondary and tertiary amines. When applied to a drug-like molecule where an empirical pKa predictor exhibits a large (4.9 pH unit) error, we find that the errors for PM6-based predictions are roughly the same in magnitude but opposite in sign. As a result, most of the PM6-based methods predict the correct protonation state at physiological pH, while the empirical predictor does not. The computational cost is around 2-5 min per conformer per core processor, making PM6-based pKa prediction computationally efficient enough to be used for high-throughput screening using on the order of 100 core processors.

19.

Towards a barrier height benchmark set for biologically relevant systems.

Kromann, Jimmy C; Christensen, Anders S; Cui, Qiang; Jensen, Jan H.

PeerJ ; 4: e1994, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27168993

RESUMO

We have collected computed barrier heights and reaction energies (and associated model structures) for five enzymes from studies published by Himo and co-workers. Using this data, obtained at the B3LYP/6- 311+G(2d,2p)[LANL2DZ]//B3LYP/6-31G(d,p) level of theory, we then benchmark PM6, PM7, PM7-TS, and DFTB3 and discuss the influence of system size, bulk solvation, and geometry re-optimization on the error. The mean absolute differences (MADs) observed for these five enzyme model systems are similar to those observed for PM6 and PM7 for smaller systems (10-15 kcal/mol), while DFTB results in a MAD that is significantly lower (6 kcal/mol). The MADs for PMx and DFTB3 are each dominated by large errors for a single system and if the system is disregarded the MADs fall to 4-5 kcal/mol. Overall, results for the condensed phase are neither more or less accurate relative to B3LYP than those in the gas phase. With the exception of PM7-TS, the MAD for small and large structural models are very similar, with a maximum deviation of 3 kcal/mol for PM6. Geometry optimization with PM6 shows that for one system this method predicts a different mechanism compared to B3LYP/6-31G(d,p). For the remaining systems, geometry optimization of the large structural model increases the MAD relative to single points, by 2.5 and 1.8 kcal/mol for barriers and reaction energies. For the small structural model, the corresponding MADs decrease by 0.4 and 1.2 kcal/mol, respectively. However, despite these small changes, significant changes in the structures are observed for some systems, such as proton transfer and hydrogen bonding rearrangements. The paper represents the first step in the process of creating a benchmark set of barriers computed for systems that are relatively large and representative of enzymatic reactions, a considerable challenge for any one research group but possible through a concerted effort by the community. We end by outlining steps needed to expand and improve the data set and how other researchers can contribute to the process.

20.

ProCS15: a DFT-based chemical shift predictor for backbone and Cß atoms in proteins.

Larsen, Anders S; Bratholm, Lars A; Christensen, Anders S; Channir, Maher; Jensen, Jan H.

PeerJ ; 3: e1344, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26623185

RESUMO

We present ProCS15: a program that computes the isotropic chemical shielding values of backbone and Cß atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values of up to 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each proteins. These maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7-1.1, 0.2-0.4, and 1.8-2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA