Búsqueda | Portal de Búsqueda de la BVS España

1.

When Do Quantum Mechanical Descriptors Help Graph Neural Networks to Predict Chemical Properties?

Li, Shih-Cheng; Wu, Haoyang; Menon, Angiras; Spiekermann, Kevin A; Li, Yi-Pei; Green, William H.

J Am Chem Soc ; 146(33): 23103-23120, 2024 Aug 21.

Artículo en Inglés | MEDLINE | ID: mdl-39106041

RESUMEN

Deep graph neural networks are extensively utilized to predict chemical reactivity and molecular properties. However, because of the complexity of chemical space, such models often have difficulty extrapolating beyond the chemistry contained in the training set. Augmenting the model with quantum mechanical (QM) descriptors is anticipated to improve its generalizability. However, obtaining QM descriptors often requires CPU-intensive computational chemistry calculations. To identify when QM descriptors help graph neural networks predict chemical properties, we conduct a systematic investigation of the impact of atom, bond, and molecular QM descriptors on the performance of directed message passing neural networks (D-MPNNs) for predicting 16 molecular properties. The analysis surveys computational and experimental targets, as well as classification and regression tasks, and varied data set sizes from several hundred to hundreds of thousands of data points. Our results indicate that QM descriptors are mostly beneficial for D-MPNN performance on small data sets, provided that the descriptors correlate well with the targets and can be readily computed with high accuracy. Otherwise, using QM descriptors can add cost without benefit or even introduce unwanted noise that can degrade model performance. Strategic integration of QM descriptors with D-MPNN unlocks potential for physics-informed, data-efficient modeling with some interpretability that can streamline de novo drug and material designs. To facilitate the use of QM descriptors in machine learning workflows for chemistry, we provide a set of guidelines regarding when and how to best leverage QM descriptors, a high-throughput workflow to compute them, and an enhancement to Chemprop, a widely adopted open-source D-MPNN implementation for chemical property prediction.

2.

Tailoring parameters for QM/MM simulations: accurate modeling of adsorption and catalysis in zirconium-based metal-organic frameworks.

Kao, Yu-Chi; Wang, Yi-Ming; Yeh, Jyun-Yi; Li, Shih-Cheng; Wu, Kevin C-W; Lin, Li-Chiang; Li, Yi-Pei.

Phys Chem Chem Phys ; 26(30): 20388-20398, 2024 Jul 31.

Artículo en Inglés | MEDLINE | ID: mdl-39015995

RESUMEN

Quantum mechanics/molecular mechanics (QM/MM) simulations offer an efficient way to model reactions occurring in complex environments. This study introduces a specialized set of charge and Lennard-Jones parameters tailored for electrostatically embedded QM/MM calculations, aiming to accurately model both adsorption processes and catalytic reactions in zirconium-based metal-organic frameworks (Zr-MOFs). To validate our approach, we compare adsorption energies derived from QM/MM simulations against experimental results and Monte Carlo simulation outcomes. The developed parameters showcase the ability of QM/MM simulations to represent long-range electrostatic and van der Waals interactions faithfully. This capability is evidenced by the prediction of adsorption energies with a low root mean square error of 1.1 kcal mol-1 across a wide range of adsorbates. The practical applicability of our QM/MM model is further illustrated through the study of glucose isomerization and epimerization reactions catalyzed by two structurally distinct Zr-MOF catalysts, UiO-66 and MOF-808. Our QM/MM calculations closely align with experimental activation energies. Importantly, the parameter set introduced here is compatible with the widely used universal force field (UFF). Moreover, we thoroughly explore how the size of the cluster model and the choice of density functional theory (DFT) methodologies influence the simulation outcomes. This work provides an accurate and computationally efficient framework for modeling complex catalytic reactions within Zr-MOFs, contributing valuable insights into their mechanistic behaviors and facilitating further advancements in this dynamic area of research.

3.

Fast Water Transport in UTSA-280 via a Knock-Off Mechanism.

Hsu, Cheng-Hsun; Yu, Hsin-Yu; Lee, Ho Jun; Wu, Pei-Hao; Huang, Shing-Jong; Lee, Jong Suk; Yu, Tsyr-Yan; Li, Yi-Pei; Kang, Dun-Yen.

Angew Chem Int Ed Engl ; 62(39): e202309874, 2023 Sep 25.

Artículo en Inglés | MEDLINE | ID: mdl-37574451

RESUMEN

Water and other small molecules frequently coordinate within metal-organic frameworks (MOFs). These coordinated molecules may actively engage in mass transfer, moving together with the transport molecules, but this phenomenon has yet to be examined. In this study, we explore a unique water transfer mechanism in UTSA-280, where an incoming water molecule can displace a coordinated molecule for mass transfer. We refer to this process as the "knock-off" mechanism. Despite UTSA-280 possessing one-dimensional channels, the knock-off transport enables water movement along the other two axes, effectively simulating a pseudo-three-dimensional mass transfer. Even with a relatively narrow pore width, the knock-off mechanism enables a high water flux in the UTSA-280 membrane. The knock-off mechanism also renders UTSA-280 superior water/ethanol diffusion selectivity for pervaporation. To validate this unique mechanism, we conducted 1 H and 2 H solid-state NMR on UTSA-280 after the adsorption of deuterated water. We also derived potential energy diagrams from the density functional theory to gain atomic-level insight into the knock-off and the direct-hopping mechanisms. The simulation findings reveal that the energy barrier of the knock-off mechanism is marginally lower than the direct-hopping pathway, implying its potential role in enhancing water diffusion in UTSA-280.

4.

Deep Learning-Based Increment Theory for Formation Enthalpy Predictions.

Chen, Lung-Yi; Hsu, Ting-Wei; Hsiung, Tsai-Chen; Li, Yi-Pei.

J Phys Chem A ; 126(41): 7548-7556, 2022 Oct 20.

Artículo en Inglés | MEDLINE | ID: mdl-36217924

RESUMEN

Machine learning predictions of molecular thermochemistry, such as formation enthalpy, have been limited for large and complicated species because of the lack of available training data. Such predictions would be important in the prediction of reaction thermodynamics and the construction of kinetic models. Herein, we introduce a graph-based deep learning approach that can separately learn the enthalpy contribution of each atom in its local environment with the effect of the overall molecular structure taken into account. Because this approach follows the additivity scheme of increment theory, it can be generalized to larger and more complicated species not present in the training data. By training the model on molecules with up to 11 heavy atoms, it can predict the formation enthalpy of testing molecules with up to 42 heavy atoms with a mean absolute error of 2 kcal/mol, which is less than half of the error of the conventional increment theory. We expect that this approach will also enable rapid prediction of other extensive properties of large molecules that are difficult to derive from experiments or ab initio calculation.

5.

Permeable reactive barrier of waste sludge from wine processing utilized to block a metallic mixture plume in a simulated aquifer.

Chien, Shui-Wen Chang; Li, Yi-Pei; Liu, Cheng-Chung.

Water Sci Technol ; 84(9): 2472-2485, 2021 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-34810325

RESUMEN

Heavy metal contamination in underground water commonly occurs in industrial areas in Taiwan. Wine-processing waste sludge (WPWS) can adsorb and remove several toxic metals from aqueous solutions. In this study, WPWS particles were used to construct a permeable reactive barrier (PRB) for the remediation of a contaminant plume comprising HCrO4-, Cu2+, Zn2+, Ni2+, Cd2+, and AsO33- in a simulated aquifer. This PRB effectively prevented the dispersals of Cu2+, Zn2+, and HCrO4-, and their concentrations in the pore water behind the barrier declined below the control standard levels. However, the PRB failed to prevent the diffusion of Ni2+, Cd2+, and AsO33-, and their concentrations were occasionally higher than the control standard levels. However, 18% to 45% of As, 84% to 93% of Cd, and 16% to 77% of Ni were removed by the barrier. Ni ions showed less adsorption on the fine sand layer because of the layer's ineffectiveness in multiple competitive adsorptions. Therefore, the ions infiltrated the barrier at a high concentration, which increased the loading for the barrier blocking. The blocking efficiency was related to the degree of adsorption of heavy metals in the sand layer and the results of their competitive adsorption.

Asunto(s)

Agua Subterránea , Metales Pesados , Vino , Adsorción , Metales Pesados/análisis , Aguas del Alcantarillado

6.

Diels-Alder Conversion of Acrylic Acid and 2,5-Dimethylfuran to para-Xylene Over Heterogeneous Bi-BTC Metal-Organic Framework Catalysts Under Mild Conditions.

Yeh, Jyun-Yi; Chen, Season S; Li, Shih-Cheng; Chen, Celine H; Shishido, Tetsuya; Tsang, Daniel C W; Yamauchi, Yusuke; Li, Yi-Pei; Wu, Kevin C-W.

Angew Chem Int Ed Engl ; 60(2): 624-629, 2021 Jan 11.

Artículo en Inglés | MEDLINE | ID: mdl-33078542

RESUMEN

The heterogeneous metal-organic framework Bi-BTC successfully catalyzed the synthesis of para-xylene from bio-based 2,5-dimethylfuran and acrylic acid in a promising yield (92 %), under relatively mild conditions (160 °C, 10âbar), and with a low reaction-energy barrier (47.3âkJ mol-1 ). The proposed reaction strategy also demonstrates a remarkable versatility for furan derivatives such as furan and 2-methylfuran.

7.

Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction.

Scalia, Gabriele; Grambow, Colin A; Pernici, Barbara; Li, Yi-Pei; Green, William H.

J Chem Inf Model ; 60(6): 2697-2717, 2020 06 22.

Artículo en Inglés | MEDLINE | ID: mdl-32243154

RESUMEN

Advances in deep neural network (DNN)-based molecular property prediction have recently led to the development of models of remarkable accuracy and generalization ability, with graph convolutional neural networks (GCNNs) reporting state-of-the-art performance for this task. However, some challenges remain, and one of the most important that needs to be fully addressed concerns uncertainty quantification. DNN performance is affected by the volume and the quality of the training samples. Therefore, establishing when and to what extent a prediction can be considered reliable is just as important as outputting accurate predictions, especially when out-of-domain molecules are targeted. Recently, several methods to account for uncertainty in DNNs have been proposed, most of which are based on approximate Bayesian inference. Among these, only a few scale to the large data sets required in applications. Evaluating and comparing these methods has recently attracted great interest, but results are generally fragmented and absent for molecular property prediction. In this paper, we quantitatively compare scalable techniques for uncertainty estimation in GCNNs. We introduce a set of quantitative criteria to capture different uncertainty aspects and then use these criteria to compare MC-dropout, Deep Ensembles, and bootstrapping, both theoretically in a unified framework that separates aleatoric/epistemic uncertainty and experimentally on public data sets. Our experiments quantify the performance of the different uncertainty estimation methods and their impact on uncertainty-related error reduction. Our findings indicate that Deep Ensembles and bootstrapping consistently outperform MC-dropout, with different context-specific pros and cons. Our analysis leads to a better understanding of the role of aleatoric/epistemic uncertainty, also in relation to the target data set features, and highlights the challenge posed by out-of-domain uncertainty.

Asunto(s)

Aprendizaje Profundo , Teorema de Bayes , Redes Neurales de la Computación , Incertidumbre

8.

Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry.

Li, Yi-Pei; Han, Kehang; Grambow, Colin A; Green, William H.

J Phys Chem A ; 123(10): 2142-2152, 2019 Mar 14.

Artículo en Inglés | MEDLINE | ID: mdl-30758953

RESUMEN

Because collecting precise and accurate chemistry data is often challenging, chemistry data sets usually only span a small region of chemical space, which limits the performance and the scope of applicability of data-driven models. To address this issue, we integrated an active learning machine with automatic ab initio calculations to form a self-evolving model that can continuously adapt to new species appointed by the users. In the present work, we demonstrate the self-evolving concept by modeling the formation enthalpies of stable closed-shell polycyclic species calculated at the B3LYP/6-31G(2df,p) level of theory. By combining a molecular graph convolutional neural network with a dropout training strategy, the model we developed can predict density functional theory (DFT) enthalpies for a broad range of polycyclic species and assess the quality of each predicted value. For the species which the current model is uncertain about, the automatic ab initio calculations provide additional training data to improve the performance of the model. For a test set composed of 2858 cyclic and polycyclic hydrocarbons and oxygenates, the enthalpies predicted by the model agree with the reference DFT values with a root-mean-square error of 2.62 kcal/mol. We found that a model originally trained on hydrocarbons and oxygenates can broaden its prediction coverage to nitrogen-containing species via an active learning process, suggesting that the continuous learning strategy is not only able to improve the model accuracy but is also capable of expanding the predictive capacity of a model to unseen species domains.

9.

Accurate Thermochemistry with Small Data Sets: A Bond Additivity Correction and Transfer Learning Approach.

Grambow, Colin A; Li, Yi-Pei; Green, William H.

J Phys Chem A ; 123(27): 5826-5835, 2019 Jul 11.

Artículo en Inglés | MEDLINE | ID: mdl-31246465

RESUMEN

Machine learning provides promising new methods for accurate yet rapid prediction of molecular properties, including thermochemistry, which is an integral component of many computer simulations, particularly automated reaction mechanism generation. Often, very large data sets with tens of thousands of molecules are required for training the models, but most data sets of experimental or high-accuracy quantum mechanical quality are much smaller. To overcome these limitations, we calculate new high-level data sets and derive bond additivity corrections to significantly improve enthalpies of formation. We adopt a transfer learning technique to train neural network models that achieve good performance even with a relatively small set of high-accuracy data. The training data for the entropy model are carefully selected so that important conformational effects are captured. The resulting models are generally applicable thermochemistry predictors for organic compounds with oxygen and nitrogen heteroatoms that approach experimental and coupled cluster accuracy while only requiring molecular graph inputs. Due to their versatility and the ease of adding new training data, they are poised to replace conventional estimation methods for thermochemical parameters in reaction mechanism generation. Since high-accuracy data are often sparse, similar transfer learning approaches are expected to be useful for estimating many other molecular properties.

10.

Unimolecular Reaction Pathways of a Î³-Ketohydroperoxide from Combined Application of Automated Reaction Discovery Methods.

Grambow, Colin A; Jamal, Adeel; Li, Yi-Pei; Green, William H; Zádor, Judit; Suleimanov, Yury V.

J Am Chem Soc ; 140(3): 1035-1048, 2018 01 24.

Artículo en Inglés | MEDLINE | ID: mdl-29271202

RESUMEN

Ketohydroperoxides are important in liquid-phase autoxidation and in gas-phase partial oxidation and pre-ignition chemistry, but because of their low concentration, instability, and various analytical chemistry limitations, it has been challenging to experimentally determine their reactivity, and only a few pathways are known. In the present work, 75 elementary-step unimolecular reactions of the simplest Î³-ketohydroperoxide, 3-hydroperoxypropanal, were discovered by a combination of density functional theory with several automated transition-state search algorithms: the Berny algorithm coupled with the freezing string method, single- and double-ended growing string methods, the heuristic KinBot algorithm, and the single-component artificial force induced reaction method (SC-AFIR). The present joint approach significantly outperforms previous manual and automated transition-state searches - 68 of the reactions of Î³-ketohydroperoxide discovered here were previously unknown and completely unexpected. All of the methods found the lowest-energy transition state, which corresponds to the first step of the Korcek mechanism, but each algorithm except for SC-AFIR detected several reactions not found by any of the other methods. We show that the low-barrier chemical reactions involve promising new chemistry that may be relevant in atmospheric and combustion systems. Our study highlights the complexity of chemical space exploration and the advantage of combined application of several approaches. Overall, the present work demonstrates both the power and the weaknesses of existing fully automated approaches for reaction discovery which suggest possible directions for further method development and assessment in order to enable reliable discovery of all important reactions of any specified reactant(s).

11.

Dendritic cell nuclear protein-1 regulates melatonin biosynthesis by binding to BMAL1 and inhibiting the transcription of N-acetyltransferase in C6 cells.

Chen, Dong; Li, Yi-Pei; Yu, Yan-Xia; Zhou, Tian; Liu, Chao; Fei, Er-Kang; Gao, Feng; Mu, Chen-Chen; Ren, Hai-Gang; Wang, Guang-Hui.

Acta Pharmacol Sin ; 39(4): 597-606, 2018 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-29219947

RESUMEN

Dendritic cell nuclear protein-1 (DCNP1) is a protein associated with major depression. In the brains of depression patients, DCNP1 is up-regulated. However, how DCNP1 participates in the pathogenesis of major depression remains unknown. In this study, we first transfected HEK293 cells with EGFP-DCNP1 and demonstrated that the full-length DCNP1 protein was localized in the nucleus, and RRK (the residues 117-119) composed its nuclear localization signal (NLS). An RRK-deletion form of DCNP1 (DCNP1ΔRRK) and truncated form (DCNP11-116), each lacking the RRK residues, did not show the specific nuclear localization like full-length DCNP1 in the cells. A rat glioma cell line C6 can synthesize melatonin, a hormone that plays important roles in both sleep and depression. We then revealed that transfection of C6 cells with full-length DCNP1 but not DCNP1ΔRRK or DCNP11-116 significantly decreased the levels of melatonin. Furthermore, overexpression of full-length DCNP1, but not DCNP1ΔRRK or DCNP11-116, in C6 cells significantly decreased both the mRNA and protein levels of N-acetyltransferase (NAT), a key enzyme in melatonin synthesis. Full-length DCNP1 but not DCNP1ΔRRK or DCNP11-116 was detected to interact with the Nat promoter and inhibited its activity through its E-box motif. Furthermore, full-length DCNP1 but not the mutants interacted with and repressed the transcriptional activity of BMAL1, a transcription factor that transactivates Nat through the E-box motif. In conclusion, we have shown that RRK (the residues 117-119) are the NLS responsible for DCNP1 nuclear localization. Nuclear DCNP1 represses NAT expression and melatonin biosynthesis by interacting with BMAL1 and repressing its transcriptional activity. Our study reveals a connection between the major depression candidate protein DCNP1, circadian system and melatonin biosynthesis, which may contribute to the pathogenesis of depression.

Asunto(s)

Factores de Transcripción ARNTL/metabolismo , Acetiltransferasas/antagonistas & inhibidores , Melatonina/biosíntesis , Proteínas Nucleares/metabolismo , Proteínas Represoras/metabolismo , Factores de Transcripción ARNTL/genética , Acetiltransferasas/genética , Secuencia de Aminoácidos , Animales , Línea Celular Tumoral , Núcleo Celular/metabolismo , Regulación de la Expresión Génica , Técnicas de Silenciamiento del Gen , Células HEK293 , Humanos , Señales de Localización Nuclear , Proteínas Nucleares/genética , Regiones Promotoras Genéticas , Unión Proteica , ARN Mensajero/metabolismo , Ratas , Proteínas Represoras/genética , Eliminación de Secuencia , Transcripción Genética

12.

Vitamin K2 suppresses rotenone-induced microglial activation in vitro.

Yu, Yan-Xia; Li, Yi-Pei; Gao, Feng; Hu, Qing-Song; Zhang, Yan; Chen, Dong; Wang, Guang-Hui.

Acta Pharmacol Sin ; 37(9): 1178-89, 2016 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-27498777

RESUMEN

AIM: Increasing evidence has shown that environmental factors such as rotenone and paraquat induce neuroinflammation, which contributes to the pathogenesis of Parkinson's disease (PD). In this study, we investigated the molecular mechanisms underlying the repression by menaquinone-4 (MK-4), a subtype of vitamin K2, of rotenone-induced microglial activation in vitro. METHODS: A microglial cell line (BV2) was exposed to rotenone (1 µmol/L) with or without MK-4 treatment. The levels of TNF-α or IL-1ß in 100 µL of cultured media of BV2 cells were measured using ELISA kits. BV2 cells treated with rotenone with or without MK4 were subjected to mitochondrial membrane potential, ROS production, immunofluorescence or immunoblot assays. The neuroblastoma SH-SY5Y cells were treated with conditioned media (CM) of BV2 cells that were exposed to rotenone with or without MK-4 treatment, and the cell viability was assessed using MTT assay. RESULTS: In rotenone-treated BV2 cells, MK-4 (0.5-20 µmol/L) dose-dependently suppressed the upregulation in the expression of iNOS and COX-2 in the cells, as well as the production of TNF-α and IL-1ß in the cultured media. MK-4 (5-20 µmol/L) significantly inhibited rotenone-induced nuclear translocation of NF-κB in BV2 cells. MK-4 (5-20 µmol/L) significantly inhibited rotenone-induced p38 activation, ROS production, and caspase-1 activation in BV2 cells. MK-4 (5-20 µmol/L) also restored the mitochondrial membrane potential that had been damaged by rotenone. Exposure to CM from rotenone-treated BV2 cells markedly decreased the viability of SH-SY5Y cells. However, this rotenone-activated microglia-mediated death of SH-SY5Y cells was significantly attenuated when the BV2 cells were co-treated with MK-4 (5-20 µmol/L). CONCLUSION: Vitamin K2 can directly suppress rotenone-induced activation of microglial BV2 cells in vitro by repressing ROS production and p38 activation.

Asunto(s)

Contaminantes Ambientales/toxicidad , Potencial de la Membrana Mitocondrial/efectos de los fármacos , Microglía/efectos de los fármacos , Rotenona/toxicidad , Vitamina K 2/análogos & derivados , Animales , Técnicas de Cultivo de Célula , Línea Celular , Supervivencia Celular/efectos de los fármacos , Relación Dosis-Respuesta a Droga , Interleucina-1beta/metabolismo , Ratones , Microglía/inmunología , FN-kappa B/metabolismo , Neuroinmunomodulación/efectos de los fármacos , Especies Reactivas de Oxígeno/metabolismo , Factor de Necrosis Tumoral alfa/metabolismo , Vitamina K 2/farmacología

13.

Enhancing chemical synthesis: a two-stage deep neural network for predicting feasible reaction conditions.

Chen, Lung-Yi; Li, Yi-Pei.

J Cheminform ; 16(1): 11, 2024 Jan 24.

Artículo en Inglés | MEDLINE | ID: mdl-38268009

RESUMEN

In the field of chemical synthesis planning, the accurate recommendation of reaction conditions is essential for achieving successful outcomes. This work introduces an innovative deep learning approach designed to address the complex task of predicting appropriate reagents, solvents, and reaction temperatures for chemical reactions. Our proposed methodology combines a multi-label classification model with a ranking model to offer tailored reaction condition recommendations based on relevance scores derived from anticipated product yields. To tackle the challenge of limited data for unfavorable reaction contexts, we employed the technique of hard negative sampling to generate reaction conditions that might be mistakenly classified as suitable, forcing the model to refine its decision boundaries, especially in challenging cases. Our developed model excels in proposing conditions where an exact match to the recorded solvents and reagents is found within the top-10 predictions 73% of the time. It also predicts temperatures within ± 20 [Formula: see text] of the recorded temperature in 89% of test cases. Notably, the model demonstrates its capacity to recommend multiple viable reaction conditions, with accuracy varying based on the availability of condition records associated with each reaction. What sets this model apart is its ability to suggest alternative reaction conditions beyond the constraints of the dataset. This underscores its potential to inspire innovative approaches in chemical research, presenting a compelling opportunity for advancing chemical synthesis planning and elevating the field of reaction engineering. Scientific contribution: The combination of multi-label classification and ranking models provides tailored recommendations for reaction conditions based on the reaction yields. A novel approach is presented to address the issue of data scarcity in negative reaction conditions through data augmentation.

14.

AutoTemplate: enhancing chemical reaction datasets for machine learning applications in organic chemistry.

Chen, Lung-Yi; Li, Yi-Pei.

J Cheminform ; 16(1): 74, 2024 Jun 27.

Artículo en Inglés | MEDLINE | ID: mdl-38937840

RESUMEN

This paper presents AutoTemplate, an innovative data preprocessing protocol, addressing the crucial need for high-quality chemical reaction datasets in the realm of machine learning applications in organic chemistry. Recent advances in artificial intelligence have expanded the application of machine learning in chemistry, particularly in yield prediction, retrosynthesis, and reaction condition prediction. However, the effectiveness of these models hinges on the integrity of chemical reaction datasets, which are often plagued by inconsistencies like missing reactants, incorrect atom mappings, and outright erroneous reactions. AutoTemplate introduces a two-stage approach to refine these datasets. The first stage involves extracting meaningful reaction transformation rules and formulating generic reaction templates using a simplified SMARTS representation. This simplification broadens the applicability of templates across various chemical reactions. The second stage is template-guided reaction curation, where these templates are systematically applied to validate and correct the reaction data. This process effectively amends missing reactant information, rectifies atom-mapping errors, and eliminates incorrect data entries. A standout feature of AutoTemplate is its capability to concurrently identify and correct false chemical reactions. It operates on the premise that most reactions in datasets are accurate, using these as templates to guide the correction of flawed entries. The protocol demonstrates its efficacy across a range of chemical reactions, significantly enhancing dataset quality. This advancement provides a more robust foundation for developing reliable machine learning models in chemistry, thereby improving the accuracy of forward and retrosynthetic predictions. AutoTemplate marks a significant progression in the preprocessing of chemical reaction datasets, bridging a vital gap and facilitating more precise and efficient machine learning applications in organic synthesis. SCIENTIFIC CONTRIBUTION: The proposed automated preprocessing tool for chemical reaction data aims to identify errors within chemical databases. Specifically, if the errors involve atom mapping or the absence of reactant types, corrections can be systematically applied using reaction templates, ultimately elevating the overall quality of the database.

15.

Unraveling Differences in the Effects of Ammonium/Amine-Based Additives on the Performance and Stability of Inverted Perovskite Solar Cells.

Kao, Shih-Feng; Yu, Ming-Hsuan; Chen, Jing-Chun; Yu, Hao-Wei; Yu, Hsin-Yu; Lin, Bi-Hsuan; Ni, I-Chih; Li, Yi-Pei; Chueh, Chu-Chen.

Small Methods ; : e2400039, 2024 Aug 09.

Artículo en Inglés | MEDLINE | ID: mdl-39118555

RESUMEN

Additive engineering, with its excellent ability to passivate bulk or surface perovskite defects, has become a common strategy to improve the performance and stability of perovskite solar cells (PVSCs). Among the various additives reported so far, ammonium salts are considered an important branch. It is worth noting that although both ammonium-based additives (R-NH3 +) and amine-based additives (R-NH2) are derivatives of ammonia (NH3), the functions of the two can be easily confused due to their structural similarities. Moreover, there is no comprehensive comparative analysis of them in the literature. Here, the differences between phenethylammonium iodide (PEA+) and phenethylamine (PEA) additives are revealed experimentally and theoretically. The results clearly show that PEA outperforms PEA+ in terms of device performance and stability based on the following three factors: i) PEA's defect passivation capability is superior to that of PEA+; ii) PEA has better hydrophobicity to hinder water ingress; and iii) PEA completely improves the stability of PVSCs by enhancing thermal stability and inhibiting iodide migration in perovskite more effectively than PEA+. As a result, the power conversion efficiency (PCE) of the inverted methylammonium triiodide (MAPbI3) device using PEA increases by ≈15% to over 21%. More importantly, this device exhibits greater ability to prevent water invasion, thermal-induce degradation, and inhibit iodide ion migration, resulting in better long-term stability.

16.

Integrating Chemical Information into Reinforcement Learning for Enhanced Molecular Geometry Optimization.

Chang, Yu-Cheng; Li, Yi-Pei.

J Chem Theory Comput ; 19(23): 8598-8609, 2023 Dec 12.

Artículo en Inglés | MEDLINE | ID: mdl-38012608

RESUMEN

Geometry optimization is a crucial step in computational chemistry, and the efficiency of optimization algorithms plays a pivotal role in reducing computational costs. In this study, we introduce a novel reinforcement-learning-based optimizer that surpasses traditional methods in terms of efficiency. What sets our model apart is its ability to incorporate chemical information into the optimization process. By exploring different state representations that integrate gradients, displacements, primitive type labels, and additional chemical information from the SchNet model, our reinforcement learning optimizer achieves exceptional results. It demonstrates an average reduction of about 50% or more in optimization steps compared to the conventional optimization algorithms that we examined when dealing with challenging initial geometries. Moreover, the reinforcement learning optimizer exhibits promising transferability across various levels of theory, emphasizing its versatility and potential for enhancing molecular geometry optimization. This research highlights the significance of leveraging reinforcement learning algorithms to harness chemical knowledge, paving the way for future advancements in computational chemistry.

17.

Explainable uncertainty quantifications for deep learning-based molecular property prediction.

Yang, Chu-I; Li, Yi-Pei.

J Cheminform ; 15(1): 13, 2023 Feb 03.

Artículo en Inglés | MEDLINE | ID: mdl-36737786

RESUMEN

Quantifying uncertainty in machine learning is important in new research areas with scarce high-quality data. In this work, we develop an explainable uncertainty quantification method for deep learning-based molecular property prediction. This method can capture aleatoric and epistemic uncertainties separately and attribute the uncertainties to atoms present in the molecule. The atom-based uncertainty method provides an extra layer of chemical insight to the estimated uncertainties, i.e., one can analyze individual atomic uncertainty values to diagnose the chemical component that introduces uncertainty to the prediction. Our experiments suggest that atomic uncertainty can detect unseen chemical structures and identify chemical species whose data are potentially associated with significant noise. Furthermore, we propose a post-hoc calibration method to refine the uncertainty quantified by ensemble models for better confidence interval estimates. This work improves uncertainty calibration and provides a framework for assessing whether and why a prediction should be considered unreliable.

18.

Mixed-linker strategy for suppressing structural flexibility of metal-organic framework membranes for gas separation.

Chang, Chung-Kai; Ko, Ting-Rong; Lin, Tsai-Yu; Lin, Yen-Chun; Yu, Hyun Jung; Lee, Jong Suk; Li, Yi-Pei; Wu, Heng-Liang; Kang, Dun-Yen.

Commun Chem ; 6(1): 118, 2023 Jun 10.

Artículo en Inglés | MEDLINE | ID: mdl-37301865

RESUMEN

Structural flexibility is a critical issue that limits the application of metal-organic framework (MOF) membranes for gas separation. Herein we propose a mixed-linker approach to suppress the structural flexibility of the CAU-10-based (CAU = Christian-Albrechts-University) membranes. Specifically, pure CAU-10-PDC membranes display high separation performance but at the same time are highly unstable for the separation of CO2/CH4. A partial substitution (30 mol.%) of the linker PDC with BDC significantly improves its stability. Such an approach also allows for decreasing the aperture size of MOFs. The optimized CAU-10-PDC-H (70/30) membrane possesses a high separation performance for CO2/CH4 (separation factor of 74.2 and CO2 permeability of 1,111.1 Barrer under 2 bar of feed pressure at 35°C). A combination of in situ characterization with X-ray diffraction (XRD) and diffuse reflectance infrared Fourier transform (DRIFT) spectroscopy, as well as periodic density functional theory (DFT) calculations, unveils the origin of the mixed-linker approach to enhancing the structural stability of the mixed-linker CAU-10-based membranes during the gas permeation tests.

19.

Comparative Analysis of Uncoupled Mode Approximations for Molecular Thermochemistry and Kinetics.

Li, Shih-Cheng; Lin, Yen-Chun; Li, Yi-Pei.

J Chem Theory Comput ; 18(11): 6866-6877, 2022 Nov 08.

Artículo en Inglés | MEDLINE | ID: mdl-36269729

RESUMEN

The accurate prediction of thermochemistry and kinetic parameters is an important task for reaction modeling. Unfortunately, the commonly used harmonic oscillator model is often not accurate enough due to the absence of anharmonic effects. In this work, we improve the representation of an anharmonic potential energy surface (PES) using uncoupled mode (UM) approximations, which model the full-dimensional PES as a sum of one-dimensional potentials of each mode. We systematically analyze different PES sampling schemes and coordinate systems for constructing the one-dimensional potentials, and benchmark the performance of UM methods on data sets of molecular thermochemistry and kinetic properties. The results show that the accuracy of the UM approach strongly depends on how the one-dimensional potentials are defined. If one-dimensional potentials are constructed by sampling along normal mode directions (UM-N) or along the directions that minimize intermode coupling (E- and E'-optimized), the accuracies of the predicted properties are not significantly improved compared to the harmonic oscillator model. However, significant improvements can be achieved by sampling the torsional modes separately from the vibrational modes (UM-T and UM-VT). In this work, three types of coordinate systems are examined, including redundant internal coordinates (RIC), hybrid internal coordinates (HIC), and translation-rotation-internal coordinates (TRIC). The HIC and TRIC coordinate systems can outperform RIC since transition state species may contain large-amplitude interfragmentary motions that regular internal coordinates can not describe adequately. Among all the methods we examined, the activation energies and pre-exponential factors calculated using UM-VT with either TRIC or HIC best agree with the reference values. Since UM-VT requires only a number of additional single point energy calculations for each independent mode, the scaling of computational costs of UM-VT is the same as that of the standard harmonic oscillator model, making UM-VT an appealing way of calculating the thermochemistry and kinetic properties for large-size systems.

20.

Learning to Optimize Molecular Geometries Using Reinforcement Learning.

Ahuja, Kabir; Green, William H; Li, Yi-Pei.

J Chem Theory Comput ; 17(2): 818-825, 2021 Feb 09.

Artículo en Inglés | MEDLINE | ID: mdl-33470813

RESUMEN

Though quasi-Newton methods have been widely adopted in computational chemistry software for molecular geometry optimization, it is well known that these methods might not perform well for initial guess geometries far away from the local minima, where the quadratic approximation might be inaccurate. We propose a reinforcement learning approach to develop a model that produces a correction term for the quasi-Newton step calculated with the BFGS algorithm to improve the overall optimization performance. Our model is able to complete the optimization in about 30% fewer steps than pure BFGS for molecules starting from perturbed geometries. The new method has similar convergence to BFGS when complemented with a line search procedure, but it is much faster since it avoids the multiple gradient evaluations associated with line searches.

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA