Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
Annu Rev Biophys ; 53(1): 109-125, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-39013026

RESUMEN

The relationship between genotype and phenotype, or the fitness landscape, is the foundation of genetic engineering and evolution. However, mapping fitness landscapes poses a major technical challenge due to the amount of quantifiable data that is required. Catalytic RNA is a special topic in the study of fitness landscapes due to its relatively small sequence space combined with its importance in synthetic biology. The combination of in vitro selection and high-throughput sequencing has recently provided empirical maps of both complete and local RNA fitness landscapes, but the astronomical size of sequence space limits purely experimental investigations. Next steps are likely to involve data-driven interpolation and extrapolation over sequence space using various machine learning techniques. We discuss recent progress in understanding RNA fitness landscapes, particularly with respect to protocells and machine representations of RNA. The confluence of technical advances may significantly impact synthetic biology in the near future.


Asunto(s)
ARN Catalítico , ARN Catalítico/química , ARN Catalítico/genética , ARN Catalítico/metabolismo , Evolución Molecular , Aptitud Genética/genética
2.
J Comput Chem ; 45(6): 352-361, 2024 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-37873926

RESUMEN

Metalloenzymes catalyze a wide range of chemical transformations, with the active site residues playing a key role in modulating chemical reactivity and selectivity. Unlike smaller synthetic catalysts, a metalloenzyme active site is embedded in a larger protein, which makes interrogation of electronic properties and geometric features with quantum mechanical calculations challenging. Here we implement the ability to fetch crystallographic structures from the Protein Data Bank and analyze the metal binding sites in the program molSimplify. We show the usefulness of the newly created protein3D class to extract the local environment around non-heme iron enzymes containing a two histidine motif and prepare 372 structures for quantum mechanical calculations. Our implementation of protein3D serves to expand the range of systems molSimplify can be used to analyze and will enable high-throughput study of metal-containing active sites in proteins.


Asunto(s)
Metaloproteínas , Metaloproteínas/química , Catálisis , Dominio Catalítico
3.
J Phys Chem B ; 127(49): 10592-10600, 2023 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-38038675

RESUMEN

The design of ion-selective materials with improved separation efficacy and efficiency is paramount, as current technologies fail to meet real-world deployment challenges. Selectivity in these materials can be informed by local ion binding in confined membrane ion channels. In this study, we utilize a data-driven approach to investigate design features in small molecular complexes coordinating ions as simplified models of ion channels. We curate a data set of 563 alkali metal coordinating molecular complexes (i.e., with Li+, Na+, or K+) from the Cambridge Structural Database and calculate differential ion binding energies using density functional theory. Using this information, we probe when and why structures favor exchange with alternate ions. Our analysis reveals that energetic preferences are related to ion size but are largely due to chemical interactions rather than structural reorganization. We identify unique trends in the selectivity for Li+ over other alkali ions, including the presence of N coordination atoms, planar coordination geometry, and small coordinating ring sizes. We use machine learning models to identify the key contributions of both geometric and electronic features in predicting selective ion binding. These physical insights offer preliminary guidance into the design of optimal membranes for ion selectivity.

4.
J Chem Phys ; 159(2)2023 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-37431914

RESUMEN

Spin crossover (SCO) complexes, which exhibit changes in spin state in response to external stimuli, have applications in molecular electronics and are challenging materials for computational design. We curate a dataset of 95 Fe(II) SCO complexes (SCO-95) from the Cambridge Structural Database that have available low- and high-temperature crystal structures and, in most cases, confirmed experimental spin transition temperatures (T1/2). We study these complexes using density functional theory (DFT) with 30 functionals spanning across multiple rungs of "Jacob's ladder" to understand the effect of exchange-correlation functional on electronic and Gibbs free energies associated with spin crossover. We specifically assess the effect of varying the Hartree-Fock exchange fraction (aHF) in structures and properties within the B3LYP family of functionals. We identify three best-performing functionals, a modified version of B3LYP (aHF = 0.10), M06-L, and TPSSh, that accurately predict SCO behavior for the majority of the complexes. While M06-L performs well, MN15-L, a more recently developed Minnesota functional, fails to predict SCO behavior for all complexes, which could be the result of differences in datasets used for parametrization of M06-L and MN15-L and also the increased number of parameters for MN15-L. Contrary to observations from prior studies, double-hybrids with higher aHF values are found to strongly stabilize high-spin states and therefore exhibit poor performance in predicting SCO behavior. Computationally predicted T1/2 values are consistent among the three functionals but show limited correlation to experimentally reported T1/2 values. These failures are attributed to the lack of crystal packing effects and counter-anions in the DFT calculations that would be needed to account for phenomena such as hysteresis and two-step SCO behavior. The SCO-95 set thus presents opportunities for method development, both in terms of increasing model complexity and method fidelity.

5.
J Phys Chem Lett ; 14(25): 5798-5804, 2023 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-37338110

RESUMEN

We survey more than 240 000 crystallized mononuclear transition metal complexes (TMCs) to identify trends in preferred geometric structure and metal coordination. While we observe that an increased level of d filling correlates with a lower coordination number preference, we note exceptions, and we observe undersampling of 4d/5d transition metals and 3p-coordinating ligands. For the one-third of mononuclear TMCs that are octahedral, analysis of the 67 symmetry classes of their ligand environments reveals that complexes often contain monodentate ligands that may be removable, forming an open site amenable to catalysis. Due to their use in catalysis, we analyze trends in coordination by tetradentate ligands in terms of the capacity to support multiple metals and the variability of coordination geometry. We identify promising tetradentate ligands that co-occur in crystallized complexes with labile monodentate ligands that would lead to reactive sites. Literature mining suggests that these ligands are untapped as catalysts, motivating proposal of a promising octa-functionalized porphyrin.

6.
J Am Chem Soc ; 145(26): 14365-14378, 2023 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-37339429

RESUMEN

The challenge of direct partial oxidation of methane to methanol has motivated the targeted search of metal-organic frameworks (MOFs) as a promising class of materials for this transformation because of their site-isolated metals with tunable ligand environments. Thousands of MOFs have been synthesized, yet relatively few have been screened for their promise in methane conversion. We developed a high-throughput virtual screening workflow that identifies MOFs from a diverse space of experimental MOFs that have not been studied for catalysis, yet are thermally stable, synthesizable, and have promising unsaturated metal sites for C-H activation via a terminal metal-oxo species. We carried out density functional theory calculations of the radical rebound mechanism for methane-to-methanol conversion on models of the secondary building units (SBUs) from 87 selected MOFs. While we showed that oxo formation favorability decreases with increasing 3d filling, consistent with prior work, previously observed scaling relations between oxo formation and hydrogen atom transfer (HAT) are disrupted by the greater diversity in our MOF set. Accordingly, we focused on Mn MOFs, which favor oxo intermediates without disfavoring HAT or leading to high methanol release energies─a key feature for methane hydroxylation activity. We identified three Mn MOFs comprising unsaturated Mn centers bound to weak-field carboxylate ligands in planar or bent geometries with promising methane-to-methanol kinetics and thermodynamics. The energetic spans of these MOFs are indicative of promising turnover frequencies for methane to methanol that warrant further experimental catalytic studies.

7.
JACS Au ; 3(2): 391-401, 2023 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-36873700

RESUMEN

Transition-metal chromophores with earth-abundant transition metals are an important design target for their applications in lighting and nontoxic bioimaging, but their design is challenged by the scarcity of complexes that simultaneously have well-defined ground states and optimal target absorption energies in the visible region. Machine learning (ML) accelerated discovery could overcome such challenges by enabling the screening of a larger space but is limited by the fidelity of the data used in ML model training, which is typically from a single approximate density functional. To address this limitation, we search for consensus in predictions among 23 density functional approximations across multiple rungs of "Jacob's ladder". To accelerate the discovery of complexes with absorption energies in the visible region while minimizing the effect of low-lying excited states, we use two-dimensional (2D)efficient global optimization to sample candidate low-spin chromophores from multimillion complex spaces. Despite the scarcity (i.e., ∼0.01%) of potential chromophores in this large chemical space, we identify candidates with high likelihood (i.e., >10%) of computational validation as the ML models improve during active learning, representing a 1000-fold acceleration in discovery. Absorption spectra of promising chromophores from time-dependent density functional theory verify that 2/3 of candidates have the desired excited-state properties. The observation that constituent ligands from our leads have demonstrated interesting optical properties in the literature exemplifies the effectiveness of our construction of a realistic design space and active learning approach.

8.
Phys Chem Chem Phys ; 25(11): 8103-8116, 2023 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-36876903

RESUMEN

Virtual high-throughput screening (VHTS) and machine learning (ML) with density functional theory (DFT) suffer from inaccuracies from the underlying density functional approximation (DFA). Many of these inaccuracies can be traced to the lack of derivative discontinuity that leads to a curvature in the energy with electron addition or removal. Over a dataset of nearly one thousand transition metal complexes typical of VHTS applications, we computed and analyzed the average curvature (i.e., deviation from piecewise linearity) for 23 density functional approximations spanning multiple rungs of "Jacob's ladder". While we observe the expected dependence of the curvatures on Hartree-Fock exchange, we note limited correlation of curvature values between different rungs of "Jacob's ladder". We train ML models (i.e., artificial neural networks or ANNs) to predict the curvature and the associated frontier orbital energies for each of these 23 functionals and then interpret differences in curvature among the different DFAs through analysis of the ML models. Notably, we observe spin to play a much more important role in determining the curvature of range-separated and double hybrids in comparison to semi-local functionals, explaining why curvature values are weakly correlated between these and other families of functionals. Over a space of 187.2k hypothetical compounds, we use our ANNs to pinpoint DFAs for which representative transition metal complexes have near-zero curvature with low uncertainty, demonstrating an approach to accelerate screening of complexes with targeted optical gaps.

9.
Chem Sci ; 14(6): 1419-1433, 2023 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-36794185

RESUMEN

Prediction of the excited state properties of photoactive iridium complexes challenges ab initio methods such as time-dependent density functional theory (TDDFT) both from the perspective of accuracy and of computational cost, complicating high-throughput virtual screening (HTVS). We instead leverage low-cost machine learning (ML) models and experimental data for 1380 iridium complexes to perform these prediction tasks. We find the best-performing and most transferable models to be those trained on electronic structure features from low-cost density functional tight binding calculations. Using artificial neural network (ANN) models, we predict the mean emission energy of phosphorescence, the excited state lifetime, and the emission spectral integral for iridium complexes with accuracy competitive with or superseding that of TDDFT. We conduct feature importance analysis to determine that high cyclometalating ligand ionization potential correlates to high mean emission energy, while high ancillary ligand ionization potential correlates to low lifetime and low spectral integral. As a demonstration of how our ML models can be used for HTVS and the acceleration of chemical discovery, we curate a set of novel hypothetical iridium complexes and use uncertainty-controlled predictions to identify promising ligands for the design of new phosphors while retaining confidence in the quality of the ANN predictions.

10.
J Chem Theory Comput ; 19(1): 190-197, 2023 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-36548116

RESUMEN

When a many-body wave function of a system cannot be captured by a single determinant, high-level multireference (MR) methods are required to properly explain its electronic structure. MR diagnostics to estimate the magnitude of such static correlation have been primarily developed for molecular systems and range from low in computational cost to as costly as the full MR calculation itself. We report the first application of low-cost MR diagnostics based on the fractional occupation number calculated with finite-temperature DFT to solid-state systems. To compare the behavior of the diagnostics on solids and molecules, we select metal-organic frameworks (MOFs) as model materials because their reticular nature provides an intuitive way to identify molecular derivatives. On a series of closed-shell MOFs, we demonstrate that the DFT-based MR diagnostics are equally applicable to solids as to their molecular derivatives. The magnitude and spatial distribution of the MR character of a MOF are found to have a good correlation with those of its molecular derivatives, which can be calculated much more affordably in comparison to those of the full MOF. The additivity of MR character discussed here suggests the set of molecular derivatives to be a good representation of a MOF for both MR detection and ultimately for MR corrections, facilitating accurate and efficient high-throughput screening of MOFs and other porous solids.

11.
Nat Comput Sci ; 3(1): 38-47, 2023 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38177951

RESUMEN

Approximate density functional theory has become indispensable owing to its balanced cost-accuracy trade-off, including in large-scale screening. To date, however, no density functional approximation (DFA) with universal accuracy has been identified, leading to uncertainty in the quality of data generated from density functional theory. With electron density fitting and Δ-learning, we build a DFA recommender that selects the DFA with the lowest expected error with respect to the gold standard (but cost-prohibitive) coupled cluster theory in a system-specific manner. We demonstrate this recommender approach on the evaluation of vertical spin splitting energies of transition metal complexes. Our recommender predicts top-performing DFAs and yields excellent accuracy (about 2 kcal mol-1) for chemical discovery, outperforming both individual Δ-learning models and the best conventional single-functional approach from a set of 48 DFAs. By demonstrating transferability to diverse synthesized compounds, our recommender potentially addresses the accuracy versus scope dilemma broadly encountered in computational chemistry.

12.
J Chem Phys ; 157(18): 184112, 2022 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-36379790

RESUMEN

To accelerate the exploration of chemical space, it is necessary to identify the compounds that will provide the most additional information or value. A large-scale analysis of mononuclear octahedral transition metal complexes deposited in an experimental database confirms an under-representation of lower-symmetry complexes. From a set of around 1000 previously studied Fe(II) complexes, we show that the theoretical space of synthetically accessible complexes formed from the relatively small number of unique ligands is significantly (∼816k) larger. For the properties of these complexes, we validate the concept of ligand additivity by inferring heteroleptic properties from a stoichiometric combination of homoleptic complexes. An improved interpolation scheme that incorporates information about cis and trans isomer effects predicts the adiabatic spin-splitting energy to around 2 kcal/mol and the HOMO level to less than 0.2 eV. We demonstrate a multi-stage strategy to discover leads from the 816k Fe(II) complexes within a targeted property region. We carry out a coarse interpolation from homoleptic complexes that we refine over a subspace of ligands based on the likelihood of generating complexes with targeted properties. We validate our approach on nine new binary and ternary complexes predicted to be in a targeted zone of discovery, suggesting opportunities for efficient transition metal complex discovery.

13.
JACS Au ; 2(5): 1200-1213, 2022 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-35647589

RESUMEN

Despite decades of effort, no earth-abundant homogeneous catalysts have been discovered that can selectively oxidize methane to methanol. We exploit active learning to simultaneously optimize methane activation and methanol release calculated with machine learning-accelerated density functional theory in a space of 16 M candidate catalysts including novel macrocycles. By constructing macrocycles from fragments inspired by synthesized compounds, we ensure synthetic realism in our computational search. Our large-scale search reveals that low-spin Fe(II) compounds paired with strong-field (e.g., P or S-coordinating) ligands have among the best energetic tradeoffs between hydrogen atom transfer (HAT) and methanol release. This observation contrasts with prior efforts that have focused on high-spin Fe(II) with weak-field ligands. By decoupling equatorial and axial ligand effects, we determine that negatively charged axial ligands are critical for more rapid release of methanol and that higher-valency metals [i.e., M(III) vs M(II)] are likely to be rate-limited by slow methanol release. With full characterization of barrier heights, we confirm that optimizing for HAT does not lead to large oxo formation barriers. Energetic span analysis reveals designs for an intermediate-spin Mn(II) catalyst and a low-spin Fe(II) catalyst that are predicted to have good turnover frequencies. Our active learning approach to optimize two distinct reaction energies with efficient global optimization is expected to be beneficial for the search of large catalyst spaces where no prior designs have been identified and where linear scaling relationships between reaction energies or barriers may be limited or unknown.

14.
Chem Sci ; 13(17): 4962-4971, 2022 May 04.
Artículo en Inglés | MEDLINE | ID: mdl-35655882

RESUMEN

Appropriately identifying and treating molecules and materials with significant multi-reference (MR) character is crucial for achieving high data fidelity in virtual high-throughput screening (VHTS). Despite development of numerous MR diagnostics, the extent to which a single value of such a diagnostic indicates the MR effect on a chemical property prediction is not well established. We evaluate MR diagnostics for over 10 000 transition-metal complexes (TMCs) and compare to those for organic molecules. We observe that only some MR diagnostics are transferable from one chemical space to another. By studying the influence of MR character on chemical properties (i.e., MR effect) that involve multiple potential energy surfaces (i.e., adiabatic spin splitting, ΔE H-L, and ionization potential, IP), we show that differences in MR character are more important than the cumulative degree of MR character in predicting the magnitude of an MR effect. Motivated by this observation, we build transfer learning models to predict CCSD(T)-level adiabatic ΔE H-L and IP from lower levels of theory. By combining these models with uncertainty quantification and multi-level modeling, we introduce a multi-pronged strategy that accelerates data acquisition by at least a factor of three while achieving coupled cluster accuracy (i.e., to within 1 kcal mol-1 MAE) for robust VHTS.

15.
J Chem Theory Comput ; 18(7): 4282-4292, 2022 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-35737587

RESUMEN

Virtual high-throughput screening (VHTS) and machine learning (ML) have greatly accelerated the design of single-site transition-metal catalysts. VHTS of catalysts, however, is often accompanied with a high calculation failure rate and wasted computational resources due to the difficulty of simultaneously converging all mechanistically relevant reactive intermediates to expected geometries and electronic states. We demonstrate a dynamic classifier approach, i.e., a convolutional neural network that monitors geometry optimizations on the fly, and exploit its good performance and transferability in identifying geometry optimization failures for catalyst design. We show that the dynamic classifier performs well on all reactive intermediates in the representative catalytic cycle of the radical rebound mechanism for the conversion of methane to methanol despite being trained on only one reactive intermediate. The dynamic classifier also generalizes to chemically distinct intermediates and metal centers absent from the training data without loss of accuracy or model confidence. We rationalize this superior model transferability as arising from the use of electronic structure and geometric information generated on-the-fly from density functional theory calculations and the convolutional layer in the dynamic classifier. When used in combination with uncertainty quantification, the dynamic classifier saves more than half of the computational resources that would have been wasted on unsuccessful calculations for all reactive intermediates being considered.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Catálisis
16.
J Chem Phys ; 156(18): 184112, 2022 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-35568542

RESUMEN

Low-cost, non-empirical corrections to semi-local density functional theory are essential for accurately modeling transition-metal chemistry. Here, we demonstrate the judiciously modified density functional theory (jmDFT) approach with non-empirical U and J parameters obtained directly from frontier orbital energetics on a series of transition-metal complexes. We curate a set of nine representative Ti(III) and V(IV) d1 transition-metal complexes and evaluate their flat-plane errors along the fractional spin and charge lines. We demonstrate that while jmDFT improves upon both DFT+U and semi-local DFT with the standard atomic orbital projectors (AOPs), it does so inefficiently. We rationalize these inefficiencies by quantifying hybridization in the relevant frontier orbitals. To overcome these limitations, we introduce a procedure for computing a molecular orbital projector (MOP) basis for use with jmDFT. We demonstrate this single set of d1 MOPs to be suitable for nearly eliminating all energetic delocalization and static correlation errors. In all cases, MOP jmDFT outperforms AOP jmDFT, and it eliminates most flat-plane errors at non-empirical values. Unlike DFT+U or hybrid functionals, jmDFT nearly eliminates energetic delocalization and static correlation errors within a non-empirical framework.

17.
J Phys Chem Lett ; 13(20): 4549-4555, 2022 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-35579948

RESUMEN

The predictive accuracy of density functional theory (DFT) is hampered by delocalization errors, especially for correlated systems such as transition-metal complexes. Two complementary strategies have been developed to reduce delocalization error: eliminating the global curvature with change in charge, and applying a linear response Hubbard U as a measure of local curvature at a metal center at fixed charge in a DFT+U framework. We investigate the relationship between the two delocalization error measures as the ligand field strength is varied with the number of strong-field ligands in a series of heteroleptic complexes or by geometrically constraining the metal-ligand bond length in homoleptic octahedral complexes. We show that across these sets of complexes an inverse relationship generally exists between global and local curvatures. We find that effects of ligand substitution on both measures of delocalization are typically additive, but the quantities seldom coincide.

18.
Sci Data ; 9(1): 74, 2022 03 11.
Artículo en Inglés | MEDLINE | ID: mdl-35277533

RESUMEN

We report a workflow and the output of a natural language processing (NLP)-based procedure to mine the extant metal-organic framework (MOF) literature describing structurally characterized MOFs and their solvent removal and thermal stabilities. We obtain over 2,000 solvent removal stability measures from text mining and 3,000 thermal decomposition temperatures from thermogravimetric analysis data. We assess the validity of our NLP methods and the accuracy of our extracted data by comparing to a hand-labeled subset. Machine learning (ML, i.e. artificial neural network) models trained on this data using graph- and pore-geometry-based representations enable prediction of stability on new MOFs with quantified uncertainty. Our web interface, MOFSimplify, provides users access to our curated data and enables them to harness that data for predictions on new MOFs. MOFSimplify also encourages community feedback on existing data and on ML model predictions for community-based active learning for improved MOF stability models.

19.
Annu Rev Chem Biomol Eng ; 13: 405-429, 2022 06 10.
Artículo en Inglés | MEDLINE | ID: mdl-35320698

RESUMEN

Machine learning (ML) has become a part of the fabric of high-throughput screening and computational discovery of materials. Despite its increasingly central role, challenges remain in fully realizing the promise of ML. This is especially true for the practical acceleration of the engineering of robust materials and the development of design strategies that surpass trial and error or high-throughput screening alone. Depending on the quantity being predicted and the experimental data available, ML can either outperform physics-based models, be used to accelerate such models, or be integrated with them to improve their performance. We cover recent advances in algorithms and in their application that are starting to make inroads toward (a) the discovery of new materials through large-scale enumerative screening, (b) the design of materials through identification of rules and principles that govern materials properties, and (c) the engineering of practical materials by satisfying multiple objectives. We conclude with opportunities for further advancement to realize ML as a widespread tool for practical computational materials design.


Asunto(s)
Algoritmos , Aprendizaje Automático , Ensayos Analíticos de Alto Rendimiento
20.
J Chem Phys ; 156(7): 074101, 2022 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-35183086

RESUMEN

Strategies for machine-learning (ML)-accelerated discovery that are general across material composition spaces are essential, but demonstrations of ML have been primarily limited to narrow composition variations. By addressing the scarcity of data in promising regions of chemical space for challenging targets such as open-shell transition-metal complexes, general representations and transferable ML models that leverage known relationships in existing data will accelerate discovery. Over a large set (∼1000) of isovalent transition-metal complexes, we quantify evident relationships for different properties (i.e., spin-splitting and ligand dissociation) between rows of the Periodic Table (i.e., 3d/4d metals and 2p/3p ligands). We demonstrate an extension to the graph-based revised autocorrelation (RAC) representation (i.e., eRAC) that incorporates the group number alongside the nuclear charge heuristic that otherwise overestimates dissimilarity of isovalent complexes. To address the common challenge of discovery in a new space where data are limited, we introduce a transfer learning approach in which we seed models trained on a large amount of data from one row of the Periodic Table with a small number of data points from the additional row. We demonstrate the synergistic value of the eRACs alongside this transfer learning strategy to consistently improve model performance. Analysis of these models highlights how the approach succeeds by reordering the distances between complexes to be more consistent with the Periodic Table, a property we expect to be broadly useful for other material domains.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA