Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Molecules ; 29(10)2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38792199

RESUMO

Two series of sugar esters with alkyl chain lengths varying from 5 to 12 carbon atoms, and with a head group consisting of glucose or galactose moieties, were synthesized. Equilibrium surface tension isotherms were measured, yielding critical micellar concentration (CMC) surface tensions at CMC (γcmc) and minimum areas at the air-water interface (Amin). In addition, Krafft temperatures (Tks) were measured to characterize the ability of molecules to dissolve in water, which is essential in numerous applications. As a comparison to widely used commercial sugar-based surfactants, those measurements were also carried out for four octyl d-glycosides. Impacts of the linkages between polar and lipophilic moieties, alkyl chain lengths, and the nature of the sugar head group on the measured properties were highlighted. Higher Tk and, thus, lower dissolution ability, were found for methyl 6-O-acyl-d-glucopyranosides. CMC and γcmc decreased with the alkyl chain lengths in both cases, but Amin did not appear to be influenced. Both γcmc and Amin appeared independent of the ester group orientation. Notably, alkyl (methyl α-d-glucopyranosid)uronates were found to result in noticeably lower CMC, possibly due to a closer distance between the carbonyl function and the head group.

2.
J Chem Inf Model ; 63(14): 4266-4276, 2023 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-37390494

RESUMO

One of the biggest obstacles to successful polymer property prediction is an effective representation that accurately captures the sequence of repeat units in a polymer. Motivated by the success of data augmentation in computer vision and natural language processing, we explore augmenting polymer data by iteratively rearranging the molecular representation while preserving the correct connectivity, revealing additional substructural information that is not present in a single representation. We evaluate the effects of this technique on the performance of machine learning models trained on three polymer datasets and compare them to common molecular representations. Data augmentation does not yield significant improvements in machine learning property prediction performance compared to equivalent (non-augmented) representations. In datasets where the target property is primarily influenced by the polymer sequence rather than experimental parameters, this data augmentation technique provides molecular embedding with more information to improve property prediction accuracy.


Assuntos
Aprendizado de Máquina , Polímeros , Processamento de Linguagem Natural
3.
Molecules ; 28(19)2023 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-37836648

RESUMO

The refractive index (RI) of liquids is a key physical property of molecular compounds and materials. In addition to its ubiquitous role in physics, it is also exploited to impart specific optical properties (transparency, opacity, and gloss) to materials and various end-use products. Since few methods exist to accurately estimate this property, we have designed a graph machine model (GMM) capable of predicting the RI of liquid organic compounds containing up to 16 different types of atoms and effective in discriminating between stereoisomers. Using 8267 carefully checked RI values from the literature and the corresponding 2D organic structures, the GMM provides a training root mean square relative error of less than 0.5%, i.e., an RMSE of 0.004 for the estimation of the refractive index of the 8267 compounds. The GMM predictive ability is also compared to that obtained by several fragment-based approaches. Finally, a Docker-based tool is proposed to predict the RI of organic compounds solely from their SMILES code. The GMM developed is easy to apply, as shown by the video tutorials provided on YouTube.

4.
Chimia (Aarau) ; 77(7-8): 484-488, 2023 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-38047789

RESUMO

The RXN for Chemistry project, initiated by IBM Research Europe - Zurich in 2017, aimed to develop a series of digital assets using machine learning techniques to promote the use of data-driven methodologies in synthetic organic chemistry. This research adopts an innovative concept by treating chemical reaction data as language records, treating the prediction of a synthetic organic chemistry reaction as a translation task between precursor and product languages. Over the years, the IBM Research team has successfully developed language models for various applications including forward reaction prediction, retrosynthesis, reaction classification, atom-mapping, procedure extraction from text, inference of experimental protocols and its use in programming commercial automation hardware to implement an autonomous chemical laboratory. Furthermore, the project has recently incorporated biochemical data in training models for greener and more sustainable chemical reactions. The remarkable ease of constructing prediction models and continually enhancing them through data augmentation with minimal human intervention has led to the widespread adoption of language model technologies, facilitating the digitalization of chemistry in diverse industrial sectors such as pharmaceuticals and chemical manufacturing. This manuscript provides a concise overview of the scientific components that contributed to the prestigious Sandmeyer Award in 2022.

5.
J Am Chem Soc ; 144(3): 1205-1217, 2022 01 26.
Artigo em Inglês | MEDLINE | ID: mdl-35020383

RESUMO

The design of molecular catalysts typically involves reconciling multiple conflicting property requirements, largely relying on human intuition and local structural searches. However, the vast number of potential catalysts requires pruning of the candidate space by efficient property prediction with quantitative structure-property relationships. Data-driven workflows embedded in a library of potential catalysts can be used to build predictive models for catalyst performance and serve as a blueprint for novel catalyst designs. Herein we introduce kraken, a discovery platform covering monodentate organophosphorus(III) ligands providing comprehensive physicochemical descriptors based on representative conformer ensembles. Using quantum-mechanical methods, we calculated descriptors for 1558 ligands, including commercially available examples, and trained machine learning models to predict properties of over 300000 new ligands. We demonstrate the application of kraken to systematically explore the property space of organophosphorus ligands and how existing data sets in catalysis can be used to accelerate ligand selection during reaction optimization.

6.
Inorg Chem ; 60(13): 9552-9562, 2021 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-34161729

RESUMO

Due to its associated low CO2 emissions, nuclear energy production is rapidly growing. In this context, the treatment of high-level liquid waste (HLLW) of nuclear plants is of high concern to both scientific and industrial communities. Specifically, the separation of An(III) and Ln(III) cations when processing nuclear fuel is a vitally important, yet challenging, step within HLLW because An(III) and Ln(III) have similar chemical properties in solution. To guide the choice of relevant ligands, anions, and solvents for this separation step, in this work, we calculate and compare the free energy of formation of different Am(III) and Eu(III) complexes (which are typical and important An(III) and Ln(III) cation examples), involving two different ligands and three different counter ions in four different solvents. Based on our calculations, we predict that the chosen solvent is a key factor in the extraction of Am(III) and Eu(III) in treatment of HLLW. This study supports a systematic, computation-assisted screening of solvents and extractive ligands with counter anions as a proficient method to optimize the separation of Ln(III) and An(III).

7.
J Chem Phys ; 154(7): 074502, 2021 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-33607909

RESUMO

Viscosity of organic liquids is an important physical property in applications of printing, pharmaceuticals, oil extracting, engineering, and chemical processes. Experimental measurement is a direct but time-consuming process. Accurately predicting the viscosity with a broad range of chemical diversity is still a great challenge. In this work, a protocol named Variable Force Field (VaFF) was implemented to efficiently vary the force field parameters, especially λvdW, for the van der Waals term for the shear viscosity prediction of 75 organic liquid molecules with viscosity ranging from -9 to 0 in their nature logarithm and containing diverse chemical functional groups, such as alcoholic hydroxyl, carbonyl, and halogenated groups. Feature learning was applied for the viscosity prediction, and the selected features indicated that the hydrogen bonding interactions and the number of atoms and rings play important roles in the property of viscosity. The shear viscosity prediction of alcohols is very difficult owing to the existence of relative strong intermolecular hydrogen bonding interaction as reflected by density functional theory binding energies. From radial and spatial distribution functions of methanol, we found that the van der Waals related parameters λvdW are more crucial to the viscosity prediction than the rotation related parameters, λtor. With the variable λvdW-based all-atom optimized potentials for liquid simulations force field, a great improvement was observed in the viscosity prediction for alcohols. The simplicity and uniformity of VaFF make it an efficient tool for the prediction of viscosity and other related properties in the rational design of materials with the specific properties.

8.
Phys Chem Chem Phys ; 21(27): 14846-14857, 2019 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-31232397

RESUMO

Microscopic polarization in liquids, which is challenging to account for intuitively and quantitatively, can impact the behavior of liquids in numerous ways and thus is ubiquitous in a broad range of domains and applications. To overcome this challenge, in this work, a molecular contact theory was proposed as a proxy to simulate microscopic polarization in liquids. In particular, molecular surfaces from implicit solvation models were used to predict both the dipole moment of individual molecules and mutual orientations arising from contacts between molecules. Then, the calculated dipole moments and orientations were combined in an analytical coupling, which allowed for the prediction of effective (polarized) dipole moments for all distinct species in the liquid. As a proof-of-concept, the model focused on predicting the dielectric constant and was tested on 420 pure liquids, 269 binary organic mixtures (3792 individual compositions) and 46 aqueous mixtures (704 individual compositions). The model proved to be flexible enough to reach an unprecedented satisfactory mean relative error of about 16-22% and a classification accuracy of 84-90% within four meaningful classes of weak, low average, high average and strong dielectric constants. The method also proved to be computationally very efficient, with calculation times ranging from a few seconds to about ten minutes on a personal computer with a single CPU. This success demonstrates that much of the microscopic polarization concept can be satisfactorily described based on a simple molecular contact theory. Moreover, the new model for dielectric constants provides a useful alternative to computationally expensive molecular dynamics simulations for large scale virtual screenings in chemical engineering and material sciences.

9.
Phys Chem Chem Phys ; 21(18): 9225-9238, 2019 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-30994133

RESUMO

During the past 20 years, the efficient combination of quantum chemical calculations with statistical thermodynamics by the COSMO-RS method has become an important alternative to force-field based simulations for the accurate prediction of free energies of molecules in liquid systems. While it was originally restricted to homogeneous liquids, it later has been extended to the prediction of the free energy of molecules in inhomogeneous systems such as micelles, biomembranes, or liquid interfaces, but these calculations were based on external input about the structure of the inhomogeneous system. Here we report the rigorous extension of COSMO-RS to a self-consistent prediction of the structure and the free energies of molecules in self-organizing inhomogeneous systems. This extends the application range to many new areas, such as the prediction of micellar structures and critical micelle concentrations, finite loading effects in micelles and biomembranes, the free energies and structure of liquid interfaces, microemulsions, and many more related topics, which often are of great practical importance.

10.
Science ; 384(6697): eadk9227, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38753786

RESUMO

Contemporary materials discovery requires intricate sequences of synthesis, formulation, and characterization that often span multiple locations with specialized expertise or instrumentation. To accelerate these workflows, we present a cloud-based strategy that enabled delocalized and asynchronous design-make-test-analyze cycles. We showcased this approach through the exploration of molecular gain materials for organic solid-state lasers as a frontier application in molecular optoelectronics. Distributed robotic synthesis and in-line property characterization, orchestrated by a cloud-based artificial intelligence experiment planner, resulted in the discovery of 21 new state-of-the-art materials. Gram-scale synthesis ultimately allowed for the verification of best-in-class stimulated emission in a thin-film device. Demonstrating the asynchronous integration of five laboratories across the globe, this workflow provides a blueprint for delocalizing-and democratizing-scientific discovery.

11.
J Colloid Interface Sci ; 608(Pt 1): 549-563, 2022 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-34628316

RESUMO

HYPOTHESIS: The salinity at which the dynamic phase inversion of the reference system C10E4/n-Octane/Water occurs in the presence of increasing amounts of a test surfactant S2 provides quantitative information on the hydrophilic/lipophilic ratio and on the sensitivity to NaClaq of S2. EXPERIENCES: The Salinities causing the Phase Inversion (SPI) of the reference system mixed with 12 ionic and 10 nonionic well-defined surfactants are determined in order to quantify the contributions of the nature of the polar head and of the alkyl chain length. FINDINGS: The SPI varies linearly upon the addition of S2. The slope of the straight variation with the molar fraction of S2 is called the "SPI-slope". It quantifies the hydrophilic/lipophilic ratio of S2 in saline environment and its salt-sensitivity with respect to the reference surfactant C10E4. The SPI-slopes of C12 surfactants bearing different polar heads are found to decrease in the following order: C12NMe3Br > C12E8 > C12E7 ≥C12SO3Na ≈ C12COONa ≥ C12SO4Na > C12E6 > C12E5 > C12E3. This classification is different from that obtained when the phase inversion is caused by a change in temperature (PIT-slope method) because the addition of NaCl in significant amounts (3 to 10 wt%) partially screens the ionic heads and diminishes their apparent hydrophilicities. A simple model, valid for all types of nonionic surfactants, is developed on the basis of the HLDN equation (Normalized Hydrophilic-Lipophilic Deviation) to express the SPI-slope as a function of the hydrophilic/lipophilic ratio (PACN2) and the salinity coefficient (δ2) of S2. All studied surfactants are positioned on a 2D map according to the values of their SPI-slope and their PIT-slope to graphically highlight their hydrophilic/lipophilic ratio and their salt-sensitivity. Finally, a linear model connecting the PIT-slope and the SPI-slope is derived for nonionics, emphasizing that the thermal partitioning of C10E4 towards n-octane is much greater in the PIT-slope than in the SPI-slope experiments.


Assuntos
Cloreto de Sódio , Tensoativos , Interações Hidrofóbicas e Hidrofílicas , Salinidade , Água
12.
ACS Cent Sci ; 8(1): 122-131, 2022 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-35106378

RESUMO

Self-driving laboratories, in the form of automated experimentation platforms guided by machine learning algorithms, have emerged as a potential solution to the need for accelerated science. While new tools for automated analysis and characterization are being developed at a steady rate, automated synthesis remains the bottleneck in the chemical space accessible to self-driving laboratories. Combining automated and manual synthesis efforts immediately significantly expands the explorable chemical space. To effectively direct the different capabilities of automated (higher throughput and less labor) and manual synthesis (greater chemical versatility), we describe a protocol, the RouteScore, that quantifies the cost of combined synthetic routes. In this work, the RouteScore is used to determine the most efficient synthetic route to a well-known pharmaceutical (structure-oriented optimization) and to simulate a self-driving laboratory that finds the most easily synthesizable organic laser molecule with specific photophysical properties from a space of ∼3500 possible molecules (property-oriented optimization). These two examples demonstrate the power and flexibility of our approach in mixed synthetic planning and optimization and especially in downselecting promising candidates from a large chemical space via an a priori estimation of the synthetic costs.

13.
Patterns (N Y) ; 3(10): 100588, 2022 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-36277819

RESUMO

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings-most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100% robustness: SELF-referencing embedded string (Selfies). Selfies has since simplified and enabled numerous new applications in chemistry. In this perspective, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete future projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.

14.
J Phys Chem Lett ; 12(20): 4980-4986, 2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34015223

RESUMO

Optimally efficient organic solar cells require not only a careful choice of new donor (D) and/or acceptor (A) molecules but also the fine-tuning of experimental fabrication conditions for organic solar cells (OSCs). Herein, a new framework for simultaneously optimizing D/A molecule pairs and device specifications of OSCs is proposed, through a quantitative structure-property relationship (QSPR) model built by machine learning. Combining the device bulk properties with structural and electronic properties, the built QSPR model achieved unprecedentedly high accuracy and consistency. Additionally, a large chemical space of 1 942 785 D/A pairs is explored to find potential synergistic ones. Favorable device bulk properties such as the root-mean-square of surfaces roughness for D/A blends and the D/A weight ratio are further screened by grid search methods. Overall, this study indicates that the simultaneous optimization of D/A molecule pairs and device specifications by theoretical calculations can accelerate the improvement of OSC efficiencies.

15.
ACS Cent Sci ; 5(9): 1572-1583, 2019 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-31572784

RESUMO

Organic synthesis is one of the key stumbling blocks in medicinal chemistry. A necessary yet unsolved step in planning synthesis is solving the forward problem: Given reactants and reagents, predict the products. Similar to other work, we treat reaction prediction as a machine translation problem between simplified molecular-input line-entry system (SMILES) strings (a text-based representation) of reactants, reagents, and the products. We show that a multihead attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark data set. Molecular Transformer makes predictions by inferring the correlations between the presence and absence of chemical motifs in the reactant, reagent, and product present in the data set. Our model requires no handcrafted rules and accurately predicts subtle chemical transformations. Crucially, our model can accurately estimate its own uncertainty, with an uncertainty score that is 89% accurate in terms of classifying whether a prediction is correct. Furthermore, we show that the model is able to handle inputs without a reactant-reagent split and including stereochemistry, which makes our method universally applicable.

16.
Adv Colloid Interface Sci ; 270: 87-100, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31200263

RESUMO

In this review, structure-property trends are systematically analyzed for four amphiphilic properties of sugar-based surfactants: critical micelle concentration (CMC), its associated surface tension (γCMC), efficiency (pC20) and Krafft temperature (TK). First, the impact on amphiphilic properties of the alkyl chain size and the presence of branching and/or unsaturation is investigated. Then, various polar head parameters are explored, such as the degree of polymerization of the sugar unit (mono- or oligosaccharides), the chemical nature of the linker and the sugar configuration. Some systematic comparisons between ethoxylated surfactants and sugar-based surfactants are also carried out. While some structural trends with the impact of alkyl chain length or the polar head size are now well understood, this analysis points out that systematic studies of more specific effects of alkyl chain (e.g. branching, unsaturation, presence of rings, position on the polar head) and polar head (e.g. linker, anomeric configuration, internal stereochemistry, cyclic vs. acyclic sugar residues) were scarcer or not available to date. This work encourages the use of these structural trends in the perspective of developing new bio-based surfactants and their consideration in predictive models. It also highlights the need of further experimental tests to fill remaining gaps notably to explore some specific structural features (such as the introduction of rings in the alkyl chain or the position of the alkyl chain on the polar head) and towards applicative properties (like foaming capacity or wettability).

17.
J Colloid Interface Sci ; 516: 162-171, 2018 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-29367067

RESUMO

HYPOTHESIS: Surface tension of aqueous solutions of surfactants at their critical micelle concentrations (γCMC), may be quantitatively linked to the surfactant structure using Quantitative Structure Property Relationships (QSPR), all other factors held equal (temperature, presence of additive or salts). Thus, QSPR models can allow improved understanding and quantification of structure-γCMC trends, direct γCMC predictions, and finally help to design renewable substitutes for petroleum-based surfactants. EXPERIMENTS AND METHODS: A dataset of 70 γCMC of single surfactants at ambient temperature has been gathered from several research papers. Then, descriptors of the whole structure, of polar heads and of alkyl chains of the 70 surfactants were calculated and introduced in multilinear regressions to evidence the most predictive and physically meaningful structure property relationships. FINDINGS: The best model, based on quantum chemical descriptors, achieved a standard error of 2.4 mN/m on an external validation. Simpler models were also achieved based solely on the count of H atoms of the polar head but with prediction error of 2.9 mN/m. Among all identified factors affecting γCMC of sugar-based surfactants (polar head size, alkyl chain length and branching), polar head size was found to exhibit the only effect clearly taken into account by all the models.

18.
Chem Sci ; 9(28): 6091-6098, 2018 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-30090297

RESUMO

There is an intuitive analogy of an organic chemist's understanding of a compound and a language speaker's understanding of a word. Based on this analogy, it is possible to introduce the basic concepts and analyze potential impacts of linguistic analysis to the world of organic chemistry. In this work, we cast the reaction prediction task as a translation problem by introducing a template-free sequence-to-sequence model, trained end-to-end and fully data-driven. We propose a tokenization, which is arbitrarily extensible with reaction information. Using an attention-based model borrowed from human language translation, we improve the state-of-the-art solutions in reaction prediction on the top-1 accuracy by achieving 80.3% without relying on auxiliary knowledge, such as reaction templates or explicit atomic features. Also, a top-1 accuracy of 65.4% is reached on a larger and noisier dataset.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA