J Chem Inf Model ; 60(3): 1194-1201, 2020 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-31909619


Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated, unsupervised method for connecting scientific literature to inorganic synthesis insights. Starting from the natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for any inorganic materials of interest. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties and that the model's behavior complements the existing thermodynamic knowledge. Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds.

Nat Mater ; 18(11): 1177-1181, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31591531


Predicting and directing polymorphic transformations is a critical challenge in zeolite synthesis1-3. Interzeolite transformations enable selective crystallization4-7, but are often too complex to be designed by comparing crystal structures. Here, computational and theoretical tools are combined to both exhaustively data mine polymorphic transformations reported in the literature and analyse and explain interzeolite relations. It was found that crystallographic building units are weak predictors of topology interconversion and insufficient to explain intergrowth. By introducing a supercell-invariant metric that compares crystal structures using graph theory, we show that diffusionless (topotactic and reconstructive) transformations occur only between graph-similar pairs. Furthermore, all the known instances of intergrowth occur between either structurally similar or graph similar frameworks. We identify promising pairs to realize diffusionless transformations and intergrowth, with hundreds of low-distance pairs identified among known zeolites, and thousands of hypothetical frameworks connected to known zeolite counterparts. The theory may enable the understanding and control of zeolite polymorphism.

ACS Cent Sci ; 5(5): 892-899, 2019 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-31139725


Zeolites are porous, aluminosilicate materials with many industrial and "green" applications. Despite their industrial relevance, many aspects of zeolite synthesis remain poorly understood requiring costly trial and error synthesis. In this paper, we create natural language processing techniques and text markup parsing tools to automatically extract synthesis information and trends from zeolite journal articles. We further engineer a data set of germanium-containing zeolites to test the accuracy of the extracted data and to discover potential opportunities for zeolites containing germanium. We also create a regression model for a zeolite's framework density from the synthesis conditions. This model has a cross-validated root mean squared error of 0.98 T/1000 Å3, and many of the model decision boundaries correspond to known synthesis heuristics in germanium-containing zeolites. We propose that this automatic data extraction can be applied to many different problems in zeolite synthesis and enable novel zeolite morphologies.