Your browser doesn't support javascript.
loading
Visualization of Solar Cell Library Space by Dimensionality Reduction Methods.
Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch.
Afiliação
  • Kaspi O; Department of Chemistry , Bar-Ilan University , Ramat-Gan 5290002 , Israel.
  • Yosipof A; Department of Information Systems , College of Law & Business, Ramat-Gan , P.O. Box 852, Bnei Brak 5110801 , Israel.
  • Senderowitz H; Department of Chemistry , Bar-Ilan University , Ramat-Gan 5290002 , Israel.
J Chem Inf Model ; 58(12): 2428-2439, 2018 12 24.
Article em En | MEDLINE | ID: mdl-30485100
ABSTRACT
Visualizing high-dimensional data by projecting them into a two- or three-dimensional space is a popular approach in many scientific fields, including computer-aided drug design and cheminformatics. In contrast, dimensionality reduction techniques have been far less explored for materials informatics. Nevertheless, similar to their usefulness in analyzing the space of, e.g., drug-like molecules, such techniques could provide useful insights on materials space, including an intuitive grasp of the overall distribution of samples, the identification of interesting trends, including the formation of materials clusters and the presence of activity cliffs and outliers, and rational navigation through this space in the search for new materials. Here we present the first application of four dimensionality reduction techniques, namely, principal component analysis (PCA), kernel PCA, Isomap, and diffusion map, to visualize and analyze a part of the materials space populated by solar cells made of metal oxides. Solar cells in general and metal-oxide-based solar cells in particular hold the promise of contributing to the world's search for clean and affordable energy resources. With the exception of PCA, these methods have seldom been used to visualize chemistry space and almost never been used to visualize materials space. For this purpose, we integrated five metal-oxide-based solar cell libraries into a uniform database and subjected it to dimensionality reduction by all four methods, comparing their performances using various criteria such as maintaining the local environment of samples and the clustering structure in the low-dimensional space. We also looked at the number of outliers produced by each method and analyzed common outliers. We found that PCA performs best in terms of the ability to correctly maintain the local environment of samples, whereas Isomap does the best job of assigning class membership on the basis of the identities of nearest neighbors (i.e., it is the best classifier). We also found that many of the outliers identified by all of the methods could be rationalized. We suggest that the methods used in this work could be extended to study other types of solar cells, thereby setting the ground for further analysis of the photovoltaic (PV) space as well as other regions of materials space.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Energia Solar / Bibliotecas de Moléculas Pequenas / Mineração de Dados Idioma: En Ano de publicação: 2018 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Energia Solar / Bibliotecas de Moléculas Pequenas / Mineração de Dados Idioma: En Ano de publicação: 2018 Tipo de documento: Article