Pesquisa | BVS - MINISTÉRIO DA SAÚDE

A comparison between 2D and 3D descriptors in QSAR modeling based on bio-active conformations.

Bahia, Malkeet Singh; Kaspi, Omer; Touitou, Meir; Binayev, Idan; Dhail, Seema; Spiegel, Jacob; Khazanov, Netaly; Yosipof, Abraham; Senderowitz, Hanoch.

Mol Inform ; 42(4): e2200186, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-36617991

RESUMO

QSAR models are widely and successfully used in many research areas. The success of such models highly depends on molecular descriptors typically classified as 1D, 2D, 3D, or 4D. While 3D information is likely important, e. g., for modeling ligand-protein binding, previous comparisons between the performances of 2D and 3D descriptors were inconclusive. Yet in such comparisons the modeled ligands were not necessarily represented by their bioactive conformations. With this in mind, we mined the PDB for sets of protein-ligand complexes sharing the same protein for which uniform activity data were reported. The results, totaling 461 structures spread across six series were compiled into a carefully curated, first of its kind dataset in which each ligand is represented by its bioactive conformation. Next, each set was characterized by 2D, 3D and 2D + 3D descriptors and modeled using three machine learning algorithms, namely, k-Nearest Neighbors, Random Forest and Lasso Regression. Models' performances were evaluated on external test sets derived from the parent datasets either randomly or in a rational manner. We found that many more significant models were obtained when combining 2D and 3D descriptors. We attribute these improvements to the ability of 2D and 3D descriptors to code for different, yet complementary molecular properties.

Assuntos

Proteínas , Relação Quantitativa Estrutura-Atividade , Ligantes , Conformação Molecular , Algoritmos

Toward Developing TechniquesâAgnostic Machine Learning Classification Models for Forensically Relevant Glass Fragments.

Kaspi, Omer; Israelsohn-Azulay, Osnat; Yigal, Zidon; Rosengarten, Hila; Krmpotic, Matea; Gouasmia, Sabrina; Bogdanovic Radovic, Iva; Jalkanen, Pasi; Liski, Anna; Mizohata, Kenichiro; Räisänen, Jyrki; Kasztovszky, Zsolt; Harsányi, Ildikó; Acharya, Raghunath; Pujari, Pradeep K; Mihály, Molnár; Braun, Mihaly; Shabi, Nahum; Girshevitz, Olga; Senderowitz, Hanoch.

J Chem Inf Model ; 63(1): 87-100, 2023 01 09.

Artigo em Inglês | MEDLINE | ID: mdl-36512692

RESUMO

Glass fragments found in crime scenes may constitute important forensic evidence when properly analyzed, for example, to determine their origin. This analysis could be greatly helped by having a large and diverse database of glass fragments and by using it for constructing reliable machine learning (ML)-based glass classification models. Ideally, the samples that make up this database should be analyzed by a single accurate and standardized analytical technique. However, due to differences in equipment across laboratories, this is not feasible. With this in mind, in this work, we investigated if and how measurement performed at different laboratories on the same set of glass fragments could be combined in the context of ML. First, we demonstrated that elemental analysis methods such as particle-induced X-ray emission (PIXE), laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS), scanning electron microscopy with energy-dispersive X-ray spectrometry (SEM-EDS), particle-induced Gamma-ray emission (PIGE), instrumental neutron activation analysis (INAA), and prompt Gamma-ray neutron activation analysis (PGAA) could each produce lab-specific ML-based classification models. Next, we determined rules for the successful combinations of data from different laboratories and techniques and demonstrated that when followed, they give rise to improved models, and conversely, poor combinations will lead to poor-performing models. Thus, the combination of PIXE and LA-ICP-MS improves the performances by â¼10-15%, while combining PGAA with other techniques provides poorer performances in comparison with the lab-specific models. Finally, we demonstrated that the poor performances of the SEM-EDS technique, still in use by law enforcement agencies, could be greatly improved by replacing SEM-EDS measurements for Fe and Ca by PIXE measurements for these elements. These findings suggest a process whereby forensic laboratories using different elemental analysis techniques could upload their data into a unified database and get reliable classification based on lab-agnostic models. This in turn brings us closer to a more exhaustive extraction of information from glass fragment evidence and furthermore may form the basis for international-wide collaboration between law enforcement agencies.

Assuntos

Vidro

Inter-laboratory workflow for forensic applications: Classification of car glass fragments.

Kaspi, Omer; Israelsohn-Azulay, Osnat; Zidon, Yigal; Rosengarten, Hila; Krmpotic, Matea; Gouasmia, Sabrina; Radovic, Iva Bogdanovic; Jalkanen, Pasi; Liski, Anna; Mizohata, Kenichiro; Räisänen, Jyrki; Girshevitz, Olga; Senderowitz, Hanoch.

Forensic Sci Int ; 333: 111216, 2022 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-35220157

RESUMO

The International Atomic Energy Agency (IAEA) has coordinated a research project titled "Enhancing Nuclear Analytical Techniques to Meet the Needs of Forensics Sciences" (CRP F11021) with the aim of empowering accelerator and reactor based techniques for applications in forensic sciences. One of the key topics of this project was the analysis and classification of forensic glass specimens using Ion Beam Analysis (IBA) techniques and in particular, Particle Induced X-ray Emission (PIXE). To this end, glass fragments from car windows from different car models and manufacturers provided by the Israeli police force were subjected to PIXE measurements at three laboratories to determine their elemental compositions and possible glass corrosion. Major and trace elements were measured and given as an input to machine learning (ML) algorithms in order to develop classification models to determine the origin of the glass samples. First, we have developed ML models based on the results obtained at each lab. These models successfully classified glass fragments into different car models with an accuracy> 80% on external test sets. Next, we demonstrated that following an appropriate pre-processing step, results from different labs could be combined into a single unified database for the derivation of a classification model. This model demonstrates good performances that matches or surpasses the performances of models derived from the individual labs. This finding paves the way towards establishing an international database that is composed of measurements from various PIXE labs. We believe that using this methodology of combining various sources of measurements will improve models' performances and generality and will make the models accessible to law enforcement agencies around the world.

PIXE based, Machine-Learning (PIXEL) supported workflow for glass fragments classification.

Kaspi, Omer; Girshevitz, Olga; Senderowitz, Hanoch.

Talanta ; 234: 122608, 2021 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-34364421

RESUMO

This paper presents a structured workflow for glass fragment analysis based on a combination of Elemental Analysis using PIXE and Machine Learning tools, with the ultimate goal of standardizing and helping forensic efforts. The proposed workflow was implemented on glass fragments received from the Israeli DIFS (Israeli Police Force's Division of Identification and Forensic Sciences) that were collected from various vehicles, including glass fragments from different manufacturers and years of production. We demonstrate that this workflow can produce models with high (>80%) accuracy in identifying glass fragment's origins and provide a test-case demonstrating how the model can be applied in real-life forensic events. We provide a standard, reproducible methodology that can be used in many forensic domains beyond glass fragments, for example, Gun Shot Residue, flammable liquids, illegal substances, and more.

Assuntos

Vidro , Aprendizado de Máquina , Ciências Forenses , Fluxo de Trabalho

Visualization of Solar Cell Library Space by Dimensionality Reduction Methods.

Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch.

J Chem Inf Model ; 58(12): 2428-2439, 2018 12 24.

Artigo em Inglês | MEDLINE | ID: mdl-30485100

RESUMO

Visualizing high-dimensional data by projecting them into a two- or three-dimensional space is a popular approach in many scientific fields, including computer-aided drug design and cheminformatics. In contrast, dimensionality reduction techniques have been far less explored for materials informatics. Nevertheless, similar to their usefulness in analyzing the space of, e.g., drug-like molecules, such techniques could provide useful insights on materials space, including an intuitive grasp of the overall distribution of samples, the identification of interesting trends, including the formation of materials clusters and the presence of activity cliffs and outliers, and rational navigation through this space in the search for new materials. Here we present the first application of four dimensionality reduction techniques, namely, principal component analysis (PCA), kernel PCA, Isomap, and diffusion map, to visualize and analyze a part of the materials space populated by solar cells made of metal oxides. Solar cells in general and metal-oxide-based solar cells in particular hold the promise of contributing to the world's search for clean and affordable energy resources. With the exception of PCA, these methods have seldom been used to visualize chemistry space and almost never been used to visualize materials space. For this purpose, we integrated five metal-oxide-based solar cell libraries into a uniform database and subjected it to dimensionality reduction by all four methods, comparing their performances using various criteria such as maintaining the local environment of samples and the clustering structure in the low-dimensional space. We also looked at the number of outliers produced by each method and analyzed common outliers. We found that PCA performs best in terms of the ability to correctly maintain the local environment of samples, whereas Isomap does the best job of assigning class membership on the basis of the identities of nearest neighbors (i.e., it is the best classifier). We also found that many of the outliers identified by all of the methods could be rationalized. We suggest that the methods used in this work could be extended to study other types of solar cells, thereby setting the ground for further analysis of the photovoltaic (PV) space as well as other regions of materials space.

Assuntos

Mineração de Dados , Bibliotecas de Moléculas Pequenas , Energia Solar , Ciência dos Materiais

PV Analyzer: A Decision Support System for Photovoltaic Solar Cells Libraries.

Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch.

Mol Inform ; 37(9-10): e1800067, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-30022619

RESUMO

This work describes the integration of several data mining and machine learning tools for researching Photovoltaic (PV) solar cells libraries into a unified workflow embedded within a GUI-supported Decision Support System (DSS), named PV Analyzer. The analyzer's workflow is composed of several data analysis components including basic statistical and visualization methods as well as an algorithm for building predictive machine learning models. The analyzer allows for the identification of interesting trends within the libraries, not easily observable using simple bi-parametric correlations. This may lead to new insights into factor affecting solar cells performances with the ultimate goal of designing better solar cells. The analyzer was developed using MATLAB version R2014a and consequently could be easily extended by adding additional tools and algorithms. Furthermore, while in our hands, the analyzer has been primarily used in the area of PV cells, is it equally applicable to the analysis of any other dataset composed of activities as dependent variables and descriptors as independent variables.

Assuntos

Aprendizado de Máquina , Software , Energia Solar , Relação Quantitativa Estrutura-Atividade

RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells.

Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch.

J Cheminform ; 9(1): 34, 2017 Jun 06.

Artigo em Inglês | MEDLINE | ID: mdl-29086047

RESUMO

An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.

Visualization Based Data Mining for Comparison Between Two Solar Cell Libraries.

Yosipof, Abraham; Kaspi, Omer; Majhi, Koushik; Senderowitz, Hanoch.

Mol Inform ; 35(11-12): 622-628, 2016 12.

Artigo em Inglês | MEDLINE | ID: mdl-27870244

RESUMO

Material informatics may provide meaningful insights and powerful predictions for the development of new and efficient Metal Oxide (MO) based solar cells. The main objective of this paper is to establish the usefulness of data reduction and visualization methods for analyzing data sets emerging from multiple all-MOs solar cell libraries. For this purpose, two libraries, TiO2 |Co3 O4 and TiO2 |Co3 O4 |MoO3 , differing only by the presence of a MoO3 layer in the latter were analyzed with Principal Component Analysis and Self-Organizing Maps. Both analyses suggest that the addition of the MoO3 layer to the TiO2 |Co3 O4 library has affected the overall photovoltaic (PV) activity profile of the solar cells making the two libraries clearly distinguishable from one another. Furthermore, while MoO3 had an overall favorable effect on PV parameters, a sub-population of cells was identified which were either indifferent to its presence or even demonstrated a reduction in several parameters.

Assuntos

Mineração de Dados/métodos , Cobalto/química , Molibdênio/química , Óxidos/química , Energia Solar , Titânio/química

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA