Your browser doesn't support javascript.
loading
Study of Data Set Modelability: Modelability, Rivality, and Weighted Modelability Indexes.
Luque Ruiz, Irene; Gómez-Nieto, Miguel Ángel.
Afiliação
  • Luque Ruiz I; Department of Computing and Numerical Analysis , University of Córdoba , Campus de Rabanales , Albert Einstein Building, E-14071 Córdoba , Spain.
  • Gómez-Nieto MÁ; Department of Computing and Numerical Analysis , University of Córdoba , Campus de Rabanales , Albert Einstein Building, E-14071 Córdoba , Spain.
J Chem Inf Model ; 58(9): 1798-1814, 2018 09 24.
Article em En | MEDLINE | ID: mdl-30149700
ABSTRACT
The knowledge of the capacity of a data set to be modeled in the first stages of the building of quantitative structure-activity relationship (QSAR) prediction models is an important issue because it might reduce the effort and time necessary to select or reject data sets and in refining the data set's composition. The modelability index (MODI) is based on the counting of the first nearest neighbor belonging to the molecules of the data set and is a standardized measurement assumed in the QSAR community. In this paper, we revisit the calculation of the modelability index, proposing a more formal formulation that extends the calculation to the first nearest neighbors that belong to each existing class in the data set. In addition, this new formulation allows the calculation of the rivality index, as a measurement of the presence of correctly classifiable molecules and activity cliffs. By weighting the rivality index considering the cardinality of the neighborhood of each molecule of the data set, the calculated weighted modelability index is highly correlated with the correct classification rate (QSAR_CCR) obtained in the building of QSAR models using different classification algorithms. The results obtained with the weighted modelability index show correlations of r2 higher than 0.9, slopes close to 1, and bias close to zero for different algorithms.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Simulação por Computador / Descoberta de Drogas Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2018 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Simulação por Computador / Descoberta de Drogas Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2018 Tipo de documento: Article