Dental data mining: potential pitfalls and practical issues.
Adv Dent Res
; 17: 109-14, 2003 Dec.
Article
in En
| MEDLINE
| ID: mdl-15126220
ABSTRACT
Knowledge Discovery and Data Mining (KDD) have become popular buzzwords. But what exactly is data mining? What are its strengths and limitations? Classic regression, artificial neural network (ANN), and classification and regression tree (CART) models are common KDD tools. Some recent reports (e.g., Kattan et al., 1998) show that ANN and CART models can perform better than classic regression models CART models excel at covariate interactions, while ANN models excel at nonlinear covariates. Model prediction performance is examined with the use of validation procedures and evaluating concordance, sensitivity, specificity, and likelihood ratio. To aid interpretation, various plots of predicted probabilities are utilized, such as lift charts, receiver operating characteristic curves, and cumulative captured-response plots. A dental caries study is used as an illustrative example. This paper compares the performance of logistic regression with KDD methods of CART and ANN in analyzing data from the Rochester caries study. With careful analysis, such as validation with sufficient sample size and the use of proper competitors, problems of naïve KDD analyses (Schwarzer et al., 2000) can be carefully avoided.
Search on Google
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Models, Statistical
/
Databases, Factual
/
Information Storage and Retrieval
/
Neural Networks, Computer
/
Dental Caries
Type of study:
Incidence_studies
/
Prognostic_studies
/
Risk_factors_studies
Limits:
Child
/
Humans
Language:
En
Journal:
Adv Dent Res
Journal subject:
ODONTOLOGIA
Year:
2003
Document type:
Article
Affiliation country:
United States