RESUMO
Hospital information systems (HISs) and picture archiving and communication systems (PACSs) are archiving large amounts of data (i.e., "big data") that are not being used. Therefore, many research projects in progress are trying to use "big data" for the development of early diagnosis, prediction of disease onset, and personalized therapies. In this study, we propose a new method for image data mining to identify regularities and abnormalities in the large image data sets. We used 70 archived magnetic resonance (MR) images that were acquired using three-dimensional magnetization-prepared rapid acquisition with gradient echo (3D MP-RAGE). These images were obtained from the Alzheimer's disease neuroimaging initiative (ADNI) database. For anatomical standardization of the data, we used the statistical parametric mapping (SPM) software. Using a similarity matrix based on cross-correlation coefficients (CCs) calculated from an anatomical region and a hierarchical clustering technique, we classified all the abnormal cases into five groups. The Z score map identified the difference between a standard normal brain and each of those from the Alzheimer's groups. In addition, the scatter plot obtained from two similarity matrixes visualized the regularities and abnormalities in the image data sets. Image features identified using our method could be useful for understanding of image findings associated with Alzheimer's disease.