RESUMO
PURPOSE: Studies have shown the association between tongue color and diseases. To help clinicians make more objective and accurate decisions quickly, we take unsupervised learning to deal with the basic clustering of tongue color in a 2D way. METHODS: A total of 595 typical tongue images were analyzed. The 3D information extracted from the image was transformed into 2D information by principal component analysis (PCA). K-Means was applied for clustering into four diagnostic groups. The results were evaluated by clustering accuracy (CA), Jaccard similarity coefficient (JSC), and adjusted rand index (ARI). RESULTS: The new 2D information totally retained 89.63% original information in the L*a*b* color space. And our methods successfully classified tongue images into four clusters and the CA, ARI, and JSC were 89.04%, 0.721, and 0.890, respectively. CONCLUSIONS: The 2D information of tongue color can be used for clustering and to improve the visualization. K-Means combined with PCA could be used for tongue color classification and diagnosis. Methods in the paper might provide reference for the other research based on image diagnosis technology.
Assuntos
Cor , Língua , Análise por Conglomerados , Humanos , Análise de Componente PrincipalRESUMO
OBJECTIVE: In this study, machine learning was utilized to classify and predict pulse wave of hypertensive group and healthy group and assess the risk of hypertension by observing the dynamic change of the pulse wave and provide an objective reference for clinical application of pulse diagnosis in traditional Chinese medicine (TCM). METHOD: The basic information from 450 hypertensive cases and 479 healthy cases was collected by self-developed H20 questionnaires and pulse wave information was acquired by self-developed pulse diagnostic instrument (PDA-1). H20 questionnaires and pulse wave information were used as input variables to obtain different machine learning classification models of hypertension. This method was aimed at analyzing the influence of pulse wave on the accuracy and stability of machine learning model, as well as the feature contribution of hypertension model after removing noise by K-means. RESULT: Compared with the classification results before removing noise, the accuracy and the area under the curve (AUC) had been improved. The accuracy rates of AdaBoost, Gradient Boosting, and Random Forest (RF) were 86.41%, 86.41%, and 85.33%, respectively. AUC were 0.86, 0.86, and 0.85, respectively. The maximum accuracy of SVM increased from 79.57% to 83.15%, and the AUC stability increased from 0.79 to 0.83. In addition, the features of importance on traditional statistics and machine learning were consistent. After removing noise, the features with large changes were h1/t1, w1/t, t, w2, h2, t1, and t5 in AdaBoost and Gradient Boosting (top10). The common variables for machine learning and traditional statistics were h1/t1, h5, t, Ad, BMI, and t2. CONCLUSION: Pulse wave-based diagnostic method of hypertension has significant value in reference. In view of the feasibility of digital-pulse-wave diagnosis and dynamically evaluating hypertension, it provides the research direction and foundation for Chinese medicine in the dynamic evaluation of modern disease diagnosis and curative effect.