RESUMO
The International Classification of Diseases (ICD) is a widely used criterion for disease classification, health monitoring, and medical data analysis. Deep learning-based automated ICD coding has gained attention due to the time-consuming and costly nature of manual coding. The main challenges of automated ICD coding include imbalanced label distribution, code hierarchy and noisy texts. Recent works have considered using code hierarchy or description for better label representation to solve the problem of imbalanced label distribution. However, these methods are still ineffective and redundant since they only interact with a constant label representation. In this work, we introduce a novel Hyperbolic Graph Convolutional Network with Contrastive Learning (HGCN-CL) to solve the above problems and the shortcomings of the previous methods. We adopt a Hyperbolic graph convolutional network on ICD coding to capture the hierarchical structure of codes, which can solve the problem of large distortions when embedding hierarchical structure with graph convolutional network. Besides, we introduce contrastive learning for automatic ICD coding by injecting code features into text encoder to generate hierarchical-aware positive samples to solve the problem of interacting with constant code features. We conduct experiments on the public MIMIC-III and MIMIC-II datasets. The results on MIMIC III show that HGCN-CL outperforms previous state-of-art methods for automatic ICD coding, which achieves a 2.7% and 3.6% improvement respectively compared to previous best results (Hypercore). We also provide ablation experiments and hierarchy visualization to verify the effectiveness of components in our model.