RESUMEN
BACKGROUND: Lung adenocarcinoma is a common cause of cancer-related deaths worldwide, and accurate EGFR genotyping is crucial for optimal treatment outcomes. Conventional methods for identifying the EGFR genotype have several limitations. Therefore, we proposed a deep learning model using non-invasive CT images to predict EGFR mutation status with robustness and generalizability. METHODS: A total of 525 patients were enrolled at the local hospital to serve as the internal data set for model training and validation. In addition, a cohort of 30 patients from the publicly available Cancer Imaging Archive Data Set was selected for external testing. All patients underwent plain chest CT, and their EGFR mutation status labels were categorized as either mutant or wild type. The CT images were analyzed using a self-attention-based ViT-B/16 model to predict the EGFR mutation status, and the model's performance was evaluated. To produce an attention map indicating the suspicious locations of EGFR mutations, Grad-CAM was utilized. RESULTS: The ViT deep learning model achieved impressive results, with an accuracy of 0.848, an AUC of 0.868, a sensitivity of 0.924, and a specificity of 0.718 on the validation cohort. Furthermore, in the external test cohort, the model achieved comparable performances, with an accuracy of 0.833, an AUC of 0.885, a sensitivity of 0.900, and a specificity of 0.800. CONCLUSIONS: The ViT model demonstrates a high level of accuracy in predicting the EGFR mutation status of lung adenocarcinoma patients. Moreover, with the aid of attention maps, the model can assist clinicians in making informed clinical decisions.