Your browser doesn't support javascript.
loading
Deep learning prediction of sex on chest radiographs: a potential contributor to biased algorithms.
Li, David; Lin, Cheng Ting; Sulam, Jeremias; Yi, Paul H.
  • Li D; Faculty of Medicine, University of Ottawa, Roger Guindon Hall, 451 Smyth Rd #2044, Ottawa, ON, K1H 8M5, Canada.
  • Lin CT; University of Maryland Medical Intelligent Imaging (UM2II) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 670 W Baltimore St, Room 1172, Baltimore, MD, 21201, USA.
  • Sulam J; Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N Caroline St, Baltimore, MD, 21231, USA.
  • Yi PH; Department of Biomedical Engineering, Johns Hopkins University, Clark 320B, 3400 N Charles St, Baltimore, MD, 21218, USA.
Emerg Radiol ; 29(2): 365-370, 2022 Apr.
Article en En | MEDLINE | ID: mdl-35006495
ABSTRACT

BACKGROUND:

Deep convolutional neural networks (DCNNs) for diagnosis of disease on chest radiographs (CXR) have been shown to be biased against males or females if the datasets used to train them have unbalanced sex representation. Prior work has suggested that DCNNs can predict sex on CXR, which could aid forensic evaluations, but also be a source of bias.

OBJECTIVE:

To (1) evaluate the performance of DCNNs for predicting sex across different datasets and architectures and (2) evaluate visual biomarkers used by DCNNs to predict sex on CXRs. MATERIALS AND

METHODS:

Chest radiographs were obtained from the Stanford CheXPert and NIH Chest XRay14 datasets which comprised of 224,316 and 112,120 CXRs, respectively. To control for dataset size and class imbalance, random undersampling was used to reduce each dataset to 97,560 images that were balanced for sex. Each dataset was randomly split into training (70%), validation (10%), and test (20%) sets. Four DCNN architectures pre-trained on ImageNet were used for transfer learning. DCNNs were externally validated using a test set from the opposing dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUC). Class activation mapping (CAM) was used to generate heatmaps visualizing the regions contributing to the DCNN's prediction.

RESULTS:

On the internal test set, DCNNs achieved AUROCs ranging from 0.98 to 0.99. On external validation, the models reached peak cross-dataset performance of 0.94 for the VGG19-Stanford model and 0.95 for the InceptionV3-NIH model. Heatmaps highlighted similar regions of attention between model architectures and datasets, localizing to the mediastinal and upper rib regions, as well as to the lower chest/diaphragmatic regions.

CONCLUSION:

DCNNs trained on two large CXR datasets accurately predicted sex on internal and external test data with similar heatmap localizations across DCNN architectures and datasets. These findings support the notion that DCNNs can leverage imaging biomarkers to predict sex and potentially confound the accurate prediction of disease on CXRs and contribute to biased models. On the other hand, these DCNNs can be beneficial to emergency radiologists for forensic evaluations and identifying patient sex for patients whose identities are unknown, such as in acute trauma.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Aprendizaje Profundo Tipo de estudio: Diagnostic_studies / Prognostic_studies / Risk_factors_studies Límite: Female / Humans / Male Idioma: En Año: 2022 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Aprendizaje Profundo Tipo de estudio: Diagnostic_studies / Prognostic_studies / Risk_factors_studies Límite: Female / Humans / Male Idioma: En Año: 2022 Tipo del documento: Article