RESUMO
OBJECTIVES: To investigate the potential and limitations of utilizing transformer-based report annotation for on-site development of image-based diagnostic decision support systems (DDSS). METHODS: The study included 88,353 chest X-rays from 19,581 intensive care unit (ICU) patients. To label the presence of six typical findings in 17,041 images, the corresponding free-text reports of the attending radiologists were assessed by medical research assistants ("gold labels"). Automatically generated "silver" labels were extracted for all reports by transformer models trained on gold labels. To investigate the benefit of such silver labels, the image-based models were trained using three approaches: with gold labels only (MG), with silver labels first, then with gold labels (MS/G), and with silver and gold labels together (MS+G). To investigate the influence of invested annotation effort, the experiments were repeated with different numbers (N) of gold-annotated reports for training the transformer and image-based models and tested on 2099 gold-annotated images. Significant differences in macro-averaged area under the receiver operating characteristic curve (AUC) were assessed by non-overlapping 95% confidence intervals. RESULTS: Utilizing transformer-based silver labels showed significantly higher macro-averaged AUC than training solely with gold labels (N = 1000: MG 67.8 [66.0-69.6], MS/G 77.9 [76.2-79.6]; N = 14,580: MG 74.5 [72.8-76.2], MS/G 80.9 [79.4-82.4]). Training with silver and gold labels together was beneficial using only 500 gold labels (MS+G 76.4 [74.7-78.0], MS/G 75.3 [73.5-77.0]). CONCLUSIONS: Transformer-based annotation has potential for unlocking free-text report databases for the development of image-based DDSS. However, on-site development of image-based DDSS could benefit from more sophisticated annotation pipelines including further information than a single radiological report. CLINICAL RELEVANCE STATEMENT: Leveraging clinical databases for on-site development of artificial intelligence (AI)-based diagnostic decision support systems by text-based transformers could promote the application of AI in clinical practice by circumventing highly regulated data exchanges with third parties. KEY POINTS: ⢠The amount of data from a database that can be used to develop AI-assisted diagnostic decision systems is often limited by the need for time-consuming identification of pathologies by radiologists. ⢠The transformer-based structuring of free-text radiological reports shows potential to unlock corresponding image databases for on-site development of image-based diagnostic decision support systems. ⢠However, the quality of image annotations generated solely on the content of a single radiology report may be limited by potential inaccuracies and incompleteness of this report.