RESUMO
Estimated age from brain MRI data has emerged as a promising biomarker of neurological health. However, the absence of large, diverse, and clinically representative training datasets, along with the complexity of managing heterogeneous MRI data, presents significant barriers to the development of accurate and generalisable models appropriate for clinical use. Here, we present a deep learning framework trained on routine clinical data (N up to 18,890, age range 18-96 years). We trained five separate models for accurate brain age prediction (all with mean absolute error ≤4.0 years, R2 ≥ .86) across five different MRI sequences (T2 -weighted, T2 -FLAIR, T1 -weighted, diffusion-weighted, and gradient-recalled echo T2 *-weighted). Our trained models offer dual functionality. First, they have the potential to be directly employed on clinical data. Second, they can be used as foundation models for further refinement to accommodate a range of other MRI sequences (and therefore a range of clinical scenarios which employ such sequences). This adaptation process, enabled by transfer learning, proved effective in our study across a range of MRI sequences and scan orientations, including those which differed considerably from the original training datasets. Crucially, our findings suggest that this approach remains viable even with limited data availability (as low as N = 25 for fine-tuning), thus broadening the application of brain age estimation to more diverse clinical contexts and patient populations. By making these models publicly available, we aim to provide the scientific community with a versatile toolkit, promoting further research in brain age prediction and related areas.
Assuntos
Encéfalo , Rememoração Mental , Humanos , Adolescente , Adulto Jovem , Adulto , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou mais , Pré-Escolar , Encéfalo/diagnóstico por imagem , Difusão , Neuroimagem , Aprendizado de MáquinaRESUMO
Unlocking the vast potential of deep learning-based computer vision classification systems necessitates large data sets for model training. Natural Language Processing (NLP)-involving automation of dataset labelling-represents a potential avenue to achieve this. However, many aspects of NLP for dataset labelling remain unvalidated. Expert radiologists manually labelled over 5,000 MRI head reports in order to develop a deep learning-based neuroradiology NLP report classifier. Our results demonstrate that binary labels (normal vs. abnormal) showed high rates of accuracy, even when only two MRI sequences (T2-weighted and those based on diffusion weighted imaging) were employed as opposed to all sequences in an examination. Meanwhile, the accuracy of more specific labelling for multiple disease categories was variable and dependent on the category. Finally, resultant model performance was shown to be dependent on the expertise of the original labeller, with worse performance seen with non-expert vs. expert labellers.
RESUMO
The growing demand for head magnetic resonance imaging (MRI) examinations, along with a global shortage of radiologists, has led to an increase in the time taken to report head MRI scans in recent years. For many neurological conditions, this delay can result in poorer patient outcomes and inflated healthcare costs. Potentially, computer vision models could help reduce reporting times for abnormal examinations by flagging abnormalities at the time of imaging, allowing radiology departments to prioritise limited resources into reporting these scans first. To date, however, the difficulty of obtaining large, clinically-representative labelled datasets has been a bottleneck to model development. In this work, we present a deep learning framework, based on convolutional neural networks, for detecting clinically-relevant abnormalities in minimally processed, hospital-grade axial T2-weighted and axial diffusion-weighted head MRI scans. The models were trained at scale using a Transformer-based neuroradiology report classifier to generate a labelled dataset of 70,206 examinations from two large UK hospital networks, and demonstrate fast (< 5 s), accurate (area under the receiver operating characteristic curve (AUC) > 0.9), and interpretable classification, with good generalisability between hospitals (ΔAUC ≤ 0.02). Through a simulation study we show that our best model would reduce the mean reporting time for abnormal examinations from 28 days to 14 days and from 9 days to 5 days at the two hospital networks, demonstrating feasibility for use in a clinical triage environment.
Assuntos
Aprendizado Profundo , Imagem de Difusão por Ressonância Magnética , Hospitais , Humanos , Imageamento por Ressonância Magnética/métodos , Triagem/métodosRESUMO
Convolutional neural networks (CNN) can accurately predict chronological age in healthy individuals from structural MRI brain scans. Potentially, these models could be applied during routine clinical examinations to detect deviations from healthy ageing, including early-stage neurodegeneration. This could have important implications for patient care, drug development, and optimising MRI data collection. However, existing brain-age models are typically optimised for scans which are not part of routine examinations (e.g., volumetric T1-weighted scans), generalise poorly (e.g., to data from different scanner vendors and hospitals etc.), or rely on computationally expensive pre-processing steps which limit real-time clinical utility. Here, we sought to develop a brain-age framework suitable for use during routine clinical head MRI examinations. Using a deep learning-based neuroradiology report classifier, we generated a dataset of 23,302 'radiologically normal for age' head MRI examinations from two large UK hospitals for model training and testing (age range = 18-95 years), and demonstrate fast (< 5 s), accurate (mean absolute error [MAE] < 4 years) age prediction from clinical-grade, minimally processed axial T2-weighted and axial diffusion-weighted scans, with generalisability between hospitals and scanner vendors (Δ MAE < 1 year). The clinical relevance of these brain-age predictions was tested using 228 patients whose MRIs were reported independently by neuroradiologists as showing atrophy 'excessive for age'. These patients had systematically higher brain-predicted age than chronological age (mean predicted age difference = +5.89 years, 'radiologically normal for age' mean predicted age difference = +0.05 years, p < 0.0001). Our brain-age framework demonstrates feasibility for use as a screening tool during routine hospital examinations to automatically detect older-appearing brains in real-time, with relevance for clinical decision-making and optimising patient pathways.
Assuntos
Envelhecimento , Encéfalo/diagnóstico por imagem , Desenvolvimento Humano , Imageamento por Ressonância Magnética , Neuroimagem , Adolescente , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Envelhecimento/patologia , Envelhecimento/fisiologia , Aprendizado Profundo , Desenvolvimento Humano/fisiologia , Humanos , Imageamento por Ressonância Magnética/métodos , Imageamento por Ressonância Magnética/normas , Pessoa de Meia-Idade , Neuroimagem/métodos , Neuroimagem/normas , Adulto JovemRESUMO
OBJECTIVES: The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development. METHODS: Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports ('reference-standard report labels'); a subset of these examinations (n = 250) were assigned 'reference-standard image labels' by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated. RESULTS: Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min. CONCLUSIONS: Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications. KEY POINTS: ⢠Deep learning is poised to revolutionise image recognition tasks in radiology; however, a barrier to clinical adoption is the difficulty of obtaining large labelled datasets for model training. ⢠We demonstrate a deep learning model which can derive labels from neuroradiology reports and assign these to the corresponding examinations at scale, facilitating the development of downstream computer vision models. ⢠We rigorously tested our model by comparing labels predicted on the basis of neuroradiology reports with two sets of reference-standard labels: (1) labels derived by manually scrutinising each radiology report and (2) labels derived by interrogating the actual images.