RESUMEN
Medical imaging research is often limited by data scarcity and availability. Governance, privacy concerns and the cost of acquisition all restrict access to medical imaging data, which, compounded by the data-hungry nature of deep learning algorithms, limits progress in the field of healthcare AI. Generative models have recently been used to synthesize photorealistic natural images, presenting a potential solution to the data scarcity problem. But are current generative models synthesizing morphologically correct samples? In this work we present a three-dimensional generative model of the human brain that is trained at the necessary scale to generate diverse, realistic-looking, high-resolution and morphologically preserving samples and conditioned on patient characteristics (for example, age and pathology). We show that the synthetic samples generated by the model preserve biological and disease phenotypes and are realistic enough to permit use downstream in well-established image analysis tools. While the proposed model has broad future applicability, such as anomaly detection and learning under limited data, its generative capabilities can be used to directly mitigate data scarcity, limited data availability and algorithmic fairness.
RESUMEN
The last few years have seen a boom in using generative models to augment real datasets, as synthetic data can effectively model real data distributions and provide privacy-preserving, shareable datasets that can be used to train deep learning models. However, most of these methods are 2D and provide synthetic datasets that come, at most, with categorical annotations. The generation of paired images and segmentation samples that can be used in downstream, supervised segmentation tasks remains fairly uncharted territory. This work proposes a two-stage generative model capable of producing 2D and 3D semantic label maps and corresponding multi-modal images. We use a latent diffusion model for label synthesis and a VAE-GAN for semantic image synthesis. Synthetic datasets provided by this model are shown to work in a wide variety of segmentation tasks, supporting small, real datasets or fully replacing them while maintaining good performance. We also demonstrate its ability to improve downstream performance on out-of-distribution data.
Asunto(s)
Imagen por Resonancia Magnética , Humanos , Imagen por Resonancia Magnética/métodos , Encéfalo/diagnóstico por imagen , Aprendizaje Profundo , Imagen Multimodal/métodos , Algoritmos , Imagenología Tridimensional/métodos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
Any clinically-deployed image-processing pipeline must be robust to the full range of inputs it may be presented with. One popular approach to this challenge is to develop predictive models that can provide a measure of their uncertainty. Another approach is to use generative modelling to quantify the likelihood of inputs. Inputs with a low enough likelihood are deemed to be out-of-distribution and are not presented to the downstream predictive model. In this work, we evaluate several approaches to segmentation with uncertainty for the task of segmenting bleeds in 3D CT of the head. We show that these models can fail catastrophically when operating in the far out-of-distribution domain, often providing predictions that are both highly confident and wrong. We propose to instead perform out-of-distribution detection using the Latent Transformer Model: a VQ-GAN is used to provide a highly compressed latent representation of the input volume, and a transformer is then used to estimate the likelihood of this compressed representation of the input. We demonstrate this approach can identify images that are both far- and near- out-of-distribution, as well as provide spatial maps that highlight the regions considered to be out-of-distribution. Furthermore, we find a strong relationship between an image's likelihood and the quality of a model's segmentation on it, demonstrating that this approach is viable for filtering out unsuitable images.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Humanos , Probabilidad , IncertidumbreRESUMEN
BACKGROUND: The prediction of long-term mortality following acute illness can be unreliable for older patients, inhibiting the delivery of targeted clinical interventions. The difficulty plausibly arises from the complex, multifactorial nature of the underlying biology in this population, which flexible, multimodal models based on machine learning may overcome. Here, we test this hypothesis by quantifying the comparative predictive fidelity of such models in a large consecutive sample of older patients acutely admitted to hospital and characterise their biological support. METHODS: A set of 804 admission episodes involving 616 unique patients with a mean age of 84.5 years consecutively admitted to the Acute Geriatric service at University College Hospital were identified, in whom clinical diagnoses, blood tests, cognitive status, computed tomography of the head, and mortality within 600 days after admission were available. We trained and evaluated out-of-sample an array of extreme gradient boosted trees-based predictive models of incrementally greater numbers of investigational modalities and modelled features. Both linear and non-linear associations with investigational features were quantified. RESULTS: Predictive models of mortality showed progressively increasing fidelity with greater numbers of modelled modalities and dimensions. The area under the receiver operating characteristic curve rose from 0.67 (sd = 0.078) for age and sex to 0.874 (sd = 0.046) for the most comprehensive model. Extracranial bone and soft tissue features contributed more than intracranial features towards long-term mortality prediction. The anterior cingulate and angular gyri, and serum albumin, were the greatest intracranial and biochemical model contributors respectively. CONCLUSIONS: High-dimensional, multimodal predictive models of mortality based on routine clinical data offer higher predictive fidelity than simpler models, facilitating individual level prognostication and interventional targeting. The joint contributions of both extracranial and intracranial features highlight the potential importance of optimising somatic as well as neural functions in healthy ageing. Our findings suggest a promising path towards a high-fidelity, multimodal index of frailty.
Asunto(s)
Fragilidad , Hospitalización , Humanos , Anciano , Anciano de 80 o más Años , Curva ROC , Fragilidad/diagnóstico , Estudios Retrospectivos , Mortalidad HospitalariaRESUMEN
An important goal of medical imaging is to be able to precisely detect patterns of disease specific to individual scans; however, this is challenged in brain imaging by the degree of heterogeneity of shape and appearance. Traditional methods, based on image registration, historically fail to detect variable features of disease, as they utilise population-based analyses, suited primarily to studying group-average effects. In this paper we therefore take advantage of recent developments in generative deep learning to develop a method for simultaneous classification, or regression, and feature attribution (FA). Specifically, we explore the use of a VAE-GAN (variational autoencoder - general adversarial network) for translation called ICAM, to explicitly disentangle class relevant features, from background confounds, for improved interpretability and regression of neurological phenotypes. We validate our method on the tasks of Mini-Mental State Examination (MMSE) cognitive test score prediction for the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort, as well as brain age prediction, for both neurodevelopment and neurodegeneration, using the developing Human Connectome Project (dHCP) and UK Biobank datasets. We show that the generated FA maps can be used to explain outlier predictions and demonstrate that the inclusion of a regression module improves the disentanglement of the latent space. Our code is freely available on GitHub https://github.com/CherBass/ICAM.
Asunto(s)
Conectoma , Neuroimagen , Humanos , Neuroimagen/métodos , Encéfalo/diagnóstico por imagen , CintigrafíaRESUMEN
Anomaly detection and segmentation pose an important task across sectors ranging from medical imaging analysis to industry quality control. However, current unsupervised approaches require training data to not contain any anomalies, a requirement that can be especially challenging in many medical imaging scenarios. In this paper, we propose Iterative Latent Token Masking, a self-supervised framework derived from a robust statistics point of view, translating an iterative model fitting with M-estimators to the task of anomaly detection. In doing so, this allows the training of unsupervised methods on datasets heavily contaminated with anomalous images. Our method stems from prior work on using Transformers, combined with a Vector Quantized-Variational Autoencoder, for anomaly detection, a method with state-of-the-art performance when trained on normal (non-anomalous) data. More importantly, we utilise the token masking capabilities of Transformers to filter out suspected anomalous tokens from each sample's sequence in the training set in an iterative self-supervised process, thus overcoming the difficulties of highly anomalous training data. Our work also highlights shortfalls in current state-of-the-art self-supervised, self-trained and unsupervised models when faced with small proportions of anomalous training data. We evaluate our method on whole-body PET data in addition to showing its wider application in more common computer vision tasks such as the industrial MVTec Dataset. Using varying levels of anomalous training data, our method showcases a superior performance over several state-of-the-art models, drawing attention to the potential of this approach.
RESUMEN
Cancer is a highly heterogeneous condition best visualised in positron emission tomography. Due to this heterogeneity, a general-purpose cancer detection model can be built using unsupervised learning anomaly detection models. While prior work in this field has showcased the efficacy of abnormality detection methods (e.g. Transformer-based), these have shown significant vulnerabilities to differences in data geometry. Changes in image resolution or observed field of view can result in inaccurate predictions, even with significant data pre-processing and augmentation. We propose a new spatial conditioning mechanism that enables models to adapt and learn from varying data geometries, and apply it to a state-of-the-art Vector-Quantized Variational Autoencoder + Transformer abnormality detection model. We showcase that this spatial conditioning mechanism statistically-significantly improves model performance on whole-body data compared to the same model without conditioning, while allowing the model to perform inference at varying data geometries.
RESUMEN
Pathological brain appearances may be so heterogeneous as to be intelligible only as anomalies, defined by their deviation from normality rather than any specific set of pathological features. Amongst the hardest tasks in medical imaging, detecting such anomalies requires models of the normal brain that combine compactness with the expressivity of the complex, long-range interactions that characterise its structural organisation. These are requirements transformers have arguably greater potential to satisfy than other current candidate architectures, but their application has been inhibited by their demands on data and computational resources. Here we combine the latent representation of vector quantised variational autoencoders with an ensemble of autoregressive transformers to enable unsupervised anomaly detection and segmentation defined by deviation from healthy brain imaging data, achievable at low computational cost, within relative modest data regimes. We compare our method to current state-of-the-art approaches across a series of experiments with 2D and 3D data involving synthetic and real pathological lesions. On real lesions, we train our models on 15,000 radiologically normal participants from UK Biobank and evaluate performance on four different brain MR datasets with small vessel disease, demyelinating lesions, and tumours. We demonstrate superior anomaly detection performance both image-wise and pixel/voxel-wise, achievable without post-processing. These results draw attention to the potential of transformers in this most challenging of imaging tasks.
Asunto(s)
Encefalopatías , Encéfalo , Encéfalo/diagnóstico por imagen , Humanos , NeuroimagenRESUMEN
Cancers can have highly heterogeneous uptake patterns best visualised in positron emission tomography. These patterns are essential to detect, diagnose, stage and predict the evolution of cancer. Due to this heterogeneity, a general-purpose cancer detection model can be built using unsupervised learning anomaly detection models; these models learn a healthy representation of tissue and detect cancer by predicting deviations from healthy appearances. This task alone requires models capable of accurately learning long-range interactions between organs, imaging patterns, and other abstract features with high levels of expressivity. Such characteristics are suitably satisfied by transformers, and have been shown to generate state-of-the-art results in unsupervised anomaly detection by training on healthy data. This work expands upon such approaches by introducing multi-modal conditioning of the transformer via cross-attention, i.e. supplying anatomical reference information from paired CT images to aid the PET anomaly detection task. Using 83 whole-body PET/CT samples containing various cancer types, we show that our anomaly detection method is robust and capable of achieving accurate cancer localisation results even in cases where healthy training data is unavailable. Furthermore, the proposed model uncertainty, in conjunction with a kernel density estimation approach, is shown to provide a statistically robust alternative to residual-based anomaly maps. Overall, a superior performance is demonstrated against leading alternatives, drawing attention to the potential of these approaches.