Pesquisa | BVS CLAP/SMR-OPAS/OMS

Assessing the capacity of a denoising diffusion probabilistic model to reproduce spatial context.

Deshpande, Rucha; Ozbey, Muzaffer; Li, Hua; Anastasio, Mark A; Brooks, Frank J.

IEEE Trans Med Imaging ; PP2024 Jun 14.

Artigo em Inglês | MEDLINE | ID: mdl-38875086

RESUMO

Diffusion models have emerged as a popular family of deep generative models (DGMs). In the literature, it has been claimed that one class of diffusion models-denoising diffusion probabilistic models (DDPMs)-demonstrate superior image synthesis performance as compared to generative adversarial networks (GANs). To date, these claims have been evaluated using either ensemble-based methods designed for natural images, or conventional measures of image quality such as structural similarity. However, there remains an important need to understand the extent to which DDPMs can reliably learn medical imaging domain-relevant information, which is referred to as 'spatial context' in this work. To address this, a systematic assessment of the ability of DDPMs to learn spatial context relevant to medical imaging applications is reported for the first time. A key aspect of the studies is the use of stochastic context models (SCMs) to produce training data. In this way, the ability of the DDPMs to reliably reproduce spatial context can be quantitatively assessed by use of post-hoc image analyses. Error-rates in DDPM-generated ensembles are reported, and compared to those corresponding to other modern DGMs. The studies reveal new and important insights regarding the capacity of DDPMs to learn spatial context. Notably, the results demonstrate that DDPMs hold significant capacity for generating contextually correct images that are 'interpolated' between training samples, which may benefit data-augmentation tasks in ways that GANs cannot.

Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics.

Deshpande, Rucha; Kelkar, Varun A; Gotsis, Dimitrios; Kc, Prabhat; Zeng, Rongping; Myers, Kyle J; Brooks, Frank J; Anastasio, Mark A.

ArXiv ; 2024 May 03.

Artigo em Inglês | MEDLINE | ID: mdl-38745699

RESUMO

Background: The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. Purpose: The goal of this challenge was to promote the development of deep generative models for medical imaging and to emphasize the need for their domain-relevant assessments via the analysis of relevant image statistics. Methods: As part of this Grand Challenge, a common training dataset and an evaluation procedure was developed for benchmarking deep generative models for medical image synthesis. To create the training dataset, an established 3D virtual breast phantom was adapted. The resulting dataset comprised about 108,000 images of size 512×512. For the evaluation of submissions to the Challenge, an ensemble of 10,000 DGM-generated images from each submission was employed. The evaluation procedure consisted of two stages. In the first stage, a preliminary check for memorization and image quality (via the Fréchet Inception Distance (FID)) was performed. Submissions that passed the first stage were then evaluated for the reproducibility of image statistics corresponding to several feature families including texture, morphology, image moments, fractal statistics and skeleton statistics. A summary measure in this feature space was employed to rank the submissions. Additional analyses of submissions was performed to assess DGM performance specific to individual feature families, the four classes in the training data, and also to identify various artifacts. Results: Fifty-eight submissions from 12 unique users were received for this Challenge. Out of these 12 submissions, 9 submissions passed the first stage of evaluation and were eligible for ranking. The top-ranked submission employed a conditional latent diffusion model, whereas the joint runners-up employed a generative adversarial network, followed by another network for image superresolution. In general, we observed that the overall ranking of the top 9 submissions according to our evaluation method (i) did not match the FID-based ranking, and (ii) differed with respect to individual feature families. Another important finding from our additional analyses was that different DGMs demonstrated similar kinds of artifacts. Conclusions: This Grand Challenge highlighted the need for domain-specific evaluation to further DGM design as well as deployment. It also demonstrated that the specification of a DGM may differ depending on its intended use.

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA