Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 2.253
Filter
1.
Front Oncol ; 14: 1400341, 2024.
Article in English | MEDLINE | ID: mdl-39091923

ABSTRACT

Brain tumors occur due to the expansion of abnormal cell tissues and can be malignant (cancerous) or benign (not cancerous). Numerous factors such as the position, size, and progression rate are considered while detecting and diagnosing brain tumors. Detecting brain tumors in their initial phases is vital for diagnosis where MRI (magnetic resonance imaging) scans play an important role. Over the years, deep learning models have been extensively used for medical image processing. The current study primarily investigates the novel Fine-Tuned Vision Transformer models (FTVTs)-FTVT-b16, FTVT-b32, FTVT-l16, FTVT-l32-for brain tumor classification, while also comparing them with other established deep learning models such as ResNet50, MobileNet-V2, and EfficientNet - B0. A dataset with 7,023 images (MRI scans) categorized into four different classes, namely, glioma, meningioma, pituitary, and no tumor are used for classification. Further, the study presents a comparative analysis of these models including their accuracies and other evaluation metrics including recall, precision, and F1-score across each class. The deep learning models ResNet-50, EfficientNet-B0, and MobileNet-V2 obtained an accuracy of 96.5%, 95.1%, and 94.9%, respectively. Among all the FTVT models, FTVT-l16 model achieved a remarkable accuracy of 98.70% whereas other FTVT models FTVT-b16, FTVT-b32, and FTVT-132 achieved an accuracy of 98.09%, 96.87%, 98.62%, respectively, hence proving the efficacy and robustness of FTVT's in medical image processing.

2.
Comput Biol Med ; 180: 108947, 2024 Aug 01.
Article in English | MEDLINE | ID: mdl-39094324

ABSTRACT

Recently, ViT and CNNs based on encoder-decoder architecture have become the dominant model in the field of medical image segmentation. However, there are some deficiencies for each of them: (1) It is difficult for CNNs to capture the interaction between two locations with consideration of the longer distance. (2) ViT cannot acquire the interaction of local context information and carries high computational complexity. To optimize the above deficiencies, we propose a new network for medical image segmentation, which is called FCSU-Net. FCSU-Net uses the proposed collaborative fusion of multi-scale feature block that enables the network to obtain more abundant and more accurate features. In addition, FCSU-Net fuses full-scale feature information through the FFF (Full-scale Feature Fusion) structure instead of simple skip connections, and establishes long-range dependencies on multiple dimensions through the CS (Cross-dimension Self-attention) mechanism. Meantime, every dimension is complementary to each other. Also, CS mechanism has the advantage of convolutions capturing local contextual weights. Finally, FCSU-Net is validated on several datasets, and the results show that FCSU-Net not only has a relatively small number of parameters, but also has a leading segmentation performance.

3.
Comput Biol Med ; 180: 108944, 2024 Aug 02.
Article in English | MEDLINE | ID: mdl-39096609

ABSTRACT

BACKGROUND: A single learning algorithm can produce deep learning-based image segmentation models that vary in performance purely due to random effects during training. This study assessed the effect of these random performance fluctuations on the reliability of standard methods of comparing segmentation models. METHODS: The influence of random effects during training was assessed by running a single learning algorithm (nnU-Net) with 50 different random seeds for three multiclass 3D medical image segmentation problems, including brain tumour, hippocampus, and cardiac segmentation. Recent literature was sampled to find the most common methods for estimating and comparing the performance of deep learning segmentation models. Based on this, segmentation performance was assessed using both hold-out validation and 5-fold cross-validation and the statistical significance of performance differences was measured using the Paired t-test and the Wilcoxon signed rank test on Dice scores. RESULTS: For the different segmentation problems, the seed producing the highest mean Dice score statistically significantly outperformed between 0 % and 76 % of the remaining seeds when estimating performance using hold-out validation, and between 10 % and 38 % when estimating performance using 5-fold cross-validation. CONCLUSION: Random effects during training can cause high rates of statistically-significant performance differences between segmentation models from the same learning algorithm. Whilst statistical testing is widely used in contemporary literature, our results indicate that a statistically-significant difference in segmentation performance is a weak and unreliable indicator of a true performance difference between two learning algorithms.

4.
Comput Biol Med ; 180: 108933, 2024 Aug 02.
Article in English | MEDLINE | ID: mdl-39096612

ABSTRACT

Medical image segmentation demands precise accuracy and the capability to assess segmentation uncertainty for informed clinical decision-making. Denoising Diffusion Probability Models (DDPMs), with their advancements in image generation, can treat segmentation as a conditional generation task, providing accurate segmentation and uncertainty estimation. However, current DDPMs used in medical image segmentation suffer from low inference efficiency and prediction errors caused by excessive noise at the end of the forward process. To address this issue, we propose an accelerated denoising diffusion probabilistic model via truncated inverse processes (ADDPM) that is specifically designed for medical image segmentation. The inverse process of ADDPM starts from a non-Gaussian distribution and terminates early once a prediction with relatively low noise is obtained after multiple iterations of denoising. We employ a separate powerful segmentation network to obtain pre-segmentation and construct the non-Gaussian distribution of the segmentation based on the forward diffusion rule. By further adopting a separate denoising network, the final segmentation can be obtained with just one denoising step from the predictions with low noise. ADDPM greatly reduces the number of denoising steps to approximately one-tenth of that in vanilla DDPMs. Our experiments on four segmentation tasks demonstrate that ADDPM outperforms both vanilla DDPMs and existing representative accelerating DDPMs methods. Moreover, ADDPM can be easily integrated with existing advanced segmentation models to improve segmentation performance and provide uncertainty estimation. Implementation code: https://github.com/Guoxt/ADDPM.

5.
Med Image Anal ; 97: 103280, 2024 Jul 22.
Article in English | MEDLINE | ID: mdl-39096845

ABSTRACT

Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules: (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations: Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https://github.com/Beckschen/TransUNet and https://github.com/Beckschen/TransUNet-3D, respectively.

6.
Neural Netw ; 178: 106546, 2024 Jul 17.
Article in English | MEDLINE | ID: mdl-39053196

ABSTRACT

Current state-of-the-art medical image segmentation techniques predominantly employ the encoder-decoder architecture. Despite its widespread use, this U-shaped framework exhibits limitations in effectively capturing multi-scale features through simple skip connections. In this study, we made a thorough analysis to investigate the potential weaknesses of connections across various segmentation tasks, and suggest two key aspects of potential semantic gaps crucial to be considered: the semantic gap among multi-scale features in different encoding stages and the semantic gap between the encoder and the decoder. To bridge these semantic gaps, we introduce a novel segmentation framework, which incorporates a Dual Attention Transformer module for capturing channel-wise and spatial-wise relationships, and a Decoder-guided Recalibration Attention module for fusing DAT tokens and decoder features. These modules establish a principle of learnable connection that resolves the semantic gaps, leading to a high-performance segmentation model for medical images. Furthermore, it provides a new paradigm for effectively incorporating the attention mechanism into the traditional convolution-based architecture. Comprehensive experimental results demonstrate that our model achieves consistent, significant gains and outperforms state-of-the-art methods with relatively fewer parameters. This study contributes to the advancement of medical image segmentation by offering a more effective and efficient framework for addressing the limitations of current encoder-decoder architectures. Code: https://github.com/McGregorWwww/UDTransNet.

7.
Med Image Anal ; 97: 103270, 2024 Jul 14.
Article in English | MEDLINE | ID: mdl-39059241

ABSTRACT

Recently, federated learning has raised increasing interest in the medical image analysis field due to its ability to aggregate multi-center data with privacy-preserving properties. A large amount of federated training schemes have been published, which we categorize into global (one final model), personalized (one model per institution) or hybrid (one model per cluster of institutions) methods. However, their applicability on the recently published Federated Brain Tumor Segmentation 2022 dataset has not been explored yet. We propose an extensive benchmark of federated learning algorithms from all three classes on this task. While standard FedAvg already performs very well, we show that some methods from each category can bring a slight performance improvement and potentially limit the final model(s) bias toward the predominant data distribution of the federation. Moreover, we provide a deeper understanding of the behavior of federated learning on this task through alternative ways of distributing the pooled dataset among institutions, namely an Independent and Identical Distributed (IID) setup, and a limited data setup. Our code is available at (https://github.com/MatthisManthe/Benchmark_FeTS2022).

8.
Technol Cancer Res Treat ; 23: 15330338241266205, 2024.
Article in English | MEDLINE | ID: mdl-39051534

ABSTRACT

Recently, large language models such as ChatGPT have made huge strides in understanding and generating human-like text and have demonstrated considerable success in natural language processing. These foundation models also perform well in computer vision. However, there is a growing need to use these technologies for specific medical tasks, especially for identifying cancer in images. This paper looks at how these foundation models, such as the segment anything model, could be used for cancer segmentation, discussing the potential benefits and challenges of applying large foundation models to help with cancer diagnoses.


Subject(s)
Neoplasms , Humans , Neoplasms/diagnostic imaging , Neoplasms/pathology , Natural Language Processing , Algorithms , Image Processing, Computer-Assisted/methods
9.
J Imaging Inform Med ; 2024 Jul 17.
Article in English | MEDLINE | ID: mdl-39020154

ABSTRACT

This paper presents an innovative automatic fusion imaging system that combines 3D CT/MR images with real-time ultrasound acquisition. The system eliminates the need for external physical markers and complex training, making image fusion feasible for physicians with different experience levels. The integrated system involves a portable 3D camera for patient-specific surface acquisition, an electromagnetic tracking system, and US components. The fusion algorithm comprises two main parts: skin segmentation and rigid co-registration, both integrated into the US machine. The co-registration aligns the surface extracted from CT/MR images with the 3D surface acquired by the camera, facilitating rapid and effective fusion. Experimental tests in different settings, validate the system's accuracy, computational efficiency, noise robustness, and operator independence.

10.
J Imaging Inform Med ; 2024 Jul 17.
Article in English | MEDLINE | ID: mdl-39020158

ABSTRACT

Wound management requires the measurement of the wound parameters such as its shape and area. However, computerized analysis of the wound suffers the challenge of inexact segmentation of the wound images due to limited or inaccurate labels. It is a common scenario that the source domain provides an abundance of labeled data, while the target domain provides only limited labels. To overcome this, we propose a novel approach that combines self-training learning and mixup augmentation. The neural network is trained on the source domain to generate weak labels on the target domain via the self-training process. In the second stage, generated labels are mixed up with labels from the source domain to retrain the neural network and enhance generalization across diverse datasets. The efficacy of our approach was evaluated using the DFUC 2022, FUSeg, and RMIT datasets, demonstrating substantial improvements in segmentation accuracy and robustness across different data distributions. Specifically, in single-domain experiments, segmentation on the DFUC 2022 dataset scored a dice score of 0.711, while the score on the FUSeg dataset achieved 0.859. For domain adaptation, when these datasets were used as target datasets, the dice scores were 0.714 for DFUC 2022 and 0.561 for FUSeg.

11.
Sensors (Basel) ; 24(13)2024 Jun 21.
Article in English | MEDLINE | ID: mdl-39000834

ABSTRACT

The fusion of multi-modal medical images has great significance for comprehensive diagnosis and treatment. However, the large differences between the various modalities of medical images make multi-modal medical image fusion a great challenge. This paper proposes a novel multi-scale fusion network based on multi-dimensional dynamic convolution and residual hybrid transformer, which has better capability for feature extraction and context modeling and improves the fusion performance. Specifically, the proposed network exploits multi-dimensional dynamic convolution that introduces four attention mechanisms corresponding to four different dimensions of the convolutional kernel to extract more detailed information. Meanwhile, a residual hybrid transformer is designed, which activates more pixels to participate in the fusion process by channel attention, window attention, and overlapping cross attention, thereby strengthening the long-range dependence between different modes and enhancing the connection of global context information. A loss function, including perceptual loss and structural similarity loss, is designed, where the former enhances the visual reality and perceptual details of the fused image, and the latter enables the model to learn structural textures. The whole network adopts a multi-scale architecture and uses an unsupervised end-to-end method to realize multi-modal image fusion. Finally, our method is tested qualitatively and quantitatively on mainstream datasets. The fusion results indicate that our method achieves high scores in most quantitative indicators and satisfactory performance in visual qualitative analysis.

12.
Sensors (Basel) ; 24(13)2024 Jun 30.
Article in English | MEDLINE | ID: mdl-39001046

ABSTRACT

Retinal vessel segmentation is crucial for diagnosing and monitoring various eye diseases such as diabetic retinopathy, glaucoma, and hypertension. In this study, we examine how sharpness-aware minimization (SAM) can improve RF-UNet's generalization performance. RF-UNet is a novel model for retinal vessel segmentation. We focused our experiments on the digital retinal images for vessel extraction (DRIVE) dataset, which is a benchmark for retinal vessel segmentation, and our test results show that adding SAM to the training procedure leads to notable improvements. Compared to the non-SAM model (training loss of 0.45709 and validation loss of 0.40266), the SAM-trained RF-UNet model achieved a significant reduction in both training loss (0.094225) and validation loss (0.08053). Furthermore, compared to the non-SAM model (training accuracy of 0.90169 and validation accuracy of 0.93999), the SAM-trained model demonstrated higher training accuracy (0.96225) and validation accuracy (0.96821). Additionally, the model performed better in terms of sensitivity, specificity, AUC, and F1 score, indicating improved generalization to unseen data. Our results corroborate the notion that SAM facilitates the learning of flatter minima, thereby improving generalization, and are consistent with other research highlighting the advantages of advanced optimization methods. With wider implications for other medical imaging tasks, these results imply that SAM can successfully reduce overfitting and enhance the robustness of retinal vessel segmentation models. Prospective research avenues encompass verifying the model on vaster and more diverse datasets and investigating its practical implementation in real-world clinical situations.


Subject(s)
Algorithms , Retinal Vessels , Humans , Retinal Vessels/diagnostic imaging , Image Processing, Computer-Assisted/methods , Diabetic Retinopathy/diagnostic imaging
13.
Sensors (Basel) ; 24(13)2024 Jul 02.
Article in English | MEDLINE | ID: mdl-39001081

ABSTRACT

In clinical conditions limited by equipment, attaining lightweight skin lesion segmentation is pivotal as it facilitates the integration of the model into diverse medical devices, thereby enhancing operational efficiency. However, the lightweight design of the model may face accuracy degradation, especially when dealing with complex images such as skin lesion images with irregular regions, blurred boundaries, and oversized boundaries. To address these challenges, we propose an efficient lightweight attention network (ELANet) for the skin lesion segmentation task. In ELANet, two different attention mechanisms of the bilateral residual module (BRM) can achieve complementary information, which enhances the sensitivity to features in spatial and channel dimensions, respectively, and then multiple BRMs are stacked for efficient feature extraction of the input information. In addition, the network acquires global information and improves segmentation accuracy by putting feature maps of different scales through multi-scale attention fusion (MAF) operations. Finally, we evaluate the performance of ELANet on three publicly available datasets, ISIC2016, ISIC2017, and ISIC2018, and the experimental results show that our algorithm can achieve 89.87%, 81.85%, and 82.87% of the mIoU on the three datasets with a parametric of 0.459 M, which is an excellent balance between accuracy and lightness and is superior to many existing segmentation methods.


Subject(s)
Algorithms , Neural Networks, Computer , Humans , Image Processing, Computer-Assisted/methods , Skin/diagnostic imaging , Skin/pathology
14.
Sensors (Basel) ; 24(13)2024 Jul 03.
Article in English | MEDLINE | ID: mdl-39001109

ABSTRACT

Elbow computerized tomography (CT) scans have been widely applied for describing elbow morphology. To enhance the objectivity and efficiency of clinical diagnosis, an automatic method to recognize, segment, and reconstruct elbow joint bones is proposed in this study. The method involves three steps: initially, the humerus, ulna, and radius are automatically recognized based on the anatomical features of the elbow joint, and the prompt boxes are generated. Subsequently, elbow MedSAM is obtained through transfer learning, which accurately segments the CT images by integrating the prompt boxes. After that, hole-filling and object reclassification steps are executed to refine the mask. Finally, three-dimensional (3D) reconstruction is conducted seamlessly using the marching cube algorithm. To validate the reliability and accuracy of the method, the images were compared to the masks labeled by senior surgeons. Quantitative evaluation of segmentation results revealed median intersection over union (IoU) values of 0.963, 0.959, and 0.950 for the humerus, ulna, and radius, respectively. Additionally, the reconstructed surface errors were measured at 1.127, 1.523, and 2.062 mm, respectively. Consequently, the automatic elbow reconstruction method demonstrates promising capabilities in clinical diagnosis, preoperative planning, and intraoperative navigation for elbow joint diseases.


Subject(s)
Algorithms , Elbow Joint , Imaging, Three-Dimensional , Tomography, X-Ray Computed , Humans , Elbow Joint/diagnostic imaging , Tomography, X-Ray Computed/methods , Imaging, Three-Dimensional/methods , Image Processing, Computer-Assisted/methods , Radius/diagnostic imaging , Ulna/diagnostic imaging , Humerus/diagnostic imaging
15.
J Med Signals Sens ; 14: 5, 2024.
Article in English | MEDLINE | ID: mdl-38993207

ABSTRACT

Background: Digital devices can easily forge medical images. Copy-move forgery detection (CMFD) in medical image has led to abuses in areas where access to advanced medical devices is unavailable. Forgery of the copy-move image directly affects the doctor's decision. The method discussed here is an optimal method for detecting medical image forgery. Methods: The proposed method is based on an evolutionary algorithm that can detect fake blocks well. In the first stage, the image is taken to the signal level with the help of a discrete cosine transform (DCT). It is then ready for segmentation by applying discrete wavelet transform (DWT). The low-low band of DWT, which has the most image properties, is divided into blocks. Each block is searched using the equilibrium optimization algorithm. The blocks are most likely to be selected, and the final image is generated. Results: The proposed method was evaluated based on three criteria of precision, recall, and F1 and obtained 90.07%, 92.34%, and 91.56%, respectively. It is superior to the methods studied on medical images. Conclusions: It concluded that our method for CMFD in the medical images was more accurate.

16.
Front Bioeng Biotechnol ; 12: 1414605, 2024.
Article in English | MEDLINE | ID: mdl-38994123

ABSTRACT

In recent years, deep convolutional neural network-based segmentation methods have achieved state-of-the-art performance for many medical analysis tasks. However, most of these approaches rely on optimizing the U-Net structure or adding new functional modules, which overlooks the complementation and fusion of coarse-grained and fine-grained semantic information. To address these issues, we propose a 2D medical image segmentation framework called Progressive Learning Network (PL-Net), which comprises Internal Progressive Learning (IPL) and External Progressive Learning (EPL). PL-Net offers the following advantages: 1) IPL divides feature extraction into two steps, allowing for the mixing of different size receptive fields and capturing semantic information from coarse to fine granularity without introducing additional parameters; 2) EPL divides the training process into two stages to optimize parameters and facilitate the fusion of coarse-grained information in the first stage and fine-grained information in the second stage. We conducted comprehensive evaluations of our proposed method on five medical image segmentation datasets, and the experimental results demonstrate that PL-Net achieves competitive segmentation performance. It is worth noting that PL-Net does not introduce any additional learnable parameters compared to other U-Net variants.

17.
Cureus ; 16(6): e62264, 2024 Jun.
Article in English | MEDLINE | ID: mdl-39011227

ABSTRACT

INTRODUCTION:  Oral tumors necessitate a dependable computer-assisted pathological diagnosis system considering their rarity and diversity. A content-based image retrieval (CBIR) system using deep neural networks has been successfully devised for digital pathology. No CBIR system for oral pathology has been investigated because of the lack of an extensive image database and feature extractors tailored to oral pathology. MATERIALS AND METHODS: This study uses a large CBIR database constructed from 30 categories of oral tumors to compare deep learning methods as feature extractors. RESULTS: The highest average area under the receiver operating characteristic curve (AUC) was achieved by models trained on database images using self-supervised learning (SSL) methods (0.900 with SimCLR and 0.897 with TiCo). The generalizability of the models was validated using query images from the same cases taken with smartphones. When smartphone images were tested as queries, both models yielded the highest mean AUC (0.871 with SimCLR and 0.857 with TiCo). We ensured the retrieved image result would be easily observed by evaluating the top 10 mean accuracies and checking for an exact diagnostic category and its differential diagnostic categories. CONCLUSION: Training deep learning models with SSL methods using image data specific to the target site is beneficial for CBIR tasks in oral tumor histology to obtain histologically meaningful results and high performance. This result provides insight into the effective development of a CBIR system to help improve the accuracy and speed of histopathology diagnosis and advance oral tumor research in the future.

18.
Front Oncol ; 14: 1247396, 2024.
Article in English | MEDLINE | ID: mdl-39011486

ABSTRACT

Introduction: Soft tissue sarcomas, similar in incidence to cervical and esophageal cancers, arise from various soft tissues like smooth muscle, fat, and fibrous tissue. Effective segmentation of sarcomas in imaging is crucial for accurate diagnosis. Methods: This study collected multi-modal MRI images from 45 patients with thigh soft tissue sarcoma, totaling 8,640 images. These images were annotated by clinicians to delineate the sarcoma regions, creating a comprehensive dataset. We developed a novel segmentation model based on the UNet framework, enhanced with residual networks and attention mechanisms for improved modality-specific information extraction. Additionally, self-supervised learning strategies were employed to optimize feature extraction capabilities of the encoders. Results: The new model demonstrated superior segmentation performance when using multi-modal MRI images compared to single-modal inputs. The effectiveness of the model in utilizing the created dataset was validated through various experimental setups, confirming the enhanced ability to characterize tumor regions across different modalities. Discussion: The integration of multi-modal MRI images and advanced machine learning techniques in our model significantly improves the segmentation of soft tissue sarcomas in thigh imaging. This advancement aids clinicians in better diagnosing and understanding the patient's condition, leveraging the strengths of different imaging modalities. Further studies could explore the application of these techniques to other types of soft tissue sarcomas and additional anatomical sites.

19.
Med Phys ; 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-39042373

ABSTRACT

BACKGROUND: Deep learning technology has made remarkable progress in pancreatic image segmentation tasks. However, annotating 3D medical images is time-consuming and requires expertise, and existing semi-supervised segmentation methods perform poorly in the segmentation task of organs with blurred edges in enhanced CT such as the pancreas. PURPOSE: To address the challenges of limited labeled data and indistinct boundaries of regions of interest (ROI). METHODS: We propose Edge-Biased Consistency Regularization (EBC-Net). 3D edge detection is employed to construct edge perturbations and integrate edge prior information into limited data, aiding the network in learning from unlabeled data. Additionally, due to the one-sidedness of a single perturbation space, we expand the dual-level perturbation space of both images and features to more efficiently focus the model's attention on the edges of the ROI. Finally, inspired by the clinical habits of doctors, we propose a 3D Anatomical Invariance Extraction Module and Anatomical Attention to capture anatomy-invariant features. RESULTS: Extensive experiments have demonstrated that our method outperforms state-of-the-art methods in semi-supervised pancreas image segmentation. Moreover, it can better preserve the morphology of pancreatic organs and excel at edges region accuracy. CONCLUSIONS: Incorporated with edge prior knowledge, our method mixes disturbances in dual-perturbation space, which shifts the network's attention to the fuzzy edge region using a few labeled samples. These ideas have been verified on the pancreas segmentation dataset.

20.
Med Image Anal ; 97: 103275, 2024 Jul 14.
Article in English | MEDLINE | ID: mdl-39032395

ABSTRACT

Recent unsupervised domain adaptation (UDA) methods in medical image segmentation commonly utilize Generative Adversarial Networks (GANs) for domain translation. However, the translated images often exhibit a distribution deviation from the ideal due to the inherent instability of GANs, leading to challenges such as visual inconsistency and incorrect style, consequently causing the segmentation model to fall into the fixed wrong pattern. To address this problem, we propose a novel UDA framework known as Dual Domain Distribution Disruption with Semantics Preservation (DDSP). Departing from the idea of generating images conforming to the target domain distribution in GAN-based UDA methods, we make the model domain-agnostic and focus on anatomical structural information by leveraging semantic information as constraints to guide the model to adapt to images with disrupted distributions in both source and target domains. Furthermore, we introduce the inter-channel similarity feature alignment based on the domain-invariant structural prior information, which facilitates the shared pixel-wise classifier to achieve robust performance on target domain features by aligning the source and target domain features across channels. Without any exaggeration, our method significantly outperforms existing state-of-the-art UDA methods on three public datasets (i.e., the heart dataset, the brain dataset, and the prostate dataset). The code is available at https://github.com/MIXAILAB/DDSPSeg.

SELECTION OF CITATIONS
SEARCH DETAIL