RESUMO
BACKGROUND: Deep learning has presented great potential in accurate MR image segmentation when enough labeled data are provided for network optimization. However, manually annotating three-dimensional (3D) MR images is tedious and time-consuming, requiring experts with rich domain knowledge and experience. PURPOSE: To build a deep learning method exploring sparse annotations, namely only a single two-dimensional slice label for each 3D training MR image. STUDY TYPE: Retrospective. POPULATION: Three-dimensional MR images of 150 subjects from two publicly available datasets were included. Among them, 50 (1377 image slices) are for prostate segmentation. The other 100 (8800 image slices) are for left atrium segmentation. Five-fold cross-validation experiments were carried out utilizing the first dataset. For the second dataset, 80 subjects were used for training and 20 were used for testing. FIELD STRENGTH/SEQUENCE: 1.5 T and 3.0 T; axial T2-weighted and late gadolinium-enhanced, 3D respiratory navigated, inversion recovery prepared gradient echo pulse sequence. ASSESSMENT: A collaborative learning method by integrating the strengths of semi-supervised and self-supervised learning schemes was developed. The method was trained using labeled central slices and unlabeled noncentral slices. Segmentation performance on testing set was reported quantitatively and qualitatively. STATISTICAL TESTS: Quantitative evaluation metrics including boundary intersection-over-union (B-IoU), Dice similarity coefficient, average symmetric surface distance, and relative absolute volume difference were calculated. Paired t test was performed, and P < 0.05 was considered statistically significant. RESULTS: Compared to fully supervised training with only the labeled central slice, mean teacher, uncertainty-aware mean teacher, deep co-training, interpolation consistency training (ICT), and ambiguity-consensus mean teacher, the proposed method achieved a substantial improvement in segmentation accuracy, increasing the mean B-IoU significantly by more than 10.0% for prostate segmentation (proposed method B-IoU: 70.3% ± 7.6% vs. ICT B-IoU: 60.3% ± 11.2%) and by more than 6.0% for left atrium segmentation (proposed method B-IoU: 66.1% ± 6.8% vs. ICT B-IoU: 60.1% ± 7.1%). DATA CONCLUSIONS: A collaborative learning method trained using sparse annotations can segment prostate and left atrium with high accuracy. LEVEL OF EVIDENCE: 0 TECHNICAL EFFICACY: Stage 1.
RESUMO
Objective.Training neural networks for pixel-wise or voxel-wise image segmentation is a challenging task that requires a considerable amount of training samples with highly accurate and densely delineated ground truth maps. This challenge becomes especially prominent in the medical imaging domain, where obtaining reliable annotations for training samples is a difficult, time-consuming, and expert-dependent process. Therefore, developing models that can perform well under the conditions of limited annotated training data is desirable.Approach.In this study, we propose an innovative framework called the extremely sparse annotation neural network (ESA-Net) that learns with only the single central slice label for 3D volumetric segmentation which explores both intra-slice pixel dependencies and inter-slice image correlations with uncertainty estimation. Specifically, ESA-Net consists of four specially designed distinct components: (1) an intra-slice pixel dependency-guided pseudo-label generation module that exploits uncertainty in network predictions while generating pseudo-labels for unlabeled slices with temporal ensembling; (2) an inter-slice image correlation-constrained pseudo-label propagation module which propagates labels from the labeled central slice to unlabeled slices by self-supervised registration with rotation ensembling; (3) a pseudo-label fusion module that fuses the two sets of generated pseudo-labels with voxel-wise uncertainty guidance; and (4) a final segmentation network optimization module to make final predictions with scoring-based label quantification.Main results.Extensive experimental validations have been performed on two popular yet challenging magnetic resonance image segmentation tasks and compared to five state-of-the-art methods.Significance.Results demonstrate that our proposed ESA-Net can consistently achieve better segmentation performances even under the extremely sparse annotation setting, highlighting its effectiveness in exploiting information from unlabeled data.
Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Incerteza , RotaçãoRESUMO
The medical imaging literature has witnessed remarkable progress in high-performing segmentation models based on convolutional neural networks. Despite the new performance highs, the recent advanced segmentation models still require large, representative, and high quality annotated datasets. However, rarely do we have a perfect training dataset, particularly in the field of medical imaging, where data and annotations are both expensive to acquire. Recently, a large body of research has studied the problem of medical image segmentation with imperfect datasets, tackling two major dataset limitations: scarce annotations where only limited annotated data is available for training, and weak annotations where the training data has only sparse annotations, noisy annotations, or image-level annotations. In this article, we provide a detailed review of the solutions above, summarizing both the technical novelties and empirical results. We further compare the benefits and requirements of the surveyed methodologies and provide our recommended solutions. We hope this survey article increases the community awareness of the techniques that are available to handle imperfect medical image segmentation datasets.