RESUMO
Self-supervised learning (SSL) has achieved remarkable progress in medical image segmentation. The application of an SSL algorithm often follows a two-stage training process: using unlabeled data to perform label-free representation learning and fine-tuning the pre-trained model on the downstream tasks. One issue of this paradigm is that the SSL step is unaware of the downstream task, which may lead to sub-optimal feature representation for a target task. In this paper, we propose a hybrid pre-training paradigm that is driven by both self-supervised and supervised objectives. To achieve this, a supervised reference task is involved in self-supervised learning, aiming to improve the representation quality. Specifically, we employ the off-the-shelf medical image segmentation task as reference, and encourage learning a representation that (1) incurs low prediction loss on both SSL and reference tasks and (2) leads to a similar gradient when updating the feature extractor from either task. In this way, the reference task pilots SSL in the direction beneficial for the downstream segmentation. To this end, we propose a simple but effective gradient matching method to optimize the model towards a consistent direction, thus improving the compatibility of both SSL and supervised reference tasks. We call this hybrid pre-training paradigm reference-guided self-supervised learning (ReFs), and perform it on a large-scale unlabeled dataset and an additional reference dataset. The experimental results demonstrate its effectiveness on seven downstream medical image segmentation benchmarks.