DRNet: Double Recalibration Network for Few-Shot Semantic Segmentation.

Gao, Guangyu; Fang, Zhiyuan; Han, Cen; Wei, Yunchao; Liu, Chi Harold; Yan, Shuicheng

ABSTRACT

Few-shot segmentation aims at learning to segment query images guided by only a few annotated images from the support set. Previous methods rely on mining the feature embedding similarity across the query and the support images to achieve successful segmentation. However, these models tend to perform badly in cases where the query instances have a large variance from the support ones. To enhance model robustness against such intra-class variance, we propose a Double Recalibration Network (DRNet) with two recalibration modules, i.e., the Self-adapted Recalibration (SR) module and the Cross-attended Recalibration (CR) module. In particular, beyond learning robust feature embedding for pixel-wise comparison between support and query as in conventional methods, the DRNet further exploits semantic-aware knowledge embedded in the query image to help segment itself, which we call 'self-adapted recalibration'. More specifically, DRNet first employs guidance from the support set to roughly predict an incomplete but correct initial object region for the query image, and then reversely uses the feature embedding extracted from the incomplete object region to segment the query image. Also, we devise a CR module to refine the feature representation of the query image by propagating the underlying knowledge embedded in the support image's foreground to the query. Instead of foreground global pooling, we refine the response at each pixel in the query feature map by attending to all foreground pixels in the support feature map and taking the weighted average by their similarity; meanwhile, feature maps of the query image are also added back to weighted feature maps as a residual connection. Our DRNet can effectively address the intra-class variance under the few-shot setting with such two recalibration modules, and mine more accurate target regions for query images. We conduct extensive experiments on the popular benchmarks PASCAL- 5i and COCO- 20i . The DRNet with the best configuration achieves the mIoU of 63.6% and 64.9% on PASCAL- 5i and 44.7% and 49.6% on COCO- 20i for 1-shot and 5-shot settings respectively, significantly outperforming the state-of-the-arts without any bells and whistles. Code is available at https//github.com/fangzy97/drnet.