Pesquisa | BVS Doenças Infecciosas e Parasitárias

MECOM: A Meta-Completion Network for Fine-Grained Recognition With Incomplete Multi-Modalities.

Wei, Xiu-Shen; Yu, Hong-Tao; Xu, Anqi; Zhang, Faen; Peng, Yuxin.

IEEE Trans Image Process ; 33: 3456-3469, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38787666

RESUMO

Our work focuses on tackling the problem of fine-grained recognition with incomplete multi-modal data, which is overlooked by previous work in the literature. It is desirable to not only capture fine-grained patterns of objects but also alleviate the challenges of missing modalities for such a practical problem. In this paper, we propose to leverage a meta-learning strategy to learn model abilities of both fast modal adaptation and more importantly missing modality completion across a variety of incomplete multi-modality learning tasks. Based on that, we develop a meta-completion method, termed as MECOM, to perform multimodal fusion and explicit missing modality completion by our proposals of cross-modal attention and decoupling reconstruction. To further improve fine-grained recognition accuracy, an additional partial stream (as a counterpart of the main stream of MECOM, i.e., holistic) and the part-level features (corresponding to fine-grained objects' parts) selection are designed, which are tailored for fine-grained nature to capture discriminative but subtle part-level patterns. Comprehensive experiments from quantitative and qualitative aspects, as well as various ablation studies, on two fine-grained multimodal datasets and one generic multimodal dataset show our superiority over competing methods. Our code is open-source and available at https://github.com/SEU-VIPGroup/MECOM.

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA