Disambiguity and Alignment: An Effective Multi-Modal Alignment Method for Cross-Modal Recipe Retrieval.

Zou, Zhuoyang; Zhu, Xinghui; Zhu, Qinying; Zhang, Hongyan; Zhu, Lei

Zou, Zhuoyang; Zhu, Xinghui; Zhu, Qinying; Zhang, Hongyan; Zhu, Lei.

Afiliação

Zou Z; College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China.
Zhu X; College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China.
Zhu Q; College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China.
Zhang H; College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China.
Zhu L; College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China.

Foods ; 13(11)2024 May 23.

Article em En | MEDLINE | ID: mdl-38890857

ABSTRACT

ABSTRACT

As a prominent topic in food computing, cross-modal recipe retrieval has garnered substantial attention. However, the semantic alignment across food images and recipes cannot be further enhanced due to the lack of intra-modal alignment in existing solutions. Additionally, a critical issue named food image ambiguity is overlooked, which disrupts the convergence of models. To these ends, we propose a novel Multi-Modal Alignment Method for Cross-Modal Recipe Retrieval (MMACMR). To consider inter-modal and intra-modal alignment together, this method measures the ambiguous food image similarity under the guidance of their corresponding recipes. Additionally, we enhance recipe semantic representation learning by involving a cross-attention module between ingredients and instructions, which is effective in supporting food image similarity measurement. We conduct experiments on the challenging public dataset Recipe1M; as a result, our method outperforms several state-of-the-art methods in commonly used evaluation criteria.

Palavras-chave

cross-modal recipe retrieval; deep learning; food image ambiguity; multi-modal alignment

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Foods Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google