Your browser doesn't support javascript.
loading
Interactive dual-stream contrastive learning for radiology report generation.
Zhang, Ziqi; Jiang, Ailian.
Afiliação
  • Zhang Z; College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030600, China.
  • Jiang A; College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030600, China. Electronic address: ailianjiang@126.com.
J Biomed Inform ; 157: 104718, 2024 Sep.
Article em En | MEDLINE | ID: mdl-39209086
ABSTRACT
Radiology report generation automates diagnostic narrative synthesis from medical imaging data. Current report generation methods primarily employ knowledge graphs for image enhancement, neglecting the interpretability and guiding function of the knowledge graphs themselves. Additionally, few approaches leverage the stable modal alignment information from multimodal pre-trained models to facilitate the generation of radiology reports. We propose the Terms-Guided Radiology Report Generation (TGR), a simple and practical model for generating reports guided primarily by anatomical terms. Specifically, we utilize a dual-stream visual feature extraction module comprised of detail extraction module and a frozen multimodal pre-trained model to separately extract visual detail features and semantic features. Furthermore, a Visual Enhancement Module (VEM) is proposed to further enrich the visual features, thereby facilitating the generation of a list of anatomical terms. We integrate anatomical terms with image features and proceed to engage contrastive learning with frozen text embeddings, utilizing the stable feature space from these embeddings to boost modal alignment capabilities further. Our model incorporates the capability for manual input, enabling it to generate a list of organs for specifically focused abnormal areas or to produce more accurate single-sentence descriptions based on selected anatomical terms. Comprehensive experiments demonstrate the effectiveness of our method in report generation tasks, our TGR-S model reduces training parameters by 38.9% while performing comparably to current state-of-the-art models, and our TGR-B model exceeds the best baseline models across multiple metrics.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural Limite: Humans Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural Limite: Humans Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China