Sequence tagging for biomedical extractive question answering.

Yoon, Wonjin; Jackson, Richard; Lagerberg, Aron; Kang, Jaewoo

Yoon, Wonjin; Jackson, Richard; Lagerberg, Aron; Kang, Jaewoo.

Afiliação

Yoon W; Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea.
Jackson R; AstraZeneca UK, Cambridge CB2 0AA, UK.
Lagerberg A; AstraZeneca SE, 43150 Mölndal, Sweden.
Kang J; Department of Computer Science and Engineering, Korea University, Seoul 02841, South Korea.

Bioinformatics ; 38(15): 3794-3801, 2022 08 02.

Article em En | MEDLINE | ID: mdl-35713500

RESUMO

MOTIVATION: Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps. RESULTS: In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps. AVAILABILITY AND IMPLEMENTATION: Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional; Software

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Biologia Computacional Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google