From zero to hero: Harnessing transformers for biomedical named entity recognition in zero- and few-shot contexts.

Kosprdic, Milos; Prodanovic, Nikola; Ljajic, Adela; Basaragin, Bojana; Milosevic, Nikola

Kosprdic, Milos; Prodanovic, Nikola; Ljajic, Adela; Basaragin, Bojana; Milosevic, Nikola.

Afiliação

Kosprdic M; Institute for Artificial Intelligence Research and Development of Serbia, Fruskogorska 1, Novi Sad, 21000, Serbia.
Prodanovic N; Institute for Artificial Intelligence Research and Development of Serbia, Fruskogorska 1, Novi Sad, 21000, Serbia.
Ljajic A; Institute for Artificial Intelligence Research and Development of Serbia, Fruskogorska 1, Novi Sad, 21000, Serbia.
Basaragin B; Institute for Artificial Intelligence Research and Development of Serbia, Fruskogorska 1, Novi Sad, 21000, Serbia.
Milosevic N; Institute for Artificial Intelligence Research and Development of Serbia, Fruskogorska 1, Novi Sad, 21000, Serbia; Bayer A.G., Research and Development, Mullerstrasse 173, Berlin, 13342, Germany. Electronic address: nikola.milosevic@bayer.com.

Artif Intell Med ; 156: 102970, 2024 10.

Article em En | MEDLINE | ID: mdl-39197375

ABSTRACT

ABSTRACT

Supervised named entity recognition (NER) in the biomedical domain depends on large sets of annotated texts with the given named entities. The creation of such datasets can be time-consuming and expensive, while extraction of new entities requires additional annotation tasks and retraining the model. This paper proposes a method for zero- and few-shot NER in the biomedical domain to address these challenges. The method is based on transforming the task of multi-class token classification into binary token classification and pre-training on a large number of datasets and biomedical entities, which allows the model to learn semantic relations between the given and potentially novel named entity labels. We have achieved average F1 scores of 35.44% for zero-shot NER, 50.10% for one-shot NER, 69.94% for 10-shot NER, and 79.51% for 100-shot NER on 9 diverse evaluated biomedical entities with fine-tuned PubMedBERT-based model. The results demonstrate the effectiveness of the proposed method for recognizing new biomedical entities with no or limited number of examples, outperforming previous transformer-based methods, and being comparable to GPT3-based models using models with over 1000 times fewer parameters. We make models and developed code publicly available.

Assuntos

Semântica; Processamento de Linguagem Natural; Humanos; Mineração de Dados/métodos; Algoritmos

Palavras-chave

Biomedical named entity recognition; Deep learning; Machine learning; Natural language processing; Zero-shot learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Semântica Limite: Humans Idioma: En Revista: Artif Intell Med Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google