HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools.

Sänger, Mario; Garda, Samuele; Wang, Xing David; Weber-Genzel, Leon; Droop, Pia; Fuchs, Benedikt; Akbik, Alan; Leser, Ulf

Sänger, Mario; Garda, Samuele; Wang, Xing David; Weber-Genzel, Leon; Droop, Pia; Fuchs, Benedikt; Akbik, Alan; Leser, Ulf.

Afiliação

Sänger M; Department of Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany.
Garda S; Department of Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany.
Wang XD; Department of Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany.
Weber-Genzel L; Center for Information and Language Processing (CIS), Ludwig Maximilian University Munich, München 80539, Germany.
Droop P; Department of Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany.
Fuchs B; Research Industrial Systems Engineering (RISE) Forschungs-, Entwicklungs- und Großprojektberatung GmbH, Schwechat 2320, Austria.
Akbik A; Department of Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany.
Leser U; Department of Computer Science, Humboldt-Universität zu Berlin, Berlin 10099, Germany.

Bioinformatics ; 40(10)2024 Oct 01.

Article em En | MEDLINE | ID: mdl-39302686

ABSTRACT

ABSTRACT

MOTIVATION With the exponential growth of the life sciences literature, biomedical text mining (BTM) has become an essential technology for accelerating the extraction of insights from publications. The identification of entities in texts, such as diseases or genes, and their normalization, i.e. grounding them in knowledge base, are crucial steps in any BTM pipeline to enable information aggregation from multiple documents. However, tools for these two steps are rarely applied in the same context in which they were developed. Instead, they are applied "in the wild," i.e. on application-dependent text collections from moderately to extremely different from those used for training, varying, e.g. in focus, genre or text type. This raises the question whether the reported performance, usually obtained by training and evaluating on different partitions of the same corpus, can be trusted for downstream applications.

RESULTS:

Here, we report on the results of a carefully designed cross-corpus benchmark for entity recognition and normalization, where tools were applied systematically to corpora not used during their training. Based on a survey of 28 published systems, we selected five, based on predefined criteria like feature richness and availability, for an in-depth analysis on three publicly available corpora covering four entity types. Our results present a mixed picture and show that cross-corpus performance is significantly lower than the in-corpus performance. HunFlair2, the redesigned and extended successor of the HunFlair tool, showed the best performance on average, being closely followed by PubTator Central. Our results indicate that users of BTM tools should expect a lower performance than the original published one when applying tools in "the wild" and show that further research is necessary for more robust BTM tools. AVAILABILITY AND IMPLEMENTATION All our models are integrated into the Natural Language Processing (NLP) framework flair https//github.com/flairNLP/flair. Code to reproduce our results is available at https//github.com/hu-ner/hunflair2-experiments.

Assuntos

Mineração de Dados; Mineração de Dados/métodos; Processamento de Linguagem Natural; Software; Biologia Computacional/métodos; Humanos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Mineração de Dados Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Mineração de Dados Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article