Benchmarking table recognition performance on biomedical literature on neurological disorders.

Adams, Tim; Namysl, Marcin; Kodamullil, Alpha Tom; Behnke, Sven; Jacobs, Marc

Adams, Tim; Namysl, Marcin; Kodamullil, Alpha Tom; Behnke, Sven; Jacobs, Marc.

Afiliación

Adams T; Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin 53757, Germany.
Namysl M; Fraunhofer Institute for Intelligent Analysis and Information Systems, Schloss Birlinghoven, Sankt Augustin 53757, Germany.
Kodamullil AT; Autonomous Intelligent Systems, Computer Science Institute VI, University of Bonn, Bonn 53115, Germany.
Behnke S; Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Sankt Augustin 53757, Germany.
Jacobs M; Fraunhofer Institute for Intelligent Analysis and Information Systems, Schloss Birlinghoven, Sankt Augustin 53757, Germany.

Bioinformatics ; 38(6): 1624-1630, 2022 03 04.

Article en En | MEDLINE | ID: mdl-34935870

RESUMEN

MOTIVATION: Table recognition systems are widely used to extract and structure quantitative information from the vast amount of documents that are increasingly available from different open sources. While many systems already perform well on tables with a simple layout, tables in the biomedical domain are often much more complex. Benchmark and training data for such tables are however very limited. RESULTS: To address this issue, we present a novel, highly curated benchmark dataset based on a hand-curated literature corpus on neurological disorders, which can be used to tune and evaluate table extraction applications for this challenging domain. We evaluate several state-of-the-art table extraction systems based on our proposed benchmark and discuss challenges that emerged during the benchmark creation as well as factors that can impact the performance of recognition methods. For the evaluation procedure, we propose a new metric as well as several improvements that result in a better performance evaluation. AVAILABILITY AND IMPLEMENTATION: The resulting benchmark dataset (https://zenodo.org/record/5549977) as well as the source code to our novel evaluation approach can be openly accessed. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Benchmarking; Enfermedades del Sistema Nervioso; Humanos; Programas Informáticos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Benchmarking / Enfermedades del Sistema Nervioso Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: Alemania Pais de publicación: Reino Unido

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google