Your browser doesn't support javascript.
loading
Extracting experimental parameter entities from scientific articles.
Farnsworth, Steele; Gurdin, Gabrielle; Vargas, Jorge; Mulyar, Andriy; Lewinski, Nastassja; McInnes, Bridget T.
  • Farnsworth S; Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA. Electronic address: farnsworthsw@alumni.vcu.edu.
  • Gurdin G; Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.
  • Vargas J; Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.
  • Mulyar A; Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.
  • Lewinski N; Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.
  • McInnes BT; Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA. Electronic address: btmcinnes@vcu.edu.
J Biomed Inform ; 126: 103970, 2022 02.
Article en En | MEDLINE | ID: mdl-34920128
ABSTRACT
Systematic reviews are labor-intensive processes to combine all knowledge about a given topic into a coherent summary. Despite the high labor investment, they are necessary to create an exhaustive overview of current evidence relevant to a research question. In this work, we evaluate three state-of-the-art supervised multi-label sequence classification systems to automatically identify 24 different experimental design factors for the categories of Animal, Dose, Exposure, and Endpoint from journal articles describing the experiments related to toxicity and health effects of environmental agents. We then present an in depth analysis of the results evaluating the lexical diversity of the design parameters with respect to model performance, evaluating the impact of tokenization and non-contiguous mentions, and finally evaluating the dependencies between entities within the category entities. We demonstrate that in general, algorithms that use embedded representations of the sequences out-perform statistical algorithms, but that even these algorithms struggle with lexically diverse entities.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Algoritmos / Procesamiento de Lenguaje Natural Tipo de estudio: Prognostic_studies / Systematic_reviews Idioma: En Año: 2022 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Algoritmos / Procesamiento de Lenguaje Natural Tipo de estudio: Prognostic_studies / Systematic_reviews Idioma: En Año: 2022 Tipo del documento: Article