Search | VHL Regional Portal

Towards cross-platform interoperability for machine-assisted text annotation.

Eckart de Castilho, Richard; Ide, Nancy; Kim, Jin-Dong; Klie, Jan-Christoph; Suderman, Keith.

Genomics Inform ; 17(2): e19, 2019 Jun.

Article in English | MEDLINE | ID: mdl-31307134

ABSTRACT

In this paper we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements of annotation models and processes that are particularly problematic for, or amenable to, enabling seamless communication across different platforms. The study is conducted in the context of a specific annotation methodology, namely machine-assisted interactive annotation (also known as human-in-the-loop annotation). This methodology requires the ability to freely combine resources from different document repositories, access a wide array of NLP tools that automatically annotate corpora for various linguistic phenomena, and use a sophisticated annotation editor that enables interactive manual annotation coupled with on-the-fly machine learning. We consider three independently developed platforms, each of which utilizes a different model for representing annotations over text, and each of which performs a different role in the process.

Three Dimensions of Reproducibility in Natural Language Processing.

Cohen, K Bretonnel; Xia, Jingbo; Zweigenbaum, Pierre; Callahan, Tiffany J; Hargraves, Orin; Goss, Foster; Ide, Nancy; Névéol, Aurélie; Grouin, Cyril; Hunter, Lawrence E.

LREC Int Conf Lang Resour Eval ; 2018: 156-165, 2018 May.

Article in English | MEDLINE | ID: mdl-29911205

ABSTRACT

Despite considerable recent attention to problems with reproducibility of scientific research, there is a striking lack of agreement about the definition of the term. That is a problem, because the lack of a consensus definition makes it difficult to compare studies of reproducibility, and thus to have even a broad overview of the state of the issue in natural language processing. This paper proposes an ontology of reproducibility in that field. Its goal is to enhance both future research and communication about the topic, and retrospective meta-analyses. We show that three dimensions of reproducibility, corresponding to three kinds of claims in natural language processing papers, can account for a variety of types of research reports. These dimensions are reproducibility of a conclusion, of a finding, and of a value. Three biomedical natural language processing papers by the authors of this paper are analyzed with respect to these dimensions.

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL