Your browser doesn't support javascript.
loading
Active Learning-based corpus annotation--the PathoJen experience.
Hahn, Udo; Beisswanger, Elena; Buyko, Ekaterina; Faessler, Erik.
Affiliation
  • Hahn U; Jena University Language & Information Engineering (JULIE) Lab, Friedrich-Schiller-Universität Jena, Fürstengraben 30, D-07743 Jena, Germany. udo.hahn@uni-jena.de
AMIA Annu Symp Proc ; 2012: 301-10, 2012.
Article in En | MEDLINE | ID: mdl-23304300
We report on basic design decisions and novel annotation procedures underlying the development of PathoJen, a corpus of Medline abstracts annotated for pathological phenomena, including diseases as a proper subclass. This named entity type is known to be hard to delineate and capture by annotation guidelines. We here propose a two-category encoding schema where we distinguish short from long mention spans, the first covering standardized terminology (e.g. diseases), the latter accounting for less structured descriptive statements about norm-deviant states, as well as criteria and observations that might signal pathologies. The second design decision relates to the way annotation instances are sampled. Here we subscribe to an Active Learning-based approach which is known to save annotation costs without sacrificing annotation quality by means of a sample bias. By design, Active Learning picks up 'hard' to annotate instances for human annotators, whereas 'easier' ones are passed over to the automatic classifier whose models already incorporate and gradually improve with previous annotation experience.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Pathology / Algorithms / Artificial Intelligence / Problem-Based Learning Type of study: Prognostic_studies Limits: Humans Language: En Journal: AMIA Annu Symp Proc Journal subject: INFORMATICA MEDICA Year: 2012 Document type: Article Affiliation country: Germany Country of publication: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Pathology / Algorithms / Artificial Intelligence / Problem-Based Learning Type of study: Prognostic_studies Limits: Humans Language: En Journal: AMIA Annu Symp Proc Journal subject: INFORMATICA MEDICA Year: 2012 Document type: Article Affiliation country: Germany Country of publication: United States