Aggregating and Predicting Sequence Labels from Crowd Annotations.

Nguyen, An T; Wallace, Byron C; Li, Junyi Jessy; Nenkova, Ani; Lease, Matthew

Nguyen, An T; Wallace, Byron C; Li, Junyi Jessy; Nenkova, Ani; Lease, Matthew.

Afiliación

Nguyen AT; University of Texas at Austin.
Wallace BC; Northeastern University.
Li JJ; University of Pennsylvania.
Nenkova A; University of Pennsylvania.
Lease M; University of Texas at Austin.

Proc Conf Assoc Comput Linguist Meet ; 2017: 299-309, 2017.

Article en En | MEDLINE | ID: mdl-29093611

ABSTRACT

ABSTRACT

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Proc Conf Assoc Comput Linguist Meet Año: 2017 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google