Your browser doesn't support javascript.
loading
Effectiveness of lexico-syntactic pattern matching for ontology enrichment with clinical documents.
Liu, K; Chapman, W W; Savova, G; Chute, C G; Sioutos, N; Crowley, R S.
Affiliation
  • Liu K; Department of Biomedical Informatics, UPMC Cancer Pavilion, Suite 301, 5150 Centre Avenue, Pittsburgh, PA 15232, USA. kaihong@pitt.edu
Methods Inf Med ; 50(5): 397-407, 2011.
Article in En | MEDLINE | ID: mdl-21057720
ABSTRACT

OBJECTIVE:

To evaluate the effectiveness of a lexico-syntactic pattern (LSP) matching method for ontology enrichment using clinical documents.

METHODS:

Two domains were separately studied using the same methodology. We used radiology documents to enrich RadLex and pathology documents to enrich National Cancer Institute Thesaurus (NCIT). Several known LSPs were used for semantic knowledge extraction. We first retrieved all sentences that contained LSPs across two large clinical repositories, and examined the frequency of the LSPs. From this set, we randomly sampled LSP instances which were examined by human judges. We used a two-step method to determine the utility of these patterns for enrichment. In the first step, domain experts annotated medically meaningful terms (MMTs) from each sentence within the LSP. In the second step, RadLex and NCIT curators evaluated how many of these MMTs could be added to the resource. To quantify the utility of this LSP method, we defined two evaluation metrics suggestion rate (SR) and acceptance rate (AR). We used these measures to estimate the yield of concepts and relationships, for each of the two domains.

RESULTS:

For NCIT, the concept SR was 24%, and the relationship SR was 65%. The concept AR was 21%, and the relationship AR was 14%. For RadLex, the concept SR was 37%, and the relationship SR was 55%. The concept AR was 11%, and the relationship AR was 44%.

CONCLUSION:

The LSP matching method is an effective method for concept and concept relationship discovery in biomedical domains.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Semantics / Medical Informatics / Artificial Intelligence / Learning / Terminology as Topic Type of study: Evaluation_studies Limits: Humans Country/Region as subject: America do norte Language: En Journal: Methods Inf Med Year: 2011 Document type: Article Affiliation country: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Semantics / Medical Informatics / Artificial Intelligence / Learning / Terminology as Topic Type of study: Evaluation_studies Limits: Humans Country/Region as subject: America do norte Language: En Journal: Methods Inf Med Year: 2011 Document type: Article Affiliation country: United States