Your browser doesn't support javascript.
loading
Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method.
Abeysinghe, Rashmie; Black, Adam; Kaduk, Denys; Li, Yupeng; Reich, Christian; Davydov, Alexander; Yao, Lixia; Cui, Licong.
Affiliation
  • Abeysinghe R; Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  • Black A; Odysseus Data Services, Cambridge, MA, USA.
  • Kaduk D; Odysseus Data Services, Cambridge, MA, USA.
  • Li Y; Merck & Co., Inc., Rahway, NJ, USA.
  • Reich C; IQVIA, Cambridge, MA, USA; Observational Health Data Sciences and Informatics (OHDSI), New York, NY, USA.
  • Davydov A; Odysseus Data Services, Cambridge, MA, USA.
  • Yao L; Merck & Co., Inc., Rahway, NJ, USA. Electronic address: lixia.yao@merck.com.
  • Cui L; School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA. Electronic address: licong.cui@uth.tmc.edu.
J Biomed Inform ; 134: 104162, 2022 10.
Article in En | MEDLINE | ID: mdl-36029954
The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization and standardization of medical concepts across various clinical domains of the OMOP CDM. For concepts with the same meaning from different source vocabularies, one is designated as the standard concept, while the others are specified as non-standard or source concepts and mapped to the standard one. However, due to the heterogeneity of source vocabularies, there may exist mapping issues such as erroneous mappings and missing mappings in the OMOP vocabulary, which could affect the results of downstream analyses with RWD. In this paper, we focus on quality assurance of vaccine concept mappings in the OMOP vocabulary, which is necessary to accurately harness the power of RWD on vaccines. We introduce a semi-automated lexical approach to audit vaccine mappings in the OMOP vocabulary. We generated two types of vaccine-pairs: mapped and unmapped, where mapped vaccine-pairs are pairs of vaccine concepts with a "Maps to" relationship, while unmapped vaccine-pairs are those without a "Maps to" relationship. We represented each vaccine concept name as a set of words, and derived term-difference pairs (i.e., name differences) for mapped and unmapped vaccine-pairs. If the same term-difference pair can be obtained by both mapped and unmapped vaccine-pairs, then this is considered as a potential mapping inconsistency. Applying this approach to the vaccine mappings in OMOP, a total of 2087 potentially mapping inconsistencies were obtained. A randomly selected 200 samples were evaluated by domain experts to identify, validate, and categorize the inconsistencies. Experts identified 95 cases revealing valid mapping issues. The remaining 105 cases were found to be invalid due to the external and/or contextual information used in the mappings that were not reflected in the concept names of vaccines. This indicates that our semi-automated approach shows promise in identifying mapping inconsistencies among vaccine concepts in the OMOP vocabulary.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Vocabulary / Vaccines Language: En Journal: J Biomed Inform Journal subject: INFORMATICA MEDICA Year: 2022 Document type: Article Affiliation country: United States Country of publication: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Vocabulary / Vaccines Language: En Journal: J Biomed Inform Journal subject: INFORMATICA MEDICA Year: 2022 Document type: Article Affiliation country: United States Country of publication: United States