Bias of Inaccurate Disease Mentions in Electronic Health Record-based Phenotyping.

Kagawa, Rina; Shinohara, Emiko; Imai, Takeshi; Kawazoe, Yoshimasa; Ohe, Kazuhiko

Kagawa, Rina; Shinohara, Emiko; Imai, Takeshi; Kawazoe, Yoshimasa; Ohe, Kazuhiko.

Affiliation

Kagawa R; Department of Medical Informatics, Strategic Planning, and Management, University of Tsukuba Hospital, Japan; Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Japan. Electronic address: kagawa-r@md.tsukuba.ac.jp.
Shinohara E; Department of Artificial Intelligence in Healthcare, Graduate School of Medicine, The University of Tokyo, Japan.
Imai T; Center for Disease Biology and Integrative Medicine, Graduate School of Medicine, The University of Tokyo, Japan.
Kawazoe Y; Department of Artificial Intelligence in Healthcare, Graduate School of Medicine, The University of Tokyo, Japan.
Ohe K; Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Japan.

Int J Med Inform ; 124: 90-96, 2019 04.

Article in En | MEDLINE | ID: mdl-30784432

ABSTRACT

OBJECTIVES: Electronic health record (EHR)-based phenotyping is an automated technique for identifying patients diagnosed with a particular disease using EHR data. However, EHR-based phenotyping has difficulties in achieving satisfactorily high performance because clinical notes include disease mentions that ultimately signify something other than the patient's diagnosis (such as differential diagnosis or screening). Our objective is to quantify the influence of such disease mentions on EHR-based phenotyping performance. METHODS: Physicians manually reviewed whether the disease mentions indicated the patients' diseases in 487,300 clinical notes of 4,430 patients. Particular focus was placed on disease mentions that did not signify the patient's diagnosis even though they did not have any syntactic modifier or indicator in the same sentences. Patients were then classified according to whether their clinical notes included such disease mentions. RESULTS: Among the patients whose clinical notes included disease mentions without any modifier or indicator, the proportion of patients whose disease mentions signified the patients' diagnosis was 78.1% (on average). This value can be interpreted as the bias of disease mentions that did not signify the patient's diagnosis on the precision of EHR-based phenotyping by extracting disease mentions from clinical notes. CONCLUSION: This study quantified the bias occurred owing to disease mentions that incorrectly signify a patient's diagnosis in the value of precision of EHR-based phenotyping from four dataset types. The results of this study will help researchers in diverse research environments with different available data types.

Subject(s)

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Diagnosis / Electronic Health Records Type of study: Diagnostic_studies / Prognostic_studies / Sysrev_observational_studies Limits: Humans Language: En Journal: Int J Med Inform Journal subject: INFORMATICA MEDICA Year: 2019 Document type: Article Country of publication:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google