Your browser doesn't support javascript.
loading
Generalizability of a Machine Learning Model for Improving Utilization of Parathyroid Hormone-Related Peptide Testing across Multiple Clinical Centers.
Yang, He S; Pan, Weishen; Wang, Yingheng; Zaydman, Mark A; Spies, Nicholas C; Zhao, Zhen; Guise, Theresa A; Meng, Qing H; Wang, Fei.
Affiliation
  • Yang HS; Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, United States.
  • Pan W; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States.
  • Wang Y; Department of Computer Science, Cornell University, Ithaca, NY, United States.
  • Zaydman MA; Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, United States.
  • Spies NC; Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, United States.
  • Zhao Z; Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, United States.
  • Guise TA; Department of Endocrine Neoplasia and Hormonal Disorders, Division of Internal Medicine, The University of Texas, MD Anderson, Houston, TX, United States.
  • Meng QH; Department of Laboratory Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, United States.
  • Wang F; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States.
Clin Chem ; 69(11): 1260-1269, 2023 11 02.
Article in En | MEDLINE | ID: mdl-37738611
ABSTRACT

BACKGROUND:

Measuring parathyroid hormone-related peptide (PTHrP) helps diagnose the humoral hypercalcemia of malignancy, but is often ordered for patients with low pretest probability, resulting in poor test utilization. Manual review of results to identify inappropriate PTHrP orders is a cumbersome process.

METHODS:

Using a dataset of 1330 patients from a single institute, we developed a machine learning (ML) model to predict abnormal PTHrP results. We then evaluated the performance of the model on two external datasets. Different strategies (model transporting, retraining, rebuilding, and fine-tuning) were investigated to improve model generalizability. Maximum mean discrepancy (MMD) was adopted to quantify the shift of data distributions across different datasets.

RESULTS:

The model achieved an area under the receiver operating characteristic curve (AUROC) of 0.936, and a specificity of 0.842 at 0.900 sensitivity in the development cohort. Directly transporting this model to two external datasets resulted in a deterioration of AUROC to 0.838 and 0.737, with the latter having a larger MMD corresponding to a greater data shift compared to the original dataset. Model rebuilding using site-specific data improved AUROC to 0.891 and 0.837 on the two sites, respectively. When external data is insufficient for retraining, a fine-tuning strategy also improved model utility.

CONCLUSIONS:

ML offers promise to improve PTHrP test utilization while relieving the burden of manual review. Transporting a ready-made model to external datasets may lead to performance deterioration due to data distribution shift. Model retraining or rebuilding could improve generalizability when there are enough data, and model fine-tuning may be favorable when site-specific data is limited.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Hypercalcemia / Neoplasms Type of study: Guideline / Prognostic_studies Limits: Humans Language: En Journal: Clin Chem Journal subject: QUIMICA CLINICA Year: 2023 Document type: Article Affiliation country: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Hypercalcemia / Neoplasms Type of study: Guideline / Prognostic_studies Limits: Humans Language: En Journal: Clin Chem Journal subject: QUIMICA CLINICA Year: 2023 Document type: Article Affiliation country: United States