Development and validation of interpretable machine learning models to predict glomerular filtration rate in chronic kidney disease Colombian patients.

Rojas, Luis H; Pereira-Morales, Angela J; Amador, William; Montenegro, Albert; Buelvas, Walberto; de la Espriella, Víctor

Rojas, Luis H; Pereira-Morales, Angela J; Amador, William; Montenegro, Albert; Buelvas, Walberto; de la Espriella, Víctor.

Affiliation

Rojas LH; Science for Life - S4L SAS, Bogotá, Colombia.
Pereira-Morales AJ; Science for Life - S4L SAS, Bogotá, Colombia.
Amador W; Science for Life - S4L SAS, Bogotá, Colombia.
Montenegro A; Science for Life - S4L SAS, Bogotá, Colombia.
Buelvas W; Medisinú IPS, Monteria, Colombia.
de la Espriella V; Medisinú IPS, Monteria, Colombia.

Ann Clin Biochem ; : 45632241285528, 2024 Sep 21.

Article in En | MEDLINE | ID: mdl-39242084

ABSTRACT

ABSTRACT

BACKGROUND:

ML predictive models have shown their capability to improve risk prediction and assist medical decision-making, nevertheless, there is a lack of accuracy systems to early identify future rapid CKD progressors in Colombia and even in South America.

OBJECTIVE:

The purpose of this study was to develop a series of interpretable machine learning models that predict GFR at 6-months, 9-months, and 12-months. STUDY DESIGN AND

SETTING:

Over 29,000 CKD patients stage 1 to 3b (estimated GFR, <60 mL/min/1.73 m2) with an average of 3-year follow-up data were included. We used the machine learning extreme gradient boosting (XGBoost) to build three models to predict the next eGFR. Models were internally and externally validated. In addition, we included SHapley Additive exPlanation (SHAP) values to offer interpretable global and local prediction models.

RESULTS:

All models showed a good performance in development and external validation. However, the 6-months XGBoost prediction model showed the best performance in internal (MAE average = 6.07; RSME = 78.87), and in external validation (MAE average = 6.45, RSME = 18.94). The top 3 most influential features that pushed the predicted eGFR value to lower values were the interpolated values for eGFR and creatinine, and eGFR at baseline.

CONCLUSION:

In the current study we have developed and validated machine learning models to predict the next eGFR value at different intervals. Furthermore, we attempted to approach the need for prediction explanation by offering transparent predictions.

Key words

Machine learning; chronic kidney disease; extreme gradient boosting; risk prediction

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Country/Region as subject: America do sul / Colombia Language: En Journal: Ann Clin Biochem Year: 2024 Document type: Article Affiliation country: Country of publication:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google