Your browser doesn't support javascript.
loading
Machine learning methods to predict attrition in a population-based cohort of very preterm infants.
Teixeira, Raquel; Rodrigues, Carina; Moreira, Carla; Barros, Henrique; Camacho, Rui.
Affiliation
  • Teixeira R; EPIUnit - Instituto de Saúde Pública, Universidade do Porto, Rua das Taipas, nº 135, 4050-600, Porto, Portugal. raquel.teixeira@ispup.up.pt.
  • Rodrigues C; Laboratório para a Investigação Integrativa e Translacional em Saúde Populacional (ITR), Porto, Portugal. raquel.teixeira@ispup.up.pt.
  • Moreira C; EPIUnit - Instituto de Saúde Pública, Universidade do Porto, Rua das Taipas, nº 135, 4050-600, Porto, Portugal.
  • Barros H; Laboratório para a Investigação Integrativa e Translacional em Saúde Populacional (ITR), Porto, Portugal.
  • Camacho R; EPIUnit - Instituto de Saúde Pública, Universidade do Porto, Rua das Taipas, nº 135, 4050-600, Porto, Portugal.
Sci Rep ; 12(1): 10587, 2022 06 22.
Article de En | MEDLINE | ID: mdl-35732850
ABSTRACT
The timely identification of cohort participants at higher risk for attrition is important to earlier interventions and efficient use of research resources. Machine learning may have advantages over the conventional approaches to improve discrimination by analysing complex interactions among predictors. We developed predictive models of attrition applying a conventional regression model and different machine learning methods. A total of 542 very preterm (< 32 gestational weeks) infants born in Portugal as part of the European Effective Perinatal Intensive Care in Europe (EPICE) cohort were included. We tested a model with a fixed number of predictors (Baseline) and a second with a dynamic number of variables added from each follow-up (Incremental). Eight classification methods were applied AdaBoost, Artificial Neural Networks, Functional Trees, J48, J48Consolidated, K-Nearest Neighbours, Random Forest and Logistic Regression. Performance was compared using AUC- PR (Area Under the Curve-Precision Recall), Accuracy, Sensitivity and F-measure. Attrition at the four follow-ups were, respectively 16%, 25%, 13% and 17%. Both models demonstrated good predictive performance, AUC-PR ranging between 69 and 94.1 in Baseline and from 72.5 to 97.1 in Incremental model. Of the whole set of methods, Random Forest presented the best performance at all follow-ups [AUC-PR1 94.1 (2.0); AUC-PR2 91.2 (1.2); AUC-PR3 97.1 (1.0); AUC-PR4 96.5 (1.7)]. Logistic Regression performed well below Random Forest. The top-ranked predictors were common for both models in all follow-ups birthweight, gestational age, maternal age, and length of hospital stay. Random Forest presented the highest capacity for prediction and provided interpretable predictors. Researchers involved in cohorts can benefit from our robust models to prepare for and prevent loss to follow-up by directing efforts toward individuals at higher risk.
Sujet(s)

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Prématuré / Apprentissage machine Type d'étude: Etiology_studies / Incidence_studies / Observational_studies / Prognostic_studies / Risk_factors_studies Limites: Female / Humans / Newborn Langue: En Journal: Sci Rep Année: 2022 Type de document: Article Pays d'affiliation: Portugal

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Prématuré / Apprentissage machine Type d'étude: Etiology_studies / Incidence_studies / Observational_studies / Prognostic_studies / Risk_factors_studies Limites: Female / Humans / Newborn Langue: En Journal: Sci Rep Année: 2022 Type de document: Article Pays d'affiliation: Portugal