Search | Nursing VHL Search Portal

1.

Does Reinforcement Learning Improve Outcomes for Critically Ill Patients? A Systematic Review and Level-of-Readiness Assessment.

Otten, Martijn; Jagesar, Ameet R; Dam, Tariq A; Biesheuvel, Laurens A; den Hengst, Floris; Ziesemer, Kirsten A; Thoral, Patrick J; de Grooth, Harm-Jan; Girbes, Armand R J; François-Lavet, Vincent; Hoogendoorn, Mark; Elbers, Paul W G.

Crit Care Med ; 52(2): e79-e88, 2024 02 01.

Article in English | MEDLINE | ID: mdl-37938042

ABSTRACT

OBJECTIVE: Reinforcement learning (RL) is a machine learning technique uniquely effective at sequential decision-making, which makes it potentially relevant to ICU treatment challenges. We set out to systematically review, assess level-of-readiness and meta-analyze the effect of RL on outcomes for critically ill patients. DATA SOURCES: A systematic search was performed in PubMed, Embase.com, Clarivate Analytics/Web of Science Core Collection, Elsevier/SCOPUS and the Institute of Electrical and Electronics Engineers Xplore Digital Library from inception to March 25, 2022, with subsequent citation tracking. DATA EXTRACTION: Journal articles that used an RL technique in an ICU population and reported on patient health-related outcomes were included for full analysis. Conference papers were included for level-of-readiness assessment only. Descriptive statistics, characteristics of the models, outcome compared with clinician's policy and level-of-readiness were collected. RL-health risk of bias and applicability assessment was performed. DATA SYNTHESIS: A total of 1,033 articles were screened, of which 18 journal articles and 18 conference papers, were included. Thirty of those were prototyping or modeling articles and six were validation articles. All articles reported RL algorithms to outperform clinical decision-making by ICU professionals, but only in retrospective data. The modeling techniques for the state-space, action-space, reward function, RL model training, and evaluation varied widely. The risk of bias was high in all articles, mainly due to the evaluation procedure. CONCLUSION: In this first systematic review on the application of RL in intensive care medicine we found no studies that demonstrated improved patient outcomes from RL-based technologies. All studies reported that RL-agent policies outperformed clinician policies, but such assessments were all based on retrospective off-policy evaluation.

Subject(s)

Critical Care , Critical Illness , Humans , Critical Illness/therapy , Retrospective Studies

2.

Evolution of Clinical Phenotypes of COVID-19 Patients During Intensive Care Treatment: An Unsupervised Machine Learning Analysis.

Siepel, Sander; Dam, Tariq A; Fleuren, Lucas M; Girbes, Armand R J; Hoogendoorn, Mark; Thoral, Patrick J; Elbers, Paul W G; Bennis, Frank C.

J Intensive Care Med ; 38(7): 612-629, 2023 Jul.

Article in English | MEDLINE | ID: mdl-36744415

ABSTRACT

BACKGROUND: Identification of clinical phenotypes in critically ill COVID-19 patients could improve understanding of the disease heterogeneity and enable prognostic and predictive enrichment. However, previous attempts did not take into account temporal dynamics with high granularity. By including the dimension of time, we aim to gain further insights into the heterogeneity of COVID-19. METHODS: We used granular data from 3202 adult COVID patients in the Dutch Data Warehouse that were admitted to one of 25 Dutch ICUs between February 2020 and March 2021. Parameters including demographics, clinical observations, medications, laboratory values, vital signs, and data from life support devices were selected. Twenty-one datasets were created that each covered 24 h of ICU data for each day of ICU treatment. Clinical phenotypes in each dataset were identified by performing cluster analyses. Both evolution of the clinical phenotypes over time and patient allocation to these clusters over time were tracked. RESULTS: The final patient cohort consisted of 2438 COVID-19 patients with a ICU mortality outcome. Forty-one parameters were chosen for cluster analysis. On admission, both a mild and a severe clinical phenotype were found. After day 4, the severe phenotype split into an intermediate and a severe phenotype for 11 consecutive days. Heterogeneity between phenotypes appears to be driven by inflammation and dead space ventilation. During the 21-day period, only 8.2% and 4.6% of patients in the initial mild and severe clusters remained assigned to the same phenotype respectively. The clinical phenotype half-life was between 5 and 6 days for the mild and severe phenotypes, and about 3 days for the medium severe phenotype. CONCLUSIONS: Patients typically do not remain in the same cluster throughout intensive care treatment. This may have important implications for prognostic or predictive enrichment. Prominent dissimilarities between clinical phenotypes are predominantly driven by inflammation and dead space ventilation.

Subject(s)

COVID-19 , Humans , COVID-19/therapy , SARS-CoV-2 , Unsupervised Machine Learning , Critical Care , Intensive Care Units , Inflammation , Phenotype , Critical Illness/therapy

3.

A Framework for Applying Natural Language Processing in Digital Health Interventions.

Funk, Burkhardt; Sadeh-Sharvit, Shiri; Fitzsimmons-Craft, Ellen E; Trockel, Mickey Todd; Monterubio, Grace E; Goel, Neha J; Balantekin, Katherine N; Eichen, Dawn M; Flatt, Rachael E; Firebaugh, Marie-Laure; Jacobi, Corinna; Graham, Andrea K; Hoogendoorn, Mark; Wilfley, Denise E; Taylor, C Barr.

J Med Internet Res ; 22(2): e13855, 2020 02 19.

Article in English | MEDLINE | ID: mdl-32130118

ABSTRACT

BACKGROUND: Digital health interventions (DHIs) are poised to reduce target symptoms in a scalable, affordable, and empirically supported way. DHIs that involve coaching or clinical support often collect text data from 2 sources: (1) open correspondence between users and the trained practitioners supporting them through a messaging system and (2) text data recorded during the intervention by users, such as diary entries. Natural language processing (NLP) offers methods for analyzing text, augmenting the understanding of intervention effects, and informing therapeutic decision making. OBJECTIVE: This study aimed to present a technical framework that supports the automated analysis of both types of text data often present in DHIs. This framework generates text features and helps to build statistical models to predict target variables, including user engagement, symptom change, and therapeutic outcomes. METHODS: We first discussed various NLP techniques and demonstrated how they are implemented in the presented framework. We then applied the framework in a case study of the Healthy Body Image Program, a Web-based intervention trial for eating disorders (EDs). A total of 372 participants who screened positive for an ED received a DHI aimed at reducing ED psychopathology (including binge eating and purging behaviors) and improving body image. These users generated 37,228 intervention text snippets and exchanged 4285 user-coach messages, which were analyzed using the proposed model. RESULTS: We applied the framework to predict binge eating behavior, resulting in an area under the curve between 0.57 (when applied to new users) and 0.72 (when applied to new symptom reports of known users). In addition, initial evidence indicated that specific text features predicted the therapeutic outcome of reducing ED symptoms. CONCLUSIONS: The case study demonstrates the usefulness of a structured approach to text data analytics. NLP techniques improve the prediction of symptom changes in DHIs. We present a technical framework that can be easily applied in other clinical trials and clinical presentations and encourage other groups to apply the framework in similar contexts.

Subject(s)

Health Promotion/methods , Natural Language Processing , Telemedicine/methods , Female , Humans , Male

4.

Predicting Therapy Success and Costs for Personalized Treatment Recommendations Using Baseline Characteristics: Data-Driven Analysis.

Bremer, Vincent; Becker, Dennis; Kolovos, Spyros; Funk, Burkhardt; van Breda, Ward; Hoogendoorn, Mark; Riper, Heleen.

J Med Internet Res ; 20(8): e10275, 2018 08 21.

Article in English | MEDLINE | ID: mdl-30131318

ABSTRACT

BACKGROUND: Different treatment alternatives exist for psychological disorders. Both clinical and cost effectiveness of treatment are crucial aspects for policy makers, therapists, and patients and thus play major roles for healthcare decision-making. At the start of an intervention, it is often not clear which specific individuals benefit most from a particular intervention alternative or how costs will be distributed on an individual patient level. OBJECTIVE: This study aimed at predicting the individual outcome and costs for patients before the start of an internet-based intervention. Based on these predictions, individualized treatment recommendations can be provided. Thus, we expand the discussion of personalized treatment recommendation. METHODS: Outcomes and costs were predicted based on baseline data of 350 patients from a two-arm randomized controlled trial that compared treatment as usual and blended therapy for depressive disorders. For this purpose, we evaluated various machine learning techniques, compared the predictive accuracy of these techniques, and revealed features that contributed most to the prediction performance. We then combined these predictions and utilized an incremental cost-effectiveness ratio in order to derive individual treatment recommendations before the start of treatment. RESULTS: Predicting clinical outcomes and costs is a challenging task that comes with high uncertainty when only utilizing baseline information. However, we were able to generate predictions that were more accurate than a predefined reference measure in the shape of mean outcome and cost values. Questionnaires that include anxiety or depression items and questions regarding the mobility of individuals and their energy levels contributed to the prediction performance. We then described how patients can be individually allocated to the most appropriate treatment type. For an incremental cost-effectiveness threshold of 25,000 /quality-adjusted life year, we demonstrated that our recommendations would have led to slightly worse outcomes (1.98%), but with decreased cost (5.42%). CONCLUSIONS: Our results indicate that it was feasible to provide personalized treatment recommendations at baseline and thus allocate patients to the most beneficial treatment type. This could potentially lead to improved decision-making, better outcomes for individuals, and reduced health care costs.

Subject(s)

Cost-Benefit Analysis/methods , Health Care Costs/trends , Machine Learning/trends , Female , Humans , Male , Surveys and Questionnaires , Treatment Outcome

5.

The added value of temporal data and the best way to handle it: A use-case for atrial fibrillation using general practitioner data.

Bennis, Frank C; Aussems, Claire; Korevaar, Joke C; Hoogendoorn, Mark.

Comput Biol Med ; 171: 108097, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38412689

ABSTRACT

INTRODUCTION: Temporal data has numerous challenges for deep learning such as irregularity of sampling. New algorithms are being developed that can handle these temporal challenges better. However, it is unclear how the performance ranges from classical non-temporal models to newly developed algorithms. Therefore, this study compares different non-temporal and temporal algorithms for a relevant use case, the prediction of atrial fibrillation (AF) using general practitioner (GP) data. METHODS: Three datasets with a 365-day observation window and prediction windows of 14, 180 and 360 days were used. Data consisted of medication, lab, symptom, and chronic diseases codings registered by the GP. The benchmark discarded temporality and used logistic regression, XGBoost models and neural networks on the presence of codings over the whole year. Pattern data extracted common patterns of GP codings and tested using the same algorithms. LSTM and CKConv models were trained as models incorporating temporality. RESULTS: Algorithms which incorporated temporality (LSTM and CKConv, (max AUC 0.734 at 360 days prediction window) outperformed both benchmark and pattern algorithms (max AUC 0.723, with a significant improvement using the 360 days prediction window (p = 0.04). The difference between the benchmark and the LSTM or CKConv algorithm decreased with smaller prediction windows, indicating temporal importance for longer prediction windows. The CKConv and LSTM algorithm performed similarly, possibly due to limited sequence length. CONCLUSION: Temporal models outperformed non-temporal models for the prediction of AF. For temporal models, CKConv is a promising algorithm to handle temporal data using GP data as it can handle irregular data.

Subject(s)

Atrial Fibrillation , General Practitioners , Humans , Atrial Fibrillation/diagnosis , Neural Networks, Computer , Algorithms , Logistic Models

6.

Guideline-informed reinforcement learning for mechanical ventilation in critical care.

den Hengst, Floris; Otten, Martijn; Elbers, Paul; van Harmelen, Frank; François-Lavet, Vincent; Hoogendoorn, Mark.

Artif Intell Med ; 147: 102742, 2024 01.

Article in English | MEDLINE | ID: mdl-38184349

ABSTRACT

Reinforcement Learning (RL) has recently found many applications in the healthcare domain thanks to its natural fit to clinical decision-making and ability to learn optimal decisions from observational data. A key challenge in adopting RL-based solution in clinical practice, however, is the inclusion of existing knowledge in learning a suitable solution. Existing knowledge from e.g. medical guidelines may improve the safety of solutions, produce a better balance between short- and long-term outcomes for patients and increase trust and adoption by clinicians. We present a framework for including knowledge available from medical guidelines in RL. The framework includes components for enforcing safety constraints and an approach that alters the learning signal to better balance short- and long-term outcomes based on these guidelines. We evaluate the framework by extending an existing RL-based mechanical ventilation (MV) approach with clinically established ventilation guidelines. Results from off-policy policy evaluation indicate that our approach has the potential to decrease 90-day mortality while ensuring lung protective ventilation. This framework provides an important stepping stone towards implementations of RL in clinical practice and opens up several avenues for further research.

Subject(s)

Learning , Respiration, Artificial , Humans , Reinforcement, Psychology , Critical Care , Clinical Decision-Making

7.

Reinforcement learning for intensive care medicine: actionable clinical insights from novel approaches to reward shaping and off-policy model evaluation.

Roggeveen, Luca F; Hassouni, Ali El; de Grooth, Harm-Jan; Girbes, Armand R J; Hoogendoorn, Mark; Elbers, Paul W G.

Intensive Care Med Exp ; 12(1): 32, 2024 Mar 25.

Article in English | MEDLINE | ID: mdl-38526681

ABSTRACT

BACKGROUND: Reinforcement learning (RL) holds great promise for intensive care medicine given the abundant availability of data and frequent sequential decision-making. But despite the emergence of promising algorithms, RL driven bedside clinical decision support is still far from reality. Major challenges include trust and safety. To help address these issues, we introduce cross off-policy evaluation and policy restriction and show how detailed policy analysis may increase clinical interpretability. As an example, we apply these in the setting of RL to optimise ventilator settings in intubated covid-19 patients. METHODS: With data from the Dutch ICU Data Warehouse and using an exhaustive hyperparameter grid search, we identified an optimal set of Dueling Double-Deep Q Network RL models. The state space comprised ventilator, medication, and clinical data. The action space focused on positive end-expiratory pressure (peep) and fraction of inspired oxygen (FiO2) concentration. We used gas exchange indices as interim rewards, and mortality and state duration as final rewards. We designed a novel evaluation method called cross off-policy evaluation (OPE) to assess the efficacy of models under varying weightings between the interim and terminal reward components. In addition, we implemented policy restriction to prevent potentially hazardous model actions. We introduce delta-Q to compare physician versus policy action quality and in-depth policy inspection using visualisations. RESULTS: We created trajectories for 1118 intensive care unit (ICU) admissions and trained 69,120 models using 8 model architectures with 128 hyperparameter combinations. For each model, policy restrictions were applied. In the first evaluation step, 17,182/138,240 policies had good performance, but cross-OPE revealed suboptimal performance for 44% of those by varying the reward function used for evaluation. Clinical policy inspection facilitated assessment of action decisions for individual patients, including identification of action space regions that may benefit most from optimisation. CONCLUSION: Cross-OPE can serve as a robust evaluation framework for safe RL model implementation by identifying policies with good generalisability. Policy restriction helps prevent potentially unsafe model recommendations. Finally, the novel delta-Q metric can be used to operationalise RL models in clinical practice. Our findings offer a promising pathway towards application of RL in intensive care medicine and beyond.

8.

The prevalence of non-pharmacological interventions in older homecare recipients: an overview from six European countries.

Kooijmans, Eline C M; Hoogendijk, Emiel O; Pokladníková, Jitka; Smalbil, Louk; Szczerbinska, Katarzyna; Baranska, Ilona; Ziuziakowska, Adrianna; Fialová, Daniela; Onder, Graziano; Declercq, Anja; Finne-Soveri, Harriet; Hoogendoorn, Mark; van Hout, Hein P J; Joling, Karlijn J.

Eur Geriatr Med ; 15(1): 243-252, 2024 Feb.

Article in English | MEDLINE | ID: mdl-37792242

ABSTRACT

PURPOSE: Non-pharmacological interventions (NPIs) play an important role in the management of older people receiving homecare. However, little is known about how often specific NPIs are being used and to what extent usage varies between countries. The aim of the current study was to investigate the prevalence of NPIs in older homecare recipients in six European countries. METHODS: This is a cross-sectional study of older homecare recipients (65+) using baseline data from the longitudinal cohort study 'Identifying best practices for care-dependent elderly by Benchmarking Costs and outcomes of community care' (IBenC). The analyzed NPIs are based on the interRAI Home Care instrument, a comprehensive geriatric assessment instrument. The prevalence of 24 NPIs was analyzed in Belgium, Germany, Finland, Iceland, Italy and the Netherlands. NPIs from seven groups were considered: psychosocial interventions, physical activity, regular care interventions, special therapies, preventive measures, special aids and environmental interventions. RESULTS: A total of 2884 homecare recipients were included. The mean age at baseline was 82.9 years and of all participants, 66.9% were female. The intervention with the highest prevalence in the study sample was 'emergency assistance available' (74%). Two other highly prevalent interventions were 'physical activity' (69%) and 'home nurse' (62%). Large differences between countries in the use of NPIs were observed and included, for example, 'going outside' (range 7-82%), 'home health aids' (range 12-93%), and 'physician visit' (range 24-94%). CONCLUSIONS: The use of NPIs varied considerably between homecare users in different European countries. It is important to better understand the barriers and facilitators of use of these potentially beneficial interventions in order to design successful uptake strategies.

Subject(s)

Longitudinal Studies , Humans , Female , Aged , Male , Prevalence , Cross-Sectional Studies , Europe/epidemiology , Cohort Studies

9.

Use of Machine-Learning Algorithms Based on Text, Audio and Video Data in the Prediction of Anxiety and Post-Traumatic Stress in General and Clinical Populations: A Systematic Review.

Ciharova, Marketa; Amarti, Khadicha; van Breda, Ward; Peng, Xianhua; Lorente-Català, Rosa; Funk, Burkhardt; Hoogendoorn, Mark; Koutsouleris, Nikolaos; Fusar-Poli, Paolo; Karyotaki, Eirini; Cuijpers, Pim; Riper, Heleen.

Biol Psychiatry ; 2024 Jun 10.

Article in English | MEDLINE | ID: mdl-38866173

ABSTRACT

Research in machine-learning (ML) algorithms using natural behavior (i.e., text, audio, and video data) suggests that these techniques could contribute to personalization in psychology and psychiatry. However, a systematic review of the current state-of-the-art is missing. Moreover, individual studies often target ML experts, and may overlook potential clinical implications of their findings. In a narrative accessible to mental health professionals, we present a systematic review, conducted in 5 psychology and 2 computer-science databases. We included 128 studies assessing the predictive power of ML algorithms using text, audio, and/or video data in the prediction of anxiety and post-traumatic stress (PTSD). Most studies (n = 87) aimed at predicting anxiety, the remainder (n = 41) focused on PTSD. They were mostly published since 2019, in computer science journals, and tested algorithms using text (n = 72), as opposed to audio or video. They focused mainly on general populations (n = 92), less on laboratory experiments (n = 23) or clinical populations (n = 13). Methodological quality varied, as did reported metrics of the predictive power, hampering comparison across studies. Two thirds of studies, focusing on both disorders, reported acceptable to very good predictive power (including high-quality studies only). Results of 33 studies were uninterpretable, mainly due to missing information. Research into ML algorithms using natural behavior is in its infancy, but shows potential to contribute to diagnostics of mental disorders, such as anxiety and PTSD, in the future, if standardization of methods, reporting of results, and research in clinical populations are improved.

10.

What is the future of artificial intelligence in obstetrics? A qualitative study among healthcare professionals.

Fischer, Anne; Rietveld, Anna; Teunissen, Pim; Hoogendoorn, Mark; Bakker, Petra.

BMJ Open ; 13(10): e076017, 2023 10 24.

Article in English | MEDLINE | ID: mdl-37879682

ABSTRACT

OBJECTIVE: This work explores the perceptions of obstetrical clinicians about artificial intelligence (AI) in order to bridge the gap in uptake of AI between research and medical practice. Identifying potential areas where AI can contribute to clinical practice, enables AI research to align with the needs of clinicians and ultimately patients. DESIGN: Qualitative interview study. SETTING: A national study conducted in the Netherlands between November 2022 and February 2023. PARTICIPANTS: Dutch clinicians working in obstetrics with varying relevant work experience, gender and age. ANALYSIS: Thematic analysis of qualitative interview transcripts. RESULTS: Thirteen gynaecologists were interviewed about hypothetical scenarios of an implemented AI model. Thematic analysis identified two major themes: perceived usefulness and trust. Usefulness involved AI extending human brain capacity in complex pattern recognition and information processing, reducing contextual influence and saving time. Trust required validation, explainability and successful personal experience. This result shows two paradoxes: first, AI is expected to provide added value by surpassing human capabilities, yet also a need to understand the parameters and their influence on predictions for trust and adoption was expressed. Second, participants recognised the value of incorporating numerous parameters into a model, but they also believed that certain contextual factors should only be considered by humans, as it would be undesirable for AI models to use that information. CONCLUSIONS: Obstetricians' opinions on the potential value of AI highlight the need for clinician-AI researcher collaboration. Trust can be built through conventional means like randomised controlled trials and guidelines. Holistic impact metrics, such as changes in workflow, not just clinical outcomes, should guide AI model development. Further research is needed for evaluating evolving AI systems beyond traditional validation methods.

Subject(s)

Artificial Intelligence , Obstetrics , Female , Pregnancy , Humans , Health Personnel , Obstetricians , Delivery of Health Care

11.

Machine learning to improve false-positive results in the Dutch newborn screening for congenital hypothyroidism.

Stroek, Kevin; Visser, Allerdien; van der Ploeg, Catharina P B; Zwaveling-Soonawala, Nitash; Heijboer, Annemieke C; Bosch, Annet M; van Trotsenburg, A S Paul; Boelen, Anita; Hoogendoorn, Mark; de Jonge, Robert.

Clin Biochem ; 116: 7-10, 2023 Jun.

Article in English | MEDLINE | ID: mdl-36878346

ABSTRACT

OBJECTIVE: The Dutch Congenital hypothyroidism (CH) Newborn Screening (NBS) algorithm for thyroidal and central congenital hypothyroidism (CH-T and CH-C, respectively) is primarily based on determination of thyroxine (T4) concentrations in dried blood spots, followed by thyroid-stimulating hormone (TSH) and thyroxine-binding globulin (TBG) measurements enabling detection of both CH-T and CH-C, with a positive predictive value (PPV) of 21%. A calculated T4/TBG ratio serves as an indirect measure for free T4. The aim of this study is to investigate whether machine learning techniques can help to improve the PPV of the algorithm without missing the positive cases that should have been detected with the current algorithm. DESIGN & METHODS: NBS data and parameters of CH patients and false-positive referrals in the period 2007-2017 and of a healthy reference population were included in the study. A random forest model was trained and tested using a stratified split and improved using synthetic minority oversampling technique (SMOTE). NBS data of 4668 newborns were included, containing 458 CH-T and 82 CH-C patients, 2332 false-positive referrals and 1670 healthy newborns. RESULTS: Variables determining identification of CH were (in order of importance) TSH, T4/TBG ratio, gestational age, TBG, T4 and age at NBS sampling. In a Receiver-Operating Characteristic (ROC) analysis on the test set, current sensitivity could be maintained, while increasing the PPV to 26%. CONCLUSIONS: Machine learning techniques have the potential to improve the PPV of the Dutch CH NBS. However, improved detection of currently missed cases is only possible with new, better predictors of especially CH-C and a better registration and inclusion of these cases in future models.

Subject(s)

Congenital Hypothyroidism , Machine Learning , Neonatal Screening , Random Forest , Humans , Congenital Hypothyroidism/diagnosis , Thyroxine/analysis , Glycoprotein Hormones, alpha Subunit/analysis , Thyroxine-Binding Globulin/analysis , False Positive Reactions , Algorithms , Gestational Age , Infant, Newborn

12.

Optimizing the Dutch newborn screening for congenital hypothyroidism by incorporating amino acids and acylcarnitines in a machine learning-based model.

Jansen, Heleen I; van Haeringen, Marije; Bouva, Marelle J; den Elzen, Wendy P J; Bruinstroop, Eveline; van der Ploeg, Catharina P B; van Trotsenburg, A S Paul; Zwaveling-Soonawala, Nitash; Heijboer, Annemieke C; Bosch, Annet M; de Jonge, Robert; Hoogendoorn, Mark; Boelen, Anita.

Eur Thyroid J ; 12(6)2023 12 01.

Article in English | MEDLINE | ID: mdl-37855424

ABSTRACT

Objective: Congenital hypothyroidism (CH) is an inborn thyroid hormone (TH) deficiency mostly caused by thyroidal (primary CH) or hypothalamic/pituitary (central CH) disturbances. Most CH newborn screening (NBS) programs are thyroid-stimulating-hormone (TSH) based, thereby only detecting primary CH. The Dutch NBS is based on measuring total thyroxine (T4) from dried blood spots, aiming to detect primary and central CH at the cost of more false-positive referrals (FPRs) (positive predictive value (PPV) of 21% in 2007-2017). An artificial PPV of 26% was yielded when using a machine learning-based model on the adjusted dataset described based on the Dutch CH NBS. Recently, amino acids (AAs) and acylcarnitines (ACs) have been shown to be associated with TH concentration. We therefore aimed to investigate whether AAs and ACs measured during NBS can contribute to better performance of the CH screening in the Netherlands by using a revised machine learning-based model. Methods: Dutch NBS data between 2007 and 2017 (CH screening results, AAs and ACs) from 1079 FPRs, 515 newborns with primary (431) and central CH (84) and data from 1842 healthy controls were used. A random forest model including these data was developed. Results: The random forest model with an artificial sensitivity of 100% yielded a PPV of 48% and AUROC of 0.99. Besides T4 and TSH, tyrosine, and succinylacetone were the main parameters contributing to the model's performance. Conclusions: The PPV improved significantly (26-48%) by adding several AAs and ACs to our machine learning-based model, suggesting that adding these parameters benefits the current algorithm.

Subject(s)

Congenital Hypothyroidism , Infant, Newborn , Humans , Congenital Hypothyroidism/diagnosis , Neonatal Screening/methods , Amino Acids , Thyrotropin

13.

Augmented intelligence facilitates concept mapping across different electronic health records.

Dam, Tariq A; Fleuren, Lucas M; Roggeveen, Luca F; Otten, Martijn; Biesheuvel, Laurens; Jagesar, Ameet R; Lalisang, Robbert C A; Kullberg, Robert F J; Hendriks, Tom; Girbes, Armand R J; Hoogendoorn, Mark; Thoral, Patrick J; Elbers, Paul W G.

Int J Med Inform ; 179: 105233, 2023 Nov.

Article in English | MEDLINE | ID: mdl-37748329

ABSTRACT

INTRODUCTION: With the advent of artificial intelligence, the secondary use of routinely collected medical data from electronic healthcare records (EHR) has become increasingly popular. However, different EHR systems typically use different names for the same medical concepts. This obviously hampers scalable model development and subsequent clinical implementation for decision support. Therefore, converting original parameter names to a so-called ontology, a standardized set of predefined concepts, is necessary but time-consuming and labor-intensive. We therefore propose an augmented intelligence approach to facilitate ontology alignment by predicting correct concepts based on parameter names from raw electronic health record data exports. METHODS: We used the manually mapped parameter names from the multicenter "Dutch ICU data warehouse against COVID-19" sourced from three types of EHR systems to train machine learning models for concept mapping. Data from 29 intensive care units on 38,824 parameters mapped to 1,679 relevant and unique concepts and 38,069 parameters labeled as irrelevant were used for model development and validation. We used the Natural Language Toolkit (NLTK) to preprocess the parameter names based on WordNet cognitive synonyms transformed by term-frequency inverse document frequency (TF-IDF), yielding numeric features. We then trained linear classifiers using stochastic gradient descent for multi-class prediction. Finally, we fine-tuned these predictions using information on distributions of the data associated with each parameter name through similarity score and skewness comparisons. RESULTS: The initial model, trained using data from one hospital organization for each of three EHR systems, scored an overall top 1 precision of 0.744, recall of 0.771, and F1-score of 0.737 on a total of 58,804 parameters. Leave-one-hospital-out analysis returned an average top 1 recall of 0.680 for relevant parameters, which increased to 0.905 for the top 5 predictions. When reducing the training dataset to only include relevant parameters, top 1 recall was 0.811 and top 5 recall was 0.914 for relevant parameters. Performance improvement based on similarity score or skewness comparisons affected at most 5.23% of numeric parameters. CONCLUSION: Augmented intelligence is a promising method to improve concept mapping of parameter names from raw electronic health record data exports. We propose a robust method for mapping data across various domains, facilitating the integration of diverse data sources. However, recall is not perfect, and therefore manual validation of mapping remains essential.

14.

Optimising the care for older persons with complex chronic conditions in home care and nursing homes: design and protocol of I-CARE4OLD, an observational study using real-world data.

Hoogendijk, Emiel O; Onder, Graziano; Smalbil, Louk; Vetrano, Davide L; Hirdes, John P; Howard, Elizabeth P; Morris, John N; Fialová, Daniela; Szczerbinska, Katarzyna; Kooijmans, Eline Cm; Hoogendoorn, Mark; Declercq, Anja; De Almeida Mello, Johanna; Leskelä, Riikka-Leena; Häsä, Jokke; Edgren, Johanna; Ruppe, Georg; Liperoti, Rosa; Joling, Karlijn J; van Hout, Hein Pj.

BMJ Open ; 13(6): e072399, 2023 06 29.

Article in English | MEDLINE | ID: mdl-37385750

ABSTRACT

INTRODUCTION: In ageing societies, the number of older adults with complex chronic conditions (CCCs) is rapidly increasing. Care for older persons with CCCs is challenging, due to interactions between multiple conditions and their treatments. In home care and nursing homes, where most older persons with CCCs receive care, professionals often lack appropriate decision support suitable and sufficient to address the medical and functional complexity of persons with CCCs. This EU-funded project aims to develop decision support systems using high-quality, internationally standardised, routine care data to support better prognostication of health trajectories and treatment impact among older persons with CCCs. METHODS AND ANALYSIS: Real-world data from older persons aged ≥60 years in home care and nursing homes, based on routinely performed comprehensive geriatric assessments using interRAI systems collected in the past 20 years, will be linked with administrative repositories on mortality and care use. These include potentially up to 51 million care recipients from eight countries: Italy, the Netherlands, Finland, Belgium, Canada, USA, Hong Kong and New Zealand. Prognostic algorithms will be developed and validated to better predict various health outcomes. In addition, the modifying impact of pharmacological and non-pharmacological interventions will be examined. A variety of analytical methods will be used, including techniques from the field of artificial intelligence such as machine learning. Based on the results, decision support tools will be developed and pilot tested among health professionals working in home care and nursing homes. ETHICS AND DISSEMINATION: The study was approved by authorised medical ethical committees in each of the participating countries, and will comply with both local and EU legislation. Study findings will be shared with relevant stakeholders, including publications in peer-reviewed journals and presentations at national and international meetings.

Subject(s)

Artificial Intelligence , Home Care Services , Humans , Aged , Aged, 80 and over , Aging , Algorithms , Chronic Disease , Observational Studies as Topic

15.

Prediction of heart failure 1 year before diagnosis in general practitioner patients using machine learning algorithms: a retrospective case-control study.

Bennis, Frank C; Hoogendoorn, Mark; Aussems, Claire; Korevaar, Joke C.

BMJ Open ; 12(8): e060458, 2022 08 30.

Article in English | MEDLINE | ID: mdl-36041765

ABSTRACT

OBJECTIVES: Heart failure (HF) is a commonly occurring health problem with high mortality and morbidity. If potential cases could be detected earlier, it may be possible to intervene earlier, which may slow progression in some patients. Preferably, it is desired to reuse already measured data for screening of all persons in an age group, such as general practitioner (GP) data. Furthermore, it is essential to evaluate the number of people needed to screen to find one patient using true incidence rates, as this indicates the generalisability in the true population. Therefore, we aim to create a machine learning model for the prediction of HF using GP data and evaluate the number needed to screen with true incidence rates. DESIGN, SETTINGS AND PARTICIPANTS: GP data from 8543 patients (-2 to -1 year before diagnosis) and controls aged 70+ years were obtained retrospectively from 01 January 2012 to 31 December 2019 from the Nivel Primary Care Database. Codes about chronic illness, complaints, diagnostics and medication were obtained. Data were split in a train/test set. Datasets describing demographics, the presence of codes (non-sequential) and upon each other following codes (sequential) were created. Logistic regression, random forest and XGBoost models were trained. Predicted outcome was the presence of HF after 1 year. The ratio case:control in the test set matched true incidence rates (1:45). RESULTS: Sole demographics performed average (area under the curve (AUC) 0.692, CI 0.677 to 0.706). Adding non-sequential information combined with a logistic regression model performed best and significantly improved performance (AUC 0.772, CI 0.759 to 0.785, p<0.001). Further adding sequential information did not alter performance significantly (AUC 0.767, CI 0.754 to 0.780, p=0.07). The number needed to screen dropped from 14.11 to 5.99 false positives per true positive. CONCLUSION: This study created a model able to identify patients with pending HF a year before diagnosis.

Subject(s)

General Practitioners , Heart Failure , Algorithms , Case-Control Studies , Heart Failure/diagnosis , Heart Failure/epidemiology , Humans , Machine Learning , Retrospective Studies

16.

Application of SHAP values for inferring the optimal functional form of covariates in pharmacokinetic modeling.

Janssen, Alexander; Hoogendoorn, Mark; Cnossen, Marjon H; Mathôt, Ron A A.

CPT Pharmacometrics Syst Pharmacol ; 11(8): 1100-1110, 2022 08.

Article in English | MEDLINE | ID: mdl-38100100

ABSTRACT

In population pharmacokinetic (PK) models, interindividual variability is explained by implementation of covariates in the model. The widely used forward stepwise selection method is sensitive to bias, which may lead to an incorrect inclusion of covariates. Alternatives, such as the full fixed effects model, reduce this bias but are dependent on the chosen implementation of each covariate. As the correct functional forms are unknown, this may still lead to an inaccurate selection of covariates. Machine learning (ML) techniques can potentially be used to learn the optimal functional forms for implementing covariates directly from data. A recent study suggested that using ML resulted in an improved selection of influential covariates. However, how do we select the appropriate functional form for including these covariates? In this work, we use SHapley Additive exPlanations (SHAP) to infer the relationship between covariates and PK parameters from ML models. As a case-study, we use data from 119 patients with hemophilia A receiving clotting factor VIII concentrate peri-operatively. We fit both a random forest and a XGBoost model to predict empirical Bayes estimated clearance and central volume from a base nonlinear mixed effects model. Next, we show that SHAP reveals covariate relationships which match previous findings. In addition, we can reveal subtle effects arising from combinations of covariates difficult to obtain using other methods of covariate analysis. We conclude that the proposed method can be used to extend ML-based covariate selection, and holds potential as a complete full model alternative to classical covariate analyses.

Subject(s)

Factor VIII , Hemophilia A , Humans , Bayes Theorem , Hemophilia A/drug therapy , Kinetics , Machine Learning

17.

Translating promise into practice: a review of machine learning in suicide research and prevention.

Kirtley, Olivia J; van Mens, Kasper; Hoogendoorn, Mark; Kapur, Navneet; de Beurs, Derek.

Lancet Psychiatry ; 9(3): 243-252, 2022 03.

Article in English | MEDLINE | ID: mdl-35183281

ABSTRACT

In ever more pressured health-care systems, technological solutions offering scalability of care and better resource targeting are appealing. Research on machine learning as a technique for identifying individuals at risk of suicidal ideation, suicide attempts, and death has grown rapidly. This research often places great emphasis on the promise of machine learning for preventing suicide, but overlooks the practical, clinical implementation issues that might preclude delivering on such a promise. In this Review, we synthesise the broad empirical and review literature on electronic health record-based machine learning in suicide research, and focus on matters of crucial importance for implementation of machine learning in clinical practice. The challenge of preventing statistically rare outcomes is well known; progress requires tackling data quality, transparency, and ethical issues. In the future, machine learning models might be explored as methods to enable targeting of interventions to specific individuals depending upon their level of need-ie, for precision medicine. Primarily, however, the promise of machine learning for suicide prevention is limited by the scarcity of high-quality scalable interventions available to individuals identified by machine learning as being at risk of suicide.

Subject(s)

Machine Learning , Suicide, Attempted/prevention & control , Decision Support Techniques , Humans , Research Design , Suicidal Ideation

18.

Machine Learning Prediction Models for Neurodevelopmental Outcome After Preterm Birth: A Scoping Review and New Machine Learning Evaluation Framework.

van Boven, Menne R; Henke, Celina E; Leemhuis, Aleid G; Hoogendoorn, Mark; van Kaam, Anton H; Königs, Marsh; Oosterlaan, Jaap.

Pediatrics ; 150(1)2022 07 01.

Article in English | MEDLINE | ID: mdl-35670123

ABSTRACT

BACKGROUND AND OBJECTIVES: Outcome prediction of preterm birth is important for neonatal care, yet prediction performance using conventional statistical models remains insufficient. Machine learning has a high potential for complex outcome prediction. In this scoping review, we provide an overview of the current applications of machine learning models in the prediction of neurodevelopmental outcomes in preterm infants, assess the quality of the developed models, and provide guidance for future application of machine learning models to predict neurodevelopmental outcomes of preterm infants. METHODS: A systematic search was performed using PubMed. Studies were included if they reported on neurodevelopmental outcome prediction in preterm infants using predictors from the neonatal period and applying machine learning techniques. Data extraction and quality assessment were independently performed by 2 reviewers. RESULTS: Fourteen studies were included, focusing mainly on very or extreme preterm infants, predicting neurodevelopmental outcome before age 3 years, and mostly assessing outcomes using the Bayley Scales of Infant Development. Predictors were most often based on MRI. The most prevalent machine learning techniques included linear regression and neural networks. None of the studies met all newly developed quality assessment criteria. Studies least prone to inflated performance showed promising results, with areas under the curve up to 0.86 for classification and R2 values up to 91% in continuous prediction. A limitation was that only 1 data source was used for the literature search. CONCLUSIONS: Studies least prone to inflated prediction results are the most promising. The provided evaluation framework may contribute to improved quality of future machine learning models.

Subject(s)

Infant, Premature , Premature Birth , Child , Child, Preschool , Female , Humans , Infant , Infant, Newborn , Machine Learning , Magnetic Resonance Imaging

19.

Predicting responders to prone positioning in mechanically ventilated patients with COVID-19 using machine learning.

Dam, Tariq A; Roggeveen, Luca F; van Diggelen, Fuda; Fleuren, Lucas M; Jagesar, Ameet R; Otten, Martijn; de Vries, Heder J; Gommers, Diederik; Cremer, Olaf L; Bosman, Rob J; Rigter, Sander; Wils, Evert-Jan; Frenzel, Tim; Dongelmans, Dave A; de Jong, Remko; Peters, Marco A A; Kamps, Marlijn J A; Ramnarain, Dharmanand; Nowitzky, Ralph; Nooteboom, Fleur G C A; de Ruijter, Wouter; Urlings-Strop, Louise C; Smit, Ellen G M; Mehagnoul-Schipper, D Jannet; Dormans, Tom; de Jager, Cornelis P C; Hendriks, Stefaan H A; Achterberg, Sefanja; Oostdijk, Evelien; Reidinga, Auke C; Festen-Spanjer, Barbara; Brunnekreef, Gert B; Cornet, Alexander D; van den Tempel, Walter; Boelens, Age D; Koetsier, Peter; Lens, Judith; Faber, Harald J; Karakus, A; Entjes, Robert; de Jong, Paul; Rettig, Thijs C D; Arbous, Sesmu; Vonk, Sebastiaan J J; Machado, Tomas; Herter, Willem E; de Grooth, Harm-Jan; Thoral, Patrick J; Girbes, Armand R J; Hoogendoorn, Mark.

Ann Intensive Care ; 12(1): 99, 2022 Oct 20.

Article in English | MEDLINE | ID: mdl-36264358

ABSTRACT

BACKGROUND: For mechanically ventilated critically ill COVID-19 patients, prone positioning has quickly become an important treatment strategy, however, prone positioning is labor intensive and comes with potential adverse effects. Therefore, identifying which critically ill intubated COVID-19 patients will benefit may help allocate labor resources. METHODS: From the multi-center Dutch Data Warehouse of COVID-19 ICU patients from 25 hospitals, we selected all 3619 episodes of prone positioning in 1142 invasively mechanically ventilated patients. We excluded episodes longer than 24 h. Berlin ARDS criteria were not formally documented. We used supervised machine learning algorithms Logistic Regression, Random Forest, Naive Bayes, K-Nearest Neighbors, Support Vector Machine and Extreme Gradient Boosting on readily available and clinically relevant features to predict success of prone positioning after 4 h (window of 1 to 7 h) based on various possible outcomes. These outcomes were defined as improvements of at least 10% in PaO2/FiO2 ratio, ventilatory ratio, respiratory system compliance, or mechanical power. Separate models were created for each of these outcomes. Re-supination within 4 h after pronation was labeled as failure. We also developed models using a 20 mmHg improvement cut-off for PaO2/FiO2 ratio and using a combined outcome parameter. For all models, we evaluated feature importance expressed as contribution to predictive performance based on their relative ranking. RESULTS: The median duration of prone episodes was 17 h (11-20, median and IQR, N = 2632). Despite extensive modeling using a plethora of machine learning techniques and a large number of potentially clinically relevant features, discrimination between responders and non-responders remained poor with an area under the receiver operator characteristic curve of 0.62 for PaO2/FiO2 ratio using Logistic Regression, Random Forest and XGBoost. Feature importance was inconsistent between models for different outcomes. Notably, not even being a previous responder to prone positioning, or PEEP-levels before prone positioning, provided any meaningful contribution to predicting a successful next proning episode. CONCLUSIONS: In mechanically ventilated COVID-19 patients, predicting the success of prone positioning using clinically relevant and readily available parameters from electronic health records is currently not feasible. Given the current evidence base, a liberal approach to proning in all patients with severe COVID-19 ARDS is therefore justified and in particular regardless of previous results of proning.

20.

Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis.

Roggeveen, Luca; El Hassouni, Ali; Ahrendt, Jonas; Guo, Tingjie; Fleuren, Lucas; Thoral, Patrick; Girbes, Armand Rj; Hoogendoorn, Mark; Elbers, Paul Wg.

Artif Intell Med ; 112: 102003, 2021 02.

Article in English | MEDLINE | ID: mdl-33581824

ABSTRACT

INTRODUCTION: In recent years, reinforcement learning (RL) has gained traction in the healthcare domain. In particular, RL methods have been explored for haemodynamic optimization of septic patients in the Intensive Care Unit. Most hospitals however, lack the data and expertise for model development, necessitating transfer of models developed using external datasets. This approach assumes model generalizability across different patient populations, the validity of which has not previously been tested. In addition, there is limited knowledge on safety and reliability. These challenges need to be addressed to further facilitate implementation of RL models in clinical practice. METHOD: We developed and validated a new reinforcement learning model for hemodynamic optimization in sepsis on the MIMIC intensive care database from the USA using a dueling double deep Q network. We then transferred this model to the European AmsterdamUMCdb intensive care database. T-Distributed Stochastic Neighbor Embedding and Sequential Organ Failure Assessment scores were used to explore the differences between the patient populations. We apply off-policy policy evaluation methods to quantify model performance. In addition, we introduce and apply a novel deep policy inspection to analyse how the optimal policy relates to the different phases of sepsis and sepsis treatment to provide interpretable insight in order to assess model safety and reliability. RESULTS: The off-policy evaluation revealed that the optimal policy outperformed the physician policy on both datasets despite marked differences between the two patient populations and physician's policies. Our novel deep policy inspection method showed insightful results and unveiled that the model could initiate therapy adequately and adjust therapy intensity to illness severity and disease progression which indicated safe and reliable model behaviour. Compared to current physician behavior, the developed policy prefers a more liberal use of vasopressors with a more restrained use of fluid therapy in line with previous work. CONCLUSION: We created a reinforcement learning model for optimal bedside hemodynamic management and demonstrated model transferability between populations from the USA and Europe for the first time. We proposed new methods for deep policy inspection integrating expert domain knowledge. This is expected to facilitate progression to bedside clinical decision support for the treatment of critically ill patients.

Subject(s)

Critical Illness , Sepsis , Hemodynamics , Humans , Reinforcement, Psychology , Reproducibility of Results , Sepsis/therapy

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL