Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Comput Biol Med ; 163: 107188, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37393785

RESUMEN

The missing data mechanism is a relevant problem in Machine Learning (ML) and biomedical informatics communities. Real-world Electronic Health Record (EHR) datasets comprise several missing values, thus revealing a high level of spatiotemporal sparsity in the predictors' matrix. Several approaches in the state-of-the-art tried to deal with this problem by proposing different data imputation strategies that (i) are often unrelated to the ML model, (ii) are not conceived for EHR data where laboratory exams are not prescribed uniformly over time and percentage of missing values is high (iii) exploit only univariate and linear information on the observed features. Our paper proposes a data imputation strategy based on a clinical conditional Generative Adversarial Network (ccGAN) capable of imputing missing values by exploiting non-linear and multivariate information across patients. Unlike other GAN data imputation-based approaches, our method deals explicitly with the high level of missingness of routine EHR data by conditioning the imputing strategy to the observable values and those fully-annotated. We demonstrated the statistical significance of the ccGAN to other state-of-the-art approaches in terms of imputation (around 19.79% of gain to the best competitor) and predictive performance (up to 1.60% of gain to the best competitor) on a real multi-diabetic centers dataset. We also demonstrated its robustness across different missingness rates (up to 1.61% of gain to the best competitor in the highest missingness rates condition) on an additional benchmark EHR dataset.


Asunto(s)
Registros Electrónicos de Salud , Aprendizaje Automático , Humanos , Interpretación Estadística de Datos
2.
Diabetes Res Clin Pract ; 190: 110013, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35870573

RESUMEN

AIM: To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records. METHODS: Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3-5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC). RESULTS: For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy). CONCLUSIONS: Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.


Asunto(s)
Diabetes Mellitus Tipo 2 , Enfermedades Vasculares Periféricas , Diabetes Mellitus Tipo 2/complicaciones , Registros Electrónicos de Salud , Humanos , Aprendizaje Automático
3.
IEEE J Biomed Health Inform ; 25(10): 3983-3994, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-33877990

RESUMEN

Kidney Disease (KD) may hide complex causes and is associated with a tremendous socio-economic impact. Timely identification and management from the first level of medical care represent the most effective strategy to address the growing global burden sustainably. Clinical practice guidelines suggest utilizing estimated Glomerular Filtration Rate (eGFR) for routine evaluation within a screening purpose. Accordingly, the analysis of Electronic Health Records (EHRs) using Machine Learning techniques offers great opportunities to monitor and predict the eGFR trend over time. This paper aims to propose a novel Semi-Supervised Multi-Task Learning (SS-MTL) approach for predicting short-term KD evolution on multiple General Practitioners' EHR data. We demonstrated that the SS-MTL approach can (i) capture the eGFR temporal evolution by imposing a temporal relatedness between consecutive time windows and (ii) exploit useful information from unlabeled patients when labeled patients are less numerous with a gain of up to 4.1% in terms of Recall. This situation reflects the real-case scenario, where available labeled samples are limited, but those unlabeled much more abundant. The SS-MTL approach, also given the high level of interpretability, might be the ideal candidate in general practice to get integrated within a decision support system for KD screening purposes.


Asunto(s)
Algoritmos , Enfermedades Renales , Tasa de Filtración Glomerular , Humanos , Aprendizaje Automático , Aprendizaje Automático Supervisado
4.
J Intensive Med ; 1(2): 110-116, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-36785563

RESUMEN

Background: Accurate risk stratification of critically ill patients with coronavirus disease 2019 (COVID-19) is essential for optimizing resource allocation, delivering targeted interventions, and maximizing patient survival probability. Machine learning (ML) techniques are attracting increased interest for the development of prediction models as they excel in the analysis of complex signals in data-rich environments such as critical care. Methods: We retrieved data on patients with COVID-19 admitted to an intensive care unit (ICU) between March and October 2020 from the RIsk Stratification in COVID-19 patients in the Intensive Care Unit (RISC-19-ICU) registry. We applied the Extreme Gradient Boosting (XGBoost) algorithm to the data to predict as a binary outcome the increase or decrease in patients' Sequential Organ Failure Assessment (SOFA) score on day 5 after ICU admission. The model was iteratively cross-validated in different subsets of the study cohort. Results: The final study population consisted of 675 patients. The XGBoost model correctly predicted a decrease in SOFA score in 320/385 (83%) critically ill COVID-19 patients, and an increase in the score in 210/290 (72%) patients. The area under the mean receiver operating characteristic curve for XGBoost was significantly higher than that for the logistic regression model (0.86 vs. 0.69, P < 0.01 [paired t-test with 95% confidence interval]). Conclusions: The XGBoost model predicted the change in SOFA score in critically ill COVID-19 patients admitted to the ICU and can guide clinical decision support systems (CDSSs) aimed at optimizing available resources.

5.
IEEE J Transl Eng Health Med ; 8: 3000112, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33150095

RESUMEN

Objective Decision support systems (DSS) have been developed and promoted for their potential to improve quality of health care. However, there is a lack of common clinical strategy and a poor management of clinical resources and erroneous implementation of preventive medicine. Methods To overcome this problem, this work proposed an integrated system that relies on the creation and sharing of a database extracted from GPs' Electronic Health Records (EHRs) within the Netmedica Italian (NMI) cloud infrastructure. Although the proposed system is a pilot application specifically tailored for improving the chronic Type 2 Diabetes (T2D) care it could be easily targeted to effectively manage different chronic-diseases. The proposed DSS is based on EHR structure used by GPs in their daily activities following the most updated guidelines in data protection and sharing. The DSS is equipped with a Machine Learning (ML) method for analyzing the shared EHRs and thus tackling the high variability of EHRs. A novel set of T2D care-quality indicators are used specifically to determine the economic incentives and the T2D features are presented as predictors of the proposed ML approach. Results The EHRs from 41237 T2D patients were analyzed. No additional data collection, with respect to the standard clinical practice, was required. The DSS exhibited competitive performance (up to an overall accuracy of 98%±2% and macro-recall of 96%±1%) for classifying chronic care quality across the different follow-up phases. The chronic care quality model brought to a significant increase (up to 12%) of the T2D patients without complications. For GPs who agreed to use the proposed system, there was an economic incentive. A further bonus was assigned when performance targets are achieved. Conclusions The quality care evaluation in a clinical use-case scenario demonstrated how the empowerment of the GPs through the use of the platform (integrating the proposed DSS), along with the economic incentives, may speed up the improvement of care.

6.
Artif Intell Med ; 105: 101847, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32505428

RESUMEN

Early prediction of target patients at high risk of developing Type 2 diabetes (T2D) plays a significant role in preventing the onset of overt disease and its associated comorbidities. Although fundamental in early phases of T2D natural history, insulin resistance is not usually quantified by General Practitioners (GPs). Triglyceride-glucose (TyG) index has been proven useful in clinical studies for quantifying insulin resistance and for the early identification of individuals at T2D risk but still not applied by GPs for diagnostic purposes. The aim of this study is to propose a multiple instance learning boosting algorithm (MIL-Boost) for creating a predictive model capable of early prediction of worsening insulin resistance (low vs high T2D risk) in terms of TyG index. The MIL-Boost is applied to past electronic health record (EHR) patients' information stored by a single GP. The proposed MIL-Boost algorithm proved to be effective in dealing with this task, by performing better than the other state-of-the-art ML competitors (Recall from 0.70 and up to 0.83). The proposed MIL-based approach is able to extract hidden patterns from past EHR temporal data, even not directly exploiting triglycerides and glucose measurements. The major advantages of our method can be found in its ability to model the temporal evolution of longitudinal EHR data while dealing with small sample size and variability in the observations (e.g., a small variable number of prescriptions for non-hospitalized patients). The proposed algorithm may represent the main core of a clinical decision support system.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Diabetes Mellitus Tipo 2 , Médicos Generales , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/epidemiología , Registros Electrónicos de Salud , Humanos , Triglicéridos
7.
IEEE J Biomed Health Inform ; 24(1): 235-246, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-30762572

RESUMEN

The diagnosis of type 2 diabetes (T2D) at an early stage has a key role for an adequate T2D integrated management system and patient's follow-up. Recent years have witnessed an increasing amount of available electronic health record (EHR) data and machine learning (ML) techniques have been considerably evolving. However, managing and modeling this amount of information may lead to several challenges, such as overfitting, model interpretability, and computational cost. Starting from these motivations, we introduced an ML method called sparse balanced support vector machine (SB-SVM) for discovering T2D in a novel collected EHR dataset (named Federazione Italiana Medici di Medicina Generale dataset). In particular, among all the EHR features related to exemptions, examination, and drug prescriptions, we have selected only those collected before T2D diagnosis from an uniform age group of subjects. We demonstrated the reliability of the introduced approach with respect to other ML and deep learning approaches widely employed in the state-of-the-art for solving this task. Results evidence that the SB-SVM overcomes the other state-of-the-art competitors providing the best compromise between predictive performance and computation time. Additionally, the induced sparsity allows to increase the model interpretability, while implicitly managing high-dimensional data and the usual unbalanced class distribution.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Diabetes Mellitus Tipo 2/diagnóstico , Registros Electrónicos de Salud , Máquina de Vectores de Soporte , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Masculino , Persona de Mediana Edad , Reproducibilidad de los Resultados
8.
Comput Biol Med ; 112: 103358, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31336327

RESUMEN

BACKGROUND: Insulin resistance is an early-stage deterioration of Type 2 diabetes. Identification and quantification of insulin resistance requires specific blood tests; however, the triglyceride-glucose (TyG) index can provide a surrogate assessment from routine Electronic Health Record (EHR) data. Since insulin resistance is a multi-factorial condition, to improve its characterisation, this study aims to discover non-trivial clinical factors in EHR data to determine where the insulin-resistance condition is encoded. METHODS: We proposed a high-interpretable Machine Learning approach (i.e., ensemble Regression Forest combined with data imputation strategies), named TyG-er. We applied three different experimental procedures to test TyG-er reliability on the Italian Federation of General Practitioners dataset, named FIMMG_obs dataset, which is publicly available and reflects the clinical use-case (i.e., not all laboratory exams are prescribed on a regular basis over time). RESULTS: Results detected non-conventional clinical factors (i.e., uricemia, leukocytes, gamma-glutamyltransferase and protein profile) and provided novel insight into the best combination of clinical factors for detecting early glucose tolerance deterioration. The robustness of these extracted clinical factors was confirmed by the high agreement (from 0.664 to 0.911 of Lin's correlation coefficient (rc)) of the TyG-er approach among different experimental procedures. Moreover, the results of the three experimental procedures outlined the predictive power of the TyG-er approach (up to a mean absolute error of 5.68% and rc=0.666,p<.05). CONCLUSIONS: The TyG-er approach is able to carry information about the identification of the TyG index, strictly correlated with the insulin-resistance condition, while extracting the most relevant non-glycemic features from routine data.


Asunto(s)
Glucemia/metabolismo , Diabetes Mellitus Tipo 2 , Registros Electrónicos de Salud , Resistencia a la Insulina , Aprendizaje Automático , Triglicéridos/sangre , Anciano , Diabetes Mellitus Tipo 2/sangre , Diabetes Mellitus Tipo 2/diagnóstico , Femenino , Humanos , Masculino , Persona de Mediana Edad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA