Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 144
Filtrar
1.
Front Public Health ; 12: 1366496, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39157521

RESUMEN

Background: Diarrheal disease, characterized by high morbidity and mortality rates, continues to be a serious public health concern, especially in developing nations such as Ethiopia. The significant burden it imposes on these countries underscores the importance of identifying predictors of diarrhea. The use of machine learning techniques to identify significant predictors of diarrhea in children under the age of 5 in Ethiopia's Amhara Region is not well documented. Therefore, this study aimed to clarify these issues. Methods: This study's data have been extracted from the Ethiopian Population and Health Survey. We have applied machine learning ensemble classifier models such as random forests, logistic regression, K-nearest neighbors, decision trees, support vector machines, gradient boosting, and naive Bayes models to predict the determinants of diarrhea in children under the age of 5 in Ethiopia. Finally, Shapley Additive exPlanation (SHAP) value analysis was performed to predict diarrhea. Result: Among the seven models used, the random forest algorithm showed the highest accuracy in predicting diarrheal disease with an accuracy rate of 81.03% and an area under the curve of 86.50%. The following factors were investigated: families who had richest wealth status (log odd of -0.04), children without a history of Acute Respiratory Infections (ARIs) (log odd of -0.08), mothers who did not have a job (log odd of -0.04), children aged between 23 and 36 months (log odd of -0.03), mothers with higher education (log odds ratio of -0.03), urban dwellers (log odd of -0.01), families using electricity as cooking material (log odd of -0.12), children under 5 years of age living in the Amhara region of Ethiopia who did not show signs of wasting, children under 5 years of age who had not taken medications for intestinal parasites unlike their peers and who showed a significant association with diarrheal disease. Conclusion: We recommend implementing programs to reduce the incidence of diarrhea in children under the age of 5 in the Amhara region. These programs should focus on removing socioeconomic barriers that impede mothers' access to wealth, a favorable work environment, cooking fuel, education, and healthcare for their children.


Asunto(s)
Diarrea , Aprendizaje Automático , Factores Socioeconómicos , Humanos , Etiopía/epidemiología , Diarrea/epidemiología , Preescolar , Lactante , Femenino , Masculino , Encuestas Epidemiológicas , Modelos Logísticos , Factores de Riesgo , Recién Nacido , Adulto
2.
JMIR Ment Health ; 11: e52045, 2024 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-38963925

RESUMEN

BACKGROUND: Identifying individuals with depressive symptomatology (DS) promptly and effectively is of paramount importance for providing timely treatment. Machine learning models have shown promise in this area; however, studies often fall short in demonstrating the practical benefits of using these models and fail to provide tangible real-world applications. OBJECTIVE: This study aims to establish a novel methodology for identifying individuals likely to exhibit DS, identify the most influential features in a more explainable way via probabilistic measures, and propose tools that can be used in real-world applications. METHODS: The study used 3 data sets: PROACTIVE, the Brazilian National Health Survey (Pesquisa Nacional de Saúde [PNS]) 2013, and PNS 2019, comprising sociodemographic and health-related features. A Bayesian network was used for feature selection. Selected features were then used to train machine learning models to predict DS, operationalized as a score of ≥10 on the 9-item Patient Health Questionnaire. The study also analyzed the impact of varying sensitivity rates on the reduction of screening interviews compared to a random approach. RESULTS: The methodology allows the users to make an informed trade-off among sensitivity, specificity, and a reduction in the number of interviews. At the thresholds of 0.444, 0.412, and 0.472, determined by maximizing the Youden index, the models achieved sensitivities of 0.717, 0.741, and 0.718, and specificities of 0.644, 0.737, and 0.766 for PROACTIVE, PNS 2013, and PNS 2019, respectively. The area under the receiver operating characteristic curve was 0.736, 0.801, and 0.809 for these 3 data sets, respectively. For the PROACTIVE data set, the most influential features identified were postural balance, shortness of breath, and how old people feel they are. In the PNS 2013 data set, the features were the ability to do usual activities, chest pain, sleep problems, and chronic back problems. The PNS 2019 data set shared 3 of the most influential features with the PNS 2013 data set. However, the difference was the replacement of chronic back problems with verbal abuse. It is important to note that the features contained in the PNS data sets differ from those found in the PROACTIVE data set. An empirical analysis demonstrated that using the proposed model led to a potential reduction in screening interviews of up to 52% while maintaining a sensitivity of 0.80. CONCLUSIONS: This study developed a novel methodology for identifying individuals with DS, demonstrating the utility of using Bayesian networks to identify the most significant features. Moreover, this approach has the potential to substantially reduce the number of screening interviews while maintaining high sensitivity, thereby facilitating improved early identification and intervention strategies for individuals experiencing DS.


Asunto(s)
Algoritmos , Teorema de Bayes , Depresión , Humanos , Depresión/diagnóstico , Adulto , Femenino , Masculino , Brasil/epidemiología , Persona de Mediana Edad , Aprendizaje Automático , Tamizaje Masivo/métodos , Sensibilidad y Especificidad , Encuestas Epidemiológicas
3.
Int J Legal Med ; 2024 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-38997516

RESUMEN

Despite the improvements in forensic DNA quantification methods that allow for the early detection of low template/challenged DNA samples, complicating stochastic effects are not revealed until the final stage of the DNA analysis workflow. An assay that would provide genotyping information at the earlier stage of quantification would allow examiners to make critical adjustments prior to STR amplification allowing for potentially exclusionary information to be immediately reported. Specifically, qPCR instruments often have dissociation curve and/or high-resolution melt curve (HRM) capabilities; this, coupled with statistical prediction analysis, could provide additional information regarding STR genotypes present. Thus, this study aimed to evaluate Qiagen's principal component analysis (PCA)-based ScreenClust® HRM® software and a linear discriminant analysis (LDA)-based technique for their abilities to accurately predict genotypes and similar groups of genotypes from HRM data. Melt curves from single source samples were generated from STR D5S818 and D18S51 amplicons using a Rotor-Gene® Q qPCR instrument and EvaGreen® intercalating dye. When used to predict D5S818 genotypes for unknown samples, LDA analysis outperformed the PCA-based method whether predictions were for individual genotypes (58.92% accuracy) or for geno-groups (81.00% accuracy). However, when a locus with increased heterogeneity was tested (D18S51), PCA-based prediction accuracy rates improved to rates similar to those obtained using LDA (45.10% and 63.46%, respectively). This study provides foundational data documenting the performance of prediction modeling for STR genotyping based on qPCR-HRM data. In order to expand the forensic applicability of this HRM assay, the method could be tested with a more commonly utilized qPCR platform.

4.
Biol Psychiatry ; 2024 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-38852896

RESUMEN

BACKGROUND: Automatic transdiagnostic risk calculators can improve the detection of individuals at risk of psychosis. However, they rely on assessment at a single point in time and can be refined with dynamic modeling techniques that account for changes in risk over time. METHODS: We included 158,139 patients (5007 events) who received a first index diagnosis of a nonorganic and nonpsychotic mental disorder within electronic health records from the South London and Maudsley National Health Service Foundation Trust between January 1, 2008, and October 8, 2021. A dynamic Cox landmark model was developed to estimate the 2-year risk of developing psychosis according to the TRIPOD (Transparent Reporting of a multivariate prediction model for Individual Prognosis or Diagnosis) statement. The dynamic model included 24 predictors extracted at 9 landmark points (baseline, 0, 6, 12, 24, 30, 36, 42, and 48 months): 3 demographic, 1 clinical, and 20 natural language processing-based symptom and substance use predictors. Performance was compared with a static Cox regression model with all predictors assessed at baseline only and indexed via discrimination (C-index), calibration (calibration plots), and potential clinical utility (decision curves) in internal-external validation. RESULTS: The dynamic model improved discrimination performance from baseline compared with the static model (dynamic: C-index = 0.9; static: C-index = 0.87) and the final landmark point (dynamic: C-index = 0.79; static: C-index = 0.76). The dynamic model was also significantly better calibrated (calibration slope = 0.97-1.1) than the static model at later landmark points (≥24 months). Net benefit was higher for the dynamic than for the static model at later landmark points (≥24 months). CONCLUSIONS: These findings suggest that dynamic prediction models can improve the detection of individuals at risk for psychosis in secondary mental health care settings.

5.
Genes (Basel) ; 15(6)2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38927704

RESUMEN

Although guidelines exist for identifying mixtures, these measures often occur at the end-point of analysis and are protracted. To facilitate early mixture detection, we integrated a high-resolution melt (HRM) mixture screening assay into the qPCR step of the forensic workflow, producing the integrated QuantifilerTM Trio-HRM assay. The assay, when coupled with a prediction tool, allowed for 75.0% accurate identification of the contributor status of a sample (single source vs. mixture). To elucidate the limitations of the developed qPCR-HRM assay, developmental validation studies were conducted assessing the reproducibility and samples with varying DNA ratios, contributors, and quality. From this work, it was determined that the integrated QuantifilerTM Trio-HRM assay is capable of accurately identifying mixtures with up to five contributors and mixtures at ratios up to 1:100. Further, the optimal performance concentration range was found to be between 0.025 and 0.5 ng/µL. With these results, evidentiary-like DNA samples were then analyzed, resulting in 100.0% of the mixture samples being accurately identified; furthermore, every time a sample was predicted as a single source, it was true, giving confidence to any single-source calls. Overall, the integrated QuantifilerTM Trio-HRM assay has exhibited an enhanced ability to discern mixture samples from single-source samples at the qPCR stage under commonly observed conditions regardless of the contributor's sex.


Asunto(s)
Genética Forense , Humanos , Genética Forense/métodos , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Reacción en Cadena en Tiempo Real de la Polimerasa/normas , ADN/genética , Dermatoglifia del ADN/métodos , Reproducibilidad de los Resultados , Repeticiones de Microsatélite/genética
6.
J Surg Res ; 300: 514-525, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38875950

RESUMEN

INTRODUCTION: Veterans Affairs Surgical Quality Improvement Program (VASQIP) benchmarking algorithms helped the Veterans Health Administration (VHA) reduce postoperative mortality. Despite calls to consider social risk factors, these algorithms do not adjust for social determinants of health (SDoH) or account for services fragmented between the VHA and the private sector. This investigation examines how the addition of SDoH change model performance and quantifies associations between SDoH and 30-d postoperative mortality. METHODS: VASQIP (2013-2019) cohort study in patients ≥65 y old with 2-30-d inpatient stays. VASQIP was linked to other VHA and Medicare/Medicaid data. 30-d postoperative mortality was examined using multivariable logistic regression models, adjusting first for clinical variables, then adding SDoH. RESULTS: In adjusted analyses of 93,644 inpatient cases (97.7% male, 79.7% non-Hispanic White), higher proportions of non-veterans affairs care (adjusted odds ratio [aOR] = 1.02, 95% CI = 1.01-1.04) and living in highly deprived areas (aOR = 1.15, 95% CI = 1.02-1.29) were associated with increased postoperative mortality. Black race (aOR = 0.77, CI = 0.68-0.88) and rurality (aOR = 0.87, CI = 0.79-0.96) were associated with lower postoperative mortality. Adding SDoH to models with only clinical variables did not improve discrimination (c = 0.836 versus c = 0.835). CONCLUSIONS: Postoperative mortality is worse among Veterans receiving more health care outside the VA and living in highly deprived neighborhoods. However, adjusting for SDoH is unlikely to improve existing mortality-benchmarking models. Reduction efforts for postoperative mortality could focus on alleviating care fragmentation and designing care pathways that consider area deprivation. The adjusted survival advantage for rural and Black Veterans may be of interest to private sector hospitals as they attempt to alleviate enduring health-care disparities.


Asunto(s)
Determinantes Sociales de la Salud , Veteranos , Humanos , Anciano , Masculino , Femenino , Estados Unidos/epidemiología , Anciano de 80 o más Años , Veteranos/estadística & datos numéricos , United States Department of Veterans Affairs/estadística & datos numéricos , United States Department of Veterans Affairs/organización & administración , Factores de Riesgo , Mejoramiento de la Calidad , Complicaciones Posoperatorias/mortalidad , Complicaciones Posoperatorias/epidemiología
7.
J Multidiscip Healthc ; 17: 2021-2030, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38716371

RESUMEN

Objective: The objective of this study was to investigate the risk factors associated with cesarean scar pregnancy (CSP) and to develop a model for predicting intraoperative bleeding risk. Methods: We retrospectively analyzed the clinical data of 208 patients with CSP who were admitted to the People's Hospital of Leshan between January 2018 and December 2022. Based on whether intraoperative bleeding was ≥ 200 mL, we categorized them into two groups for comparative analysis: the excessive bleeding group (n = 27) and the control group (n = 181). Identifying relevant factors, we constructed a prediction model and created a nomogram. Results: We observed that there were significant differences between the two groups in several parameters. These included the time of menstrual cessation (P = 0.002), maximum diameter of the gestational sac (P < 0.001), thickness of the myometrium at the uterine scar (P = 0.001), pre-treatment blood HCG levels (P = 0.016), and the grade of blood flow signals (P < 0.001). We consolidated the above data and constructed a clinical prediction model. The model exhibited favorable results in terms of predictive efficacy, discriminative ability (C-index = 0.894, specificity = 0.834, sensitivity = 0.852), calibration precision (mean absolute error = 0.018), and clinical decision-making utility, indicating its effectiveness. Conclusion: The clinical prediction model related to the risk of hemorrhage that we developed in this experiment can assist in the development of appropriate interventions and effectively improve patient prognosis.

8.
Clin Perinatol ; 51(2): 411-424, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38705649

RESUMEN

Preterm birth (PTB) is a leading cause of morbidity and mortality in children aged under 5 years globally, especially in low-resource settings. It remains a challenge in many low-income and middle-income countries to accurately measure the true burden of PTB due to limited availability of accurate measures of gestational age (GA), first trimester ultrasound dating being the gold standard. Metabolomics biomarkers are a promising area of research that could provide tools for both early identification of high-risk pregnancies and for the estimation of GA and preterm status of newborns postnatally.


Asunto(s)
Biomarcadores , Edad Gestacional , Metabolómica , Nacimiento Prematuro , Humanos , Nacimiento Prematuro/metabolismo , Biomarcadores/metabolismo , Femenino , Embarazo , Recién Nacido
9.
Clin Epidemiol ; 16: 267-279, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38645475

RESUMEN

Background: High risk of intracranial hemorrhage (ICH) is a leading reason for withholding anticoagulation in patients with atrial fibrillation (AF). We aimed to develop a claims-based ICH risk prediction model in older adults with AF initiating oral anticoagulation (OAC). Methods: We used US Medicare claims data to identify new users of OAC aged ≥65 years with AF in 2010-2017. We used regularized Cox regression to select predictors of ICH. We compared our AF ICH risk score with the HAS-BLED bleed risk and Homer fall risk scores by area under the receiver operating characteristic curve (AUC) and assessed net reclassification improvement (NRI) when predicting 1-year risk of ICH. Results: Our study cohort comprised 840,020 patients (mean [SD] age 77.5 [7.4] years and female 52.2%) split geographically into training (3963 ICH events [0.6%] in 629,804 patients) and validation (1397 ICH events [0.7%] in 210,216 patients) sets. Our AF ICH risk score, including 50 predictors, had superior AUCs of 0.653 and 0.650 in the training and validation sets than the HAS-BLED score of 0.580 and 0.567 (p<0.001) and the Homer score of 0.624 and 0.623 (p<0.001). In the validation set, our AF ICH risk score reclassified 57.8%, 42.5%, and 43.9% of low, intermediate, and high-risk patients, respectively, by HAS-BLED score (NRI: 15.3%, p<0.001). Similarly, it reclassified 0.0, 44.1, and 19.4% of low, intermediate, and high-risk patients, respectively, by the Homer score (NRI: 21.9%, p<0.001). Conclusion: Our novel claims-based ICH risk prediction model outperformed the standard HAS-BLED score and can inform OAC prescribing decisions.

10.
Environ Sci Pollut Res Int ; 31(21): 30370-30398, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38641692

RESUMEN

Water resources are constantly threatened by pollution of potentially toxic elements (PTEs). In efforts to monitor and mitigate PTEs pollution in water resources, machine learning (ML) algorithms have been utilized to predict them. However, review studies have not paid attention to the suitability of input variables utilized for PTE prediction. Therefore, the present review analyzed studies that employed three ML algorithms: MLP-NN (multilayer perceptron neural network), RBF-NN (radial basis function neural network), and ANFIS (adaptive neuro-fuzzy inference system) to predict PTEs in water. A total of 139 models were analyzed to ascertain the input variables utilized, the suitability of the input variables, the trends of the ML model applications, and the comparison of their performances. The present study identified seven groups of input variables commonly used to predict PTEs in water. Group 1 comprised of physical parameters (P), chemical parameters (C), and metals (M). Group 2 contains only P and C; Group 3 contains only P and M; Group 4 contains only C and M; Group 5 contains only P; Group 6 contains only C; and Group 7 contains only M. Studies that employed the three algorithms proved that Groups 1, 2, 3, 5, and 7 parameters are suitable input variables for forecasting PTEs in water. The parameters of Groups 4 and 6 also proved to be suitable for the MLP-NN algorithm. However, their suitability with respect to the RBF-NN and ANFIS algorithms could not be ascertained. The most commonly predicted PTEs using the MLP-NN algorithm were Fe, Zn, and As. For the RBF-NN algorithm, they were NO3, Zn, and Pb, and for the ANFIS, they were NO3, Fe, and Mn. Based on correlation and determination coefficients (R, R2), the overall order of performance of the three ML algorithms was ANFIS > RBF-NN > MLP-NN, even though MLP-NN was the most commonly used algorithm.


Asunto(s)
Algoritmos , Aprendizaje Automático , Redes Neurales de la Computación , Contaminantes Químicos del Agua , Recursos Hídricos , Contaminantes Químicos del Agua/análisis , Monitoreo del Ambiente/métodos , Lógica Difusa
11.
BMC Med Res Methodol ; 24(1): 77, 2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38539074

RESUMEN

BACKGROUND: SARS-CoV-2 vaccines are effective in reducing hospitalization, COVID-19 symptoms, and COVID-19 mortality for nursing home (NH) residents. We sought to compare the accuracy of various machine learning models, examine changes to model performance, and identify resident characteristics that have the strongest associations with 30-day COVID-19 mortality, before and after vaccine availability. METHODS: We conducted a population-based retrospective cohort study analyzing data from all NH facilities across Ontario, Canada. We included all residents diagnosed with SARS-CoV-2 and living in NHs between March 2020 and July 2021. We employed five machine learning algorithms to predict COVID-19 mortality, including logistic regression, LASSO regression, classification and regression trees (CART), random forests, and gradient boosted trees. The discriminative performance of the models was evaluated using the area under the receiver operating characteristic curve (AUC) for each model using 10-fold cross-validation. Model calibration was determined through evaluation of calibration slopes. Variable importance was calculated by repeatedly and randomly permutating the values of each predictor in the dataset and re-evaluating the model's performance. RESULTS: A total of 14,977 NH residents and 20 resident characteristics were included in the model. The cross-validated AUCs were similar across algorithms and ranged from 0.64 to 0.67. Gradient boosted trees and logistic regression had an AUC of 0.67 pre- and post-vaccine availability. CART had the lowest discrimination ability with an AUC of 0.64 pre-vaccine availability, and 0.65 post-vaccine availability. The most influential resident characteristics, irrespective of vaccine availability, included advanced age (≥ 75 years), health instability, functional and cognitive status, sex (male), and polypharmacy. CONCLUSIONS: The predictive accuracy and discrimination exhibited by all five examined machine learning algorithms were similar. Both logistic regression and gradient boosted trees exhibit comparable performance and display slight superiority over other machine learning algorithms. We observed consistent model performance both before and after vaccine availability. The influence of resident characteristics on COVID-19 mortality remained consistent across time periods, suggesting that changes to pre-vaccination screening practices for high-risk individuals are effective in the post-vaccination era.


Asunto(s)
COVID-19 , Anciano , Humanos , COVID-19/prevención & control , Vacunas contra la COVID-19 , Casas de Salud , Ontario/epidemiología , Estudios Retrospectivos , SARS-CoV-2 , Masculino , Femenino
12.
Crit Care Explor ; 6(4): e1067, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38549688

RESUMEN

OBJECTIVES BACKGROUND: To externally validate clinical prediction models that aim to predict progression to invasive ventilation or death on the ICU in patients admitted with confirmed COVID-19 pneumonitis. DESIGN: Single-center retrospective external validation study. DATA SOURCES: Routinely collected healthcare data in the ICU electronic patient record. Curated data recorded for each ICU admission for the purposes of the U.K. Intensive Care National Audit and Research Centre (ICNARC). SETTING: The ICU at Manchester Royal Infirmary, Manchester, United Kingdom. PATIENTS: Three hundred forty-nine patients admitted to ICU with confirmed COVID-19 Pneumonitis, older than 18 years, from March 1, 2020, to February 28, 2022. Three hundred two met the inclusion criteria for at least one model. Fifty-five of the 349 patients were admitted before the widespread adoption of dexamethasone for the treatment of severe COVID-19 (pre-dexamethasone patients). OUTCOMES: Ability to be externally validated, discriminate, and calibrate. METHODS: Articles meeting the inclusion criteria were identified, and those that gave sufficient details on predictors used and methods to generate predictions were tested in our cohort of patients, which matched the original publications' inclusion/exclusion criteria and endpoint. RESULTS: Thirteen clinical prediction articles were identified. There was insufficient information available to validate models in five of the articles; a further three contained predictors that were not routinely measured in our ICU cohort and were not validated; three had performance that was substantially lower than previously published (range C-statistic = 0.483-0.605 in pre-dexamethasone patients and C = 0.494-0.564 among all patients). One model retained its discriminative ability in our cohort compared with previously published results (C = 0.672 and 0.686), and one retained performance among pre-dexamethasone patients but was poor in all patients (C = 0.793 and 0.596). One model could be calibrated but with poor performance. CONCLUSIONS: Our findings, albeit from a single center, suggest that the published performance of COVID-19 prediction models may not be replicated when translated to other institutions. In light of this, we would encourage bedside intensivists to reflect on the role of clinical prediction models in their own clinical decision-making.

13.
Alzheimers Dement (Amst) ; 16(1): e12572, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38545542

RESUMEN

INTRODUCTION: Identifying mild cognitive impairment (MCI) patients at risk for dementia could facilitate early interventions. Using electronic health records (EHRs), we developed a model to predict MCI to all-cause dementia (ACD) conversion at 5 years. METHODS: Cox proportional hazards model was used to identify predictors of ACD conversion from EHR data in veterans with MCI. Model performance (area under the receiver operating characteristic curve [AUC] and Brier score) was evaluated on a held-out data subset. RESULTS: Of 59,782 MCI patients, 15,420 (25.8%) converted to ACD. The model had good discriminative performance (AUC 0.73 [95% confidence interval (CI) 0.72-0.74]), and calibration (Brier score 0.18 [95% CI 0.17-0.18]). Age, stroke, cerebrovascular disease, myocardial infarction, hypertension, and diabetes were risk factors, while body mass index, alcohol abuse, and sleep apnea were protective factors. DISCUSSION: EHR-based prediction model had good performance in identifying 5-year MCI to ACD conversion and has potential to assist triaging of at-risk patients. Highlights: Of 59,782 veterans with mild cognitive impairment (MCI), 15,420 (25.8%) converted to all-cause dementia within 5 years.Electronic health record prediction models demonstrated good performance (area under the receiver operating characteristic curve 0.73; Brier 0.18).Age and vascular-related morbidities were predictors of dementia conversion.Synthetic data was comparable to real data in modeling MCI to dementia conversion. Key Points: An electronic health record-based model using demographic and co-morbidity data had good performance in identifying veterans who convert from mild cognitive impairment (MCI) to all-cause dementia (ACD) within 5 years.Increased age, stroke, cerebrovascular disease, myocardial infarction, hypertension, and diabetes were risk factors for 5-year conversion from MCI to ACD.High body mass index, alcohol abuse, and sleep apnea were protective factors for 5-year conversion from MCI to ACD.Models using synthetic data, analogs of real patient data that retain the distribution, density, and covariance between variables of real patient data but are not attributable to any specific patient, performed just as well as models using real patient data. This could have significant implications in facilitating widely distributed computing of health-care data with minimized patient privacy concern that could accelerate scientific discoveries.

14.
Psychol Med ; 54(8): 1500-1509, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38497091

RESUMEN

Precision psychiatry is an emerging field that aims to provide individualized approaches to mental health care. An important strategy to achieve this precision is to reduce uncertainty about prognosis and treatment response. Multivariate analysis and machine learning are used to create outcome prediction models based on clinical data such as demographics, symptom assessments, genetic information, and brain imaging. While much emphasis has been placed on technical innovation, the complex and varied nature of mental health presents significant challenges to the successful implementation of these models. From this perspective, I review ten challenges in the field of precision psychiatry, including the need for studies on real-world populations and realistic clinical outcome definitions, and consideration of treatment-related factors such as placebo effects and non-adherence to prescriptions. Fairness, prospective validation in comparison to current practice and implementation studies of prediction models are other key issues that are currently understudied. A shift is proposed from retrospective studies based on linear and static concepts of disease towards prospective research that considers the importance of contextual factors and the dynamic and complex nature of mental health.


Asunto(s)
Trastornos Mentales , Medicina de Precisión , Psiquiatría , Humanos , Medicina de Precisión/métodos , Psiquiatría/métodos , Trastornos Mentales/tratamiento farmacológico , Aprendizaje Automático , Pronóstico
15.
Biol Psychiatry ; 2024 Feb 24.
Artículo en Inglés | MEDLINE | ID: mdl-38408535

RESUMEN

The use of clinical prediction models to produce individualized risk estimates can facilitate the implementation of precision psychiatry. As a source of data from large, clinically representative patient samples, electronic health records (EHRs) provide a platform to develop and validate clinical prediction models, as well as potentially implement them in routine clinical care. The current review describes promising use cases for the application of precision psychiatry to EHR data and considers their performance in terms of discrimination (ability to separate individuals with and without the outcome) and calibration (extent to which predicted risk estimates correspond to observed outcomes), as well as their potential clinical utility (weighing benefits and costs associated with the model compared to different approaches across different assumptions of the number needed to test). We review 4 externally validated clinical prediction models designed to predict psychosis onset, psychotic relapse, cardiometabolic morbidity, and suicide risk. We then discuss the prospects for clinically implementing these models and the potential added value of integrating data from evidence syntheses, standardized psychometric assessments, and biological data into EHRs. Clinical prediction models can utilize routinely collected EHR data in an innovative way, representing a unique opportunity to inform real-world clinical decision making. Combining data from other sources (e.g., meta-analyses) or enhancing EHR data with information from research studies (clinical and biomarker data) may enhance our abilities to improve the performance of clinical prediction models.

16.
Comput Biol Med ; 170: 108014, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38301515

RESUMEN

BACKGROUND: Across medicine, prognostic models are used to estimate patient risk of certain future health outcomes (e.g., cardiovascular or mortality risk). To develop (or train) prognostic models, historic patient-level training data is needed containing both the predictive factors (i.e., features) and the relevant health outcomes (i.e., labels). Sometimes, when the health outcomes are not recorded in structured data, these are first extracted from textual notes using text mining techniques. Because there exist many studies utilizing text mining to obtain outcome data for prognostic model development, our aim is to study the impact of the text mining quality on downstream prognostic model performance. METHODS: We conducted a simulation study charting the relationship between text mining quality and prognostic model performance using an illustrative case study about in-hospital mortality prediction in intensive care unit patients. We repeatedly developed and evaluated a prognostic model for in-hospital mortality, using outcome data extracted by multiple text mining models of varying quality. RESULTS: Interestingly, we found in our case study that a relatively low-quality text mining model (F1 score ≈ 0.50) could already be used to train a prognostic model with quite good discrimination (area under the receiver operating characteristic curve of around 0.80). The calibration of the risks estimated by the prognostic model seemed unreliable across the majority of settings, even when text mining models were of relatively high quality (F1 ≈ 0.80). DISCUSSION: Developing prognostic models on text-extracted outcomes using imperfect text mining models seems promising. However, it is likely that prognostic models developed using this approach may not produce well-calibrated risk estimates, and require recalibration in (possibly a smaller amount of) manually extracted outcome data.


Asunto(s)
Cuidados Críticos , Minería de Datos , Humanos , Pronóstico , Simulación por Computador , Evaluación de Resultado en la Atención de Salud
17.
Bioengineering (Basel) ; 11(1)2024 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-38247937

RESUMEN

The field of the human microbiome is rapidly growing due to the recent advances in high-throughput sequencing technologies. Meanwhile, there have also been many new analytic pipelines, methods and/or tools developed for microbiome data preprocessing and analytics. They are usually focused on microbiome data with continuous (e.g., body mass index) or binary responses (e.g., diseased vs. healthy), yet multi-categorical responses that have more than two categories are also common in reality. In this paper, we introduce a new unified cloud platform, named MiMultiCat, for the analysis of microbiome data with multi-categorical responses. The two main distinguishing features of MiMultiCat are as follows: First, MiMultiCat streamlines a long sequence of microbiome data preprocessing and analytic procedures on user-friendly web interfaces; as such, it is easy to use for many people in various disciplines (e.g., biology, medicine, public health). Second, MiMultiCat performs both association testing and prediction modeling extensively. For association testing, MiMultiCat handles both ecological (e.g., alpha and beta diversity) and taxonomical (e.g., phylum, class, order, family, genus, species) contexts through covariate-adjusted or unadjusted analysis. For prediction modeling, MiMultiCat employs the random forest and gradient boosting algorithms that are well suited to microbiome data while providing nice visual interpretations. We demonstrate its use through the reanalysis of gut microbiome data on obesity with body mass index categories. MiMultiCat is freely available on our web server.

18.
Stat Med ; 43(7): 1384-1396, 2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38297411

RESUMEN

Clinical prediction models are estimated using a sample of limited size from the target population, leading to uncertainty in predictions, even when the model is correctly specified. Generally, not all patient profiles are observed uniformly in model development. As a result, sampling uncertainty varies between individual patients' predictions. We aimed to develop an intuitive measure of individual prediction uncertainty. The variance of a patient's prediction can be equated to the variance of the sample mean outcome in n ∗ $$ {n}_{\ast } $$ hypothetical patients with the same predictor values. This hypothetical sample size n ∗ $$ {n}_{\ast } $$ can be interpreted as the number of similar patients n eff $$ {n}_{\mathrm{eff}} $$ that the prediction is effectively based on, given that the model is correct. For generalized linear models, we derived analytical expressions for the effective sample size. In addition, we illustrated the concept in patients with acute myocardial infarction. In model development, n eff $$ {n}_{\mathrm{eff}} $$ can be used to balance accuracy versus uncertainty of predictions. In a validation sample, the distribution of n eff $$ {n}_{\mathrm{eff}} $$ indicates which patients were more and less represented in the development data, and whether predictions might be too uncertain for some to be practically meaningful. In a clinical setting, the effective sample size may facilitate communication of uncertainty about predictions. We propose the effective sample size as a clinically interpretable measure of uncertainty in individual predictions. Its implications should be explored further for the development, validation and clinical implementation of prediction models.


Asunto(s)
Incertidumbre , Humanos , Modelos Lineales , Tamaño de la Muestra
19.
Blood Rev ; 65: 101170, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38290895

RESUMEN

Hodgkin lymphoma is a rare, but highly curative form of cancer, primarily afflicting adolescents and young adults. Despite multiple seminal trials over the past twenty years, there is no single consensus-based treatment approach beyond use of multi-agency chemotherapy with curative intent. The use of radiation continues to be debated in early-stage disease, as part of combined modality treatment, as well as in salvage, as an important form of consolidation. While short-term disease outcomes have varied little across these different approaches across both early and advanced stage disease, the potential risk of severe, longer-term risk has varied considerably. Over the past decade novel therapeutics have been employed in the retrieval setting in preparation to and as consolidation after autologous stem cell transplant. More recently, these novel therapeutics have moved to the frontline setting, initially compared to standard-of-care treatment and later in a direct head-to-head comparison combined with multi-agent chemotherapy. In 2018, we established the HoLISTIC Consortium, bringing together disease and methods experts to develop clinical decision models based on individual patient data to guide providers, patients, and caregivers in decision-making. In this review, we detail the steps we followed to create the master database of individual patient data from patients treated over the past 20 years, using principles of data science. We then describe different methodological approaches we are taking to clinical decision making, beginning with clinical prediction tools at the time of diagnosis, to multi-state models, incorporating treatments and their response. Finally, we describe how simulation modeling can be used to estimate risks of late effects, based on cumulative exposure from frontline and salvage treatment. The resultant database and tools employed are dynamic with the expectation that they will be updated as better and more complete information becomes available.


Asunto(s)
Enfermedad de Hodgkin , Adolescente , Adulto Joven , Humanos , Enfermedad de Hodgkin/diagnóstico , Enfermedad de Hodgkin/terapia , Recurrencia Local de Neoplasia/tratamiento farmacológico , Terapia Combinada , Trasplante de Células Madre/métodos , Progresión de la Enfermedad , Protocolos de Quimioterapia Combinada Antineoplásica/efectos adversos
20.
Am J Epidemiol ; 193(1): 203-213, 2024 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-37650647

RESUMEN

We developed and validated a claims-based algorithm that classifies patients into obesity categories. Using Medicare (2007-2017) and Medicaid (2000-2014) claims data linked to 2 electronic health record (EHR) systems in Boston, Massachusetts, we identified a cohort of patients with an EHR-based body mass index (BMI) measurement (calculated as weight (kg)/height (m)2). We used regularized regression to select from 137 variables and built generalized linear models to classify patients with BMIs of ≥25, ≥30, and ≥40. We developed the prediction model using EHR system 1 (training set) and validated it in EHR system 2 (validation set). The cohort contained 123,432 patients in the Medicare population and 40,736 patients in the Medicaid population. The model comprised 97 variables in the Medicare set and 95 in the Medicaid set, including BMI-related diagnosis codes, cardiovascular and antidiabetic drugs, and obesity-related comorbidities. The areas under the receiver-operating-characteristic curve in the validation set were 0.72, 0.75, and 0.83 (Medicare) and 0.66, 0.66, and 0.70 (Medicaid) for BMIs of ≥25, ≥30, and ≥40, respectively. The positive predictive values were 81.5%, 80.6%, and 64.7% (Medicare) and 81.6%, 77.5%, and 62.5% (Medicaid), for BMIs of ≥25, ≥30, and ≥40, respectively. The proposed model can identify obesity categories in claims databases when BMI measurements are missing and can be used for confounding adjustment, defining subgroups, or probabilistic bias analysis.


Asunto(s)
Medicare , Obesidad , Anciano , Humanos , Estados Unidos/epidemiología , Obesidad/epidemiología , Índice de Masa Corporal , Comorbilidad , Hipoglucemiantes , Registros Electrónicos de Salud
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...