Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Biomed Inform ; 113: 103621, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33220494

RESUMO

The use of machine learning to guide clinical decision making has the potential to worsen existing health disparities. Several recent works frame the problem as that of algorithmic fairness, a framework that has attracted considerable attention and criticism. However, the appropriateness of this framework is unclear due to both ethical as well as technical considerations, the latter of which include trade-offs between measures of fairness and model performance that are not well-understood for predictive models of clinical outcomes. To inform the ongoing debate, we conduct an empirical study to characterize the impact of penalizing group fairness violations on an array of measures of model performance and group fairness. We repeat the analysis across multiple observational healthcare databases, clinical outcomes, and sensitive attributes. We find that procedures that penalize differences between the distributions of predictions across groups induce nearly-universal degradation of multiple performance metrics within groups. On examining the secondary impact of these procedures, we observe heterogeneity of the effect of these procedures on measures of fairness in calibration and ranking across experimental conditions. Beyond the reported trade-offs, we emphasize that analyses of algorithmic fairness in healthcare lack the contextual grounding and causal awareness necessary to reason about the mechanisms that lead to health disparities, as well as about the potential of algorithmic fairness methods to counteract those mechanisms. In light of these limitations, we encourage researchers building predictive models for clinical use to step outside the algorithmic fairness frame and engage critically with the broader sociotechnical context surrounding the use of machine learning in healthcare.


Assuntos
Atenção à Saúde , Aprendizado de Máquina , Pesquisa Empírica
2.
J Biomed Inform ; 113: 103637, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33290879

RESUMO

Widespread adoption of electronic health records (EHRs) has fueled the development of using machine learning to build prediction models for various clinical outcomes. However, this process is often constrained by having a relatively small number of patient records for training the model. We demonstrate that using patient representation schemes inspired from techniques in natural language processing can increase the accuracy of clinical prediction models by transferring information learned from the entire patient population to the task of training a specific model, where only a subset of the population is relevant. Such patient representation schemes enable a 3.5% mean improvement in AUROC on five prediction tasks compared to standard baselines, with the average improvement rising to 19% when only a small number of patient records are available for training the clinical prediction model.


Assuntos
Registros Eletrônicos de Saúde , Modelos Estatísticos , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural , Prognóstico
3.
EClinicalMedicine ; 70: 102479, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38685924

RESUMO

Background: Artificial intelligence (AI) has repeatedly been shown to encode historical inequities in healthcare. We aimed to develop a framework to quantitatively assess the performance equity of health AI technologies and to illustrate its utility via a case study. Methods: Here, we propose a methodology to assess whether health AI technologies prioritise performance for patient populations experiencing worse outcomes, that is complementary to existing fairness metrics. We developed the Health Equity Assessment of machine Learning performance (HEAL) framework designed to quantitatively assess the performance equity of health AI technologies via a four-step interdisciplinary process to understand and quantify domain-specific criteria, and the resulting HEAL metric. As an illustrative case study (analysis conducted between October 2022 and January 2023), we applied the HEAL framework to a dermatology AI model. A set of 5420 teledermatology cases (store-and-forward cases from patients of 20 years or older, submitted from primary care providers in the USA and skin cancer clinics in Australia), enriched for diversity in age, sex and race/ethnicity, was used to retrospectively evaluate the AI model's HEAL metric, defined as the likelihood that the AI model performs better for subpopulations with worse average health outcomes as compared to others. The likelihood that AI performance was anticorrelated to pre-existing health outcomes was estimated using bootstrap methods as the probability that the negated Spearman's rank correlation coefficient (i.e., "R") was greater than zero. Positive values of R suggest that subpopulations with poorer health outcomes have better AI model performance. Thus, the HEAL metric, defined as p (R >0), measures how likely the AI technology is to prioritise performance for subpopulations with worse average health outcomes as compared to others (presented as a percentage below). Health outcomes were quantified as disability-adjusted life years (DALYs) when grouping by sex and age, and years of life lost (YLLs) when grouping by race/ethnicity. AI performance was measured as top-3 agreement with the reference diagnosis from a panel of 3 dermatologists per case. Findings: Across all dermatologic conditions, the HEAL metric was 80.5% for prioritizing AI performance of racial/ethnic subpopulations based on YLLs, and 92.1% and 0.0% respectively for prioritizing AI performance of sex and age subpopulations based on DALYs. Certain dermatologic conditions were significantly associated with greater AI model performance compared to a reference category of less common conditions. For skin cancer conditions, the HEAL metric was 73.8% for prioritizing AI performance of age subpopulations based on DALYs. Interpretation: Analysis using the proposed HEAL framework showed that the dermatology AI model prioritised performance for race/ethnicity, sex (all conditions) and age (cancer conditions) subpopulations with respect to pre-existing health disparities. More work is needed to investigate ways of promoting equitable AI performance across age for non-cancer conditions and to better understand how AI models can contribute towards improving equity in health outcomes. Funding: Google LLC.

4.
Sci Rep ; 13(1): 3767, 2023 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-36882576

RESUMO

Temporal distribution shift negatively impacts the performance of clinical prediction models over time. Pretraining foundation models using self-supervised learning on electronic health records (EHR) may be effective in acquiring informative global patterns that can improve the robustness of task-specific models. The objective was to evaluate the utility of EHR foundation models in improving the in-distribution (ID) and out-of-distribution (OOD) performance of clinical prediction models. Transformer- and gated recurrent unit-based foundation models were pretrained on EHR of up to 1.8 M patients (382 M coded events) collected within pre-determined year groups (e.g., 2009-2012) and were subsequently used to construct patient representations for patients admitted to inpatient units. These representations were used to train logistic regression models to predict hospital mortality, long length of stay, 30-day readmission, and ICU admission. We compared our EHR foundation models with baseline logistic regression models learned on count-based representations (count-LR) in ID and OOD year groups. Performance was measured using area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve, and absolute calibration error. Both transformer and recurrent-based foundation models generally showed better ID and OOD discrimination relative to count-LR and often exhibited less decay in tasks where there is observable degradation of discrimination performance (average AUROC decay of 3% for transformer-based foundation model vs. 7% for count-LR after 5-9 years). In addition, the performance and robustness of transformer-based foundation models continued to improve as pretraining set size increased. These results suggest that pretraining EHR foundation models at scale is a useful approach for developing clinical prediction models that perform well in the presence of temporal distribution shift.


Assuntos
Fontes de Energia Elétrica , Registros Eletrônicos de Saúde , Humanos , Mortalidade Hospitalar , Hospitalização
5.
Methods Inf Med ; 62(1-02): 60-70, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36812932

RESUMO

BACKGROUND: Temporal dataset shift can cause degradation in model performance as discrepancies between training and deployment data grow over time. The primary objective was to determine whether parsimonious models produced by specific feature selection methods are more robust to temporal dataset shift as measured by out-of-distribution (OOD) performance, while maintaining in-distribution (ID) performance. METHODS: Our dataset consisted of intensive care unit patients from MIMIC-IV categorized by year groups (2008-2010, 2011-2013, 2014-2016, and 2017-2019). We trained baseline models using L2-regularized logistic regression on 2008-2010 to predict in-hospital mortality, long length of stay (LOS), sepsis, and invasive ventilation in all year groups. We evaluated three feature selection methods: L1-regularized logistic regression (L1), Remove and Retrain (ROAR), and causal feature selection. We assessed whether a feature selection method could maintain ID performance (2008-2010) and improve OOD performance (2017-2019). We also assessed whether parsimonious models retrained on OOD data performed as well as oracle models trained on all features in the OOD year group. RESULTS: The baseline model showed significantly worse OOD performance with the long LOS and sepsis tasks when compared with the ID performance. L1 and ROAR retained 3.7 to 12.6% of all features, whereas causal feature selection generally retained fewer features. Models produced by L1 and ROAR exhibited similar ID and OOD performance as the baseline models. The retraining of these models on 2017-2019 data using features selected from training on 2008-2010 data generally reached parity with oracle models trained directly on 2017-2019 data using all available features. Causal feature selection led to heterogeneous results with the superset maintaining ID performance while improving OOD calibration only on the long LOS task. CONCLUSIONS: While model retraining can mitigate the impact of temporal dataset shift on parsimonious models produced by L1 and ROAR, new methods are required to proactively improve temporal robustness.


Assuntos
Medicina Clínica , Sepse , Feminino , Gravidez , Humanos , Mortalidade Hospitalar , Tempo de Internação , Aprendizado de Máquina
6.
J Am Med Inform Assoc ; 30(12): 2004-2011, 2023 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-37639620

RESUMO

OBJECTIVE: Development of electronic health records (EHR)-based machine learning models for pediatric inpatients is challenged by limited training data. Self-supervised learning using adult data may be a promising approach to creating robust pediatric prediction models. The primary objective was to determine whether a self-supervised model trained in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients, for pediatric inpatient clinical prediction tasks. MATERIALS AND METHODS: This retrospective cohort study used EHR data and included patients with at least one admission to an inpatient unit. One admission per patient was randomly selected. Adult inpatients were 18 years or older while pediatric inpatients were more than 28 days and less than 18 years. Admissions were temporally split into training (January 1, 2008 to December 31, 2019), validation (January 1, 2020 to December 31, 2020), and test (January 1, 2021 to August 1, 2022) sets. Primary comparison was a self-supervised model trained in adult inpatients versus count-based logistic regression models trained in pediatric inpatients. Primary outcome was mean area-under-the-receiver-operating-characteristic-curve (AUROC) for 11 distinct clinical outcomes. Models were evaluated in pediatric inpatients. RESULTS: When evaluated in pediatric inpatients, mean AUROC of self-supervised model trained in adult inpatients (0.902) was noninferior to count-based logistic regression models trained in pediatric inpatients (0.868) (mean difference = 0.034, 95% CI=0.014-0.057; P < .001 for noninferiority and P = .006 for superiority). CONCLUSIONS: Self-supervised learning in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients. This finding suggests transferability of self-supervised models trained in adult patients to pediatric patients, without requiring costly model retraining.


Assuntos
Pacientes Internados , Aprendizado de Máquina , Humanos , Adulto , Criança , Estudos Retrospectivos , Aprendizado de Máquina Supervisionado , Registros Eletrônicos de Saúde
7.
Nat Med ; 29(11): 2929-2938, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37884627

RESUMO

Artificial intelligence as a medical device is increasingly being applied to healthcare for diagnosis, risk stratification and resource allocation. However, a growing body of evidence has highlighted the risk of algorithmic bias, which may perpetuate existing health inequity. This problem arises in part because of systemic inequalities in dataset curation, unequal opportunity to participate in research and inequalities of access. This study aims to explore existing standards, frameworks and best practices for ensuring adequate data diversity in health datasets. Exploring the body of existing literature and expert views is an important step towards the development of consensus-based guidelines. The study comprises two parts: a systematic review of existing standards, frameworks and best practices for healthcare datasets; and a survey and thematic analysis of stakeholder views of bias, health equity and best practices for artificial intelligence as a medical device. We found that the need for dataset diversity was well described in literature, and experts generally favored the development of a robust set of guidelines, but there were mixed views about how these could be implemented practically. The outputs of this study will be used to inform the development of standards for transparency of data diversity in health datasets (the STANDING Together initiative).


Assuntos
Inteligência Artificial , Atenção à Saúde , Humanos , Consenso , Revisões Sistemáticas como Assunto
8.
BMJ Health Care Inform ; 29(1)2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35396247

RESUMO

OBJECTIVES: The American College of Cardiology and the American Heart Association guidelines on primary prevention of atherosclerotic cardiovascular disease (ASCVD) recommend using 10-year ASCVD risk estimation models to initiate statin treatment. For guideline-concordant decision-making, risk estimates need to be calibrated. However, existing models are often miscalibrated for race, ethnicity and sex based subgroups. This study evaluates two algorithmic fairness approaches to adjust the risk estimators (group recalibration and equalised odds) for their compatibility with the assumptions underpinning the guidelines' decision rules.MethodsUsing an updated pooled cohorts data set, we derive unconstrained, group-recalibrated and equalised odds-constrained versions of the 10-year ASCVD risk estimators, and compare their calibration at guideline-concordant decision thresholds. RESULTS: We find that, compared with the unconstrained model, group-recalibration improves calibration at one of the relevant thresholds for each group, but exacerbates differences in false positive and false negative rates between groups. An equalised odds constraint, meant to equalise error rates across groups, does so by miscalibrating the model overall and at relevant decision thresholds. DISCUSSION: Hence, because of induced miscalibration, decisions guided by risk estimators learned with an equalised odds fairness constraint are not concordant with existing guidelines. Conversely, recalibrating the model separately for each group can increase guideline compatibility, while increasing intergroup differences in error rates. As such, comparisons of error rates across groups can be misleading when guidelines recommend treating at fixed decision thresholds. CONCLUSION: The illustrated tradeoffs between satisfying a fairness criterion and retaining guideline compatibility underscore the need to evaluate models in the context of downstream interventions.


Assuntos
Aterosclerose , Cardiologia , Doenças Cardiovasculares , Inibidores de Hidroximetilglutaril-CoA Redutases , American Heart Association , Aterosclerose/tratamento farmacológico , Aterosclerose/prevenção & controle , Doenças Cardiovasculares/prevenção & controle , Humanos , Inibidores de Hidroximetilglutaril-CoA Redutases/uso terapêutico , Estados Unidos
9.
Sci Rep ; 12(1): 3254, 2022 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-35228563

RESUMO

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.


Assuntos
Hospitalização , Readmissão do Paciente , Registros Eletrônicos de Saúde , Mortalidade Hospitalar , Humanos
10.
Sci Rep ; 12(1): 2726, 2022 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-35177653

RESUMO

Temporal dataset shift associated with changes in healthcare over time is a barrier to deploying machine learning-based clinical decision support systems. Algorithms that learn robust models by estimating invariant properties across time periods for domain generalization (DG) and unsupervised domain adaptation (UDA) might be suitable to proactively mitigate dataset shift. The objective was to characterize the impact of temporal dataset shift on clinical prediction models and benchmark DG and UDA algorithms on improving model robustness. In this cohort study, intensive care unit patients from the MIMIC-IV database were categorized by year groups (2008-2010, 2011-2013, 2014-2016 and 2017-2019). Tasks were predicting mortality, long length of stay, sepsis and invasive ventilation. Feedforward neural networks were used as prediction models. The baseline experiment trained models using empirical risk minimization (ERM) on 2008-2010 (ERM[08-10]) and evaluated them on subsequent year groups. DG experiment trained models using algorithms that estimated invariant properties using 2008-2016 and evaluated them on 2017-2019. UDA experiment leveraged unlabelled samples from 2017 to 2019 for unsupervised distribution matching. DG and UDA models were compared to ERM[08-16] models trained using 2008-2016. Main performance measures were area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve and absolute calibration error. Threshold-based metrics including false-positives and false-negatives were used to assess the clinical impact of temporal dataset shift and its mitigation strategies. In the baseline experiments, dataset shift was most evident for sepsis prediction (maximum AUROC drop, 0.090; 95% confidence interval (CI), 0.080-0.101). Considering a scenario of 100 consecutively admitted patients showed that ERM[08-10] applied to 2017-2019 was associated with one additional false-negative among 11 patients with sepsis, when compared to the model applied to 2008-2010. When compared with ERM[08-16], DG and UDA experiments failed to produce more robust models (range of AUROC difference, - 0.003 to 0.050). In conclusion, DG and UDA failed to produce more robust models compared to ERM in the setting of temporal dataset shift. Alternate approaches are required to preserve model performance over time in clinical medicine.


Assuntos
Bases de Dados Factuais , Unidades de Terapia Intensiva , Tempo de Internação , Modelos Biológicos , Redes Neurais de Computação , Sepse , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Sepse/mortalidade , Sepse/terapia
11.
Nat Commun ; 13(1): 1678, 2022 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-35354802

RESUMO

Linear mixed models are commonly used in healthcare-based association analyses for analyzing multi-site data with heterogeneous site-specific random effects. Due to regulations for protecting patients' privacy, sensitive individual patient data (IPD) typically cannot be shared across sites. We propose an algorithm for fitting distributed linear mixed models (DLMMs) without sharing IPD across sites. This algorithm achieves results identical to those achieved using pooled IPD from multiple sites (i.e., the same effect size and standard error estimates), hence demonstrating the lossless property. The algorithm requires each site to contribute minimal aggregated data in only one round of communication. We demonstrate the lossless property of the proposed DLMM algorithm by investigating the associations between demographic and clinical characteristics and length of hospital stay in COVID-19 patients using administrative claims from the UnitedHealth Group Clinical Discovery Database. We extend this association study by incorporating 120,609 COVID-19 patients from 11 collaborative data sources worldwide.


Assuntos
COVID-19 , Algoritmos , COVID-19/epidemiologia , Confidencialidade , Bases de Dados Factuais , Humanos , Modelos Lineares
12.
J Am Med Inform Assoc ; 28(10): 2258-2264, 2021 09 18.
Artigo em Inglês | MEDLINE | ID: mdl-34350942

RESUMO

Using a risk stratification model to guide clinical practice often requires the choice of a cutoff-called the decision threshold-on the model's output to trigger a subsequent action such as an electronic alert. Choosing this cutoff is not always straightforward. We propose a flexible approach that leverages the collective information in treatment decisions made in real life to learn reference decision thresholds from physician practice. Using the example of prescribing a statin for primary prevention of cardiovascular disease based on 10-year risk calculated by the 2013 pooled cohort equations, we demonstrate the feasibility of using real-world data to learn the implicit decision threshold that reflects existing physician behavior. Learning a decision threshold in this manner allows for evaluation of a proposed operating point against the threshold reflective of the community standard of care. Furthermore, this approach can be used to monitor and audit model-guided clinical decision making following model deployment.


Assuntos
Doenças Cardiovasculares , Tomada de Decisão Clínica , Humanos , Medição de Risco
13.
Appl Clin Inform ; 12(4): 808-815, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34470057

RESUMO

OBJECTIVE: The change in performance of machine learning models over time as a result of temporal dataset shift is a barrier to machine learning-derived models facilitating decision-making in clinical practice. Our aim was to describe technical procedures used to preserve the performance of machine learning models in the presence of temporal dataset shifts. METHODS: Studies were included if they were fully published articles that used machine learning and implemented a procedure to mitigate the effects of temporal dataset shift in a clinical setting. We described how dataset shift was measured, the procedures used to preserve model performance, and their effects. RESULTS: Of 4,457 potentially relevant publications identified, 15 were included. The impact of temporal dataset shift was primarily quantified using changes, usually deterioration, in calibration or discrimination. Calibration deterioration was more common (n = 11) than discrimination deterioration (n = 3). Mitigation strategies were categorized as model level or feature level. Model-level approaches (n = 15) were more common than feature-level approaches (n = 2), with the most common approaches being model refitting (n = 12), probability calibration (n = 7), model updating (n = 6), and model selection (n = 6). In general, all mitigation strategies were successful at preserving calibration but not uniformly successful in preserving discrimination. CONCLUSION: There was limited research in preserving the performance of machine learning models in the presence of temporal dataset shift in clinical medicine. Future research could focus on the impact of dataset shift on clinical decision making, benchmark the mitigation strategies on a wider range of datasets and tasks, and identify optimal strategies for specific settings.


Assuntos
Medicina Clínica , Aprendizado de Máquina , Tomada de Decisão Clínica , Cognição
14.
Transl Psychiatry ; 11(1): 642, 2021 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-34930903

RESUMO

Many patients with bipolar disorder (BD) are initially misdiagnosed with major depressive disorder (MDD) and are treated with antidepressants, whose potential iatrogenic effects are widely discussed. It is unknown whether MDD is a comorbidity of BD or its earlier stage, and no consensus exists on individual conversion predictors, delaying BD's timely recognition and treatment. We aimed to build a predictive model of MDD to BD conversion and to validate it across a multi-national network of patient databases using the standardization afforded by the Observational Medical Outcomes Partnership (OMOP) common data model. Five "training" US databases were retrospectively analyzed: IBM MarketScan CCAE, MDCR, MDCD, Optum EHR, and Optum Claims. Cyclops regularized logistic regression models were developed on one-year MDD-BD conversion with all standard covariates from the HADES PatientLevelPrediction package. Time-to-conversion Kaplan-Meier analysis was performed up to a decade after MDD, stratified by model-estimated risk. External validation of the final prediction model was performed across 9 patient record databases within the Observational Health Data Sciences and Informatics (OHDSI) network internationally. The model's area under the curve (AUC) varied 0.633-0.745 (µ = 0.689) across the five US training databases. Nine variables predicted one-year MDD-BD transition. Factors that increased risk were: younger age, severe depression, psychosis, anxiety, substance misuse, self-harm thoughts/actions, and prior mental disorder. AUCs of the validation datasets ranged 0.570-0.785 (µ = 0.664). An assessment algorithm was built for MDD to BD conversion that allows distinguishing as much as 100-fold risk differences among patients and validates well across multiple international data sources.


Assuntos
Transtorno Bipolar , Transtorno Depressivo Maior , Transtornos Psicóticos , Antidepressivos , Transtorno Bipolar/complicações , Transtorno Bipolar/diagnóstico , Transtorno Bipolar/epidemiologia , Transtorno Depressivo Maior/complicações , Transtorno Depressivo Maior/diagnóstico , Transtorno Depressivo Maior/epidemiologia , Humanos , Estudos Retrospectivos
15.
PLoS One ; 15(1): e0226718, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31910437

RESUMO

BACKGROUND AND PURPOSE: Hemorrhagic transformation (HT) after cerebral infarction is a complex and multifactorial phenomenon in the acute stage of ischemic stroke, and often results in a poor prognosis. Thus, identifying risk factors and making an early prediction of HT in acute cerebral infarction contributes not only to the selections of therapeutic regimen but also, more importantly, to the improvement of prognosis of acute cerebral infarction. The purpose of this study was to develop and validate a model to predict a patient's risk of HT within 30 days of initial ischemic stroke. METHODS: We utilized a retrospective multicenter observational cohort study design to develop a Lasso Logistic Regression prediction model with a large, US Electronic Health Record dataset which structured to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). To examine clinical transportability, the model was externally validated across 10 additional real-world healthcare datasets include EHR records for patients from America, Europe and Asia. RESULTS: In the database the model was developed, the target population cohort contained 621,178 patients with ischemic stroke, of which 5,624 patients had HT within 30 days following initial ischemic stroke. 612 risk predictors, including the distance a patient travels in an ambulance to get to care for a HT, were identified. An area under the receiver operating characteristic curve (AUC) of 0.75 was achieved in the internal validation of the risk model. External validation was performed across 10 databases totaling 5,515,508 patients with ischemic stroke, of which 86,401 patients had HT within 30 days following initial ischemic stroke. The mean external AUC was 0.71 and ranged between 0.60-0.78. CONCLUSIONS: A HT prognostic predict model was developed with Lasso Logistic Regression based on routinely collected EMR data. This model can identify patients who have a higher risk of HT than the population average with an AUC of 0.78. It shows the OMOP CDM is an appropriate data standard for EMR secondary use in clinical multicenter research for prognostic prediction model development and validation. In the future, combining this model with clinical information systems will assist clinicians to make the right therapy decision for patients with acute ischemic stroke.


Assuntos
Isquemia Encefálica/complicações , Hemorragia Cerebral/diagnóstico , Modelos Estatísticos , Medição de Risco/métodos , Acidente Vascular Cerebral/complicações , Hemorragia Cerebral/etiologia , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Curva ROC , Estudos Retrospectivos , Fatores de Risco
16.
Front Neuroinform ; 12: 36, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29962944

RESUMO

Objective: The heterogeneity of amyotrophic lateral sclerosis (ALS) survival duration, which varies from <1 year to >10 years, challenges clinical decisions and trials. Utilizing data from 801 deceased ALS patients, we: (1) assess the underlying complex relationships among common clinical ALS metrics; (2) identify which clinical ALS metrics are the "best" survival predictors and how their predictive ability changes as a function of disease progression. Methods: Analyses included examination of relationships within the raw data as well as the construction of interactive survival regression and classification models (generalized linear model and random forests model). Dimensionality reduction and feature clustering enabled decomposition of clinical variable contributions. Thirty-eight metrics were utilized, including Medical Research Council (MRC) muscle scores; respiratory function, including forced vital capacity (FVC) and FVC % predicted, oxygen saturation, negative inspiratory force (NIF); the Revised ALS Functional Rating Scale (ALSFRS-R) and its activities of daily living (ADL) and respiratory sub-scores; body weight; onset type, onset age, gender, and height. Prognostic random forest models confirm the dominance of patient age-related parameters decline in classifying survival at thresholds of 30, 60, 90, and 180 days and 1, 2, 3, 4, and 5 years. Results: Collective prognostic insight derived from the overall investigation includes: multi-dimensionality of ALSFRS-R scores suggests cautious usage for survival forecasting; upper and lower extremities independently degenerate and are autonomous from respiratory decline, with the latter associating with nearer-to-death classifications; height and weight-based metrics are auxiliary predictors for farther-from-death classifications; sex and onset site (limb, bulbar) are not independent survival predictors due to age co-correlation. Conclusion: The dimensionality and fluctuating predictors of ALS survival must be considered when developing predictive models for clinical trial development or in-clinic usage. Additional independent metrics and possible revisions to current metrics, like the ALSFRS-R, are needed to capture the underlying complexity needed for population and personalized forecasting of survival.

18.
J Neuromuscul Dis ; 2(2): 137-150, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26594635

RESUMO

BACKGROUND: The SOD1 G93A mouse model of amyotrophic lateral sclerosis (ALS) is the most frequently used model to examine ALS pathophysiology. There is a lack of homogeneity in usage of the SOD1 G93A mouse, including differences in genetic background and gender, which could confound the field's results. OBJECTIVE: In an analysis of 97 studies, we characterized the ALS progression for the high transgene copy control SOD1 G93A mouse on the basis of disease onset, overall lifespan, and disease duration for male and female mice on the B6SJL and C57BL/6J genetic backgrounds and quantified magnitudes of differences between groups. METHODS: Mean age at onset, onset assessment measure, disease duration, and overall lifespan data from each study were extracted and statistically modeled as the response of linear regression with the sex and genetic background factored as predictors. Additional examination was performed on differing experimental onset and endpoint assessment measures. RESULTS: C57BL/6 background mice show delayed onset of symptoms, increased lifespan, and an extended disease duration compared to their sex-matched B6SJL counterparts. Female B6SJL generally experience extended lifespan and delayed onset compared to their male counterparts, while female mice on the C57BL/6 background show delayed onset but no difference in survival compared to their male counterparts. Finally, different experimental protocols (tremor, rotarod, etc.) for onset determination result in notably different onset means. CONCLUSIONS: Overall, the observed effect of sex on disease endpoints was smaller than that which can be attributed to the genetic background. The often-reported increase in lifespan for female mice was observed only for mice on the B6SJL background, implicating a strain-dependent effect of sex on disease progression that manifests despite identical mutant SOD1 expression.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA