Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Eur J Clin Invest ; 54(6): e14183, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38381530

RESUMO

Large language models (LLMs) are a type of machine learning model that learn statistical patterns over text, such as predicting the next words in a sequence of text. Both general purpose and task-specific LLMs have demonstrated potential across diverse applications. Science and medicine have many data types that are highly suitable for LLMs, such as scientific texts (publications, patents and textbooks), electronic medical records, large databases of DNA and protein sequences and chemical compounds. Carefully validated systems that can understand and reason across all these modalities may maximize benefits. Despite the inevitable limitations and caveats of any new technology and some uncertainties specific to LLMs, LLMs have the potential to be transformative in science and medicine.


Assuntos
Aprendizado de Máquina , Humanos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Medicina , Ciência , Patentes como Assunto
2.
Crit Care ; 28(1): 113, 2024 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589940

RESUMO

BACKGROUND: Perhaps nowhere else in the healthcare system than in the intensive care unit environment are the challenges to create useful models with direct time-critical clinical applications more relevant and the obstacles to achieving those goals more massive. Machine learning-based artificial intelligence (AI) techniques to define states and predict future events are commonplace activities of modern life. However, their penetration into acute care medicine has been slow, stuttering and uneven. Major obstacles to widespread effective application of AI approaches to the real-time care of the critically ill patient exist and need to be addressed. MAIN BODY: Clinical decision support systems (CDSSs) in acute and critical care environments support clinicians, not replace them at the bedside. As will be discussed in this review, the reasons are many and include the immaturity of AI-based systems to have situational awareness, the fundamental bias in many large databases that do not reflect the target population of patient being treated making fairness an important issue to address and technical barriers to the timely access to valid data and its display in a fashion useful for clinical workflow. The inherent "black-box" nature of many predictive algorithms and CDSS makes trustworthiness and acceptance by the medical community difficult. Logistically, collating and curating in real-time multidimensional data streams of various sources needed to inform the algorithms and ultimately display relevant clinical decisions support format that adapt to individual patient responses and signatures represent the efferent limb of these systems and is often ignored during initial validation efforts. Similarly, legal and commercial barriers to the access to many existing clinical databases limit studies to address fairness and generalizability of predictive models and management tools. CONCLUSIONS: AI-based CDSS are evolving and are here to stay. It is our obligation to be good shepherds of their use and further development.


Assuntos
Algoritmos , Inteligência Artificial , Humanos , Cuidados Críticos , Unidades de Terapia Intensiva , Atenção à Saúde
3.
J Biomed Inform ; 156: 104683, 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38925281

RESUMO

OBJECTIVE: Despite increased availability of methodologies to identify algorithmic bias, the operationalization of bias evaluation for healthcare predictive models is still limited. Therefore, this study proposes a process for bias evaluation through an empirical assessment of common hospital readmission models. The process includes selecting bias measures, interpretation, determining disparity impact and potential mitigations. METHODS: This retrospective analysis evaluated racial bias of four common models predicting 30-day unplanned readmission (i.e., LACE Index, HOSPITAL Score, and the CMS readmission measure applied as is and retrained). The models were assessed using 2.4 million adult inpatient discharges in Maryland from 2016 to 2019. Fairness metrics that are model-agnostic, easy to compute, and interpretable were implemented and apprised to select the most appropriate bias measures. The impact of changing model's risk thresholds on these measures was further assessed to guide the selection of optimal thresholds to control and mitigate bias. RESULTS: Four bias measures were selected for the predictive task: zero-one-loss difference, false negative rate (FNR) parity, false positive rate (FPR) parity, and generalized entropy index. Based on these measures, the HOSPITAL score and the retrained CMS measure demonstrated the lowest racial bias. White patients showed a higher FNR while Black patients resulted in a higher FPR and zero-one-loss. As the models' risk threshold changed, trade-offs between models' fairness and overall performance were observed, and the assessment showed all models' default thresholds were reasonable for balancing accuracy and bias. CONCLUSIONS: This study proposes an Applied Framework to Assess Fairness of Predictive Models (AFAFPM) and demonstrates the process using 30-day hospital readmission model as the example. It suggests the feasibility of applying algorithmic bias assessment to determine optimized risk thresholds so that predictive models can be used more equitably and accurately. It is evident that a combination of qualitative and quantitative methods and a multidisciplinary team are necessary to identify, understand and respond to algorithm bias in real-world healthcare settings. Users should also apply multiple bias measures to ensure a more comprehensive, tailored, and balanced view. The results of bias measures, however, must be interpreted with caution and consider the larger operational, clinical, and policy context.

4.
J Med Internet Res ; 26: e47125, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38422347

RESUMO

BACKGROUND: The adoption of predictive algorithms in health care comes with the potential for algorithmic bias, which could exacerbate existing disparities. Fairness metrics have been proposed to measure algorithmic bias, but their application to real-world tasks is limited. OBJECTIVE: This study aims to evaluate the algorithmic bias associated with the application of common 30-day hospital readmission models and assess the usefulness and interpretability of selected fairness metrics. METHODS: We used 10.6 million adult inpatient discharges from Maryland and Florida from 2016 to 2019 in this retrospective study. Models predicting 30-day hospital readmissions were evaluated: LACE Index, modified HOSPITAL score, and modified Centers for Medicare & Medicaid Services (CMS) readmission measure, which were applied as-is (using existing coefficients) and retrained (recalibrated with 50% of the data). Predictive performances and bias measures were evaluated for all, between Black and White populations, and between low- and other-income groups. Bias measures included the parity of false negative rate (FNR), false positive rate (FPR), 0-1 loss, and generalized entropy index. Racial bias represented by FNR and FPR differences was stratified to explore shifts in algorithmic bias in different populations. RESULTS: The retrained CMS model demonstrated the best predictive performance (area under the curve: 0.74 in Maryland and 0.68-0.70 in Florida), and the modified HOSPITAL score demonstrated the best calibration (Brier score: 0.16-0.19 in Maryland and 0.19-0.21 in Florida). Calibration was better in White (compared to Black) populations and other-income (compared to low-income) groups, and the area under the curve was higher or similar in the Black (compared to White) populations. The retrained CMS and modified HOSPITAL score had the lowest racial and income bias in Maryland. In Florida, both of these models overall had the lowest income bias and the modified HOSPITAL score showed the lowest racial bias. In both states, the White and higher-income populations showed a higher FNR, while the Black and low-income populations resulted in a higher FPR and a higher 0-1 loss. When stratified by hospital and population composition, these models demonstrated heterogeneous algorithmic bias in different contexts and populations. CONCLUSIONS: Caution must be taken when interpreting fairness measures' face value. A higher FNR or FPR could potentially reflect missed opportunities or wasted resources, but these measures could also reflect health care use patterns and gaps in care. Simply relying on the statistical notions of bias could obscure or underplay the causes of health disparity. The imperfect health data, analytic frameworks, and the underlying health systems must be carefully considered. Fairness measures can serve as a useful routine assessment to detect disparate model performances but are insufficient to inform mechanisms or policy changes. However, such an assessment is an important first step toward data-driven improvement to address existing health disparities.


Assuntos
Medicare , Readmissão do Paciente , Idoso , Adulto , Humanos , Estados Unidos , Estudos Retrospectivos , Hospitais , Florida/epidemiologia
5.
JAMA ; 331(3): 245-249, 2024 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-38117493

RESUMO

Importance: Given the importance of rigorous development and evaluation standards needed of artificial intelligence (AI) models used in health care, nationwide accepted procedures to provide assurance that the use of AI is fair, appropriate, valid, effective, and safe are urgently needed. Observations: While there are several efforts to develop standards and best practices to evaluate AI, there is a gap between having such guidance and the application of such guidance to both existing and new AI models being developed. As of now, there is no publicly available, nationwide mechanism that enables objective evaluation and ongoing assessment of the consequences of using health AI models in clinical care settings. Conclusion and Relevance: The need to create a public-private partnership to support a nationwide health AI assurance labs network is outlined here. In this network, community best practices could be applied for testing health AI models to produce reports on their performance that can be widely shared for managing the lifecycle of AI models over time and across populations and sites where these models are deployed.


Assuntos
Inteligência Artificial , Atenção à Saúde , Laboratórios , Garantia da Qualidade dos Cuidados de Saúde , Qualidade da Assistência à Saúde , Inteligência Artificial/normas , Instalações de Saúde/normas , Laboratórios/normas , Parcerias Público-Privadas , Garantia da Qualidade dos Cuidados de Saúde/normas , Atenção à Saúde/normas , Qualidade da Assistência à Saúde/normas , Estados Unidos
6.
Lupus ; 30(1): 15-24, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33115373

RESUMO

OBJECTIVE: To characterize the longitudinal trajectory of estimated glomerular filtration rate (eGFR) in patients with systemic lupus erythematosus (SLE) and identify predictors of the change in eGFR trajectory. METHODS: The longitudinal eGFR levels of patients in the Hopkins Lupus Cohort were modelled by piecewise linear regression to evaluate the slope of different line segments. The slopes were classified into declining (≤-4 mL/min/1.73 m2 per year), stable (-4 to 4 mL/min/1.73 m2 per year), and increasing (≥4 mL/min/1.73 m2 per year) states. The transition rate between states and the impact of clinical parameters were estimated by a Markov model. RESULTS: The analysis was based on 494 SLE patients. At a mean follow-up of 8.8 years, 347 (70.2%), 107 (21.7%), 33 (6.7%), and 7 (1.4%) patients had zero, one, two, and three state transitions, respectively. In patients with no transition, 37 (10.7%), 308 (88.8%), and 2 (0.6%) were in declining, stable, and increasing state, respectively. In patients with one transition, 43 (40.2%) changed from declining to stable state while 29 (27.1%) changed from stable to declining state. When patients were in a non-declining GFR state, those who were younger and African Americans were more likely to transition to a declining GFR state. In adjusted analyses, high blood pressure, C4 and low hematocrit were associated with change from non-declining to declining state. High urine protein-to-creatinine ratio also tended to be associated with change from non-declining to declining state. African American patients were less likely to move from declining to non-declining state. Use of prednisone was associated with change from declining to non-declining state. CONCLUSIONS: Patients with high blood pressure, low complement C4, low haematocrit, and high urine protein-to-creatinine ratio are more likely to have a declining eGFR trajectory, while the use of prednisone stabilizes the declining eGFR trajectory.


Assuntos
Taxa de Filtração Glomerular/efeitos dos fármacos , Rim/fisiopatologia , Lúpus Eritematoso Sistêmico/tratamento farmacológico , Lúpus Eritematoso Sistêmico/fisiopatologia , Prednisona/uso terapêutico , Adulto , Negro ou Afro-Americano/estatística & dados numéricos , Complemento C4/metabolismo , Creatinina/urina , Progressão da Doença , Feminino , Hematócrito , Humanos , Hipertensão/complicações , Rim/efeitos dos fármacos , Modelos Lineares , Estudos Longitudinais , Lúpus Eritematoso Sistêmico/sangue , Lúpus Eritematoso Sistêmico/urina , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Valor Preditivo dos Testes , Proteinúria/complicações , Estudos Retrospectivos
7.
Ann Intern Med ; 172(11 Suppl): S137-S144, 2020 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-32479180

RESUMO

Increasingly, interventions aimed at improving care are likely to use such technologies as machine learning and artificial intelligence. However, health care has been relatively late to adopt them. This article provides clinical examples in which machine learning and artificial intelligence are already in use in health care and appear to deliver benefit. Three key bottlenecks toward increasing the pace of diffusion and adoption are methodological issues in evaluation of artificial intelligence-based interventions, reporting standards to enable assessment of model performance, and issues that need to be addressed for an institution to adopt these interventions. Methodological best practices will include external validation, ideally at a different site; use of proactive learning algorithms to correct for site-specific biases and increase robustness as algorithms are deployed across multiple sites; addressing subgroup performance; and communicating to providers the uncertainty of predictions. Regarding reporting, especially important issues are the extent to which implementing standardized approaches for introducing clinical decision support has been followed, describing the data sources, reporting on data assumptions, and addressing biases. Although most health care organizations in the United States have adopted electronic health records, they may be ill prepared to adopt machine learning and artificial intelligence. Several steps can enable this: preparing data, developing tools to get suggestions to clinicians in useful ways, and getting clinicians engaged in the process. Open challenges and the role of regulation in this area are briefly discussed. Although these techniques have enormous potential to improve care and personalize recommendations for individuals, the hype regarding them is tremendous. Organizations will need to approach this domain carefully with knowledgeable partners to obtain the hoped-for benefits and avoid failures.


Assuntos
Algoritmos , Inteligência Artificial , Sistemas de Apoio a Decisões Clínicas/organização & administração , Atenção à Saúde/normas , Aprendizado de Máquina , Humanos
8.
Crit Care Med ; 48(6): 808-814, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32271185

RESUMO

OBJECTIVES: To evaluate associations between a readily availvable composite measurement of neighborhood socioeconomic disadvantage (the area deprivation index) and 30-day readmissions for patients who were previously hospitalized with sepsis. DESIGN: A retrospective study. SETTING: An urban, academic medical institution. PATIENTS: The authors conducted a manual audit for adult patients (18 yr old or older) discharged with an International Classification of Diseases, 10th edition code of sepsis during the 2017 fiscal year to confirm that they met SEP-3 criteria. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: The area deprivation index is a publicly available composite score constructed from socioeconomic components (e.g., income, poverty, education, housing characteristics) based on census block level, where higher scores are associated with more disadvantaged areas (range, 1-100). Using discharge data from the hospital population health database, residential addresses were geocoded and linked to their respective area deprivation index. Patient characteristics, contextual-level variables, and readmissions were compared by t tests for continuous variables and Fisher exact test for categorical variables. The associations between readmissions and area deprivation index were explored using logistic regression models. A total of 647 patients had an International Classification of Diseases, 10th edition diagnosis code of sepsis. Of these 647, 116 (17.9%) either died in hospital or were discharged to hospice and were excluded from our analysis. Of the remaining 531 patients, the mean age was 61.0 years (± 17.6 yr), 281 were females (52.9%), and 164 (30.9%) were active smokers. The mean length of stay was 6.9 days (± 5.6 d) with the mean Sequential Organ Failure Assessment score 4.9 (± 2.5). The mean area deprivation index was 54.2 (± 23.8). The mean area deprivation index of patients who were readmitted was 62.5 (± 27.4), which was significantly larger than the area deprivation index of patients not readmitted (51.8 [± 22.2]) (p < 0.001). In adjusted logistic regression models, a greater area deprivation index was significantly associated with readmissions (ß, 0.03; p < 0.001). CONCLUSIONS: Patients who reside in more disadvantaged neighborhoods have a significantly higher risk for 30-day readmission following a hospitalization for sepsis. The insight provided by neighborhood disadvantage scores, such as the area deprivation index, may help to better understand how contextual-level socioeconomic status affects the burden of sepsis-related morbidity.


Assuntos
Readmissão do Paciente/estatística & dados numéricos , Características de Residência/estatística & dados numéricos , Sepse/epidemiologia , Centros Médicos Acadêmicos , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Comorbidade , Feminino , Hospitais Urbanos , Humanos , Tempo de Internação/estatística & dados numéricos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Escores de Disfunção Orgânica , Alta do Paciente/estatística & dados numéricos , Estudos Retrospectivos , Fatores Sexuais , Fumar/epidemiologia , Fatores Socioeconômicos
10.
Crit Care Med ; 47(9): 1232-1234, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31162207

RESUMO

OBJECTIVES: To compare noninvasive mobility sensor patient motion signature to direct observations by physicians and nurses. DESIGN: Prospective, observational study. SETTING: Academic hospital surgical ICU. PATIENTS AND MEASUREMENTS: A total of 2,426 1-minute clips from six ICU patients (development dataset) and 4,824 1-minute clips from five patients (test dataset). INTERVENTIONS: None. MAIN RESULTS: Noninvasive mobility sensor achieved a minute-level accuracy of 94.2% (2,138/2,272) and an hour-level accuracy of 81.4% (70/86). CONCLUSIONS: The automated noninvasive mobility sensor system represents a significant departure from current manual measurement and reporting used in clinical care, lowering the burden of measurement and documentation on caregivers.


Assuntos
Deambulação Precoce/instrumentação , Unidades de Terapia Intensiva/organização & administração , Tecnologia de Sensoriamento Remoto/instrumentação , Centros Médicos Acadêmicos , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Feminino , Humanos , Masculino , Estudos Prospectivos
11.
PLoS Med ; 15(12): e1002721, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30596635

RESUMO

Machine Learning Special Issue Guest Editors Suchi Saria, Atul Butte, and Aziz Sheikh cut through the hyperbole with an accessible and accurate portrayal of the forefront of machine learning in clinical translation.


Assuntos
Aprendizado de Máquina/tendências , Medicina/tendências , Inteligência Artificial/normas , Inteligência Artificial/tendências , Diagnóstico por Computador/normas , Diagnóstico por Computador/tendências , Humanos , Aprendizado de Máquina/normas , Medicina/normas
12.
Crit Care Med ; 45(4): 630-636, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28291092

RESUMO

OBJECTIVES: To develop and validate a noninvasive mobility sensor to automatically and continuously detect and measure patient mobility in the ICU. DESIGN: Prospective, observational study. SETTING: Surgical ICU at an academic hospital. PATIENTS: Three hundred sixty-two hours of sensor color and depth image data were recorded and curated into 109 segments, each containing 1,000 images, from eight patients. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Three Microsoft Kinect sensors (Microsoft, Beijing, China) were deployed in one ICU room to collect continuous patient mobility data. We developed software that automatically analyzes the sensor data to measure mobility and assign the highest level within a time period. To characterize the highest mobility level, a validated 11-point mobility scale was collapsed into four categories: nothing in bed, in-bed activity, out-of-bed activity, and walking. Of the 109 sensor segments, the noninvasive mobility sensor was developed using 26 of these from three ICU patients and validated on 83 remaining segments from five different patients. Three physicians annotated each segment for the highest mobility level. The weighted Kappa (κ) statistic for agreement between automated noninvasive mobility sensor output versus manual physician annotation was 0.86 (95% CI, 0.72-1.00). Disagreement primarily occurred in the "nothing in bed" versus "in-bed activity" categories because "the sensor assessed movement continuously," which was significantly more sensitive to motion than physician annotations using a discrete manual scale. CONCLUSIONS: Noninvasive mobility sensor is a novel and feasible method for automating evaluation of ICU patient mobility.


Assuntos
Unidades de Terapia Intensiva , Monitorização Fisiológica/métodos , Movimento , Idoso , Algoritmos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Monitorização Fisiológica/instrumentação , Estudos Prospectivos , Gravação em Vídeo/instrumentação , Caminhada
15.
Am J Obstet Gynecol ; 211(1): 41.e1-8, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24657795

RESUMO

OBJECTIVE: Our objective was to identify perinatal risk factors that are available within 1 hour of birth that are associated with severe brain injury after hypothermia treatment for suspected hypoxic-ischemic encephalopathy. STUDY DESIGN: One hundred nine neonates at ≥35 weeks' gestation who were admitted from January 2007 to September 2012 with suspected hypoxic-ischemic encephalopathy were treated with whole-body hypothermia; 98 of them (90%) underwent brain magnetic resonance imaging (MRI) at 7-10 days of life. Eight neonates died before brain imaging. Neonates who had severe brain injury, which was defined as death or abnormal MRI results (cases), were compared with surviving neonates with normal MRI (control subjects). Logistic regression models were used to identify risk factors that were predictive of severe injury. RESULTS: Cases and control subjects did not differ with regard to gestational age, birthweight, mode of delivery, or diagnosis of nonreassuring fetal heart rate before delivery. Cases were significantly (P < .05) more likely to have had an abruption, a cord and neonatal arterial gas level that showed metabolic acidosis, lower platelet counts, lower glucose level, longer time to spontaneous respirations, intubation, chest compressions in the delivery room, and seizures. In multivariable logistic regression, lower initial neonatal arterial pH (P = .004), spontaneous respiration at >30 minutes of life (P = .002), and absence of exposure to oxytocin (P = .033) were associated independently with severe injury with 74.3% sensitivity and 74.4% specificity. CONCLUSION: Worsening metabolic acidosis at birth, longer time to spontaneous respirations, and lack of exposure to oxytocin correlated with severe brain injury in neonates who were treated with whole-body hypothermia. These risk factors may help quickly identify neonatal candidates for time-sensitive investigational therapies for brain neuroprotection.


Assuntos
Dano Encefálico Crônico/etiologia , Técnicas de Apoio para a Decisão , Hipotermia Induzida , Hipóxia-Isquemia Encefálica/terapia , Índice de Gravidade de Doença , Dano Encefálico Crônico/diagnóstico , Dano Encefálico Crônico/prevenção & controle , Estudos de Casos e Controles , Feminino , Humanos , Hipóxia-Isquemia Encefálica/complicações , Recém-Nascido , Modelos Logísticos , Imageamento por Ressonância Magnética , Masculino , Análise Multivariada , Prognóstico , Curva ROC , Medição de Risco , Fatores de Risco , Resultado do Tratamento
16.
BMJ Open ; 14(4): e082540, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38594078

RESUMO

OBJECTIVE: To predict the risk of hospital-acquired pressure injury using machine learning compared with standard care. DESIGN: We obtained electronic health records (EHRs) to structure a multilevel cohort of hospitalised patients at risk for pressure injury and then calibrate a machine learning model to predict future pressure injury risk. Optimisation methods combined with multilevel logistic regression were used to develop a predictive algorithm of patient-specific shifts in risk over time. Machine learning methods were tested, including random forests, to identify predictive features for the algorithm. We reported the results of the regression approach as well as the area under the receiver operating characteristics (ROC) curve for predictive models. SETTING: Hospitalised inpatients. PARTICIPANTS: EHRs of 35 001 hospitalisations over 5 years across 2 academic hospitals. MAIN OUTCOME MEASURE: Longitudinal shifts in pressure injury risk. RESULTS: The predictive algorithm with features generated by machine learning achieved significantly improved prediction of pressure injury risk (p<0.001) with an area under the ROC curve of 0.72; whereas standard care only achieved an area under the ROC curve of 0.52. At a specificity of 0.50, the predictive algorithm achieved a sensitivity of 0.75. CONCLUSIONS: These data could help hospitals conserve resources within a critical period of patient vulnerability of hospital-acquired pressure injury which is not reimbursed by US Medicare; thus, conserving between 30 000 and 90 000 labour-hours per year in an average 500-bed hospital. Hospitals can use this predictive algorithm to initiate a quality improvement programme for pressure injury prevention and further customise the algorithm to patient-specific variation by facility.


Assuntos
Úlcera por Pressão , Humanos , Idoso , Estados Unidos/epidemiologia , Estudos de Coortes , Úlcera por Pressão/epidemiologia , Úlcera por Pressão/prevenção & controle , Registros Eletrônicos de Saúde , Medicare , Aprendizado de Máquina , Estudos Retrospectivos , Curva ROC
17.
J Am Med Inform Assoc ; 29(8): 1323-1333, 2022 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-35579328

RESUMO

OBJECTIVE: Health care providers increasingly rely upon predictive algorithms when making important treatment decisions, however, evidence indicates that these tools can lead to inequitable outcomes across racial and socio-economic groups. In this study, we introduce a bias evaluation checklist that allows model developers and health care providers a means to systematically appraise a model's potential to introduce bias. MATERIALS AND METHODS: Our methods include developing a bias evaluation checklist, a scoping literature review to identify 30-day hospital readmission prediction models, and assessing the selected models using the checklist. RESULTS: We selected 4 models for evaluation: LACE, HOSPITAL, Johns Hopkins ACG, and HATRIX. Our assessment identified critical ways in which these algorithms can perpetuate health care inequalities. We found that LACE and HOSPITAL have the greatest potential for introducing bias, Johns Hopkins ACG has the most areas of uncertainty, and HATRIX has the fewest causes for concern. DISCUSSION: Our approach gives model developers and health care providers a practical and systematic method for evaluating bias in predictive models. Traditional bias identification methods do not elucidate sources of bias and are thus insufficient for mitigation efforts. With our checklist, bias can be addressed and eliminated before a model is fully developed or deployed. CONCLUSION: The potential for algorithms to perpetuate biased outcomes is not isolated to readmission prediction models; rather, we believe our results have implications for predictive models across health care. We offer a systematic method for evaluating potential bias with sufficient flexibility to be utilized across models and applications.


Assuntos
Lista de Checagem , Readmissão do Paciente , Viés , Disparidades em Assistência à Saúde , Hospitais , Humanos
18.
NPJ Digit Med ; 5(1): 97, 2022 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-35864312

RESUMO

While a growing number of machine learning (ML) systems have been deployed in clinical settings with the promise of improving patient care, many have struggled to gain adoption and realize this promise. Based on a qualitative analysis of coded interviews with clinicians who use an ML-based system for sepsis, we found that, rather than viewing the system as a surrogate for their clinical judgment, clinicians perceived themselves as partnering with the technology. Our findings suggest that, even without a deep understanding of machine learning, clinicians can build trust with an ML system through experience, expert endorsement and validation, and systems designed to accommodate clinicians' autonomy and support them across their entire workflow.

19.
Precis Nutr ; 1(3): e00017, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37744083

RESUMO

Background: Most studies on the association of in utero exposure to cigarette smoking and childhood overweight or obesity (OWO) were based on maternal self-reported smoking status, and few were based on objective biomarkers. The concordance of self-report smoking, and maternal and cord blood biomarkers of cigarette smoking as well as their effects on children's long-term risk of overweight and obesity are unclear. Methods: In this study, we analyzed data from 2351 mother-child pairs in the Boston Birth Cohort, a sample of US predominantly Black, indigenous, and people of color (BIPOC) that enrolled children at birth and followed prospectively up to age 18 years. In utero smoking exposure was measured by maternal self-report and by maternal and cord plasma biomarkers of smoking: cotinine and hydroxycotinine. We assessed the individual and joint associations of each smoking exposure measure and maternal OWO with childhood OWO using multinomial logistic regressions. We used nested logistic regressions to investigate the childhood OWO prediction performance when adding maternal and cord plasma biomarkers as input covariates on top of self-reported data. Results: Our results demonstrated that in utero cigarette smoking exposure defined by self-report and by maternal or cord metabolites was consistently associated with increased risk of long-term child OWO. Children with cord hydroxycotinine in the fourth quartile (vs. first quartile) had 1.66 (95% confidence interval [CI] 1.03-2.66) times the odds for overweight and 1.57 (95% CI 1.05-2.36) times the odds for obesity. The combined effect of maternal OWO and smoking on offspring risk of obesity is 3.66 (95% CI 2.37-5.67) if using self-reported smoking. Adding maternal and cord plasma biomarker information to self-reported data improved the prediction accuracy of long-term child OWO risk. Conclusions: This longitudinal birth cohort study of US BIPOC underscored the role of maternal smoking as an obesogen for offspring OWO risk. Our findings call for public health intervention strategies to focus on maternal smoking - as a highly modifiable target, including smoking cessation and countermeasures (such as optimal nutrition) that may alleviate the increasing obesity burden in the United States and globally.

20.
Cancer Inform ; 21: 11769351221136081, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36439024

RESUMO

Tumor mutational burden (TMB), a surrogate for tumor neoepitope burden, is used as a pan-tumor biomarker to identify patients who may benefit from anti-program cell death 1 (PD1) immunotherapy, but it is an imperfect biomarker. Multiple additional genomic characteristics are associated with anti-PD1 responses, but the combined predictive value of these features and the added informativeness of each respective feature remains unknown. We evaluated whether machine learning (ML) approaches using proposed determinants of anti-PD1 response derived from whole exome sequencing (WES) could improve prediction of anti-PD1 responders over TMB alone. Random forest classifiers were trained on publicly available anti-PD1 data (n = 104), and subsequently tested on an independent anti-PD1 cohort (n = 69). Both the training and test datasets included a range of cancer types such as non-small cell lung cancer (NSCLC), head and neck squamous cell carcinoma (HNSCC), melanoma, and smaller numbers of patients from other tumor types. Features used include summaries such as TMB and number of frameshift mutations, as well as more gene-level features such as counts of mutations associated with immune checkpoint response and resistance. Both ML algorithms demonstrated area under the receiver-operator curves (AUC) that exceeded TMB alone (AUC 0.63 "human-guided," 0.64 "cluster," and 0.58 TMB alone). Mutations within oncogenes disproportionately modulate anti-PD1 responses relative to their overall contribution to tumor neoepitope burden. The use of a ML algorithm evaluating multiple proposed genomic determinants of anti-PD1 responses modestly improves performance over TMB alone, highlighting the need to integrate other biomarkers to further improve model performance.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa