Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Antimicrob Agents Chemother ; 65(7): e0006321, 2021 06 17.
Artículo en Inglés | MEDLINE | ID: mdl-33972243

RESUMEN

Infection caused by carbapenem-resistant (CR) organisms is a rising problem in the United States. While the risk factors for antibiotic resistance are well known, there remains a large need for the early identification of antibiotic-resistant infections. Using machine learning (ML), we sought to develop a prediction model for carbapenem resistance. All patients >18 years of age admitted to a tertiary-care academic medical center between 1 January 2012 and 10 October 2017 with ≥1 bacterial culture were eligible for inclusion. All demographic, medication, vital sign, procedure, laboratory, and culture/sensitivity data were extracted from the electronic health record. Organisms were considered CR if a single isolate was reported as intermediate or resistant. Patients with CR and non-CR organisms were temporally matched to maintain the positive/negative case ratio. Extreme gradient boosting was used for model development. In total, 68,472 patients met inclusion criteria, with 1,088 patients identified as having CR organisms. Sixty-seven features were used for predictive modeling. The most important features were number of prior antibiotic days, recent central venous catheter placement, and inpatient surgery. After model training, the area under the receiver operating characteristic curve was 0.846. The sensitivity of the model was 30%, with a positive predictive value (PPV) of 30% and a negative predictive value of 99%. Using readily available clinical data, we were able to create a ML model capable of predicting CR infections at the time of culture collection with a high PPV.


Asunto(s)
Carbapenémicos , Aprendizaje Automático , Carbapenémicos/farmacología , Humanos , Valor Predictivo de las Pruebas , Estudios Retrospectivos , Medición de Riesgo
2.
Crit Care Med ; 49(4): e433-e443, 2021 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-33591014

RESUMEN

OBJECTIVES: Assess the impact of heterogeneity among established sepsis criteria (Sepsis-1, Sepsis-3, Centers for Disease Control and Prevention Adult Sepsis Event, and Centers for Medicare and Medicaid severe sepsis core measure 1) through the comparison of corresponding sepsis cohorts. DESIGN: Retrospective analysis of data extracted from electronic health record. SETTING: Single, tertiary-care center in St. Louis, MO. PATIENTS: Adult, nonsurgical inpatients admitted between January 1, 2012, and January 6, 2018. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: In the electronic health record data, 286,759 encounters met inclusion criteria across the study period. Application of established sepsis criteria yielded cohorts varying in prevalence: Centers for Disease Control and Prevention Adult Sepsis Event (4.4%), Centers for Medicare and Medicaid severe sepsis core measure 1 (4.8%), International Classification of Disease code (7.2%), Sepsis-3 (7.5%), and Sepsis-1 (11.3%). Between the two modern established criteria, Sepsis-3 (n = 21,550) and Centers for Disease Control and Prevention Adult Sepsis Event (n = 12,494), the size of the overlap was 7,763. The sepsis cohorts also varied in time from admission to sepsis onset (hr): Sepsis-1 (2.9), Sepsis-3 (4.1), Centers for Disease Control and Prevention Adult Sepsis Event (4.6), and Centers for Medicare and Medicaid severe sepsis core measure 1 (7.6); sepsis discharge International Classification of Disease code rate: Sepsis-1 (37.4%), Sepsis-3 (40.1%), Centers for Medicare and Medicaid severe sepsis core measure 1 (48.5%), and Centers for Disease Control and Prevention Adult Sepsis Event (54.5%); and inhospital mortality rate: Sepsis-1 (13.6%), Sepsis-3 (18.8%), International Classification of Disease code (20.4%), Centers for Medicare and Medicaid severe sepsis core measure 1 (22.5%), and Centers for Disease Control and Prevention Adult Sepsis Event (24.1%). CONCLUSIONS: The application of commonly used sepsis definitions on a single population produced sepsis cohorts with low agreement, significantly different baseline demographics, and clinical outcomes.


Asunto(s)
Bases de Datos Factuales/estadística & datos numéricos , Sepsis/clasificación , Sepsis/diagnóstico , Índice de Severidad de la Enfermedad , Humanos , Clasificación Internacional de Enfermedades , Evaluación de Resultado en la Atención de Salud , Estudios Retrospectivos , Sepsis/epidemiología , Choque Séptico/clasificación , Choque Séptico/diagnóstico , Estados Unidos
3.
MMWR Morb Mortal Wkly Rep ; 70(12): 449-455, 2021 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-33764961

RESUMEN

Many kindergarten through grade 12 (K-12) schools offering in-person learning have adopted strategies to limit the spread of SARS-CoV-2, the virus that causes COVID-19 (1). These measures include mandating use of face masks, physical distancing in classrooms, increasing ventilation with outdoor air, identification of close contacts,* and following CDC isolation and quarantine guidance† (2). A 2-week pilot investigation was conducted to investigate occurrences of SARS-CoV-2 secondary transmission in K-12 schools in the city of Springfield, Missouri, and in St. Louis County, Missouri, during December 7-18, 2020. Schools in both locations implemented COVID-19 mitigation strategies; however, Springfield implemented a modified quarantine policy permitting student close contacts aged ≤18 years who had school-associated contact with a person with COVID-19 and met masking requirements during their exposure to continue in-person learning.§ Participating students, teachers, and staff members with COVID-19 (37) from 22 schools and their school-based close contacts (contacts) (156) were interviewed, and contacts were offered SARS-CoV-2 testing. Among 102 school-based contacts who received testing, two (2%) had positive test results indicating probable school-based SARS-CoV-2 secondary transmission. Both contacts were in Springfield and did not meet criteria to participate in the modified quarantine. In Springfield, 42 student contacts were permitted to continue in-person learning under the modified quarantine; among the 30 who were interviewed, 21 were tested, and none received a positive test result. Despite high community transmission, SARS-CoV-2 transmission in schools implementing COVID-19 mitigation strategies was lower than that in the community. Until additional data are available, K-12 schools should continue implementing CDC-recommended mitigation measures (2) and follow CDC isolation and quarantine guidance to minimize secondary transmission in schools offering in-person learning.


Asunto(s)
COVID-19/prevención & control , COVID-19/transmisión , Instituciones Académicas/organización & administración , Instituciones Académicas/estadística & datos numéricos , Adolescente , Adulto , COVID-19/epidemiología , Prueba de Ácido Nucleico para COVID-19 , Niño , Preescolar , Trazado de Contacto , Femenino , Humanos , Masculino , Máscaras/estadística & datos numéricos , Persona de Mediana Edad , Missouri/epidemiología , Distanciamiento Físico , Proyectos Piloto , Cuarentena , SARS-CoV-2/aislamiento & purificación , Ventilación/estadística & datos numéricos
4.
J Biomed Inform ; 60: 95-103, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26828957

RESUMEN

BACKGROUND: Community-level factors have been clearly linked to health outcomes, but are challenging to incorporate into medical practice. Increasing use of electronic health records (EHRs) makes patient-level data available for researchers in a systematic and accessible way, but these data remain siloed from community-level data relevant to health. PURPOSE: This study sought to link community and EHR data from an older female patient cohort participating in an ongoing intervention at the Ohio State University Wexner Medical Center to associate community-level data with patient-level cardiovascular health (CVH) as well as to assess the utility of this EHR integration methodology. MATERIALS AND METHODS: CVH was characterized among patients using available EHR data collected May through July of 2013. EHR data for 153 patients were linked to United States census-tract level data to explore feasibility and insights gained from combining these disparate data sources. Analyses were conducted in 2014. RESULTS: Using the linked data, weekly per capita expenditure on fruits and vegetables was found to be significantly associated with CVH at the p<0.05 level and three other community-level attributes (median income, average household size, and unemployment rate) were associated with CVH at the p<0.10 level. CONCLUSIONS: This work paves the way for future integration of community and EHR-based data into patient care as a novel methodology to gain insight into multi-level factors that affect CVH and other health outcomes. Further, our findings demonstrate the specific architectural and functional challenges associated with integrating decision support technologies and geographic information to support tailored and patient-centered decision making therein.


Asunto(s)
Sistema Cardiovascular , Atención a la Salud , Registros Electrónicos de Salud , Estado de Salud , Almacenamiento y Recuperación de la Información , Anciano , Estudios de Cohortes , Femenino , Sistemas de Información Geográfica , Humanos , Ohio , Características de la Residencia , Factores Socioeconómicos
5.
J Biomed Inform ; 58 Suppl: S103-S110, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26375493

RESUMEN

The second track of the 2014 i2b2 challenge asked participants to automatically identify risk factors for heart disease among diabetic patients using natural language processing techniques for clinical notes. This paper describes a rule-based system developed using a combination of regular expressions, concepts from the Unified Medical Language System (UMLS), and freely-available resources from the community. With a performance (F1=90.7) that is significantly higher than the median (F1=87.20) and close to the top performing system (F1=92.8), it was the best rule-based system of all the submissions in the challenge. We also used this system to evaluate the utility of different terminologies in the UMLS towards the challenge task. Of the 155 terminologies in the UMLS, 129 (76.78%) have no representation in the corpus. The Consumer Health Vocabulary had very good coverage of relevant concepts and was the most useful terminology for the challenge task. While segmenting notes into sections and lists has a significant impact on the performance, identifying negations and experiencer of the medical event results in negligible gain.


Asunto(s)
Minería de Datos/métodos , Complicaciones de la Diabetes/epidemiología , Registros Electrónicos de Salud/organización & administración , Narración , Procesamiento de Lenguaje Natural , Unified Medical Language System/organización & administración , Anciano , Estudios de Cohortes , Comorbilidad , Seguridad Computacional , Confidencialidad , Enfermedad de la Arteria Coronaria/diagnóstico , Enfermedad de la Arteria Coronaria/epidemiología , Complicaciones de la Diabetes/diagnóstico , Femenino , Humanos , Incidencia , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Ohio/epidemiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Medición de Riesgo/métodos , Terminología como Asunto , Vocabulario Controlado
6.
J Biomed Inform ; 58 Suppl: S211-S218, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26376462

RESUMEN

Clinical trials are essential for determining whether new interventions are effective. In order to determine the eligibility of patients to enroll into these trials, clinical trial coordinators often perform a manual review of clinical notes in the electronic health record of patients. This is a very time-consuming and exhausting task. Efforts in this process can be expedited if these coordinators are directed toward specific parts of the text that are relevant for eligibility determination. In this study, we describe the creation of a dataset that can be used to evaluate automated methods capable of identifying sentences in a note that are relevant for screening a patient's eligibility in clinical trials. Using this dataset, we also present results for four simple methods in natural language processing that can be used to automate this task. We found that this is a challenging task (maximum F-score=26.25), but it is a promising direction for further research.


Asunto(s)
Ensayos Clínicos como Asunto/métodos , Minería de Datos/métodos , Registros Electrónicos de Salud/organización & administración , Determinación de la Elegibilidad/métodos , Procesamiento de Lenguaje Natural , Selección de Paciente , Humanos , Reconocimiento de Normas Patrones Automatizadas/métodos , Vocabulario Controlado
7.
bioRxiv ; 2024 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-37808763

RESUMEN

Objective: Accurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients' health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI's Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal was to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, Flan-T5-xl, Flan-T5-xxl, and two rule-based and machine learning-based methods, namely, scispaCy and medspaCy. Materials and Methods: Phenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13,646 records for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, Flan-T5-xxl, Flan-T5-xl, medspaCy and scispaCy by comparing precision, recall, and micro-F1 scores. Results: GPT-4 achieved higher F1 score, precision, and recall compared to Flan-T5-xl, Flan-T5-xxl, medspaCy and scispaCy's models. GPT-3.5-turbo performed similarly to that of GPT-4. GPT and Flan-T5 models were not constrained by explicit rule requirements for contextual pattern recognition. SpaCy models relied on predefined patterns, leading to their suboptimal performance. Discussion and Conclusion: GPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, and robust clinical phenotype extraction.

8.
JAMIA Open ; 7(3): ooae060, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38962662

RESUMEN

Objective: Accurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients' health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI's Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal was to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, Flan-T5-xl, Flan-T5-xxl, Llama-3-8B, and 2 rule-based and machine learning-based methods, namely, scispaCy and medspaCy. Materials and Methods: Phenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13 646 clinical notes for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, Flan-T5-xxl, Flan-T5-xl, Llama-3-8B, medspaCy, and scispaCy by comparing precision, recall, and micro-F1 scores. Results: GPT-4 achieved higher F1 score, precision, and recall compared to Flan-T5-xl, Flan-T5-xxl, Llama-3-8B, medspaCy, and scispaCy's models. GPT-3.5-turbo performed similarly to that of GPT-4. GPT, Flan-T5, and Llama models were not constrained by explicit rule requirements for contextual pattern recognition. spaCy models relied on predefined patterns, leading to their suboptimal performance. Discussion and Conclusion: GPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, and robust clinical phenotype extraction.

9.
JAMA Netw Open ; 7(6): e2417977, 2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38904961

RESUMEN

Importance: It is unclear whether cannabis use is associated with adverse health outcomes in patients with COVID-19 when accounting for known risk factors, including tobacco use. Objective: To examine whether cannabis and tobacco use are associated with adverse health outcomes from COVID-19 in the context of other known risk factors. Design, Setting, and Participants: This retrospective cohort study used electronic health record data from February 1, 2020, to January 31, 2022. This study included patients who were identified as having COVID-19 during at least 1 medical visit at a large academic medical center in the Midwest US. Exposures: Current cannabis use and tobacco smoking, as documented in the medical encounter. Main Outcomes and Measures: Health outcomes of hospitalization, intensive care unit (ICU) admission, and all-cause mortality following COVID-19 infection. The association between substance use (cannabis and tobacco) and these COVID-19 outcomes was assessed using multivariable modeling. Results: A total of 72 501 patients with COVID-19 were included (mean [SD] age, 48.9 [19.3] years; 43 315 [59.7%] female; 9710 [13.4%] had current smoking; 17 654 [24.4%] had former smoking; and 7060 [9.7%] had current use of cannabis). Current tobacco smoking was significantly associated with increased risk of hospitalization (odds ratio [OR], 1.72; 95% CI, 1.62-1.82; P < .001), ICU admission (OR, 1.22; 95% CI, 1.10-1.34; P < .001), and all-cause mortality (OR, 1.37, 95% CI, 1.20-1.57; P < .001) after adjusting for other factors. Cannabis use was significantly associated with increased risk of hospitalization (OR, 1.80; 95% CI, 1.68-1.93; P < .001) and ICU admission (OR, 1.27; 95% CI, 1.14-1.41; P < .001) but not with all-cause mortality (OR, 0.97; 95% CI, 0.82-1.14, P = .69) after adjusting for tobacco smoking, vaccination, comorbidity, diagnosis date, and demographic factors. Conclusions and Relevance: The findings of this cohort study suggest that cannabis use may be an independent risk factor for COVID-19-related complications, even after considering cigarette smoking, vaccination status, comorbidities, and other risk factors.


Asunto(s)
COVID-19 , Hospitalización , Unidades de Cuidados Intensivos , SARS-CoV-2 , Humanos , COVID-19/mortalidad , COVID-19/epidemiología , Femenino , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , Hospitalización/estadística & datos numéricos , Adulto , Factores de Riesgo , Unidades de Cuidados Intensivos/estadística & datos numéricos , Anciano , Uso de Tabaco/efectos adversos , Uso de Tabaco/epidemiología , Fumar Tabaco/efectos adversos , Fumar Tabaco/epidemiología , Fumar Marihuana/epidemiología , Fumar Marihuana/efectos adversos
10.
J Am Med Inform Assoc ; 30(10): 1730-1740, 2023 09 25.
Artículo en Inglés | MEDLINE | ID: mdl-37390812

RESUMEN

OBJECTIVE: We extended a 2013 literature review on electronic health record (EHR) data quality assessment approaches and tools to determine recent improvements or changes in EHR data quality assessment methodologies. MATERIALS AND METHODS: We completed a systematic review of PubMed articles from 2013 to April 2023 that discussed the quality assessment of EHR data. We screened and reviewed papers for the dimensions and methods defined in the original 2013 manuscript. We categorized papers as data quality outcomes of interest, tools, or opinion pieces. We abstracted and defined additional themes and methods though an iterative review process. RESULTS: We included 103 papers in the review, of which 73 were data quality outcomes of interest papers, 22 were tools, and 8 were opinion pieces. The most common dimension of data quality assessed was completeness, followed by correctness, concordance, plausibility, and currency. We abstracted conformance and bias as 2 additional dimensions of data quality and structural agreement as an additional methodology. DISCUSSION: There has been an increase in EHR data quality assessment publications since the original 2013 review. Consistent dimensions of EHR data quality continue to be assessed across applications. Despite consistent patterns of assessment, there still does not exist a standard approach for assessing EHR data quality. CONCLUSION: Guidelines are needed for EHR data quality assessment to improve the efficiency, transparency, comparability, and interoperability of data quality assessment. These guidelines must be both scalable and flexible. Automation could be helpful in generalizing this process.


Asunto(s)
Exactitud de los Datos , Registros Electrónicos de Salud
11.
JAMIA Open ; 6(1): ooad014, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36844369

RESUMEN

Objectives: There is much interest in utilizing clinical data for developing prediction models for Alzheimer's disease (AD) risk, progression, and outcomes. Existing studies have mostly utilized curated research registries, image analysis, and structured electronic health record (EHR) data. However, much critical information resides in relatively inaccessible unstructured clinical notes within the EHR. Materials and Methods: We developed a natural language processing (NLP)-based pipeline to extract AD-related clinical phenotypes, documenting strategies for success and assessing the utility of mining unstructured clinical notes. We evaluated the pipeline against gold-standard manual annotations performed by 2 clinical dementia experts for AD-related clinical phenotypes including medical comorbidities, biomarkers, neurobehavioral test scores, behavioral indicators of cognitive decline, family history, and neuroimaging findings. Results: Documentation rates for each phenotype varied in the structured versus unstructured EHR. Interannotator agreement was high (Cohen's kappa = 0.72-1) and positively correlated with the NLP-based phenotype extraction pipeline's performance (average F1-score = 0.65-0.99) for each phenotype. Discussion: We developed an automated NLP-based pipeline to extract informative phenotypes that may improve the performance of eventual machine learning predictive models for AD. In the process, we examined documentation practices for each phenotype relevant to the care of AD patients and identified factors for success. Conclusion: Success of our NLP-based phenotype extraction pipeline depended on domain-specific knowledge and focus on a specific clinical domain instead of maximizing generalizability.

12.
Neurology ; 101(14): e1424-e1433, 2023 10 03.
Artículo en Inglés | MEDLINE | ID: mdl-37532510

RESUMEN

BACKGROUND AND OBJECTIVES: The capacity of specialty memory clinics in the United States is very limited. If lower socioeconomic status or minoritized racial group is associated with reduced use of memory clinics, this could exacerbate health care disparities, especially if more effective treatments of Alzheimer disease become available. We aimed to understand how use of a memory clinic is associated with neighborhood-level measures of socioeconomic factors and the intersectionality of race. METHODS: We conducted an observational cross-sectional study using electronic health record data to compare the neighborhood advantage of patients seen at the Washington University Memory Diagnostic Center with the catchment area using a geographical information system. Furthermore, we compared the severity of dementia at the initial visit between patients who self-identified as Black or White. We used a multinomial logistic regression model to assess the Clinical Dementia Rating at the initial visit and t tests to compare neighborhood characteristics, including Area Deprivation Index, with those of the catchment area. RESULTS: A total of 4,824 patients seen at the memory clinic between 2008 and 2018 were included in this study (mean age 72.7 [SD 11.0] years, 2,712 [56%] female, 543 [11%] Black). Most of the memory clinic patients lived in more advantaged neighborhoods within the overall catchment area. The percentage of patients self-identifying as Black (11%) was lower than the average percentage of Black individuals by census tract in the catchment area (16%) (p < 0.001). Black patients lived in less advantaged neighborhoods, and Black patients were more likely than White patients to have moderate or severe dementia at their initial visit (odds ratio 1.59, 95% CI 1.11-2.25). DISCUSSION: This study demonstrates that patients living in less affluent neighborhoods were less likely to be seen in one large memory clinic. Black patients were under-represented in the clinic, and Black patients had more severe dementia at their initial visit. These findings suggest that patients with a lower socioeconomic status and who identify as Black are less likely to be seen in memory clinics, which are likely to be a major point of access for any new Alzheimer disease treatments that may become available.


Asunto(s)
Enfermedad de Alzheimer , Anciano , Femenino , Humanos , Masculino , Enfermedad de Alzheimer/complicaciones , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/epidemiología , Enfermedad de Alzheimer/etnología , Enfermedad de Alzheimer/terapia , Población Negra , Estudios Transversales , Grupos Raciales , Factores Socioeconómicos , Estados Unidos , Trastornos de la Memoria/epidemiología , Trastornos de la Memoria/etnología , Trastornos de la Memoria/etiología , Población Blanca , Características del Vecindario , Persona de Mediana Edad , Anciano de 80 o más Años
13.
JAMIA Open ; 5(4): ooac105, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36570030

RESUMEN

EHR-based sepsis research often uses heterogeneous definitions of sepsis leading to poor generalizability and difficulty in comparing studies to each other. We have developed OpenSep, an open-source pipeline for sepsis phenotyping according to the Sepsis-3 definition, as well as determination of time of sepsis onset and SOFA scores. The Minimal Sepsis Data Model was developed alongside the pipeline to enable the execution of the pipeline to diverse sources of electronic health record data. The pipeline's accuracy was validated by applying it to the MIMIC-IV version 1.0 data and comparing sepsis onset and SOFA scores to those produced by the pipeline developed by the curators of MIMIC. We demonstrated high reliability between both the sepsis onsets and SOFA scores, however the use of the Minimal Sepsis Data model developed for this work allows our pipeline to be applied to more broadly to data sources beyond MIMIC.

14.
J Am Med Inform Assoc ; 29(5): 813-821, 2022 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-35092276

RESUMEN

OBJECTIVE: Respiratory support status is critical in understanding patient status, but electronic health record data are often scattered, incomplete, and contradictory. Further, there has been limited work on standardizing representations for respiratory support. The objective of this work was to (1) propose a practical terminology system for respiratory support methods; (2) develop (meta-)heuristics for constructing respiratory support episodes; and (3) evaluate the utility of respiratory support information for mortality prediction. MATERIALS AND METHODS: All analyses were performed using electronic health record data of COVID-19-tested, emergency department-admit, adult patients at a large, Midwestern healthcare system between March 1, 2020 and April 1, 2021. Logistic regression and XGBoost models were trained with and without respiratory support information, and performance metrics were compared. Importance of respiratory-support-based features was explored using absolute coefficient values for logistic regression and SHapley Additive exPlanations values for the XGBoost model. RESULTS: The proposed terminology system for respiratory support methods is as follows: Low-Flow Oxygen Therapy (LFOT), High-Flow Oxygen Therapy (HFOT), Non-Invasive Mechanical Ventilation (NIMV), Invasive Mechanical Ventilation (IMV), and ExtraCorporeal Membrane Oxygenation (ECMO). The addition of respiratory support information significantly improved mortality prediction (logistic regression area under receiver operating characteristic curve, median [IQR] from 0.855 [0.852-0.855] to 0.881 [0.876-0.884]; area under precision recall curve from 0.262 [0.245-0.268] to 0.319 [0.313-0.325], both P < 0.01). The proposed generalizable, interpretable, and episodic representation had commensurate performance compared to alternate representations despite loss of granularity. Respiratory support features were among the most important in both models. CONCLUSION: Respiratory support information is critical in understanding patient status and can facilitate downstream analyses.


Asunto(s)
COVID-19 , Heurística , Adulto , Humanos , Aprendizaje Automático , Oxígeno , Estudios Retrospectivos
15.
Front Digit Health ; 4: 848599, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35350226

RESUMEN

Objective: To develop and evaluate a sepsis prediction model for the general ward setting and extend the evaluation through a novel pseudo-prospective trial design. Design: Retrospective analysis of data extracted from electronic health records (EHR). Setting: Single, tertiary-care academic medical center in St. Louis, MO, USA. Patients: Adult, non-surgical inpatients admitted between January 1, 2012 and June 1, 2019. Interventions: None. Measurements and Main Results: Of the 70,034 included patient encounters, 3.1% were septic based on the Sepsis-3 criteria. Features were generated from the EHR data and were used to develop a machine learning model to predict sepsis 6-h ahead of onset. The best performing model had an Area Under the Receiver Operating Characteristic curve (AUROC or c-statistic) of 0.862 ± 0.011 and Area Under the Precision-Recall Curve (AUPRC) of 0.294 ± 0.021 compared to that of Logistic Regression (0.857 ± 0.008 and 0.256 ± 0.024) and NEWS 2 (0.699 ± 0.012 and 0.092 ± 0.009). In the pseudo-prospective trial, 388 (69.7%) septic patients were alerted on with a specificity of 81.4%. Within 24 h of crossing the alert threshold, 20.9% had a sepsis-related event occur. Conclusions: A machine learning model capable of predicting sepsis in the general ward setting was developed using the EHR data. The pseudo-prospective trial provided a more realistic estimation of implemented performance and demonstrated a 29.1% Positive Predictive Value (PPV) for sepsis-related intervention or outcome within 48 h.

16.
PLoS One ; 17(10): e0266292, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36264919

RESUMEN

OBJECTIVE: To determine whether modified K-12 student quarantine policies that allow some students to continue in-person education during their quarantine period increase schoolwide SARS-CoV-2 transmission risk following the increase in cases in winter 2020-2021. METHODS: We conducted a prospective cohort study of COVID-19 cases and close contacts among students and staff (n = 65,621) in 103 Missouri public schools. Participants were offered free, saliva-based RT-PCR testing. The projected number of school-based transmission events among untested close contacts was extrapolated from the percentage of events detected among tested asymptomatic close contacts and summed with the number of detected events for a projected total. An adjusted Cox regression model compared hazard rates of school-based SARS-CoV-2 infections between schools with a modified versus standard quarantine policy. RESULTS: From January-March 2021, a projected 23 (1%) school-based transmission events occurred among 1,636 school close contacts. There was no difference in the adjusted hazard rates of school-based SARS-CoV-2 infections between schools with a modified versus standard quarantine policy (hazard ratio = 1.00; 95% confidence interval: 0.97-1.03). DISCUSSION: School-based SARS-CoV-2 transmission was rare in 103 K-12 schools implementing multiple COVID-19 prevention strategies. Modified student quarantine policies were not associated with increased school incidence of COVID-19. Modifications to student quarantine policies may be a useful strategy for K-12 schools to safely reduce disruptions to in-person education during times of increased COVID-19 community incidence.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Cuarentena , COVID-19/epidemiología , COVID-19/prevención & control , Estudios Prospectivos , Estudiantes , Políticas
17.
Stat Med ; 30(16): 1989-2004, 2011 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-21520454

RESUMEN

Stochastic curtailment is a sequential method to terminate a study when continuing to the end would be unlikely to change the outcome. This method has been researched most commonly in the context of clinical trials. The current paper explores its use in a different setting: the administration of a health questionnaire to patients via computer. A classification procedure augmenting logistic regression with stochastic curtailment is introduced to avoid burdening the patients with unnecessary questions. In a real-data simulation using responses from the Medicare Health Outcomes Survey, the new procedure substantially reduced the average number of questions administered with a minimal loss of classification accuracy.


Asunto(s)
Bioestadística/métodos , Procesos Estocásticos , Encuestas y Cuestionarios , Anciano , Ensayos Clínicos como Asunto/estadística & datos numéricos , Interpretación Estadística de Datos , Femenino , Encuestas Epidemiológicas/estadística & datos numéricos , Humanos , Masculino , Medicare , Estados Unidos
18.
JAMIA Open ; 4(3): ooab052, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-34350389

RESUMEN

OBJECTIVE: Alzheimer disease (AD) is the most common cause of dementia, a syndrome characterized by cognitive impairment severe enough to interfere with activities of daily life. We aimed to conduct a systematic literature review (SLR) of studies that applied machine learning (ML) methods to clinical data derived from electronic health records in order to model risk for progression of AD dementia. MATERIALS AND METHODS: We searched for articles published between January 1, 2010, and May 31, 2020, in PubMed, Scopus, ScienceDirect, IEEE Explore Digital Library, Association for Computing Machinery Digital Library, and arXiv. We used predefined criteria to select relevant articles and summarized them according to key components of ML analysis such as data characteristics, computational algorithms, and research focus. RESULTS: There has been a considerable rise over the past 5 years in the number of research papers using ML-based analysis for AD dementia modeling. We reviewed 64 relevant articles in our SLR. The results suggest that majority of existing research has focused on predicting progression of AD dementia using publicly available datasets containing both neuroimaging and clinical data (neurobehavioral status exam scores, patient demographics, neuroimaging data, and laboratory test values). DISCUSSION: Identifying individuals at risk for progression of AD dementia could potentially help to personalize disease management to plan future care. Clinical data consisting of both structured data tables and clinical notes can be effectively used in ML-based approaches to model risk for AD dementia progression. Data sharing and reproducibility of results can enhance the impact, adaptation, and generalizability of this research.

19.
JAMIA Open ; 4(3): ooab062, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-34820600

RESUMEN

The objective of this study was to directly compare the ability of commonly used early warning scores (EWS) for early identification and prediction of sepsis in the general ward setting. For general ward patients at a large, academic medical center between early-2012 and mid-2018, common EWS and patient acuity scoring systems were calculated from electronic health records (EHR) data for patients that both met and did not meet Sepsis-3 criteria. For identification of sepsis at index time, National Early Warning Score 2 (NEWS 2) had the highest performance (area under the receiver operating characteristic curve: 0.803 [95% confidence interval [CI]: 0.795-0.811], area under the precision recall curves: 0.130 [95% CI: 0.121-0.140]) followed NEWS, Modified Early Warning Score, and quick Sequential Organ Failure Assessment (qSOFA). Using validated thresholds, NEWS 2 also had the highest recall (0.758 [95% CI: 0.736-0.778]) but qSOFA had the highest specificity (0.950 [95% CI: 0.948-0.952]), positive predictive value (0.184 [95% CI: 0.169-0.198]), and F1 score (0.236 [95% CI: 0.220-0.253]). While NEWS 2 outperformed all other compared EWS and patient acuity scores, due to the low prevalence of sepsis, all scoring systems were prone to false positives (low positive predictive value without drastic sacrifices in sensitivity), thus leaving room for more computationally advanced approaches.

20.
Learn Health Syst ; 5(1): e10235, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32838037

RESUMEN

Problem: The current coronavirus disease 2019 (COVID-19) pandemic underscores the need for building and sustaining public health data infrastructure to support a rapid local, regional, national, and international response. Despite a historical context of public health crises, data sharing agreements and transactional standards do not uniformly exist between institutions which hamper a foundational infrastructure to meet data sharing and integration needs for the advancement of public health. Approach: There is a growing need to apply population health knowledge with technological solutions to data transfer, integration, and reasoning, to improve health in a broader learning health system ecosystem. To achieve this, data must be combined from healthcare provider organizations, public health departments, and other settings. Public health entities are in a unique position to consume these data, however, most do not yet have the infrastructure required to integrate data sources and apply computable knowledge to combat this pandemic. Outcomes: Herein, we describe lessons learned and a framework to address these needs, which focus on: (a) identifying and filling technology "gaps"; (b) pursuing collaborative design of data sharing requirements and transmission mechanisms; (c) facilitating cross-domain discussions involving legal and research compliance; and (d) establishing or participating in multi-institutional convening or coordinating activities. Next steps: While by no means a comprehensive evaluation of such issues, we envision that many of our experiences are universal. We hope those elucidated can serve as the catalyst for a robust community-wide dialogue on what steps can and should be taken to ensure that our regional and national health care systems can truly learn, in a rapid manner, so as to respond to this and future emergent public health crises.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA