Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 90
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Biomed Inform ; 149: 104551, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38000765

RESUMEN

The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lacks standard methodologies. Although tools for model replication are numerous, without a unifying blueprint it remains difficult to scientifically reproduce predictive ML models for any number of reasons (e.g., assumptions regarding data distributions and preprocessing, unclear test metrics, etc.) and ultimately, questions around generalizability and transportability are not readily answered. To facilitate scientific reproducibility, we built upon the Predictive Model Markup Language (PMML) to capture essential information. As a key component of the PREdictive Model Index and Exchange REpository (PREMIERE) platform, we present the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that autocompletes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets, providing a foundation for future predictive model reproducibility, sharing, and comparison.


Asunto(s)
Investigación Biomédica , Reproducibilidad de los Resultados , Algoritmos , Registros , Metadatos
2.
J Biomed Inform ; 134: 104168, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-35987449

RESUMEN

Early detection of heart failure (HF) can provide patients with the opportunity for more timely intervention and better disease management, as well as efficient use of healthcare resources. Recent machine learning (ML) methods have shown promising performance on diagnostic prediction using temporal sequences from electronic health records (EHRs). In practice, however, these models may not generalize to other populations due to dataset shift. Shifts in datasets can be attributed to a range of factors such as variations in demographics, data management methods, and healthcare delivery patterns. In this paper, we use unsupervised adversarial domain adaptation methods to adaptively reduce the impact of dataset shift on cross-institutional transfer performance. The proposed framework is validated on a next-visit HF onset prediction task using a BERT-style Transformer-based language model pre-trained with a masked language modeling (MLM) task. Our model empirically demonstrates superior prediction performance relative to non-adversarial baselines in both transfer directions on two different clinical event sequence data sources.


Asunto(s)
Insuficiencia Cardíaca , Redes Neurales de la Computación , Registros Electrónicos de Salud , Insuficiencia Cardíaca/diagnóstico , Humanos , Almacenamiento y Recuperación de la Información , Lenguaje , Aprendizaje Automático
3.
J Biomed Inform ; 135: 104214, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36220544

RESUMEN

To better understand the challenges of generally implementing and adapting computational phenotyping approaches, the performance of a Phenotype KnowledgeBase (PheKB) algorithm for rheumatoid arthritis (RA) was evaluated on a University of California, Los Angeles (UCLA) patient population, focusing on examining its performance on ambiguous cases. The algorithm was evaluated on a cohort of 4,766 patients, along with a chart review of 300 patients by rheumatologists against accepted diagnostic guidelines. The performance revealed low sensitivity towards specific subtypes of positive RA cases, which suggests revisions in features used for phenotyping. A close examination of select cases also indicated a significant portion of patients with missing data, drawing attention to the need to consider data integrity as an integral part of phenotyping pipelines, as well as issues around the usability of various codes for distinguishing cases. We use patterns in the PheKB algorithm's errors to further demonstrate important considerations when designing a phenotyping algorithm.


Asunto(s)
Artritis Reumatoide , Registros Electrónicos de Salud , Humanos , Algoritmos , Bases del Conocimiento , Fenotipo , Artritis Reumatoide/diagnóstico , Artritis Reumatoide/epidemiología
4.
J Asthma ; 59(7): 1305-1318, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-33926348

RESUMEN

OBJECTIVE: The heterogeneity of asthma has inspired widespread application of statistical clustering algorithms to a variety of datasets for identification of potentially clinically meaningful phenotypes. There has not been a standardized data analysis approach for asthma clustering, which can affect reproducibility and clinical translation of results. Our objective was to identify common and effective data analysis practices in the asthma clustering literature and apply them to data from a Southern California population-based cohort of schoolchildren with asthma. METHODS: As of January 1, 2020, we reviewed key statistical elements of 77 asthma clustering studies. Guided by the literature, we used 12 input variables and three clustering methods (hierarchical clustering, k-medoids, and latent class analysis) to identify clusters in 598 schoolchildren with asthma from the Southern California Children's Health Study (CHS). RESULTS: Clusters of children identified by latent class analysis were characterized by exhaled nitric oxide, FEV1/FVC, FEV1 percent predicted, asthma control and allergy score; and were predictive of control at two year follow up. Clusters from the other two methods were less clinically remarkable, primarily differentiated by sex and race/ethnicity and less predictive of asthma control over time. CONCLUSION: Upon review of the asthma phenotyping literature, common approaches of data clustering emerged. When applying these elements to the Children's Health Study data, latent class analysis clusters-represented by exhaled nitric oxide and spirometry measures-had clinical relevance over time.


Asunto(s)
Asma , Asma/epidemiología , Asma/genética , Niño , Salud Infantil , Análisis por Conglomerados , Humanos , Óxido Nítrico , Reproducibilidad de los Resultados
5.
Am J Perinatol ; 2022 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-35752169

RESUMEN

OBJECTIVE: This study aimed to develop and validate a machine learning (ML) model to predict the probability of a vaginal delivery (Partometer) using data iteratively obtained during labor from the electronic health record. STUDY DESIGN: A retrospective cohort study of deliveries at an academic, tertiary care hospital was conducted from 2013 to 2019 who had at least two cervical examinations. The population was divided into those delivered by physicians with nulliparous term singleton vertex (NTSV) cesarean delivery rates <23.9% (Partometer cohort) and the remainder (control cohort). The cesarean rate among this population of lower risk patients is a standard metric by which to compare provider rates; <23.9% was the Healthy People 2020 goal. A supervised automated ML approach was applied to generate a model for each population. The primary outcome was accuracy of the model developed on the Partometer cohort at 4 hours from admission to labor and delivery. Secondary outcomes included discrimination ability (receiver operating characteristics-area under the curve [ROC-AUC]), precision-recall AUC, and calibration of the Partometer. To assess generalizability, we compared the performance and clinical predictors identified by the Partometer to the control model. RESULTS: There were 37,932 deliveries during the study period; after exclusions, 9,385 deliveries were included in the Partometer cohort and 19,683 in the control cohort. Accuracy of predicting vaginal delivery at 4 hours was 87.1% for the Partometer (ROC-AUC: 0.82). Clinical predictors of greatest importance in the stacked Intrapartum Partometer Model included the Admission Model prediction and ongoing measures of dilatation and station which mirrored those found in the control population. CONCLUSION: Using automated ML and intrapartum factors improved the accuracy of prediction of probability of a vaginal delivery over both previously published models based on logistic regression. Harnessing real-time data and ML could represent the bridge to generating a truly prescriptive tool to augment clinical decision-making, predict labor outcomes, and reduce maternal and neonatal morbidity. KEY POINTS: · Our ML-based model yielded accurate predictions of mode of delivery early in labor.. · Predictors for models created on populations with high and low cesarean rates were the same.. · A ML-based model may provide meaningful guidance to clinicians managing labor..

6.
BMC Infect Dis ; 19(1): 918, 2019 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-31699053

RESUMEN

BACKGROUND: In recent years, the number of infective endocarditis (IE) cases associated with injection drug use has increased. Clinical guidelines suggest deferring surgery for IE in people who inject drugs (PWID) due to a concern for worse outcomes in comparison to non-injectors (non-PWID). We performed a systematic review and meta-analysis of long-term outcomes in PWID who underwent cardiac surgery and compared these outcomes to non-PWID. METHODS: We systematically searched for studies reported between 1965 and 2018. We used an algorithm to estimate individual patient data (eIPD) from Kaplan-Meier (KM) curves and combined it with published individual patient data (IPD) to analyze long-term outcomes after cardiac surgery for IE in PWID. Our primary outcome was survival. Secondary outcomes were reoperation and mortality at 30-days, one-, five-, and 10-years. Random effects Cox regression was used for estimating survival. RESULTS: We included 27 studies in the systematic review and 19 provided data (KM or IPD) for the meta-analysis. PWID were younger and more likely to have S. aureus than non-PWID. Survival at 30-days, one-, five-, and 10-years was 94.3, 81.0, 62.1, and 56.6% in PWID, respectively; and 96.4, 85.0, 70.3, and 63.4% in non-PWID. PWID had 47% greater hazard of death (HR 1.47, 95% CI, 1.05-2.05) and more than twice the hazard of reoperation (HR 2.37, 95% CI, 1.25-4.50) than non-PWID. CONCLUSION: PWID had shorter survival that non-PWID. Implementing evidence-based interventions and testing new modalities are urgently needed to improve outcomes in PWID after cardiac surgery.


Asunto(s)
Endocarditis/diagnóstico , Abuso de Sustancias por Vía Intravenosa/complicaciones , Procedimientos Quirúrgicos Cardíacos , Endocarditis/etiología , Endocarditis/mortalidad , Humanos , Estimación de Kaplan-Meier , Modelos de Riesgos Proporcionales , Infecciones Estafilocócicas/diagnóstico , Resultado del Tratamiento
7.
BMC Nephrol ; 20(1): 416, 2019 11 20.
Artículo en Inglés | MEDLINE | ID: mdl-31747918

RESUMEN

BACKGROUND: Chronic kidney disease (CKD) is a global public health problem, exhibiting sharp increases in incidence, prevalence, and attributable morbidity and mortality. There is a critical need to better understand the demographics, clinical characteristics, and key risk factors for CKD; and to develop platforms for testing novel interventions to improve modifiable risk factors, particularly for the CKD patients with a rapid decline in kidney function. METHODS: We describe a novel collaboration between two large healthcare systems (Providence St. Joseph Health and University of California, Los Angeles Health) supported by leadership from both institutions, which was created to develop harmonized cohorts of patients with CKD or those at increased risk for CKD (hypertension/HTN, diabetes/DM, pre-diabetes) from electronic health record data. RESULTS: The combined repository of candidate records included more than 3.3 million patients with at least a single qualifying measure for CKD and/or at-risk for CKD. The CURE-CKD registry includes over 2.6 million patients with and/or at-risk for CKD identified by stricter guide-line based criteria using a combination of administrative encounter codes, physical examinations, laboratory values and medication use. Notably, data based on race/ethnicity and geography in part, will enable robust analyses to study traditionally disadvantaged or marginalized patients not typically included in clinical trials. DISCUSSION: CURE-CKD project is a unique multidisciplinary collaboration between nephrologists, endocrinologists, primary care physicians with health services research skills, health economists, and those with expertise in statistics, bio-informatics and machine learning. The CURE-CKD registry uses curated observations from real-world settings across two large healthcare systems and has great potential to provide important contributions for healthcare and for improving clinical outcomes in patients with and at-risk for CKD.


Asunto(s)
Atención Integral de Salud , Registros Electrónicos de Salud , Registro Médico Coordinado/métodos , Insuficiencia Renal Crónica , Adulto , Atención Integral de Salud/organización & administración , Atención Integral de Salud/normas , Diabetes Mellitus/epidemiología , Progresión de la Enfermedad , Registros Electrónicos de Salud/organización & administración , Registros Electrónicos de Salud/estadística & datos numéricos , Femenino , Humanos , Hipertensión/epidemiología , Masculino , Prevalencia , Pronóstico , Mejoramiento de la Calidad , Sistema de Registros , Insuficiencia Renal Crónica/diagnóstico , Insuficiencia Renal Crónica/epidemiología , Medición de Riesgo , Factores de Riesgo , Estados Unidos/epidemiología
8.
Expert Syst Appl ; 128: 84-95, 2019 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-31296975

RESUMEN

While deep learning methods have demonstrated performance comparable to human readers in tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by end users such as radiologists. In this paper, we present a novel interpretable deep hierarchical semantic convolutional neural network (HSCNN) to predict whether a given pulmonary nodule observed on a computed tomography (CT) scan is malignant. Our network provides two levels of output: 1) low-level semantic features; and 2) a high-level prediction of nodule malignancy. The low-level outputs reflect diagnostic features often reported by radiologists and serve to explain how the model interprets the images in an expert-interpretable manner. The information from these low-level outputs, along with the representations learned by the convolutional layers, are then combined and used to infer the high-level output. This unified architecture is trained by optimizing a global loss function including both low- and high-level tasks, thereby learning all the parameters within a joint framework. Our experimental results using the Lung Image Database Consortium (LIDC) show that the proposed method not only produces interpretable lung cancer predictions but also achieves significantly better results compared to using a 3D CNN alone.

9.
J Biomed Inform ; 69: 115-117, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28366789

RESUMEN

Through the increasing availability of more efficient data collection procedures, biomedical scientists are now confronting ever larger sets of data, often finding themselves struggling to process and interpret what they have gathered. This, while still more data continues to accumulate. This torrent of biomedical information necessitates creative thinking about how the data are being generated, how they might be best managed, analyzed, and eventually how they can be transformed into further scientific understanding for improving patient care. Recognizing this as a major challenge, the National Institutes of Health (NIH) has spearheaded the "Big Data to Knowledge" (BD2K) program - the agency's most ambitious biomedical informatics effort ever undertaken to date. In this commentary, we describe how the NIH has taken on "big data" science head-on, how a consortium of leading research centers are developing the means for handling large-scale data, and how such activities are being marshalled for the training of a new generation of biomedical data scientists. All in all, the NIH BD2K program seeks to position data science at the heart of 21st Century biomedical research.


Asunto(s)
Investigación Biomédica , Recolección de Datos , National Institutes of Health (U.S.) , Humanos , Estados Unidos
10.
Sensors (Basel) ; 17(8)2017 Aug 03.
Artículo en Inglés | MEDLINE | ID: mdl-28771168

RESUMEN

To address the need for asthma self-management in pediatrics, the authors present the feasibility of a mobile health (mHealth) platform built on their prior work in an asthmatic adult and child. Real-time asthma attack risk was assessed through physiological and environmental sensors. Data were sent to a cloud via a smartwatch application (app) using Health Insurance Portability and Accountability Act (HIPAA)-compliant cryptography and combined with online source data. A risk level (high, medium or low) was determined using a random forest classifier and then sent to the app to be visualized as animated dragon graphics for easy interpretation by children. The feasibility of the system was first tested on an adult with moderate asthma, then usability was examined on a child with mild asthma over several weeks. It was found during feasibility testing that the system is able to assess asthma risk with 80.10 ± 14.13% accuracy. During usability testing, it was able to continuously collect sensor data, and the child was able to wear, easily understand and enjoy the use of the system. If tested in more individuals, this system may lead to an effective self-management program that can reduce hospitalization in those who suffer from asthma.


Asunto(s)
Asma , Niño , Humanos , Automanejo , Telemedicina , Interfaz Usuario-Computador , Tecnología Inalámbrica
11.
Pervasive Mob Comput ; 28: 69-80, 2016 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-27293387

RESUMEN

Time series subsequence matching has importance in a variety of areas in healthcare informatics. These include case-based diagnosis and treatment as well as discovery of trends among patients. However, few medical systems employ subsequence matching due to high computational and memory complexities. This manuscript proposes a randomized Monte Carlo sampling method to broaden search criteria with minimal increases in computational and memory complexities over R-NN indexing. Information gain improves while producing result sets that approximate the theoretical result space, query results increase by several orders of magnitude, and recall is improved with no signi cant degradation to precision over R-NN matching.

12.
J Biomed Inform ; 55: 132-42, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25817919

RESUMEN

The electronic health record (EHR) contains a diverse set of clinical observations that are captured as part of routine care, but the incomplete, inconsistent, and sometimes incorrect nature of clinical data poses significant impediments for its secondary use in retrospective studies or comparative effectiveness research. In this work, we describe an ontology-driven approach for extracting and analyzing data from the patient record in a longitudinal and continuous manner. We demonstrate how the ontology helps enforce consistent data representation, integrates phenotypes generated through analyses of available clinical data sources, and facilitates subsequent studies to identify clinical predictors for an outcome of interest. Development and evaluation of our approach are described in the context of studying factors that influence intracranial aneurysm (ICA) growth and rupture. We report our experiences in capturing information on 78 individuals with a total of 120 aneurysms. Two example applications related to assessing the relationship between aneurysm size, growth, gene expression modules, and rupture are described. Our work highlights the challenges with respect to data quality, workflow, and analysis of data and its implications toward a learning health system paradigm.


Asunto(s)
Aneurisma Roto/clasificación , Minería de Datos/métodos , Bases de Datos Factuales , Registros Electrónicos de Salud/organización & administración , Aneurisma Intracraneal/clasificación , Vocabulario Controlado , Investigación Biomédica/métodos , Investigación Biomédica/organización & administración , Exactitud de los Datos , Sistemas de Administración de Bases de Datos , Humanos , Uso Significativo , Procesamiento de Lenguaje Natural , Integración de Sistemas , Interfaz Usuario-Computador
13.
Nat Commun ; 15(1): 5440, 2024 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-38937447

RESUMEN

Continuous renal replacement therapy (CRRT) is a form of dialysis prescribed to severely ill patients who cannot tolerate regular hemodialysis. However, as the patients are typically very ill to begin with, there is always uncertainty whether they will survive during or after CRRT treatment. Because of outcome uncertainty, a large percentage of patients treated with CRRT do not survive, utilizing scarce resources and raising false hope in patients and their families. To address these issues, we present a machine learning-based algorithm to predict short-term survival in patients being initiated on CRRT. We use information extracted from electronic health records from patients who were placed on CRRT at multiple institutions to train a model that predicts CRRT survival outcome; on a held-out test set, the model achieves an area under the receiver operating curve of 0.848 (CI = 0.822-0.870). Feature importance, error, and subgroup analyses provide insight into bias and relevant features for model prediction. Overall, we demonstrate the potential for predictive machine learning models to assist clinicians in alleviating the uncertainty of CRRT patient survival outcomes, with opportunities for future improvement through further data collection and advanced modeling.


Asunto(s)
Algoritmos , Terapia de Reemplazo Renal Continuo , Aprendizaje Automático , Humanos , Terapia de Reemplazo Renal Continuo/métodos , Masculino , Femenino , Persona de Mediana Edad , Registros Electrónicos de Salud , Anciano , Curva ROC , Terapia de Reemplazo Renal/métodos , Terapia de Reemplazo Renal/mortalidad
14.
Open Forum Infect Dis ; 11(2): ofae030, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38379573

RESUMEN

Introduction: Initiation of medications for opioid use disorder (MOUD) within the hospital setting may improve outcomes for people who inject drugs (PWID) hospitalized because of an infection. Many studies used International Classification of Diseases (ICD) codes to identify PWID, although these may be misclassified and thus, inaccurate. We hypothesized that bias from misclassification of PWID using ICD codes may impact analyses of MOUD outcomes. Methods: We analyzed a cohort of 36 868 cases of patients diagnosed with Staphylococcus aureus bacteremia at 124 US Veterans Health Administration hospitals between 2003 and 2014. To identify PWID, we implemented an ICD code-based algorithm and a natural language processing (NLP) algorithm for classification of admission notes. We analyzed outcomes of prescribing MOUD as an inpatient using both approaches. Our primary outcome was 365-day all-cause mortality. We fit mixed-effects Cox regression models with receipt or not of MOUD during the index hospitalization as the primary predictor and 365-day mortality as the outcome. Results: NLP identified 2389 cases as PWID, whereas ICD codes identified 6804 cases as PWID. In the cohort identified by NLP, receipt of inpatient MOUD was associated with a protective effect on 365-day survival (adjusted hazard ratio, 0.48; 95% confidence interval, .29-.81; P < .01) compared with those not receiving MOUD. There was no significant effect of MOUD receipt in the cohort identified by ICD codes (adjusted hazard ratio, 1.00; 95% confidence interval, .77-1.30; P = .99). Conclusions: MOUD was protective of all-cause mortality when NLP was used to identify PWID, but not significant when ICD codes were used to identify the analytic subjects.

15.
JAMIA Open ; 7(1): ooae015, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38414534

RESUMEN

Objectives: In the United States, end-stage kidney disease (ESKD) is responsible for high mortality and significant healthcare costs, with the number of cases sharply increasing in the past 2 decades. In this study, we aimed to reduce these impacts by developing an ESKD model for predicting its occurrence in a 2-year period. Materials and Methods: We developed a machine learning (ML) pipeline to test different models for the prediction of ESKD. The electronic health record was used to capture several kidney disease-related variables. Various imputation methods, feature selection, and sampling approaches were tested. We compared the performance of multiple ML models using area under the ROC curve (AUCROC), area under the Precision-Recall curve (PR-AUC), and Brier scores for discrimination, precision, and calibration, respectively. Explainability methods were applied to the final model. Results: Our best model was a gradient-boosting machine with feature selection and imputation methods as additional components. The model exhibited an AUCROC of 0.97, a PR-AUC of 0.33, and a Brier score of 0.002 on a holdout test set. A chart review analysis by expert physicians indicated clinical utility. Discussion and Conclusion: An ESKD prediction model can identify individuals at risk for ESKD and has been successfully deployed within our health system.

16.
Res Sq ; 2023 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-38014280

RESUMEN

Continuous renal replacement therapy (CRRT) is a form of dialysis prescribed to severely ill patients who cannot tolerate regular hemodialysis. However, as the patients are typically very ill to begin with, there is always uncertainty as to whether they will survive during or after CRRT treatment. Because of outcome uncertainty, a large percentage of patients treated with CRRT do not survive, utilizing scarce resources and raising false hope in patients and their families. To address these issues, we present a machine-learning-based algorithm to predict if patients will survive after being treated with CRRT. We use information extracted from electronic health records from patients who were placed on CRRT at multiple institutions to train a model that predicts CRRT survival outcome; on a held-out test set, the model achieved an area under the receiver operating curve of 0.929 (CI=0.917-0.942). Feature importance, error, and subgroup analyses identified consistently, mean corpuscular volume as a driving feature for model predictions. Overall, we demonstrate the potential for predictive machine-learning models to assist clinicians in alleviating the uncertainty of CRRT patient survival outcomes, with opportunities for future improvement through further data collection and advanced modeling.

17.
AMIA Annu Symp Proc ; 2022: 709-718, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-37128415

RESUMEN

Determining factors influencing patient participation in and adherence to cancer screening recommendations is key to successful cancer screening programs. However, the collection of variables necessary to anticipate patient behavior in cancer screening has not been systematically examined. Using lung cancer screening as a representative example, we conducted an exploratory analysis to characterize the current representations of 18 demographic, health-related, and psychosocial variables collected as part of a conceptual model to understand factors for lung cancer screening participation and adherence. Our analysis revealed a lack of standardization in controlled terminologies and common data elements for these variables. For example, only eight (44%) demographic and health-related variables were recorded consistently in the electronic health record. Multiple survey instruments could collect the remaining variables but were highly inconsistent in how variables were represented. This analysis suggests opportunities to establish standardized data formats for psychological, cognitive, social, and environmental variables to improve data collection.


Asunto(s)
Detección Precoz del Cáncer , Neoplasias Pulmonares , Humanos , Recolección de Datos , Participación del Paciente , Demografía
18.
JAMA Netw Open ; 5(3): e222037, 2022 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-35285922

RESUMEN

Importance: Living alone, a key proxy of social isolation, is a risk factor for cardiovascular disease. In addition, Black race is associated with less optimal blood pressure (BP) control than in other racial or ethnic groups. However, it is not clear whether living arrangement status modifies the beneficial effects of intensive BP control on reduction in cardiovascular events among Black individuals. Objective: To examine whether the association of intensive BP control with cardiovascular events differs by living arrangement among Black individuals and non-Black individuals (eg, individuals who identified as Alaskan Native, American Indian, Asian, Native Hawaiian, Pacific Islander, White, or other) in the Systolic Blood Pressure Intervention Trial (SPRINT). Design, Setting, and Participants: This secondary analysis incorporated data from SPRINT, a multicenter study of individuals with increased risk for cardiovascular disease and free of diabetes, enrolled at 102 clinical sites in the United States between November 2010 and March 2013. Race and living arrangement (ie, living alone or living with others) were self-reported. Data were collected between November 2010 and March 2013 and analyzed from January 2021 to October 2021. Exposures: The SPRINT participants were randomized to a systolic BP target of either less than 120 mm Hg (intensive treatment group) or less than 140 mm Hg (standard treatment group). Antihypertensive medications were adjusted to achieve the targets in each group. Main Outcomes and Measures: Cox proportional hazards model was used to investigate the association of intensive treatment with the incident composite cardiovascular outcome (by August 20, 2015) according to living arrangement among Black individuals and other individuals. Transportability formula was applied to generalize the SPRINT findings to hypothetical external populations by varying the proportion of Black race and living arrangement status. Results: Among the 9342 total participants, the mean (SD) age was 67.9 (9.4) years; 2793 participants [30%] were Black, 2714 [29%] lived alone, and 3320 participants (35.5%) were female. Over a median (IQR) follow-up of 3.22 (2.74-3.76) years, the primary composite cardiovascular outcome was observed in 67 of 1001 Black individuals living alone (6.7%), 76 of 1792 Black individuals living with others (4.2%), 108 of 1713 non-Black individuals living alone (6.3%), and 311 of 4836 non-Black individuals living with others (6.4%). The intensive treatment group showed a significantly lower rate of the composite cardiovascular outcome than the standard treatment group among Black individuals living with others (hazard ratio [HR], 0.53 [95% CI, 0.33-0.85]) but not among those living alone (HR, 1.07 [95% CI, 0.66-1.73]; P for interaction = .04). The association was observed among individuals who were not Black regardless of living arrangement status. Using transportability, we found a smaller or null association between intensive control and cardiovascular outcomes among hypothetical populations of 60% Black individuals or more and 60% or more of individuals living alone. Conclusions and Relevance: Intensive BP control was associated with a lower rate of cardiovascular events among Black individuals living with others and individuals who were not Black but not among Black individuals living alone. Trial Registration: ClinicalTrials.gov Identifier: NCT01206062.


Asunto(s)
Enfermedades Cardiovasculares , Hipertensión , Anciano , Antihipertensivos/farmacología , Presión Sanguínea/fisiología , Determinación de la Presión Sanguínea , Enfermedades Cardiovasculares/inducido químicamente , Enfermedades Cardiovasculares/epidemiología , Enfermedades Cardiovasculares/prevención & control , Femenino , Humanos , Hipertensión/diagnóstico , Masculino
19.
JAMA Netw Open ; 5(8): e2225593, 2022 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-35939303

RESUMEN

Importance: Overdose is one of the leading causes of death in the US; however, surveillance data lag considerably from medical examiner determination of the death to reporting in national surveillance reports. Objective: To automate the classification of deaths related to substances in medical examiner data using natural language processing (NLP) and machine learning (ML). Design, Setting, and Participants: Diagnostic study comparing different natural language processing and machine learning algorithms to identify substances related to overdose in 10 health jurisdictions in the US from January 1, 2020, to December 31, 2020. Unstructured text from 35 433 medical examiner and coroners' death records was examined. Exposures: Text from each case was manually classified to a substance that was related to the death. Three feature representation methods were used and compared: text frequency-inverse document frequency (TF-IDF), global vectors for word representations (GloVe), and concept unique identifier (CUI) embeddings. Several ML algorithms were trained and best models were selected based on F-scores. The best models were tested on a hold-out test set and results were reported with 95% CIs. Main Outcomes and Measures: Text data from death certificates were classified as any opioid, fentanyl, alcohol, cocaine, methamphetamine, heroin, prescription opioid, and an aggregate of other substances. Diagnostic metrics and 95% CIs were calculated for each combination of feature extraction method and machine learning classifier. Results: Of 35 433 death records analyzed (decedent median age, 58 years [IQR, 41-72 years]; 24 449 [69%] were male), the most common substances related to deaths included any opioid (5739 [16%]), fentanyl (4758 [13%]), alcohol (2866 [8%]), cocaine (2247 [6%]), methamphetamine (1876 [5%]), heroin (1613 [5%]), prescription opioids (1197 [3%]), and any benzodiazepine (1076 [3%]). The CUI embeddings had similar or better diagnostic metrics compared with word embeddings and TF-IDF for all substances except alcohol. ML classifiers had perfect or near perfect performance in classifying deaths related to any opioids, heroin, fentanyl, prescription opioids, methamphetamine, cocaine, and alcohol. Classification of benzodiazepines was suboptimal using all 3 feature extraction methods. Conclusions and Relevance: In this diagnostic study, NLP/ML algorithms demonstrated excellent diagnostic performance at classifying substances related to overdoses. These algorithms should be integrated into workflows to decrease the lag time in reporting overdose surveillance data.


Asunto(s)
Cocaína , Sobredosis de Droga , Metanfetamina , Analgésicos Opioides , Benzodiazepinas , Sobredosis de Droga/epidemiología , Femenino , Fentanilo , Heroína , Humanos , Masculino , Persona de Mediana Edad , Procesamiento de Lenguaje Natural
20.
Open Forum Infect Dis ; 9(9): ofac471, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-36168546

RESUMEN

Background: Improving the identification of people who inject drugs (PWID) in electronic medical records can improve clinical decision making, risk assessment and mitigation, and health service research. Identification of PWID currently consists of heterogeneous, nonspecific International Classification of Diseases (ICD) codes as proxies. Natural language processing (NLP) and machine learning (ML) methods may have better diagnostic metrics than nonspecific ICD codes for identifying PWID. Methods: We manually reviewed 1000 records of patients diagnosed with Staphylococcus aureus bacteremia admitted to Veterans Health Administration hospitals from 2003 through 2014. The manual review was the reference standard. We developed and trained NLP/ML algorithms with and without regular expression filters for negation (NegEx) and compared these with 11 proxy combinations of ICD codes to identify PWID. Data were split 70% for training and 30% for testing. We calculated diagnostic metrics and estimated 95% confidence intervals (CIs) by bootstrapping the hold-out test set. Best models were determined by best F-score, a summary of sensitivity and positive predictive value. Results: Random forest with and without NegEx were the best-performing NLP/ML algorithms in the training set. Random forest with NegEx outperformed all ICD-based algorithms. F-score for the best NLP/ML algorithm was 0.905 (95% CI, .786-.967) and 0.592 (95% CI, .550-.632) for the best ICD-based algorithm. The NLP/ML algorithm had a sensitivity of 92.6% and specificity of 95.4%. Conclusions: NLP/ML outperformed ICD-based coding algorithms at identifying PWID in electronic health records. NLP/ML models should be considered in identifying cohorts of PWID to improve clinical decision making, health services research, and administrative surveillance.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA