Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Am Soc Nephrol ; 32(1): 151-160, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32883700

RESUMO

BACKGROUND: Early reports indicate that AKI is common among patients with coronavirus disease 2019 (COVID-19) and associated with worse outcomes. However, AKI among hospitalized patients with COVID-19 in the United States is not well described. METHODS: This retrospective, observational study involved a review of data from electronic health records of patients aged ≥18 years with laboratory-confirmed COVID-19 admitted to the Mount Sinai Health System from February 27 to May 30, 2020. We describe the frequency of AKI and dialysis requirement, AKI recovery, and adjusted odds ratios (aORs) with mortality. RESULTS: Of 3993 hospitalized patients with COVID-19, AKI occurred in 1835 (46%) patients; 347 (19%) of the patients with AKI required dialysis. The proportions with stages 1, 2, or 3 AKI were 39%, 19%, and 42%, respectively. A total of 976 (24%) patients were admitted to intensive care, and 745 (76%) experienced AKI. Of the 435 patients with AKI and urine studies, 84% had proteinuria, 81% had hematuria, and 60% had leukocyturia. Independent predictors of severe AKI were CKD, men, and higher serum potassium at admission. In-hospital mortality was 50% among patients with AKI versus 8% among those without AKI (aOR, 9.2; 95% confidence interval, 7.5 to 11.3). Of survivors with AKI who were discharged, 35% had not recovered to baseline kidney function by the time of discharge. An additional 28 of 77 (36%) patients who had not recovered kidney function at discharge did so on posthospital follow-up. CONCLUSIONS: AKI is common among patients hospitalized with COVID-19 and is associated with high mortality. Of all patients with AKI, only 30% survived with recovery of kidney function by the time of discharge.


Assuntos
Injúria Renal Aguda/etiologia , COVID-19/complicações , SARS-CoV-2 , Injúria Renal Aguda/epidemiologia , Injúria Renal Aguda/terapia , Injúria Renal Aguda/urina , Idoso , Idoso de 80 Anos ou mais , COVID-19/mortalidade , Feminino , Hematúria/etiologia , Mortalidade Hospitalar , Hospitais Privados/estatística & dados numéricos , Hospitais Urbanos/estatística & dados numéricos , Humanos , Incidência , Pacientes Internados , Leucócitos , Masculino , Pessoa de Meia-Idade , Cidade de Nova Iorque/epidemiologia , Proteinúria/etiologia , Diálise Renal , Estudos Retrospectivos , Resultado do Tratamento , Urina/citologia
2.
Europace ; 23(8): 1179-1191, 2021 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-33564873

RESUMO

In the recent decade, deep learning, a subset of artificial intelligence and machine learning, has been used to identify patterns in big healthcare datasets for disease phenotyping, event predictions, and complex decision making. Public datasets for electrocardiograms (ECGs) have existed since the 1980s and have been used for very specific tasks in cardiology, such as arrhythmia, ischemia, and cardiomyopathy detection. Recently, private institutions have begun curating large ECG databases that are orders of magnitude larger than the public databases for ingestion by deep learning models. These efforts have demonstrated not only improved performance and generalizability in these aforementioned tasks but also application to novel clinical scenarios. This review focuses on orienting the clinician towards fundamental tenets of deep learning, state-of-the-art prior to its use for ECG analysis, and current applications of deep learning on ECGs, as well as their limitations and future areas of improvement.


Assuntos
Cardiologia , Aprendizado Profundo , Inteligência Artificial , Eletrocardiografia , Humanos , Aprendizado de Máquina
3.
J Med Internet Res ; 23(2): e26107, 2021 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-33529156

RESUMO

BACKGROUND: Changes in autonomic nervous system function, characterized by heart rate variability (HRV), have been associated with infection and observed prior to its clinical identification. OBJECTIVE: We performed an evaluation of HRV collected by a wearable device to identify and predict COVID-19 and its related symptoms. METHODS: Health care workers in the Mount Sinai Health System were prospectively followed in an ongoing observational study using the custom Warrior Watch Study app, which was downloaded to their smartphones. Participants wore an Apple Watch for the duration of the study, measuring HRV throughout the follow-up period. Surveys assessing infection and symptom-related questions were obtained daily. RESULTS: Using a mixed-effect cosinor model, the mean amplitude of the circadian pattern of the standard deviation of the interbeat interval of normal sinus beats (SDNN), an HRV metric, differed between subjects with and without COVID-19 (P=.006). The mean amplitude of this circadian pattern differed between individuals during the 7 days before and the 7 days after a COVID-19 diagnosis compared to this metric during uninfected time periods (P=.01). Significant changes in the mean and amplitude of the circadian pattern of the SDNN was observed between the first day of reporting a COVID-19-related symptom compared to all other symptom-free days (P=.01). CONCLUSIONS: Longitudinally collected HRV metrics from a commonly worn commercial wearable device (Apple Watch) can predict the diagnosis of COVID-19 and identify COVID-19-related symptoms. Prior to the diagnosis of COVID-19 by nasal swab polymerase chain reaction testing, significant changes in HRV were observed, demonstrating the predictive ability of this metric to identify COVID-19 infection.


Assuntos
Teste para COVID-19/métodos , COVID-19/diagnóstico , COVID-19/fisiopatologia , Frequência Cardíaca/fisiologia , Dispositivos Eletrônicos Vestíveis , Adulto , COVID-19/virologia , Ritmo Circadiano/fisiologia , Feminino , Pessoal de Saúde , Humanos , Masculino , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação
4.
Clin Infect Dis ; 71(11): 2933-2938, 2020 12 31.
Artigo em Inglês | MEDLINE | ID: mdl-32594164

RESUMO

BACKGROUND: There are limited data regarding the clinical impact of coronavirus disease 2019 (COVID-19) on people living with human immunodeficiency virus (PLWH). In this study, we compared outcomes for PLWH with COVID-19 to a matched comparison group. METHODS: We identified 88 PLWH hospitalized with laboratory-confirmed COVID-19 in our hospital system in New York City between 12 March and 23 April 2020. We collected data on baseline clinical characteristics, laboratory values, HIV status, treatment, and outcomes from this group and matched comparators (1 PLWH to up to 5 patients by age, sex, race/ethnicity, and calendar week of infection). We compared clinical characteristics and outcomes (death, mechanical ventilation, hospital discharge) for these groups, as well as cumulative incidence of death by HIV status. RESULTS: Patients did not differ significantly by HIV status by age, sex, or race/ethnicity due to the matching algorithm. PLWH hospitalized with COVID-19 had high proportions of HIV virologic control on antiretroviral therapy. PLWH had greater proportions of smoking (P < .001) and comorbid illness than uninfected comparators. There was no difference in COVID-19 severity on admission by HIV status (P = .15). Poor outcomes for hospitalized PLWH were frequent but similar to proportions in comparators; 18% required mechanical ventilation and 21% died during follow-up (compared with 23% and 20%, respectively). There was similar cumulative incidence of death over time by HIV status (P = .94). CONCLUSIONS: We found no differences in adverse outcomes associated with HIV infection for hospitalized COVID-19 patients compared with a demographically similar patient group.


Assuntos
COVID-19 , Coronavirus , Infecções por HIV , COVID-19/mortalidade , COVID-19/terapia , HIV , Infecções por HIV/complicações , Infecções por HIV/tratamento farmacológico , Infecções por HIV/epidemiologia , Humanos , Cidade de Nova Iorque/epidemiologia , Alta do Paciente , Respiração Artificial , SARS-CoV-2 , Resultado do Tratamento
5.
Brief Bioinform ; 19(6): 1236-1246, 2018 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-28481991

RESUMO

Gaining knowledge and actionable insights from complex, high-dimensional and heterogeneous biomedical data remains a key challenge in transforming health care. Various types of data have been emerging in modern biomedical research, including electronic health records, imaging, -omics, sensor data and text, which are complex, heterogeneous, poorly annotated and generally unstructured. Traditional data mining and statistical learning approaches typically need to first perform feature engineering to obtain effective and more robust features from those data, and then build prediction or clustering models on top of them. There are lots of challenges on both steps in a scenario of complicated data and lacking of sufficient domain knowledge. The latest advances in deep learning technologies provide new effective paradigms to obtain end-to-end learning models from complex data. In this article, we review the recent literature on applying deep learning technologies to advance the health care domain. Based on the analyzed work, we suggest that deep learning approaches could be the vehicle for translating big biomedical data into improved human health. However, we also note limitations and needs for improved methods development and applications, especially in terms of ease-of-understanding for domain experts and citizen scientists. We discuss such challenges and suggest developing holistic and meaningful interpretable architectures to bridge deep learning models and human interpretability.


Assuntos
Aprendizado Profundo , Atenção à Saúde/organização & administração , Biologia Computacional , Mineração de Dados , Diagnóstico por Imagem , Registros Eletrônicos de Saúde , Genômica , Humanos , Telemedicina
6.
Brief Bioinform ; 19(4): 656-678, 2018 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-28200013

RESUMO

Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug-resistant pathogens, drug-resistant cancers (cisplatin-resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta-analyses could augment therapeutic development.


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Doença , Descoberta de Drogas , Reposicionamento de Medicamentos , Proteínas/metabolismo , Humanos , Epidemiologia Molecular , Proteínas/genética
7.
Bioinformatics ; 35(21): 4515-4518, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31214700

RESUMO

MOTIVATION: Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge. RESULTS: We present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes. AVAILABILITY AND IMPLEMENTATION: PatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Registros Eletrônicos de Saúde , Software , Computadores , Bases de Dados Factuais , Humanos , Estudos Observacionais como Assunto
8.
J Med Internet Res ; 22(11): e24018, 2020 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-33027032

RESUMO

BACKGROUND: COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. OBJECTIVE: The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. METHODS: We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19-positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. RESULTS: Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. CONCLUSIONS: We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.


Assuntos
Infecções por Coronavirus/diagnóstico , Infecções por Coronavirus/mortalidade , Aprendizado de Máquina/normas , Pneumonia Viral/diagnóstico , Pneumonia Viral/mortalidade , Injúria Renal Aguda/epidemiologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Betacoronavirus , COVID-19 , Estudos de Coortes , Registros Eletrônicos de Saúde , Feminino , Mortalidade Hospitalar , Hospitalização/estatística & dados numéricos , Hospitais , Humanos , Masculino , Pessoa de Meia-Idade , Cidade de Nova Iorque/epidemiologia , Pandemias , Prognóstico , Curva ROC , Medição de Risco/métodos , Medição de Risco/normas , SARS-CoV-2 , Adulto Jovem
9.
Sensors (Basel) ; 20(5)2020 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-32138289

RESUMO

Sleep quality has been directly linked to cognitive function, quality of life, and a variety of serious diseases across many clinical domains. Standard methods for assessing sleep involve overnight studies in hospital settings, which are uncomfortable, expensive, not representative of real sleep, and difficult to conduct on a large scale. Recently, numerous commercial digital devices have been developed that record physiological data, such as movement, heart rate, and respiratory rate, which can act as a proxy for sleep quality in lieu of standard electroencephalogram recording equipment. The sleep-related output metrics from these devices include sleep staging and total sleep duration and are derived via proprietary algorithms that utilize a variety of these physiological recordings. Each device company makes different claims of accuracy and measures different features of sleep quality, and it is still unknown how well these devices correlate with one another and perform in a research setting. In this pilot study of 21 participants, we investigated whether sleep metric outputs from self-reported sleep metrics (SRSMs) and four sensors, specifically Fitbit Surge (a smart watch), Withings Aura (a sensor pad that is placed under a mattress), Hexoskin (a smart shirt), and Oura Ring (a smart ring), were related to known cognitive and psychological metrics, including the n-back test and Pittsburgh Sleep Quality Index (PSQI). We analyzed correlation between multiple device-related sleep metrics. Furthermore, we investigated relationships between these sleep metrics and cognitive scores across different timepoints and SRSM through univariate linear regressions. We found that correlations for sleep metrics between the devices across the sleep cycle were almost uniformly low, but still significant (P < 0.05). For cognitive scores, we found the Withings latency was statistically significant for afternoon and evening timepoints at P = 0.016 and P = 0.013. We did not find any significant associations between SRSMs and PSQI or cognitive scores. Additionally, Oura Ring's total sleep duration and efficiency in relation to the PSQI measure was statistically significant at P = 0.004 and P = 0.033, respectively. These findings can hopefully be used to guide future sensor-based sleep research.


Assuntos
Meio Ambiente , Sono/fisiologia , Adulto , Cognição , Feminino , Humanos , Masculino , Projetos Piloto , Autorrelato , Fases do Sono/fisiologia , Adulto Jovem
10.
Brief Bioinform ; 18(1): 105-124, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-26876889

RESUMO

Monitoring and modeling biomedical, health care and wellness data from individuals and converging data on a population scale have tremendous potential to improve understanding of the transition to the healthy state of human physiology to disease setting. Wellness monitoring devices and companion software applications capable of generating alerts and sharing data with health care providers or social networks are now available. The accessibility and clinical utility of such data for disease or wellness research are currently limited. Designing methods for streaming data capture, real-time data aggregation, machine learning, predictive analytics and visualization solutions to integrate wellness or health monitoring data elements with the electronic medical records (EMRs) maintained by health care providers permits better utilization. Integration of population-scale biomedical, health care and wellness data would help to stratify patients for active health management and to understand clinically asymptomatic patients and underlying illness trajectories. In this article, we discuss various health-monitoring devices, their ability to capture the unique state of health represented in a patient and their application in individualized diagnostics, prognosis, clinical or wellness intervention. We also discuss examples of translational bioinformatics approaches to integrating patient-generated data with existing EMRs, personal health records, patient portals and clinical data repositories. Briefly, translational bioinformatics methods, tools and resources are at the center of these advances in implementing real-time biomedical and health care analytics in the clinical setting. Furthermore, these advances are poised to play a significant role in clinical decision-making and implementation of data-driven medicine and wellness care.


Assuntos
Biologia Computacional , Coleta de Dados , Humanos , Software
11.
J Biomed Inform ; 46(6): 1145-51, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24036004

RESUMO

Clinical text, such as clinical trial eligibility criteria, is largely underused in state-of-the-art medical search engines due to difficulties of accurate parsing. This paper proposes a novel methodology to derive a semantic index for clinical eligibility documents based on a controlled vocabulary of frequent tags, which are automatically mined from the text. We applied this method to eligibility criteria on ClinicalTrials.gov and report that frequent tags (1) define an effective and efficient index of clinical trials and (2) are unlikely to grow radically when the repository increases. We proposed to apply the semantic index to filter clinical trial search results and we concluded that frequent tags reduce the result space more efficiently than an uncontrolled set of UMLS concepts. Overall, unsupervised mining of frequent tags from clinical text leads to an effective semantic index for the clinical eligibility documents and promotes their computational reuse.


Assuntos
Ensaios Clínicos como Assunto , Mineração de Dados , Definição da Elegibilidade , Humanos
12.
J Biomed Inform ; 46(1): 33-9, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22846169

RESUMO

OBJECTIVE: To identify Common Data Elements (CDEs) in eligibility criteria of multiple clinical trials studying the same disease using a human-computer collaborative approach. DESIGN: A set of free-text eligibility criteria from clinical trials on two representative diseases, breast cancer and cardiovascular diseases, was sampled to identify disease-specific eligibility criteria CDEs. In this proposed approach, a semantic annotator is used to recognize Unified Medical Language Systems (UMLSs) terms within the eligibility criteria text. The Apriori algorithm is applied to mine frequent disease-specific UMLS terms, which are then filtered by a list of preferred UMLS semantic types, grouped by similarity based on the Dice coefficient, and, finally, manually reviewed. MEASUREMENTS: Standard precision, recall, and F-score of the CDEs recommended by the proposed approach were measured with respect to manually identified CDEs. RESULTS: Average precision and recall of the recommended CDEs for the two diseases were 0.823 and 0.797, respectively, leading to an average F-score of 0.810. In addition, the machine-powered CDEs covered 80% of the cardiovascular CDEs published by The American Heart Association and assigned by human experts. CONCLUSION: It is feasible and effort saving to use a human-computer collaborative approach to augment domain experts for identifying disease-specific CDEs from free-text clinical trial eligibility criteria.


Assuntos
Ensaios Clínicos como Assunto , Comportamento Cooperativo , Sistemas Homem-Máquina , Seleção de Pacientes , Algoritmos , Humanos , Armazenamento e Recuperação da Informação , Unified Medical Language System
13.
J Biomed Inform ; 46(6): 1060-7, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23916863

RESUMO

OBJECTIVE: Information overload is a significant problem facing online clinical trial searchers. We present eTACTS, a novel interactive retrieval framework using common eligibility tags to dynamically filter clinical trial search results. MATERIALS AND METHODS: eTACTS mines frequent eligibility tags from free-text clinical trial eligibility criteria and uses these tags for trial indexing. After an initial search, eTACTS presents to the user a tag cloud representing the current results. When the user selects a tag, eTACTS retains only those trials containing that tag in their eligibility criteria and generates a new cloud based on tag frequency and co-occurrences in the remaining trials. The user can then select a new tag or unselect a previous tag. The process iterates until a manageable number of trials is returned. We evaluated eTACTS in terms of filtering efficiency, diversity of the search results, and user eligibility to the filtered trials using both qualitative and quantitative methods. RESULTS: eTACTS (1) rapidly reduced search results from over a thousand trials to ten; (2) highlighted trials that are generally not top-ranked by conventional search engines; and (3) retrieved a greater number of suitable trials than existing search engines. DISCUSSION: eTACTS enables intuitive clinical trial searches by indexing eligibility criteria with effective tags. User evaluation was limited to one case study and a small group of evaluators due to the long duration of the experiment. Although a larger-scale evaluation could be conducted, this feasibility study demonstrated significant advantages of eTACTS over existing clinical trial search engines. CONCLUSION: A dynamic eligibility tag cloud can potentially enhance state-of-the-art clinical trial search engines by allowing intuitive and efficient filtering of the search result space.


Assuntos
Ensaios Clínicos como Assunto , Humanos , Resultado do Tratamento
14.
JAMIA Open ; 5(4): ooac097, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36448021

RESUMO

Objective: Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. Materials and Methods: We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A "train and validation") using cross-validation, and then applied the models to a second dataset (dataset B "test") to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. Results: With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the "train and validation" dataset A and 0.94 in the "test" dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. Conclusion: These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension.

15.
Cardiovasc Digit Health J ; 3(5): 220-231, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36310683

RESUMO

Background: Electrocardiogram (ECG) deep learning (DL) has promise to improve the outcomes of patients with cardiovascular abnormalities. In ECG DL, researchers often use convolutional neural networks (CNNs) and traditionally use the full duration of raw ECG waveforms that create redundancies in feature learning and result in inaccurate predictions with large uncertainties. Objective: For enhancing these predictions, we introduced a sub-waveform representation that leverages the rhythmic pattern of ECG waveforms (data-centric approach) rather than changing the CNN architecture (model-centric approach). Results: We applied the proposed representation to a population with 92,446 patients to identify left ventricular dysfunction. We found that the sub-waveform representation increases the performance metrics compared to the full-waveform representation. We observed a 2% increase for area under the receiver operating characteristic curve and 10% increase for area under the precision-recall curve. We also carefully examined three reliability components of explainability, interpretability, and fairness. We provided an explanation for enhancements obtained by heartbeat alignment mechanism. By developing a new scoring system, we interpreted the clinical relevance of ECG features and showed that sub-waveform representation further pushes the scores towards clinical predictions. Finally, we showed that the new representation significantly reduces prediction uncertainties within subgroups that contributes to individual fairness. Conclusion: We expect that this added control over the granularity of ECG data will improve the DL modeling for new artificial intelligence technologies in the cardiovascular space.

16.
IEEE Trans Big Data ; 7(1): 38-44, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33768136

RESUMO

Traditional Machine Learning (ML) models have had limited success in predicting Coronoavirus-19 (COVID-19) outcomes using Electronic Health Record (EHR) data partially due to not effectively capturing the inter-connectivity patterns between various data modalities. In this work, we propose a novel framework that utilizes relational learning based on a heterogeneous graph model (HGM) for predicting mortality at different time windows in COVID-19 patients within the intensive care unit (ICU). We utilize the EHRs of one of the largest and most diverse patient populations across five hospitals in major health system in New York City. In our model, we use an LSTM for processing time varying patient data and apply our proposed relational learning strategy in the final output layer along with other static features. Here, we replace the traditional softmax layer with a Skip-Gram relational learning strategy to compare the similarity between a patient and outcome embedding representation. We demonstrate that the construction of a HGM can robustly learn the patterns classifying patient representations of outcomes through leveraging patterns within the embeddings of similar patients. Our experimental results show that our relational learning-based HGM model achieves higher area under the receiver operating characteristic curve (auROC) than both comparator models in all prediction time windows, with dramatic improvements to recall.

17.
Patterns (N Y) ; 2(9): 100337, 2021 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-34553174

RESUMO

Robust phenotyping of patients from electronic health records (EHRs) at scale is a challenge in clinical informatics. Here, we introduce Phe2vec, an automated framework for disease phenotyping from EHRs based on unsupervised learning and assess its effectiveness against standard rule-based algorithms from Phenotype KnowledgeBase (PheKB). Phe2vec is based on pre-computing embeddings of medical concepts and patients' clinical history. Disease phenotypes are then derived from a seed concept and its neighbors in the embedding space. Patients are linked to a disease if their embedded representation is close to the disease phenotype. Comparing Phe2vec and PheKB cohorts head-to-head using chart review, Phe2vec performed on par or better in nine out of ten diseases. Differently from other approaches, it can scale to any condition and was validated against widely adopted expert-based standards. Phe2vec aims to optimize clinical informatics research by augmenting current frameworks to characterize patients by condition and derive reliable disease cohorts.

18.
ArXiv ; 2021 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-33442560

RESUMO

Machine Learning (ML) models typically require large-scale, balanced training data to be robust, generalizable, and effective in the context of healthcare. This has been a major issue for developing ML models for the coronavirus-disease 2019 (COVID-19) pandemic where data is highly imbalanced, particularly within electronic health records (EHR) research. Conventional approaches in ML use cross-entropy loss (CEL) that often suffers from poor margin classification. For the first time, we show that contrastive loss (CL) improves the performance of CEL especially for imbalanced EHR data and the related COVID-19 analyses. This study has been approved by the Institutional Review Board at the Icahn School of Medicine at Mount Sinai. We use EHR data from five hospitals within the Mount Sinai Health System (MSHS) to predict mortality, intubation, and intensive care unit (ICU) transfer in hospitalized COVID-19 patients over 24 and 48 hour time windows. We train two sequential architectures (RNN and RETAIN) using two loss functions (CEL and CL). Models are tested on full sample data set which contain all available data and restricted data set to emulate higher class imbalance.CL models consistently outperform CEL models with the restricted data set on these tasks with differences ranging from 0.04 to 0.15 for AUPRC and 0.05 to 0.1 for AUROC. For the restricted sample, only the CL model maintains proper clustering and is able to identify important features, such as pulse oximetry. CL outperforms CEL in instances of severe class imbalance, on three EHR outcomes with respect to three performance metrics: predictive power, clustering, and feature importance. We believe that the developed CL framework can be expanded and used for EHR ML work in general.

19.
Patterns (N Y) ; 2(12): 100389, 2021 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-34723227

RESUMO

Deep learning (DL) models typically require large-scale, balanced training data to be robust, generalizable, and effective in the context of healthcare. This has been a major issue for developing DL models for the coronavirus disease 2019 (COVID-19) pandemic, where data are highly class imbalanced. Conventional approaches in DL use cross-entropy loss (CEL), which often suffers from poor margin classification. We show that contrastive loss (CL) improves the performance of CEL, especially in imbalanced electronic health records (EHR) data for COVID-19 analyses. We use a diverse EHR dataset to predict three outcomes: mortality, intubation, and intensive care unit (ICU) transfer in hospitalized COVID-19 patients over multiple time windows. To compare the performance of CEL and CL, models are tested on the full dataset and a restricted dataset. CL models consistently outperform CEL models, with differences ranging from 0.04 to 0.15 for area under the precision and recall curve (AUPRC) and 0.05 to 0.1 for area under the receiver-operating characteristic curve (AUROC).

20.
JMIR Med Inform ; 9(1): e24207, 2021 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-33400679

RESUMO

BACKGROUND: Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. OBJECTIVE: We aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. METHODS: Patient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. RESULTS: The LASSOfederated model outperformed the LASSOlocal model at 3 hospitals, and the MLPfederated model performed better than the MLPlocal model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSOpooled model outperformed the LASSOfederated model at all hospitals, and the MLPfederated model outperformed the MLPpooled model at 2 hospitals. CONCLUSIONS: The federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA