Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
1.
Eur Heart J ; 45(22): 2002-2012, 2024 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-38503537

RESUMO

BACKGROUND AND AIMS: Early identification of cardiac structural abnormalities indicative of heart failure is crucial to improving patient outcomes. Chest X-rays (CXRs) are routinely conducted on a broad population of patients, presenting an opportunity to build scalable screening tools for structural abnormalities indicative of Stage B or worse heart failure with deep learning methods. In this study, a model was developed to identify severe left ventricular hypertrophy (SLVH) and dilated left ventricle (DLV) using CXRs. METHODS: A total of 71 589 unique CXRs from 24 689 different patients completed within 1 year of echocardiograms were identified. Labels for SLVH, DLV, and a composite label indicating the presence of either were extracted from echocardiograms. A deep learning model was developed and evaluated using area under the receiver operating characteristic curve (AUROC). Performance was additionally validated on 8003 CXRs from an external site and compared against visual assessment by 15 board-certified radiologists. RESULTS: The model yielded an AUROC of 0.79 (0.76-0.81) for SLVH, 0.80 (0.77-0.84) for DLV, and 0.80 (0.78-0.83) for the composite label, with similar performance on an external data set. The model outperformed all 15 individual radiologists for predicting the composite label and achieved a sensitivity of 71% vs. 66% against the consensus vote across all radiologists at a fixed specificity of 73%. CONCLUSIONS: Deep learning analysis of CXRs can accurately detect the presence of certain structural abnormalities and may be useful in early identification of patients with LV hypertrophy and dilation. As a resource to promote further innovation, 71 589 CXRs with adjoining echocardiographic labels have been made publicly available.


Assuntos
Aprendizado Profundo , Hipertrofia Ventricular Esquerda , Radiografia Torácica , Humanos , Hipertrofia Ventricular Esquerda/diagnóstico por imagem , Radiografia Torácica/métodos , Feminino , Masculino , Pessoa de Meia-Idade , Ecocardiografia/métodos , Idoso , Insuficiência Cardíaca/diagnóstico por imagem , Ventrículos do Coração/diagnóstico por imagem , Curva ROC
2.
Pediatr Crit Care Med ; 25(1): 54-61, 2024 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-37966346

RESUMO

OBJECTIVES: Patient vital sign data charted in the electronic health record (EHR) are used for time-sensitive decisions, yet little is known about when these data become nominally available compared with when the vital sign was actually measured. The objective of this study was to determine the magnitude of any delay between when a vital sign was actually measured in a patient and when it nominally appears in the EHR. DESIGN: We performed a single-center retrospective cohort study. SETTING: Tertiary academic children's hospital. PATIENTS: A total of 5,458 patients were admitted to a PICU from January 2014 to December 2018. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: We analyzed entry and display times of all vital signs entered in the EHR. The primary outcome measurement was time between vital sign occurrence and nominal timing of the vital sign in the EHR. An additional outcome measurement was the frequency of batch charting. A total of 9,818,901 vital sign recordings occurred during the study period. Across the entire cohort the median (interquartile range [IQR]) difference between time of occurrence and nominal time in the EHR was in hours:minutes:seconds, 00:41:58 (IQR 00:13:42-01:44:10). Lag in the first 24 hours of PICU admission was 00:47:34 (IQR 00:15:23-02:19:00), lag in the last 24 hours was 00:38:49 (IQR 00:13:09-01:29:22; p < 0.001). There were 1,892,143 occurrences of batch charting. CONCLUSIONS: This retrospective study shows a lag between vital sign occurrence and its appearance in the EHR, as well as a frequent practice of batch charting. The magnitude of the delay-median ~40 minutes-suggests that vital signs available in the EHR for clinical review and incorporation into clinical alerts may be outdated by the time they are available.


Assuntos
Registros Eletrônicos de Saúde , Sinais Vitais , Criança , Humanos , Estudos Retrospectivos , Fatores de Tempo , Unidades de Terapia Intensiva Pediátrica
3.
Am J Transplant ; 22(5): 1372-1381, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35000284

RESUMO

Deceased donor kidney allocation follows a ranked match-run of potential recipients. Organ procurement organizations (OPOs) are permitted to deviate from the mandated match-run in exceptional circumstances. Using match-run data for all deceased donor kidney transplants (Ktx) in the US between 2015 and 2019, we identified 1544 kidneys transplanted from 933 donors with an OPO-initiated allocation exception. Most OPOs (55/58) used this process at least once, but 3 OPOs performed 64% of the exceptions and just 2 transplant centers received 25% of allocation exception Ktx. At 2 of 3 outlier OPOs these transplants increased 136% and 141% between 2015 and 2019 compared to only a 35% increase in all Ktx. Allocation exception donors had less favorable characteristics (median KDPI 70, 41% with history of hypertension), but only 29% had KDPI ≥ 85% and the majority did not meet the traditional threshold for marginal kidneys. Allocation exception kidneys went to larger centers with higher offer acceptance ratios and to recipients with 2 fewer priority points-equivalent to 2 less years of waiting time. OPO-initiated exceptions for kidney allocation are growing increasingly frequent and more concentrated at a few outlier centers. Increasing pressure to improve organ utilization risks increasing out-of-sequence allocations, potentially exacerbating disparities in access to transplantation.


Assuntos
Transplante de Rim , Obtenção de Tecidos e Órgãos , Transplantes , Humanos , Rim , Doadores de Tecidos
4.
Transpl Int ; 34(7): 1239-1250, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33964036

RESUMO

Unfavourable procurement biopsy findings are the most common reason for deceased donor kidney discard in the United States. We sought to assess the association between biopsy findings and post-transplant outcomes when donor characteristics are accounted for. We used registry data to identify 1566 deceased donors of 3132 transplanted kidneys (2015-2020) with discordant right/left procurement biopsy classification and performed time-to-event analyses to determine the association between optimal histology and hazard of death-censored graft failure or death. We then repeated all analyses using a local cohort of 147 donors of kidney pairs with detailed procurement histology data available (2006-2016). Among transplanted kidney pairs in the national cohort, there were no significant differences in incidence of delayed graft function or primary nonfunction. Time to death-censored graft failure was not significantly different between recipients of optimal versus suboptimal kidneys. Results were similar in analyses using the local cohort. Regarding recipient survival, analysis of the national, but not local, cohort showed optimal kidneys were associated with a lower hazard of death (adjusted HR 0.68, 95% CI 0.52-0.90, P = 0.006). In conclusion, in a large national cohort of deceased donor kidney pairs with discordant right/left procurement biopsy findings, we found no association between histology and death-censored graft survival.


Assuntos
Transplante de Rim , Obtenção de Tecidos e Órgãos , Biópsia , Seleção do Doador , Sobrevivência de Enxerto , Humanos , Rim , Doadores de Tecidos , Resultado do Tratamento , Estados Unidos
5.
J Biomed Inform ; 121: 103870, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34302957

RESUMO

Evidence-Based Medicine (EBM) encourages clinicians to seek the most reputable evidence. The quality of evidence is organized in a hierarchy in which randomized controlled trials (RCTs) are regarded as least biased. However, RCTs are plagued by poor generalizability, impeding the translation of clinical research to practice. Though the presence of poor external validity is known, the factors that contribute to poor generalizability have not been summarized and placed in a framework. We propose a new population-oriented conceptual framework to facilitate consistent and comprehensive evaluation of generalizability, replicability, and assessment of RCT study quality.


Assuntos
Medicina Baseada em Evidências , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa
6.
J Biomed Inform ; 109: 103515, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32771540

RESUMO

Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome, known as strong ignorability. Approaches to enforcing strong ignorability in causal analyses of observational data include weighting and matching methods. Effect estimates, such as the average treatment effect (ATE), are then estimated as expectations under the re-weighted or matched distribution, P. The choice of P is important and can impact the interpretation of the effect estimate and the variance of effect estimates. In this work, instead of specifying P, we learn a distribution that simultaneously maximizes coverage and minimizes variance of ATE estimates. In order to learn this distribution, this research proposes a generative adversarial network (GAN)-based model called the Counterfactual χ-GAN (cGAN), which also learns feature-balancing weights and supports unbiased causal estimation in the absence of unobserved confounding. Our model minimizes the Pearson χ2-divergence, which we show simultaneously maximizes coverage and minimizes the variance of importance sampling estimates. To our knowledge, this is the first such application of the Pearson χ2-divergence. We demonstrate the effectiveness of cGAN in achieving feature balance relative to established weighting methods in simulation and with real-world medical data.


Assuntos
Causalidade , Simulação por Computador , Humanos
7.
Proc Natl Acad Sci U S A ; 113(27): 7329-36, 2016 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-27274072

RESUMO

Observational research promises to complement experimental research by providing large, diverse populations that would be infeasible for an experiment. Observational research can test its own clinical hypotheses, and observational studies also can contribute to the design of experiments and inform the generalizability of experimental research. Understanding the diversity of populations and the variance in care is one component. In this study, the Observational Health Data Sciences and Informatics (OHDSI) collaboration created an international data network with 11 data sources from four countries, including electronic health records and administrative claims data on 250 million patients. All data were mapped to common data standards, patient privacy was maintained by using a distributed model, and results were aggregated centrally. Treatment pathways were elucidated for type 2 diabetes mellitus, hypertension, and depression. The pathways revealed that the world is moving toward more consistent therapy over time across diseases and across locations, but significant heterogeneity remains among sources, pointing to challenges in generalizing clinical trial results. Diabetes favored a single first-line medication, metformin, to a much greater extent than hypertension or depression. About 10% of diabetes and depression patients and almost 25% of hypertension patients followed a treatment pathway that was unique within the cohort. Aside from factors such as sample size and underlying population (academic medical center versus general population), electronic health records data and administrative claims data revealed similar results. Large-scale international observational research is feasible.


Assuntos
Padrões de Prática Médica/estatística & dados numéricos , Antidepressivos/uso terapêutico , Anti-Hipertensivos/uso terapêutico , Bases de Dados Factuais , Depressão/tratamento farmacológico , Diabetes Mellitus Tipo 2/tratamento farmacológico , Registros Eletrônicos de Saúde , Humanos , Hipertensão/tratamento farmacológico , Hipoglicemiantes/uso terapêutico , Internacionalidade , Informática Médica
8.
J Med Internet Res ; 18(8): e205, 2016 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-27485315

RESUMO

BACKGROUND: Social media platforms are increasingly being used to support individuals in behavior change attempts, including smoking cessation. Examining the interactions of participants in health-related social media groups can help inform our understanding of how these groups can best be leveraged to facilitate behavior change. OBJECTIVE: The aim of this study was to analyze patterns of participation, self-reported smoking cessation length, and interactions within the National Cancer Institutes' Facebook community for smoking cessation support. METHODS: Our sample consisted of approximately 4243 individuals who interacted (eg, posted, commented) on the public Smokefree Women Facebook page during the time of data collection. In Phase 1, social network visualizations and centrality measures were used to evaluate network structure and engagement. In Phase 2, an inductive, thematic qualitative content analysis was conducted with a subsample of 500 individuals, and correlational analysis was used to determine how participant engagement was associated with self-reported session length. RESULTS: Between February 2013 and March 2014, there were 875 posts and 4088 comments from approximately 4243 participants. Social network visualizations revealed the moderator's role in keeping the community together and distributing the most active participants. Correlation analyses suggest that engagement in the network was significantly inversely associated with cessation status (Spearman correlation coefficient = -0.14, P=.03, N=243). The content analysis of 1698 posts from 500 randomly selected participants identified the most frequent interactions in the community as providing support (43%, n=721) and announcing number of days smoke free (41%, n=689). CONCLUSIONS: These findings highlight the importance of the moderator for network engagement and provide helpful insights into the patterns and types of interactions participants are engaging in. This study adds knowledge of how the social network of a smoking cessation community behaves within the confines of a Facebook group.


Assuntos
Abandono do Hábito de Fumar/métodos , Comportamento Social , Mídias Sociais/estatística & dados numéricos , Rede Social , Apoio Social , Adulto , Coleta de Dados , Feminino , Humanos , Abandono do Hábito de Fumar/estatística & dados numéricos
9.
J Biomed Inform ; 58: 156-165, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26464024

RESUMO

We present the Unsupervised Phenome Model (UPhenome), a probabilistic graphical model for large-scale discovery of computational models of disease, or phenotypes. We tackle this challenge through the joint modeling of a large set of diseases and a large set of clinical observations. The observations are drawn directly from heterogeneous patient record data (notes, laboratory tests, medications, and diagnosis codes), and the diseases are modeled in an unsupervised fashion. We apply UPhenome to two qualitatively different mixtures of patients and diseases: records of extremely sick patients in the intensive care unit with constant monitoring, and records of outpatients regularly followed by care providers over multiple years. We demonstrate that the UPhenome model can learn from these different care settings, without any additional adaptation. Our experiments show that (i) the learned phenotypes combine the heterogeneous data types more coherently than baseline LDA-based phenotypes; (ii) they each represent single diseases rather than a mix of diseases more often than the baseline ones; and (iii) when applied to unseen patient records, they are correlated with the patients' ground-truth disorders. Code for training, inference, and quantitative evaluation is made available to the research community.


Assuntos
Registros Eletrônicos de Saúde , Aprendizagem , Probabilidade , Humanos , Fenótipo
10.
Ann Neurol ; 74(1): 53-64, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23813945

RESUMO

OBJECTIVE: Seizures have been implicated as a cause of secondary brain injury, but the systemic and cerebral physiologic effects of seizures after acute brain injury are poorly understood. METHODS: We analyzed intracortical electroencephalographic (EEG) and multimodality physiological recordings in 48 comatose subarachnoid hemorrhage patients to better characterize the physiological response to seizures after acute brain injury. RESULTS: Intracortical seizures were seen in 38% of patients, and 8% had surface seizures. Intracortical seizures were accompanied by elevated heart rate (p = 0.001), blood pressure (p < 0.001), and respiratory rate (p < 0.001). There were trends for rising cerebral perfusion pressure (p = 0.03) and intracranial pressure (p = 0.06) seen after seizure onset. Intracortical seizure-associated increases in global brain metabolism, partial brain tissue oxygenation, and regional cerebral blood flow (rCBF) did not reach significance, but a trend for a pronounced delayed rCBF rise was seen for surface seizures (p = 0.08). Functional outcome was very poor for patients with severe background attenuation without seizures and best for those without severe attenuation or seizures (77% vs 0% dead or severely disabled, respectively). Outcome was intermediate for those with seizures independent of the background EEG and worse for those with intracortical only seizures when compared to those with intracortical and scalp seizures (50% and 25% death or severe disability, respectively). INTERPRETATION: We replicated in humans complex physiologic processes associated with seizures after acute brain injury previously described in laboratory experiments and illustrated differences such as the delayed increase in rCBF. These real world physiologic observations may permit more successful translation of laboratory research to the bedside.


Assuntos
Epilepsia Generalizada/diagnóstico , Epilepsia Generalizada/etiologia , Hemorragia Subaracnóidea/complicações , Idoso , Eletroencefalografia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Avaliação de Resultados em Cuidados de Saúde , Estudos Retrospectivos
11.
Lancet Digit Health ; 6(1): e70-e78, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38065778

RESUMO

BACKGROUND: Preoperative risk assessments used in clinical practice are insufficient in their ability to identify risk for postoperative mortality. Deep-learning analysis of electrocardiography can identify hidden risk markers that can help to prognosticate postoperative mortality. We aimed to develop a prognostic model that accurately predicts postoperative mortality in patients undergoing medical procedures and who had received preoperative electrocardiographic diagnostic testing. METHODS: In a derivation cohort of preoperative patients with available electrocardiograms (ECGs) from Cedars-Sinai Medical Center (Los Angeles, CA, USA) between Jan 1, 2015 and Dec 31, 2019, a deep-learning algorithm was developed to leverage waveform signals to discriminate postoperative mortality. We randomly split patients (8:1:1) into subsets for training, internal validation, and final algorithm test analyses. Model performance was assessed using area under the receiver operating characteristic curve (AUC) values in the hold-out test dataset and in two external hospital cohorts and compared with the established Revised Cardiac Risk Index (RCRI) score. The primary outcome was post-procedural mortality across three health-care systems. FINDINGS: 45 969 patients had a complete ECG waveform image available for at least one 12-lead ECG performed within the 30 days before the procedure date (59 975 inpatient procedures and 112 794 ECGs): 36 839 patients in the training dataset, 4549 in the internal validation dataset, and 4581 in the internal test dataset. In the held-out internal test cohort, the algorithm discriminates mortality with an AUC value of 0·83 (95% CI 0·79-0·87), surpassing the discrimination of the RCRI score with an AUC of 0·67 (0·61-0·72). The algorithm similarly discriminated risk for mortality in two independent US health-care systems, with AUCs of 0·79 (0·75-0·83) and 0·75 (0·74-0·76), respectively. Patients determined to be high risk by the deep-learning model had an unadjusted odds ratio (OR) of 8·83 (5·57-13·20) for postoperative mortality compared with an unadjusted OR of 2·08 (0·77-3·50) for postoperative mortality for RCRI scores of more than 2. The deep-learning algorithm performed similarly for patients undergoing cardiac surgery (AUC 0·85 [0·77-0·92]), non-cardiac surgery (AUC 0·83 [0·79-0·88]), and catheterisation or endoscopy suite procedures (AUC 0·76 [0·72-0·81]). INTERPRETATION: A deep-learning algorithm interpreting preoperative ECGs can improve discrimination of postoperative mortality. The deep-learning algorithm worked equally well for risk stratification of cardiac surgeries, non-cardiac surgeries, and catheterisation laboratory procedures, and was validated in three independent health-care systems. This algorithm can provide additional information to clinicians making the decision to perform medical procedures and stratify the risk of future complications. FUNDING: National Heart, Lung, and Blood Institute.


Assuntos
Aprendizado Profundo , Humanos , Medição de Risco/métodos , Algoritmos , Prognóstico , Eletrocardiografia
12.
J Am Med Inform Assoc ; 30(6): 1022-1031, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36921288

RESUMO

OBJECTIVE: To develop a computable representation for medical evidence and to contribute a gold standard dataset of annotated randomized controlled trial (RCT) abstracts, along with a natural language processing (NLP) pipeline for transforming free-text RCT evidence in PubMed into the structured representation. MATERIALS AND METHODS: Our representation, EvidenceMap, consists of 3 levels of abstraction: Medical Evidence Entity, Proposition and Map, to represent the hierarchical structure of medical evidence composition. Randomly selected RCT abstracts were annotated following EvidenceMap based on the consensus of 2 independent annotators to train an NLP pipeline. Via a user study, we measured how the EvidenceMap improved evidence comprehension and analyzed its representative capacity by comparing the evidence annotation with EvidenceMap representation and without following any specific guidelines. RESULTS: Two corpora including 229 disease-agnostic and 80 COVID-19 RCT abstracts were annotated, yielding 12 725 entities and 1602 propositions. EvidenceMap saves users 51.9% of the time compared to reading raw-text abstracts. Most evidence elements identified during the freeform annotation were successfully represented by EvidenceMap, and users gave the enrollment, study design, and study Results sections mean 5-scale Likert ratings of 4.85, 4.70, and 4.20, respectively. The end-to-end evaluations of the pipeline show that the evidence proposition formulation achieves F1 scores of 0.84 and 0.86 in the adjusted random index score. CONCLUSIONS: EvidenceMap extends the participant, intervention, comparator, and outcome framework into 3 levels of abstraction for transforming free-text evidence from the clinical literature into a computable structure. It can be used as an interoperable format for better evidence retrieval and synthesis and an interpretable representation to efficiently comprehend RCT findings.


Assuntos
COVID-19 , Compreensão , Humanos , Processamento de Linguagem Natural , PubMed
13.
AMIA Annu Symp Proc ; 2023: 289-298, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222422

RESUMO

Complete and accurate race and ethnicity (RE) patient information is important for many areas of biomedical informatics research, such as defining and characterizing cohorts, performing quality assessments, and identifying health inequities. Patient-level RE data is often inaccurate or missing in structured sources, but can be supplemented through clinical notes and natural language processing (NLP). While NLP has made many improvements in recent years with large language models, bias remains an often-unaddressed concern, with research showing that harmful and negative language is more often used for certain racial/ethnic groups than others. We present an approach to audit the learned associations of models trained to identify RE information in clinical text by measuring the concordance between model-derived salient features and manually identified RE-related spans of text. We show that while models perform well on the surface, there exist concerning learned associations and potential for future harms from RE-identification models if left unaddressed.


Assuntos
Aprendizado Profundo , Etnicidade , Humanos , Idioma , Processamento de Linguagem Natural
14.
J Am Coll Cardiol ; 80(6): 613-626, 2022 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-35926935

RESUMO

BACKGROUND: Valvular heart disease is an important contributor to cardiovascular morbidity and mortality and remains underdiagnosed. Deep learning analysis of electrocardiography (ECG) may be useful in detecting aortic stenosis (AS), aortic regurgitation (AR), and mitral regurgitation (MR). OBJECTIVES: This study aimed to develop ECG deep learning algorithms to identify moderate or severe AS, AR, and MR alone and in combination. METHODS: A total of 77,163 patients undergoing ECG within 1 year before echocardiography from 2005-2021 were identified and split into train (n = 43,165), validation (n = 12,950), and test sets (n = 21,048; 7.8% with any of AS, AR, or MR). Model performance was assessed using area under the receiver-operating characteristic (AU-ROC) and precision-recall curves. Outside validation was conducted on an independent data set. Test accuracy was modeled using different disease prevalence levels to simulate screening efficacy using the deep learning model. RESULTS: The deep learning algorithm model accuracy was as follows: AS (AU-ROC: 0.88), AR (AU-ROC: 0.77), MR (AU-ROC: 0.83), and any of AS, AR, or MR (AU-ROC: 0.84; sensitivity 78%, specificity 73%) with similar accuracy in external validation. In screening program modeling, test characteristics were dependent on underlying prevalence and selected sensitivity levels. At a prevalence of 7.8%, the positive and negative predictive values were 20% and 97.6%, respectively. CONCLUSIONS: Deep learning analysis of the ECG can accurately detect AS, AR, and MR in this multicenter cohort and may serve as the basis for the development of a valvular heart disease screening program.


Assuntos
Insuficiência da Valva Aórtica , Estenose da Valva Aórtica , Aprendizado Profundo , Doenças das Valvas Cardíacas , Insuficiência da Valva Mitral , Insuficiência da Valva Aórtica/diagnóstico , Estenose da Valva Aórtica/diagnóstico , Eletrocardiografia , Doenças das Valvas Cardíacas/diagnóstico , Doenças das Valvas Cardíacas/epidemiologia , Humanos , Insuficiência da Valva Mitral/diagnóstico , Insuficiência da Valva Mitral/epidemiologia
15.
J Am Med Inform Assoc ; 28(4): 812-823, 2021 03 18.
Artigo em Inglês | MEDLINE | ID: mdl-33367705

RESUMO

OBJECTIVE: The study sought to develop and evaluate a knowledge-based data augmentation method to improve the performance of deep learning models for biomedical natural language processing by overcoming training data scarcity. MATERIALS AND METHODS: We extended the easy data augmentation (EDA) method for biomedical named entity recognition (NER) by incorporating the Unified Medical Language System (UMLS) knowledge and called this method UMLS-EDA. We designed experiments to systematically evaluate the effect of UMLS-EDA on popular deep learning architectures for both NER and classification. We also compared UMLS-EDA to BERT. RESULTS: UMLS-EDA enables substantial improvement for NER tasks from the original long short-term memory conditional random fields (LSTM-CRF) model (micro-F1 score: +5%, + 17%, and +15%), helps the LSTM-CRF model (micro-F1 score: 0.66) outperform LSTM-CRF with transfer learning by BERT (0.63), and improves the performance of the state-of-the-art sentence classification model. The largest gain on micro-F1 score is 9%, from 0.75 to 0.84, better than classifiers with BERT pretraining (0.82). CONCLUSIONS: This study presents a UMLS-based data augmentation method, UMLS-EDA. It is effective at improving deep learning models for both NER and sentence classification, and contributes original insights for designing new, superior deep learning approaches for low-resource biomedical domains.


Assuntos
Pesquisa Biomédica , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Unified Medical Language System , Gerenciamento de Dados
16.
J Am Med Inform Assoc ; 28(8): 1703-1711, 2021 07 30.
Artigo em Inglês | MEDLINE | ID: mdl-33956981

RESUMO

OBJECTIVE: We introduce Medical evidence Dependency (MD)-informed attention, a novel neuro-symbolic model for understanding free-text clinical trial publications with generalizability and interpretability. MATERIALS AND METHODS: We trained one head in the multi-head self-attention model to attend to the Medical evidence Ddependency (MD) and to pass linguistic and domain knowledge on to later layers (MD informed). This MD-informed attention model was integrated into BioBERT and tested on 2 public machine reading comprehension benchmarks for clinical trial publications: Evidence Inference 2.0 and PubMedQA. We also curated a small set of recently published articles reporting randomized controlled trials on COVID-19 (coronavirus disease 2019) following the Evidence Inference 2.0 guidelines to evaluate the model's robustness to unseen data. RESULTS: The integration of MD-informed attention head improves BioBERT substantially in both benchmark tasks-as large as an increase of +30% in the F1 score-and achieves the new state-of-the-art performance on the Evidence Inference 2.0. It achieves 84% and 82% in overall accuracy and F1 score, respectively, on the unseen COVID-19 data. CONCLUSIONS: MD-informed attention empowers neural reading comprehension models with interpretability and generalizability via reusable domain knowledge. Its compositionality can benefit any transformer-based architecture for machine reading comprehension of free-text medical evidence.


Assuntos
Inteligência Artificial , Ensaios Clínicos como Assunto , Armazenamento e Recuperação da Informação/métodos , Modelos Neurológicos , Processamento de Linguagem Natural , COVID-19 , Simulação por Computador , Mineração de Dados , Humanos , Software
17.
Adv Neural Inf Process Syst ; 34: 2160-2172, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35859987

RESUMO

Deep models trained through maximum likelihood have achieved state-of-the-art results for survival analysis. Despite this training scheme, practitioners evaluate models under other criteria, such as binary classification losses at a chosen set of time horizons, e.g. Brier score (BS) and Bernoulli log likelihood (BLL). Models trained with maximum likelihood may have poor BS or BLL since maximum likelihood does not directly optimize these criteria. Directly optimizing criteria like BS requires inverse-weighting by the censoring distribution. However, estimating the censoring model under these metrics requires inverse-weighting by the failure distribution. The objective for each model requires the other, but neither are known. To resolve this dilemma, we introduce Inverse-Weighted Survival Games. In these games, objectives for each model are built from re-weighted estimates featuring the other model, where the latter is held fixed during training. When the loss is proper, we show that the games always have the true failure and censoring distributions as a stationary point. This means models in the game do not leave the correct distributions once reached. We construct one case where this stationary point is unique. We show that these games optimize BS on simulations and then apply these principles on real world cancer and critically-ill patient data.

18.
J Am Med Inform Assoc ; 28(9): 1970-1976, 2021 08 13.
Artigo em Inglês | MEDLINE | ID: mdl-34151966

RESUMO

Clinical notes present a wealth of information for applications in the clinical domain, but heterogeneity across clinical institutions and settings presents challenges for their processing. The clinical natural language processing field has made strides in overcoming domain heterogeneity, while pretrained deep learning models present opportunities to transfer knowledge from one task to another. Pretrained models have performed well when transferred to new tasks; however, it is not well understood if these models generalize across differences in institutions and settings within the clinical domain. We explore if institution or setting specific pretraining is necessary for pretrained models to perform well when transferred to new tasks. We find no significant performance difference between models pretrained across institutions and settings, indicating that clinically pretrained models transfer well across such boundaries. Given a clinically pretrained model, clinical natural language processing researchers may forgo the time-consuming pretraining step without a significant performance drop.


Assuntos
Aprendizado Profundo , Humanos , Processamento de Linguagem Natural , Pesquisadores
19.
J Am Med Inform Assoc ; 28(9): 1955-1963, 2021 08 13.
Artigo em Inglês | MEDLINE | ID: mdl-34270710

RESUMO

OBJECTIVE: To propose an algorithm that utilizes only timestamps of longitudinal electronic health record data to classify clinical deterioration events. MATERIALS AND METHODS: This retrospective study explores the efficacy of machine learning algorithms in classifying clinical deterioration events among patients in intensive care units using sequences of timestamps of vital sign measurements, flowsheets comments, order entries, and nursing notes. We design a data pipeline to partition events into discrete, regular time bins that we refer to as timesteps. Logistic regressions, random forest classifiers, and recurrent neural networks are trained on datasets of different length of timesteps, respectively, against a composite outcome of death, cardiac arrest, and Rapid Response Team calls. Then these models are validated on a holdout dataset. RESULTS: A total of 6720 intensive care unit encounters meet the criteria and the final dataset includes 830 578 timestamps. The gated recurrent unit model utilizes timestamps of vital signs, order entries, flowsheet comments, and nursing notes to achieve the best performance on the time-to-outcome dataset, with an area under the precision-recall curve of 0.101 (0.06, 0.137), a sensitivity of 0.443, and a positive predictive value of 0. 092 at the threshold of 0.6. DISCUSSION AND CONCLUSION: This study demonstrates that our recurrent neural network models using only timestamps of longitudinal electronic health record data that reflect healthcare processes achieve well-performing discriminative power.


Assuntos
Deterioração Clínica , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina , Estudos Retrospectivos , Sinais Vitais
20.
J Am Med Inform Assoc ; 28(7): 1480-1488, 2021 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-33706377

RESUMO

OBJECTIVE: Coronavirus disease 2019 (COVID-19) patients are at risk for resource-intensive outcomes including mechanical ventilation (MV), renal replacement therapy (RRT), and readmission. Accurate outcome prognostication could facilitate hospital resource allocation. We develop and validate predictive models for each outcome using retrospective electronic health record data for COVID-19 patients treated between March 2 and May 6, 2020. MATERIALS AND METHODS: For each outcome, we trained 3 classes of prediction models using clinical data for a cohort of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2)-positive patients (n = 2256). Cross-validation was used to select the best-performing models per the areas under the receiver-operating characteristic and precision-recall curves. Models were validated using a held-out cohort (n = 855). We measured each model's calibration and evaluated feature importances to interpret model output. RESULTS: The predictive performance for our selected models on the held-out cohort was as follows: area under the receiver-operating characteristic curve-MV 0.743 (95% CI, 0.682-0.812), RRT 0.847 (95% CI, 0.772-0.936), readmission 0.871 (95% CI, 0.830-0.917); area under the precision-recall curve-MV 0.137 (95% CI, 0.047-0.175), RRT 0.325 (95% CI, 0.117-0.497), readmission 0.504 (95% CI, 0.388-0.604). Predictions were well calibrated, and the most important features within each model were consistent with clinical intuition. DISCUSSION: Our models produce performant, well-calibrated, and interpretable predictions for COVID-19 patients at risk for the target outcomes. They demonstrate the potential to accurately estimate outcome prognosis in resource-constrained care sites managing COVID-19 patients. CONCLUSIONS: We develop and validate prognostic models targeting MV, RRT, and readmission for hospitalized COVID-19 patients which produce accurate, interpretable predictions. Additional external validation studies are needed to further verify the generalizability of our results.


Assuntos
COVID-19/terapia , Modelos Estatísticos , Readmissão do Paciente , Terapia de Substituição Renal , Respiração Artificial , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Área Sob a Curva , COVID-19/complicações , Registros Eletrônicos de Saúde , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Prognóstico , Curva ROC , Estudos Retrospectivos , Estatísticas não Paramétricas , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA