RESUMO
Testing multiple treatments for heterogeneous (varying) effectiveness with respect to many underlying risk factors requires many pairwise tests; we would like to instead automatically discover and visualize patient archetypes and predictors of treatment effectiveness using multitask machine learning. In this paper, we present a method to estimate these heterogeneous treatment effects with an interpretable hierarchical framework that uses additive models to visualize expected treatment benefits as a function of patient factors (identifying personalized treatment benefits) and concurrent treatments (identifying combinatorial treatment benefits). This method achieves state-of-the-art predictive power for COVID-19 in-hospital mortality and interpretable identification of heterogeneous treatment benefits. We first validate this method on the large public MIMIC-IV dataset of ICU patients to test recovery of heterogeneous treatment effects. Next we apply this method to a proprietary dataset of over 3000 patients hospitalized for COVID-19, and find evidence of heterogeneous treatment effectiveness predicted largely by indicators of inflammation and thrombosis risk: patients with few indicators of thrombosis risk benefit most from treatments against inflammation, while patients with few indicators of inflammation risk benefit most from treatments against thrombosis. This approach provides an automated methodology to discover heterogeneous and individualized effectiveness of treatments.
Assuntos
COVID-19 , Humanos , Inflamação , Aprendizado de Máquina , Fatores de Risco , Resultado do TratamentoRESUMO
BACKGROUND: Although clinical decision support (CDS) alerts are effective reminders of best practices, their effectiveness is blunted by clinicians who fail to respond to an overabundance of inappropriate alerts. An electronic health record (EHR)-integrated machine learning (ML) algorithm is a potentially powerful tool to increase the signal-to-noise ratio of CDS alerts and positively impact the clinician's interaction with these alerts in general. OBJECTIVE: This study aimed to describe the development and implementation of an ML-based signal-to-noise optimization system (SmartCDS) to increase the signal of alerts by decreasing the volume of low-value herpes zoster (shingles) vaccination alerts. METHODS: We built and deployed SmartCDS, which builds personalized user activity profiles to suppress shingles vaccination alerts unlikely to yield a clinician's interaction. We extracted all records of shingles alerts from January 2017 to March 2019 from our EHR system, including 327,737 encounters, 780 providers, and 144,438 patients. RESULTS: During the 6 weeks of pilot deployment, the SmartCDS system suppressed an average of 43.67% (15,425/35,315) potential shingles alerts (appointments) and maintained stable counts of weekly shingles vaccination orders (326.3 with system active vs 331.3 in the control group; P=.38) and weekly user-alert interactions (1118.3 with system active vs 1166.3 in the control group; P=.20). CONCLUSIONS: All key statistics remained stable while the system was turned on. Although the results are promising, the characteristics of the system can be subject to future data shifts, which require automated logging and monitoring. We demonstrated that an automated, ML-based method and data architecture to suppress alerts are feasible without detriment to overall order rates. This work is the first alert suppression ML-based model deployed in practice and serves as foundational work in encounter-level customization of alert display to maximize effectiveness.
Assuntos
Sistemas de Apoio a Decisões Clínicas/normas , Herpes Zoster/tratamento farmacológico , Aprendizado de Máquina/normas , Medicina de Precisão/métodos , Vacinação/métodos , Algoritmos , Humanos , Projetos PilotoAssuntos
Infecções por Coronavirus/complicações , Pneumonia Viral/complicações , Trombose/etiologia , Adolescente , Adulto , Idoso , Betacoronavirus , COVID-19 , Infecções por Coronavirus/sangue , Infecções por Coronavirus/mortalidade , Feminino , Produtos de Degradação da Fibrina e do Fibrinogênio/análise , Hospitalização , Humanos , Masculino , Pessoa de Meia-Idade , Cidade de Nova Iorque , Pandemias , Pneumonia Viral/sangue , Pneumonia Viral/mortalidade , Fatores de Risco , SARS-CoV-2 , Tromboembolia/epidemiologia , Tromboembolia/etiologia , Trombose/epidemiologia , Adulto JovemRESUMO
Predictive models may be particularly beneficial to clinicians when they face uncertainty and seek to develop a mental model of disease progression, but we know little about the post-implementation effects of predictive models on clinicians' experience of their work. Combining survey and interview methods, we found that providers using a predictive algorithm reported being significantly less uncertain and better able to anticipate, plan and prepare for patient discharge than non-users. The tool helped hospitalists form and develop confidence in their mental models of a novel disease (Covid-19). Yet providers' attention to the predictive tool declined as their confidence in their own mental models grew. Predictive algorithms that not only offer data but also provide feedback on decisions, thus supporting providers' motivation for continuous learning, hold promise for more sustained provider attention and cognition augmentation.
Assuntos
COVID-19 , Humanos , Algoritmos , Prognóstico , Aprendizado de Máquina , Alta do PacienteRESUMO
Treatment protocols, treatment availability, disease understanding, and viral characteristics have changed over the course of the Covid-19 pandemic; as a result, the risks associated with patient comorbidities and biomarkers have also changed. We add to the ongoing conversation regarding inflammation, hemostasis and vascular function in Covid-19 by performing a time-varying observational analysis of over 4000 patients hospitalized for Covid-19 in a New York City hospital system from March 2020 to August 2021 to elucidate the changing impact of thrombosis, inflammation, and other risk factors on in-hospital mortality. We find that the predictive power of biomarkers of thrombosis risk have increased over time, suggesting an opportunity for improved care by identifying and targeting therapies for patients with elevated thrombophilic propensity.
RESUMO
OBJECTIVE: Heightened inflammation, dysregulated immunity, and thrombotic events are characteristic of hospitalized COVID-19 patients. Given that platelets are key regulators of thrombosis, inflammation, and immunity they represent prime candidates as mediators of COVID-19-associated pathogenesis. The objective of this study was to understand the contribution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to the platelet phenotype via phenotypic (activation, aggregation) and transcriptomic characterization. APPROACH AND RESULTS: In a cohort of 3915 hospitalized COVID-19 patients, we analyzed blood platelet indices collected at hospital admission. Following adjustment for demographics, clinical risk factors, medication, and biomarkers of inflammation and thrombosis, we find platelet count, size, and immaturity are associated with increased critical illness and all-cause mortality. Bone marrow, lung tissue, and blood from COVID-19 patients revealed the presence of SARS-CoV-2 virions in megakaryocytes and platelets. Characterization of COVID-19 platelets found them to be hyperreactive (increased aggregation, and expression of P-selectin and CD40) and to have a distinct transcriptomic profile characteristic of prothrombotic large and immature platelets. In vitro mechanistic studies highlight that the interaction of SARS-CoV-2 with megakaryocytes alters the platelet transcriptome, and its effects are distinct from the coronavirus responsible for the common cold (CoV-OC43). CONCLUSIONS: Platelet count, size, and maturity associate with increased critical illness and all-cause mortality among hospitalized COVID-19 patients. Profiling tissues and blood from COVID-19 patients revealed that SARS-CoV-2 virions enter megakaryocytes and platelets and associate with alterations to the platelet transcriptome and activation profile.
Assuntos
COVID-19 , Trombose , Plaquetas , Humanos , SARS-CoV-2 , Índice de Gravidade de DoençaRESUMO
The nature of the internet as a non-peer-reviewed (and largely unregulated) publication medium has allowed wide-spread promotion of inaccurate and unproven medical claims in unprecedented scale. Patients with conditions that are not currently fully treatable are particularly susceptible to unproven and dangerous promises about miracle treatments. In extreme cases, fatal adverse outcomes have been documented. Most commonly, the cost is financial, psychological, and delayed application of imperfect but proven scientific modalities. To help protect patients, who may be desperately ill and thus prone to exploitation, we explored the use of machine learning techniques to identify web pages that make unproven claims. This feasibility study shows that the resulting models can identify web pages that make unproven claims in a fully automatic manner, and substantially better than previous web tools and state-of-the-art search engine technology.
Assuntos
Inteligência Artificial , Internet , Neoplasias/terapia , Charlatanismo , Estudos de Viabilidade , Humanos , Serviços de Informação/normas , Armazenamento e Recuperação da Informação , Curva ROCRESUMO
Complex medical data sometimes requires significant data preprocessing to prepare for analysis. The complexity can lead non-domain experts to apply simple filters of available data or to not use the data at all. The preprocessing choices can also have serious effects on the results of the study if incorrect decision or missteps are made. In this work, we present open-source data filters for an analysis motivated by understanding mortality in the context of sepsis- associated cardiomyopathy in the ICU. We report specific ICU filters and validations through chart review and graphs. These published filters reduce the complexity of using data in analysis by (1) encapsulating the domain expertise and feature engineering applied to the filter, by (2) providing debugged and ready code for use, and by (3) providing sensible validations. We intend these filters to evolve through pull requests and forks and serve as common starting points for specific analyses.
Assuntos
Cardiomiopatias/etiologia , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Unidades de Terapia Intensiva/organização & administração , Sepse/complicações , Software , Adulto , Idoso , Idoso de 80 Anos ou mais , Cardiomiopatias/mortalidade , Cardiomiopatias/terapia , Ecocardiografia , Feminino , Mortalidade Hospitalar , Humanos , Modelos Logísticos , Masculino , Sistemas Computadorizados de Registros Médicos , Pessoa de Meia-Idade , Estudos de Casos OrganizacionaisRESUMO
Rapid increases in e-cigarette use and potential exposure to harmful byproducts have shifted public health focus to e-cigarettes as a possible drug of abuse. Effective surveillance of use and prevalence would allow appropriate regulatory responses. An ideal surveillance system would collect usage data in real time, focus on populations of interest, include populations unable to take the survey, allow a breadth of questions to answer, and enable geo-location analysis. Social media streams may provide this ideal system. To realize this use case, a foundational question is whether we can detect e-cigarette use at all. This work reports two pilot tasks using text classification to identify automatically Tweets that indicate e-cigarette use and/or e-cigarette use for smoking cessation. We build and define both datasets and compare performance of 4 state of the art classifiers and a keyword search for each task. Our results demonstrate excellent classifier performance of up to 0.90 and 0.94 area under the curve in each category. These promising initial results form the foundation for further studies to realize the ideal surveillance solution.
Assuntos
Sistemas Eletrônicos de Liberação de Nicotina/estatística & dados numéricos , Abandono do Hábito de Fumar/métodos , Mídias Sociais/estatística & dados numéricos , Algoritmos , Teorema de Bayes , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Estudos de Viabilidade , Humanos , Modelos Logísticos , Projetos Piloto , Abandono do Hábito de Fumar/estatística & dados numéricos , Máquina de Vetores de SuporteRESUMO
Prior research has shown that Support Vector Machine models have the ability to identify high quality content-specific articles in the domain of internal medicine. These models, though powerful, cannot be used in Boolean search engines nor can the content of the models be verified via human inspection. In this paper, we use decision trees combined with several feature selection methods to generate Boolean query filters for the same domain and task. The resulting trees are generated automatically and exhibit high performance. The trees are understandable, manageable, and able to be validated by humans. The subsequent Boolean queries are sensible and can be readily used as filters by Boolean search engines.
Assuntos
Árvores de Decisões , Armazenamento e Recuperação da Informação/métodos , Algoritmos , Inteligência Artificial , Armazenamento e Recuperação da Informação/normas , Medical Subject Headings , Publicações Periódicas como Assunto , PubMedRESUMO
Building machine learning models that identify unproven cancer treatments on the Health Web is a promising approach for dealing with the dissemination of false and dangerous information to vulnerable health consumers. Aside from the obvious requirement of accuracy, two issues are of practical importance in deploying these models in real world applications. (a) Generalizability: The models must generalize to all treatments (not just the ones used in the training of the models). (b) Scalability: The models can be applied efficiently to billions of documents on the Health Web. First, we provide methods and related empirical data demonstrating strong accuracy and generalizability. Second, by combining the MapReduce distributed architecture and high dimensionality compression via Markov Boundary feature selection, we show how to scale the application of the models to WWW-scale corpora. The present work provides evidence that (a) a very small subset of unproven cancer treatments is sufficient to build a model to identify unproven treatments on the web; (b) unproven treatments use distinct language to market their claims and this language is learnable; (c) through distributed parallelization and state of the art feature selection, it is possible to prepare the corpora and build and apply models with large scalability.