Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

Dynamics of the most common pathogenic mtDNA variant m.3243A > G demonstrate frequency-dependency in blood and positive selection in the germline.

Franco, Melissa; Pickett, Sarah J; Fleischmann, Zoe; Khrapko, Mark; Cote-L'Heureux, Auden; Aidlen, Dylan; Stein, David; Markuzon, Natasha; Popadin, Konstantin; Braverman, Maxim; Woods, Dori C; Tilly, Jonathan L; Turnbull, Doug M; Khrapko, Konstantin.

Hum Mol Genet ; 31(23): 4075-4086, 2022 11 28.

Artigo em Inglês | MEDLINE | ID: mdl-35849052

RESUMO

The A-to-G point mutation at position 3243 in the human mitochondrial genome (m.3243A > G) is the most common pathogenic mtDNA variant responsible for disease in humans. It is widely accepted that m.3243A > G levels decrease in blood with age, and an age correction representing ~ 2% annual decline is often applied to account for this change in mutation level. Here we report that recent data indicate that the dynamics of m.3243A > G are more complex and depend on the mutation level in blood in a bi-phasic way. Consequently, the traditional 2% correction, which is adequate 'on average', creates opposite predictive biases at high and low mutation levels. Unbiased age correction is needed to circumvent these drawbacks of the standard model. We propose to eliminate both biases by using an approach where age correction depends on mutation level in a biphasic way to account for the dynamics of m.3243A > G in blood. The utility of this approach was further tested in estimating germline selection of m.3243A > G. The biphasic approach permitted us to uncover patterns consistent with the possibility of positive selection for m.3243A > G. Germline selection of m.3243A > G shows an 'arching' profile by which selection is positive at intermediate mutant fractions and declines at high and low mutant fractions. We conclude that use of this biphasic approach will greatly improve the accuracy of modelling changes in mtDNA mutation frequencies in the germline and in somatic cells during aging.

Assuntos

DNA Mitocondrial , Doenças Mitocondriais , Humanos , DNA Mitocondrial/genética , Mitocôndrias/genética , Mutação , Mutação Puntual , Células Germinativas , Doenças Mitocondriais/genética

2.

Integrating knowledge graphs into machine learning models for survival prediction and biomarker discovery in patients with non-small-cell lung cancer.

Fang, Chao; Arango Argoty, Gustavo Alonso; Kagiampakis, Ioannis; Khalid, Mohammad Hassan; Jacob, Etai; Bulusu, Krishna C; Markuzon, Natasha.

J Transl Med ; 22(1): 726, 2024 Aug 05.

Artigo em Inglês | MEDLINE | ID: mdl-39103897

RESUMO

Accurate survival prediction for Non-Small Cell Lung Cancer (NSCLC) patients remains a significant challenge for the scientific and clinical community despite decades of advanced analytics. Addressing this challenge not only helps inform the critical aspects of clinical study design and biomarker discovery but also ensures that the 'right patient' receives the 'right treatment'. However, survival prediction is a highly complex task, given the large number of 'omics; and clinical features, as well as the high degree of freedom that drive patient survival. Prior knowledge could play a critical role in uncovering the complexity of a disease and understanding the driving factors affecting a patient's survival. We introduce a methodology for incorporating prior knowledge into machine learning-based models for prediction of patient survival through Knowledge Graphs, demonstrating the advantage of such an approach for NSCLC patients. Using data from patients treated with immuno-oncologic therapies in the POPLAR (NCT01903993) and OAK (NCT02008227) clinical trials, we found that the use of knowledge graphs yielded significantly improved hazard ratios, including in the POPLAR cohort, for models based on biomarker tumor mutation burden compared with those based on knowledge graphs. Use of a model-defined mutational 10-gene signature led to significant overall survival differentiation for both trials. We provide parameterized code for incorporating knowledge graphs into survival analyses for use by the wider scientific community.

Assuntos

Biomarcadores Tumorais , Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Aprendizado de Máquina , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/mortalidade , Carcinoma Pulmonar de Células não Pequenas/patologia , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidade , Neoplasias Pulmonares/patologia , Análise de Sobrevida , Prognóstico , Modelos de Riscos Proporcionais , Gráficos por Computador , Mutação/genética , Conhecimento

3.

Systematic review of time to subsequent therapy as a candidate surrogate endpoint in advanced solid tumors.

Agapow, Paul; Mulla, Rob; Markuzon, Natasha; Ottesen, Lone H; Meulendijks, Didier.

Future Oncol ; 19(23): 1627-1639, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-37589145

RESUMO

Aim: Time to subsequent therapy (TST) is an end point that may complement progression-free survival (PFS) and overall survival (OS) in determining the treatment effect of anticancer drugs and may be a potential surrogate for PFS and OS. We systematically reviewed the correlation between TST and both PFS and OS in published phase 2/3 studies in advanced solid tumors. Materials & methods: Trial-level correlational analyses were performed for TST versus PFS (by investigator and/or central review) and TST versus OS. Results: Of 21 included studies, nine (43%) used 'time to first subsequent therapy or death' (TFST) as the TST end point; 11 (57%) used different definitions ('other TST end points'). There was a strong correlation between TFST and PFS by investigator (medians: R2 = 0.88; hazard ratio [HR]: R2 = 0.91) and TFST versus PFS by central review (medians: R2 = 0.86; HRs: R2 = 0.84). For TFST versus OS there was medium/poor correlation for medians (R2 = 0.64) and HRs (R2 = 0.02). Conclusion: TFST strongly correlates with PFS, but not with OS.

In a recent study, researchers investigated how we can measure the effectiveness of cancer drugs. They focused on a specific measure called 'time to next therapy', which is the duration between two treatments patients receive. By analyzing the relationship between time to next therapy and disease progression, they discovered a strong correlation. This suggests that in the future, time to next therapy could potentially help to measure how well a cancer treatment works. However, when it comes to predicting patient survival, the relationship was not as strong. This implies that time to next therapy is not a reliable indicator of patient survival. To fully understand whether time to next therapy can effectively measure the effectiveness of anticancer drugs, further research is necessary.

Assuntos

Neoplasias , Humanos , Neoplasias/terapia , Intervalo Livre de Progressão , Pesquisadores

4.

Natural Language Processing for Automated Classification of Qualitative Data From Interviews of Patients With Cancer.

Fang, Chao; Markuzon, Natasha; Patel, Nikunj; Rueda, Juan-David.

Value Health ; 25(12): 1995-2002, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-35840523

RESUMO

OBJECTIVES: This study sought to explore the use of novel natural language processing (NLP) methods for classifying unstructured, qualitative textual data from interviews of patients with cancer to identify patient-reported symptoms and impacts on quality of life. METHODS: We tested the ability of 4 NLP models to accurately classify text from interview transcripts as "symptom," "quality of life impact," and "other." Interview data sets from patients with hepatocellular carcinoma (HCC) (n = 25), biliary tract cancer (BTC) (n = 23), and gastric cancer (n = 24) were used. Models were cross-validated with transcript subsets designated for training, validation, and testing. Multiclass classification performance of the 4 models was evaluated at paragraph and sentence level using the HCC testing data set and analyzed by the one-versus-rest technique quantified by the receiver operating characteristic area under the curve (ROC AUC) score. RESULTS: NLP models accurately classified multiclass text from patient interviews. The Bidirectional Encoder Representations from Transformers model generally outperformed all other models at paragraph and sentence level. The highest predictive performance of the Bidirectional Encoder Representations from Transformers model was observed using the HCC data set to train and BTC data set to test (mean ROC AUC, 0.940 [SD 0.028]), with similarly high predictive performance using balanced and imbalanced training data sets from BTC and gastric cancer populations. CONCLUSIONS: NLP models were accurate in predicting multiclass classification of text from interviews of patients with cancer, with most surpassing 0.9 ROC AUC at paragraph level. NLP may be a useful tool for scaling up processing of patient interviews in clinical studies and, thus, could serve to facilitate patient input into drug development and improving patient care.

Assuntos

Carcinoma Hepatocelular , Neoplasias Hepáticas , Neoplasias Gástricas , Humanos , Processamento de Linguagem Natural , Qualidade de Vida

5.

The association between autoimmune disease and 30-day mortality among sepsis ICU patients: a cohort study.

Sheth, Mallory; Benedum, Corey M; Celi, Leo Anthony; Mark, Roger G; Markuzon, Natasha.

Crit Care ; 23(1): 93, 2019 Mar 18.

Artigo em Inglês | MEDLINE | ID: mdl-30885252

RESUMO

INTRODUCTION: Sepsis results from a dysregulated host response to an infection that is associated with an imbalance between pro- and anti-inflammatory cytokines. This imbalance is hypothesized to be a driver of patient mortality. Certain autoimmune diseases modulate the expression of cytokines involved in the pathophysiology of sepsis. However, the outcomes of patients with autoimmune disease who develop sepsis have not been studied in detail. The objective of this study is to determine whether patients with autoimmune diseases have different sepsis outcomes than patients without these comorbidities. METHODS: Using the Multiparameter Intelligent Monitoring in Intensive Care III database (v. 1.4) which contains retrospective clinical data for over 50,000 adult ICU stays, we compared 30-day mortality risk for sepsis patients with and without autoimmune disease. We used logistic regression models to control for known confounders, including demographics, disease severity, and immunomodulation medications. We used mediation analysis to evaluate how the chronic use of immunomodulation medications affects the relationship between autoimmune disease and 30-day mortality. RESULTS: Our study found a statistically significant 27.00% reduction in the 30-day mortality risk associated with autoimmune disease presence. This association was found to be the strongest (OR 0.71, 95% CI 0.54-0.93, P = 0.014) among patients with septic shock. The autoimmune disease-30-day mortality association was not mediated through the chronic use of immunomodulation medications (indirect effect OR 1.07, 95% CI 1.01-1.13, P = 0.020). CONCLUSIONS: We demonstrated that autoimmune diseases are associated with a lower 30-day mortality risk in sepsis. Our findings suggest that autoimmune diseases affect 30-day mortality through a mechanism unrelated to the chronic use of immunomodulation medications. Since this study was conducted within a single study center, research using data from other medical centers will provide further validation.

Assuntos

Doenças Autoimunes/complicações , Mortalidade/tendências , Fatores de Proteção , Sepse/mortalidade , Idoso , Idoso de 80 Anos ou mais , Doenças Autoimunes/mortalidade , Doenças Autoimunes/fisiopatologia , Estudos de Coortes , Feminino , Humanos , Unidades de Terapia Intensiva/organização & administração , Unidades de Terapia Intensiva/estatística & dados numéricos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Fatores de Risco , Sepse/complicações , Sepse/fisiopatologia

6.

Insights Into the Patient Experience of Hormone Therapy for Early Breast Cancer Treatment Using Patient Forum Discussions and Natural Language Processing.

Sreenivasan, Sameet; Fang, Chao; Flood, Emuella M; Markuzon, Natasha; Sze, Jasmine Y Y.

JCO Clin Cancer Inform ; 8: e2400038, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39102642

RESUMO

PURPOSE: Understanding the real-world experience of patients with early breast cancer (eBC) is imperative for optimizing outcomes and evolving patient care. However, there is a lack of patient-level data, hindering clinical development. This social listening study was performed to understand patient insights into symptoms and impacts of hormone therapy (HT) for eBC using posts from patient forums on breastcancer.org to inform future clinical research. METHODS: Natural language processing (NLP) and machine learning techniques were used to identify themes related to eBC from a sample of 500,000 posts. After relevant data selection, 362,074 eBC posts were retained for further analysis of symptoms and impacts related to HT, as well as insights into symptom severity, pain locations, and symptom management using exercise and yoga. RESULTS: Overall, 32 symptoms and nine impacts had significant associations with ≥one HT. Hot flush (relative risk [RR], 6.70 [95% CI, 3.36 to 13.36]), arthralgia (RR, 6.67 [95% CI, 3.53 to 12.59]), weight increased (RR, 4.83 [95% CI, 3.20 to 7.28]), mood swings (RR, 7.36 [95% CI, 5.75 to 9.42]), insomnia (RR, 4.76 [95% CI, 3.14 to 7.22]), and depression (RR, 3.05 [95% CI, 1.71 to 5.44]) demonstrated the strongest associations. Severe headache, dizziness, back pain, and muscle spasms showed significant associations with ≥one HT despite their low overall prevalence in eBC posts. CONCLUSION: The social listening approach allowed the identification of real-world insights from posts specific to eBC HT from a large-scale online breast cancer forum that captured experiences from a uniquely diverse group of patients. Using NLP has a potential to scale analysis of patient feedback and reveal actionable insights into patient experiences of treatment that can inform the development of future therapies and improve the care of patients with eBC.

Assuntos

Neoplasias da Mama , Processamento de Linguagem Natural , Humanos , Feminino , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/psicologia , Neoplasias da Mama/terapia , Aprendizado de Máquina , Pessoa de Meia-Idade

7.

Autoencoder-based multimodal prediction of non-small cell lung cancer survival.

Ellen, Jacob G; Jacob, Etai; Nikolaou, Nikos; Markuzon, Natasha.

Sci Rep ; 13(1): 15761, 2023 09 22.

Artigo em Inglês | MEDLINE | ID: mdl-37737469

RESUMO

The ability to accurately predict non-small cell lung cancer (NSCLC) patient survival is crucial for informing physician decision-making, and the increasing availability of multi-omics data offers the promise of enhancing prognosis predictions. We present a multimodal integration approach that leverages microRNA, mRNA, DNA methylation, long non-coding RNA (lncRNA) and clinical data to predict NSCLC survival and identify patient subtypes, utilizing denoising autoencoders for data compression and integration. Survival performance for patients with lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) was compared across modality combinations and data integration methods. Using The Cancer Genome Atlas data, our results demonstrate that survival prediction models combining multiple modalities outperform single modality models. The highest performance was achieved with a combination of only two modalities, lncRNA and clinical, at concordance indices (C-indices) of 0.69 ± 0.03 for LUAD and 0.62 ± 0.03 for LUSC. Models utilizing all five modalities achieved mean C-indices of 0.67 ± 0.04 and 0.63 ± 0.02 for LUAD and LUSC, respectively, while the best individual modality performance reached C-indices of 0.64 ± 0.03 for LUAD and 0.59 ± 0.03 for LUSC. Analysis of biological differences revealed two distinct survival subtypes with over 900 differentially expressed transcripts.

Assuntos

Carcinoma Pulmonar de Células não Pequenas , Carcinoma de Células Escamosas , Neoplasias Pulmonares , MicroRNAs , RNA Longo não Codificante , Humanos , Carcinoma Pulmonar de Células não Pequenas/genética , RNA Longo não Codificante/genética , Neoplasias Pulmonares/genética , MicroRNAs/genética , Carcinoma de Células Escamosas/genética

8.

Application of Bayesian networks to generate synthetic health data.

Kaur, Dhamanpreet; Sobiesk, Matthew; Patil, Shubham; Liu, Jin; Bhagat, Puran; Gupta, Amar; Markuzon, Natasha.

J Am Med Inform Assoc ; 28(4): 801-811, 2021 03 18.

Artigo em Inglês | MEDLINE | ID: mdl-33367620

RESUMO

OBJECTIVE: This study seeks to develop a fully automated method of generating synthetic data from a real dataset that could be employed by medical organizations to distribute health data to researchers, reducing the need for access to real data. We hypothesize the application of Bayesian networks will improve upon the predominant existing method, medBGAN, in handling the complexity and dimensionality of healthcare data. MATERIALS AND METHODS: We employed Bayesian networks to learn probabilistic graphical structures and simulated synthetic patient records from the learned structure. We used the University of California Irvine (UCI) heart disease and diabetes datasets as well as the MIMIC-III diagnoses database. We evaluated our method through statistical tests, machine learning tasks, preservation of rare events, disclosure risk, and the ability of a machine learning classifier to discriminate between the real and synthetic data. RESULTS: Our Bayesian network model outperformed or equaled medBGAN in all key metrics. Notable improvement was achieved in capturing rare variables and preserving association rules. DISCUSSION: Bayesian networks generated data sufficiently similar to the original data with minimal risk of disclosure, while offering additional transparency, computational efficiency, and capacity to handle more data types in comparison to existing methods. We hope this method will allow healthcare organizations to efficiently disseminate synthetic health data to researchers, enabling them to generate hypotheses and develop analytical tools. CONCLUSION: We conclude the application of Bayesian networks is a promising option for generating realistic synthetic health data that preserves the features of the original data without compromising data privacy.

Assuntos

Teorema de Bayes , Anonimização de Dados , Gerenciamento de Dados , Aprendizado de Máquina , Redes Neurais de Computação , Confidencialidade , Conjuntos de Dados como Assunto , Revelação , Humanos , Disseminação de Informação

9.

Progression of aortic stenosis and echocardiographic criteria for its severity.

Kebed, Kalie; Sun, Deyu; Addetia, Karima; Mor-Avi, Victor; Markuzon, Natasha; Lang, Roberto M.

Eur Heart J Cardiovasc Imaging ; 21(7): 737-743, 2020 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-32335667

RESUMO

AIMS: Guidelines-recommended criteria for identifying severe aortic stenosis (AS) are based on small, homogenous cohorts of patients, leading to potentially inconsistent or missed diagnosis. We used a large cohort of patients with varying degrees of AS to (i) characterize its progression; (ii) evaluate the influence of demographic and echocardiographic variables; and (iii) derive haemodynamically consistent cut-off values. METHODS AND RESULTS: We identified 916 patients with mild to severe AS who had undergone >1 echocardiographic study (N = 2547). For each study, aortic valve area (AVA), peak transaortic velocity (Vmax), and mean pressure gradient (ΔP) were extracted. Annual rates of AVA change were determined by a linear mixed-effects model. To determine the prevalence of inconsistent diagnosis of severe AS, AVA was plotted against ΔP and Vmax, with quadrants defined using guidelines-recommended cut-offs. The rate of AVA change was -0.070 ± 0.003 cm2/year and was more rapid in men than women and in Whites than African Americans. AVA = 1 cm2 corresponded to ΔP = 32 mmHg and Vmax = 3.7 m/s, causing discrepancies in defining severe AS in 480 (19%) and 458 (18%) studies, respectively. Conversely, ΔP = 40 mmHg corresponded to AVA = 0.89 cm2 and Vmax = 4.0 m/s corresponded to AVA = 0.92 cm2, confirming the inconsistency of the guidelines. Notably, discrepancy rate was higher in 206 patients with low flow (SVi < 35 mL/m2): 40% vs. 16% in the remaining patients. CONCLUSION: Our findings demonstrated gender- and race-related differences in AS progression and underscored the need to refine the multiparametric criteria for diagnosis of severe AS to minimize internal inconsistencies, which are high with the current cut-offs and amplified in patients with low stroke volumes.

Assuntos

Estenose da Valva Aórtica , Valva Aórtica , Valva Aórtica/diagnóstico por imagem , Estenose da Valva Aórtica/diagnóstico por imagem , Ecocardiografia , Ecocardiografia Doppler , Feminino , Humanos , Masculino , Índice de Gravidade de Doença , Volume Sistólico

10.

Measurement errors in serial echocardiographic assessments of aortic valve stenosis severity.

Kebed, Kalie; Sun, Deyu; Addetia, Karima; Mor-Avi, Victor; Markuzon, Natasha; Lang, Roberto M.

Int J Cardiovasc Imaging ; 36(3): 471-479, 2020 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-31865497

RESUMO

Transthoracic echocardiography (TTE) evaluation of aortic stenosis (AS) is routinely performed using the continuity equation. Inaccurate measurements of the left ventricular (LV) outflow tract (LVOT) diameter are considered the most common source of error in AS grading. We hypothesized that inconsistency in LVOT velocity time integral (VTI) is an under-recognized cause of AS assessment error. We sought to determine which parameters contribute most towards inconsistencies in AS grading by studying the prevalence of different errors in a historic cohort. We identified patients with mild to severe AS with multiple studies from our database from 1994 to 2018 (n = 988 patients, 2859 studies). Errors were defined when: (1) LVOT diameter changed by > 2 mm, (2) LVOT VTI changed by > 15% without change in LV function from the initial TTE, (3) aortic valve (AV) maximum velocity (Vmax), mean pressure gradient (ΔP) or AV VTI decreased by > 15% without change in LV function from prior study. The most common error was the LVOT VTI measurement with 22% prevalence. LVOT diameter, AV VTI, AV Vmax and AV ΔP measurement caused errors in < 7% studies. Patients with normal LV function and more severe AS were more likely to have LVOT VTI errors (P < 0.05). LVOT VTI is a frequent, under-recognized source of error in assessing AS. Greater attention should be directed toward the proper positioning of the pulsed Doppler sample volume, particularly in patients with higher grades of AS and normal systolic function, to ensure accurate and reproducible assessment of AS.

Assuntos

Estenose da Valva Aórtica/diagnóstico por imagem , Valva Aórtica/diagnóstico por imagem , Ecocardiografia Doppler em Cores , Ecocardiografia Doppler de Pulso , Ventrículos do Coração/diagnóstico por imagem , Valva Aórtica/fisiopatologia , Estenose da Valva Aórtica/fisiopatologia , Bases de Dados Factuais , Ventrículos do Coração/fisiopatologia , Hemodinâmica , Humanos , Valor Preditivo dos Testes , Prognóstico , Reprodutibilidade dos Testes , Estudos Retrospectivos , Índice de Gravidade de Doença , Fatores de Tempo , Função Ventricular Esquerda

11.

Weekly dengue forecasts in Iquitos, Peru; San Juan, Puerto Rico; and Singapore.

Benedum, Corey M; Shea, Kimberly M; Jenkins, Helen E; Kim, Louis Y; Markuzon, Natasha.

PLoS Negl Trop Dis ; 14(10): e0008710, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-33064770

RESUMO

BACKGROUND: Predictive models can serve as early warning systems and can be used to forecast future risk of various infectious diseases. Conventionally, regression and time series models are used to forecast dengue incidence, using dengue surveillance (e.g., case counts) and weather data. However, these models may be limited in terms of model assumptions and the number of predictors that can be included. Machine learning (ML) methods are designed to work with a large number of predictors and thus offer an appealing alternative. Here, we compared the performance of ML algorithms with that of regression models in predicting dengue cases and outbreaks from 4 to up to 12 weeks in advance. Many countries lack sufficient health surveillance infrastructure, as such we evaluated the contribution of dengue surveillance and weather data on the predictive power of these models. METHODS: We developed ML, regression, and time series models to forecast weekly dengue case counts and outbreaks in Iquitos, Peru; San Juan, Puerto Rico; and Singapore from 1990-2016. Forecasts were generated using available weekly dengue surveillance, and weather data. We evaluated the agreement between model forecasts and actual dengue observations using Mean Absolute Error and Matthew's Correlation Coefficient (MCC). RESULTS: For near term predictions of weekly case counts and when using surveillance data, ML models had 21% and 33% less error than regression and time series models respectively. However, using weather data only, ML models did not demonstrate a practical advantage. When forecasting weekly dengue outbreaks 12 weeks in advance, ML models achieved a maximum MCC of 0.61. CONCLUSIONS: Our results identified 2 scenarios when ML models are advantageous over regression model: 1) predicting dengue weekly case counts 4 weeks ahead when dengue surveillance data are available and 2) predicting weekly dengue outbreaks 12 weeks ahead when dengue surveillance data are unavailable. Given the advantages of ML models, dengue early warning systems may be improved by the inclusion of these models.

Assuntos

Dengue/epidemiologia , Surtos de Doenças , Previsões , Humanos , Modelos Biológicos , Peru/epidemiologia , Vigilância da População , Porto Rico/epidemiologia , Singapura/epidemiologia , Fatores de Tempo , Tempo (Meteorologia)

12.

Role of persistent cascades in diffusion.

Morse, Steven; González, Marta C; Markuzon, Natasha.

Phys Rev E ; 99(1-1): 012323, 2019 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-30780226

RESUMO

We define a structural property of real-world large-scale communication networks consisting of the recurring patterns of communication among individuals, which we term persistent cascades. Using methods of inexact tree matching and agglomerative clustering, we group these patterns into classes which we claim represent some underlying way in which individuals tend to disseminate information. We extend methods from epidemic modeling to offer a way to analytically model this recurring structure in a random network, and comparing to the data, we find that the real cascading structure is significantly larger and more recurrent than the random model. We find that the cascades reveal a habitual hierarchy of spreading, alternative roles in weekday vs weekend spreading, and the existence of hidden spreaders. Finally, we show that cascade membership increases the likelihood of receiving information spreading through the network through simulation on the real order of communication events.

13.

Incorporating media data into a model of infectious disease transmission.

Kim, Louis; Fast, Shannon M; Markuzon, Natasha.

PLoS One ; 14(2): e0197646, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30716139

RESUMO

Understanding the effect of media on disease spread can help improve epidemic forecasting and uncover preventive measures to slow the spread of disease. Most previously introduced models have approximated media effect through disease incidence, making media influence dependent on the size of epidemic. We propose an alternative approach, which relies on real data about disease coverage in the news, allowing us to model low incidence/high interest diseases, such as SARS, Ebola or H1N1. We introduce a network-based model, in which disease is transmitted through local interactions between individuals and the probability of transmission is affected by media coverage. We assume that media attention increases self-protection (e.g. hand washing and compliance with social distancing), which, in turn, decreases disease model. We apply the model to the case of H1N1 transmission in Mexico City in 2009 and show how media influence-measured by the time series of the weekly count of news articles published on the outbreak-helps to explain the observed transmission dynamics. We show that incorporating the media attention based on the observed media coverage of the outbreak better estimates the disease dynamics from what would be predicted by using media function that approximate the media impact using the number of cases and rate of spread. Finally, we apply the model to a typical influenza season in Washington, DC and estimate how the transmission pattern would have changed given different levels of media coverage.

Assuntos

Controle de Doenças Transmissíveis/métodos , Surtos de Doenças/prevenção & controle , Meios de Comunicação de Massa/tendências , Doenças Transmissíveis , Meios de Comunicação/tendências , Epidemias/prevenção & controle , Previsões , Doença pelo Vírus Ebola/epidemiologia , Humanos , Incidência , Influenza Humana/epidemiologia , México , Probabilidade

14.

The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling.

Sheth, Mallory; Gerovitch, Albert; Welsch, Roy; Markuzon, Natasha.

PLoS One ; 14(10): e0223161, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31603902

RESUMO

In many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep understanding of the individual input variables but far less insight into how they interact with each other. For example, there may be ranges of an input variable for which the observed outcome is significantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while obtaining a high level of support. We evaluate its performance using six sample datasets and demonstrate that thresholds identified by the algorithm align well with published results and known physiological boundaries. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is comparable to or better than traditional classifiers when confidence intervals are considered. We identify conditions under which UFA performs well, including applications with large amounts of missing or noisy data, applications with a large number of inputs relative to observations, and applications where incidence of the target is low. We argue that ease of explanation of the results, robustness to missing data and noise, and detection of low incidence adverse outcomes are desirable features for clinical applications that can be achieved with relatively simple classifier, like UFA.

Assuntos

Algoritmos , Neoplasias da Mama/diagnóstico , Diabetes Mellitus/diagnóstico , Leucemia Mieloide Aguda/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Sepse/diagnóstico , Temperatura Corporal , Neoplasias da Mama/mortalidade , Neoplasias da Mama/patologia , Conjuntos de Dados como Assunto , Diabetes Mellitus/mortalidade , Diabetes Mellitus/patologia , Feminino , Humanos , Leucemia Mieloide Aguda/mortalidade , Leucemia Mieloide Aguda/patologia , Masculino , Modelos Estatísticos , Leucemia-Linfoma Linfoblástico de Células Precursoras/mortalidade , Leucemia-Linfoma Linfoblástico de Células Precursoras/patologia , Sepse/mortalidade , Sepse/patologia , Análise de Sobrevida

15.

Statistical modeling of the effect of rainfall flushing on dengue transmission in Singapore.

Benedum, Corey M; Seidahmed, Osama M E; Eltahir, Elfatih A B; Markuzon, Natasha.

PLoS Negl Trop Dis ; 12(12): e0006935, 2018 12.

Artigo em Inglês | MEDLINE | ID: mdl-30521523

RESUMO

BACKGROUND: Rainfall patterns are one of the main drivers of dengue transmission as mosquitoes require standing water to reproduce. However, excess rainfall can be disruptive to the Aedes reproductive cycle by "flushing out" aquatic stages from breeding sites. We developed models to predict the occurrence of such "flushing" events from rainfall data and to evaluate the effect of flushing on dengue outbreak risk in Singapore between 2000 and 2016. METHODS: We used machine learning and regression models to predict days with "flushing" in the dataset based on entomological and corresponding rainfall observations collected in Singapore. We used a distributed lag nonlinear logistic regression model to estimate the association between the number of flushing events per week and the risk of a dengue outbreak. RESULTS: Days with flushing were identified through the developed logistic regression model based on entomological data (test set accuracy = 92%). Predictions were based upon the aggregate number of thresholds indicating unusually rainy conditions over multiple weeks. We observed a statistically significant reduction in dengue outbreak risk one to six weeks after flushing events occurred. For weeks with five or more flushing events, compared with weeks with no flushing events, the risk of a dengue outbreak in the subsequent weeks was reduced by 16% to 70%. CONCLUSIONS: We have developed a high accuracy predictive model associating temporal rainfall patterns with flushing conditions. Using predicted flushing events, we have demonstrated a statistically significant reduction in dengue outbreak risk following flushing, with the time lag well aligned with time of mosquito development from larvae and infection transmission. Vector control programs should consider the effects of hydrological conditions in endemic areas on dengue transmission.

Assuntos

Aedes/fisiologia , Dengue/epidemiologia , Surtos de Doenças , Modelos Estatísticos , Controle de Mosquitos , Mosquitos Vetores/fisiologia , Animais , Dengue/transmissão , Entomologia , Humanos , Chuva , Singapura/epidemiologia

16.

Predicting social response to infectious disease outbreaks from internet-based news streams.

Fast, Shannon M; Kim, Louis; Cohn, Emily L; Mekaru, Sumiko R; Brownstein, John S; Markuzon, Natasha.

Ann Oper Res ; 263(1): 551-564, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-32214588

RESUMO

Infectious disease outbreaks often have consequences beyond human health, including concern among the population, economic instability, and sometimes violence. A warning system capable of anticipating social disruptions resulting from disease outbreaks is urgently needed to help decision makers prepare appropriately. We designed a system that operates in near real-time to identify and predict social response. Over 150,000 Internet-based news articles related to outbreaks of 16 diseases in 72 countries and territories were provided by HealthMap. These articles were automatically tagged with indicators of the disease activity and population reaction. An anomaly detection algorithm was implemented on the population reaction indicators to identify periods of unusually severe social response. Then a model was developed to predict the probability of these periods of unusually severe social response occurring in the coming week, 2 and 3 weeks. This model exhibited remarkably strong performance for diseases with substantial media coverage. For country-disease pairs with a median of 20 or more articles per year, the onset of social response in the next week was correctly predicted over 60% of the time, and 87% of weeks were correctly predicted. Performance was weaker for diseases with little media coverage, and, for these diseases, the main utility of our system is in identifying social response when it occurs, rather than predicting when it will happen in the future. Overall, the developed near real-time prediction approach is a promising step toward developing predictive models to inform responders of the likely social consequences of disease spread.

17.

Cost-Effective Control of Infectious Disease Outbreaks Accounting for Societal Reaction.

Fast, Shannon M; González, Marta C; Markuzon, Natasha.

PLoS One ; 10(8): e0136059, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26288274

RESUMO

BACKGROUND: Studies of cost-effective disease prevention have typically focused on the tradeoff between the cost of disease transmission and the cost of applying control measures. We present a novel approach that also accounts for the cost of social disruptions resulting from the spread of disease. These disruptions, which we call social response, can include heightened anxiety, strain on healthcare infrastructure, economic losses, or violence. METHODOLOGY: The spread of disease and social response are simulated under several different intervention strategies. The modeled social response depends upon the perceived risk of the disease, the extent of disease spread, and the media involvement. Using Monte Carlo simulation, we estimate the total number of infections and total social response for each strategy. We then identify the strategy that minimizes the expected total cost of the disease, which includes the cost of the disease itself, the cost of control measures, and the cost of social response. CONCLUSIONS: The model-based simulations suggest that the least-cost disease control strategy depends upon the perceived risk of the disease, as well as media intervention. The most cost-effective solution for diseases with low perceived risk was to implement moderate control measures. For diseases with higher perceived severity, such as SARS or Ebola, the most cost-effective strategy shifted toward intervening earlier in the outbreak, with greater resources. When intervention elicited increased media involvement, it remained important to control high severity diseases quickly. For moderate severity diseases, however, it became most cost-effective to implement no intervention and allow the disease to run its course. Our simulation results imply that, when diseases are perceived as severe, the costs of social response have a significant influence on selecting the most cost-effective strategy.

Assuntos

Controle de Doenças Transmissíveis/economia , Controle de Custos/métodos , Surtos de Doenças/economia , Surtos de Doenças/prevenção & controle , Prevenção Primária/economia , Simulação por Computador , Efeitos Psicossociais da Doença , Análise Custo-Benefício , Humanos , Modelos Teóricos

18.

Modelling the propagation of social response during a disease outbreak.

Fast, Shannon M; González, Marta C; Wilson, James M; Markuzon, Natasha.

J R Soc Interface ; 12(104): 20141105, 2015 Mar 06.

Artigo em Inglês | MEDLINE | ID: mdl-25589575

RESUMO

Epidemic trajectories and associated social responses vary widely between populations, with severe reactions sometimes observed. When confronted with fatal or novel pathogens, people exhibit a variety of behaviours from anxiety to hoarding of medical supplies, overwhelming medical infrastructure and rioting. We developed a coupled network approach to understanding and predicting social response. We couple the disease spread and panic spread processes and model them through local interactions between agents. The social contagion process depends on the prevalence of the disease, its perceived risk and a global media signal. We verify the model by analysing the spread of disease and social response during the 2009 H1N1 outbreak in Mexico City and 2003 severe acute respiratory syndrome and 2009 H1N1 outbreaks in Hong Kong, accurately predicting population-level behaviour. This kind of empirically validated model is critical to exploring strategies for public health intervention, increasing our ability to anticipate the response to infectious disease outbreaks.

Assuntos

Surtos de Doenças , Influenza Humana/epidemiologia , Síndrome Respiratória Aguda Grave/epidemiologia , Comportamento Social , Comunicação , Planejamento em Desastres , Progressão da Doença , Epidemias , Geografia , Hong Kong , Humanos , Vírus da Influenza A Subtipo H1N1 , Influenza Humana/transmissão , México , Modelos Teóricos , Saúde Pública , Risco , Síndrome Respiratória Aguda Grave/transmissão , Mídias Sociais

19.

The Role of Social Mobilization in Controlling Ebola Virus in Lofa County, Liberia.

Fast, Shannon M; Mekaru, Sumiko; Brownstein, John S; Postlethwaite, Timothy A; Markuzon, Natasha.

PLoS Curr ; 72015 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-26075140

RESUMO

The West Africa Ebola virus epidemic now appears to be coming to an end. In the proposed model, we simulate changes in population behavior that help to explain the observed transmission dynamics. We introduce an EVD transmission model accompanied by a model of social mobilization. The model was fit to Lofa County, Liberia through October 2014, using weekly counts of new cases reported by the US CDC. In simulation studies, we analyze the dynamics of the disease transmission with and without population behavior change, given the availability of beds in Ebola treatment units (ETUs) estimated from observed data. Only the model scenario that included individuals' behavioral change achieved a good fit to the observed case counts. Although the capacity of the Lofa County ETUs greatly increased in mid-August, our simulations show that the expansion was insufficient to alone control the outbreak. Modeling the entire outbreak without considering behavior change fit the data poorly, and extrapolating from early data without taking behavioral changes into account led to a prediction of exponential outbreak growth, contrary to the observed decline. Education and awareness-induced behavior change in the population was instrumental in curtailing the Ebola outbreak in Lofa County and is likely playing an important role in stopping the West Africa epidemic altogether.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA