Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 63
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomed Inform ; 125: 103971, 2021 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-34920127

RESUMO

OBJECTIVE: Quantify tradeoffs in performance, reproducibility, and resource demands across several strategies for developing clinically relevant word embeddings. MATERIALS AND METHODS: We trained separate embeddings on all full-text manuscripts in the Pubmed Central (PMC) Open Access subset, case reports therein, the English Wikipedia corpus, the Medical Information Mart for Intensive Care (MIMIC) III dataset, and all notes in the University of Pennsylvania Health System (UPHS) electronic health record. We tested embeddings in six clinically relevant tasks including mortality prediction and de-identification, and assessed performance using the scaled Brier score (SBS) and the proportion of notes successfully de-identified, respectively. RESULTS: Embeddings from UPHS notes best predicted mortality (SBS 0.30, 95% CI 0.15 to 0.45) while Wikipedia embeddings performed worst (SBS 0.12, 95% CI -0.05 to 0.28). Wikipedia embeddings most consistently (78% of notes) and the full PMC corpus embeddings least consistently (48%) de-identified notes. Across all six tasks, the full PMC corpus demonstrated the most consistent performance, and the Wikipedia corpus the least. Corpus size ranged from 49 million tokens (PMC case reports) to 10 billion (UPHS). DISCUSSION: Embeddings trained on published case reports performed as least as well as embeddings trained on other corpora in most tasks, and clinical corpora consistently outperformed non-clinical corpora. No single corpus produced a strictly dominant set of embeddings across all tasks and so the optimal training corpus depends on intended use. CONCLUSION: Embeddings trained on published case reports performed comparably on most clinical tasks to embeddings trained on larger corpora. Open access corpora allow training of clinically relevant, effective, and reproducible embeddings.

2.
J Affect Disord ; 2021 Dec 25.
Artigo em Inglês | MEDLINE | ID: mdl-34963643

RESUMO

BACKGROUND: Personal sensing has shown promise for detecting behavioral correlates of depression, but there is little work examining personal sensing of cognitive and affective states. Digital language, particularly through personal text messages, is one source that can measure these markers. METHODS: We correlated privacy-preserving sentiment analysis of text messages with self-reported depression symptom severity. We enrolled 219 U.S. adults in a 16 week longitudinal observational study. Participants installed a personal sensing app on their phones, which administered self-report PHQ-8 assessments of their depression severity, collected phone sensor data, and computed anonymized language sentiment scores from their text messages. We also trained machine learning models for predicting end-of-study self-reported depression status using on blocks of phone sensor and text features. RESULTS: In correlation analyses, we find that degrees of depression, emotional, and personal pronoun language categories correlate most strongly with self-reported depression, validating prior literature. Our classification models which predict binary depression status achieve a leave-one-out AUC of 0.72 when only considering text features and 0.76 when combining text with other networked smartphone sensors. LIMITATIONS: Participants were recruited from a panel that over-represented women, caucasians, and individuals with self-reported depression at baseline. As language use differs across demographic factors, generalizability beyond this population may be limited. The study period also coincided with the initial COVID-19 outbreak in the United States, which may have affected smartphone sensor data quality. CONCLUSIONS: Effective depression prediction through text message sentiment, especially when combined with other personal sensors, could enable comprehensive mental health monitoring and intervention.

3.
Artigo em Inglês | MEDLINE | ID: mdl-34791302

RESUMO

OBJECTIVE: Frailty is a prevalent risk factor for adverse outcomes among patients with chronic lung disease. However, identifying frail patients who may benefit from interventions is challenging using standard data sources. We therefore sought to identify phrases in clinical notes in the electronic health record (EHR) that describe actionable frailty syndromes. MATERIALS AND METHODS: We used an active learning strategy to select notes from the EHR and annotated each sentence for 4 actionable aspects of frailty: respiratory impairment, musculoskeletal problems, fall risk, and nutritional deficiencies. We compared the performance of regression, tree-based, and neural network models to predict the labels for each sentence. We evaluated performance with the scaled Brier score (SBS), where 1 is perfect and 0 is uninformative, and the positive predictive value (PPV). RESULTS: We manually annotated 155 952 sentences from 326 patients. Elastic net regression had the best performance across all 4 frailty aspects (SBS 0.52, 95% confidence interval [CI] 0.49-0.54) followed by random forests (SBS 0.49, 95% CI 0.47-0.51), and multi-task neural networks (SBS 0.39, 95% CI 0.37-0.42). For the elastic net model, the PPV for identifying the presence of respiratory impairment was 54.8% (95% CI 53.3%-56.6%) at a sensitivity of 80%. DISCUSSION: Classification models using EHR notes can effectively identify actionable aspects of frailty among patients living with chronic lung disease. Regression performed better than random forest and neural network models. CONCLUSIONS: NLP-based models offer promising support to population health management programs that seek to identify and refer community-dwelling patients with frailty for evidence-based interventions.

4.
Psychol Methods ; 26(4): 398-427, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34726465

RESUMO

Technology now makes it possible to understand efficiently and at large scale how people use language to reveal their everyday thoughts, behaviors, and emotions. Written text has been analyzed through both theory-based, closed-vocabulary methods from the social sciences as well as data-driven, open-vocabulary methods from computer science, but these approaches have not been comprehensively compared. To provide guidance on best practices for automatically analyzing written text, this narrative review and quantitative synthesis compares five predominant closed- and open-vocabulary methods: Linguistic Inquiry and Word Count (LIWC), the General Inquirer, DICTION, Latent Dirichlet Allocation, and Differential Language Analysis. We compare the linguistic features associated with gender, age, and personality across the five methods using an existing dataset of Facebook status updates and self-reported survey data from 65,896 users. Results are fairly consistent across methods. The closed-vocabulary approaches efficiently summarize concepts and are helpful for understanding how people think, with LIWC2015 yielding the strongest, most parsimonious results. Open-vocabulary approaches reveal more specific and concrete patterns across a broad range of content domains, better address ambiguous word senses, and are less prone to misinterpretation, suggesting that they are well-suited for capturing the nuances of everyday psychological processes. We detail several errors that can occur in closed-vocabulary analyses, the impact of sample size, number of words per user and number of topics included in open-vocabulary analyses, and implications of different analytical decisions. We conclude with recommendations for researchers, advocating for a complementary approach that combines closed- and open-vocabulary methods. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Assuntos
Linguística , Vocabulário , Emoções , Humanos , Idioma , Personalidade
5.
J Am Heart Assoc ; 10(19): e020596, 2021 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-34558301

RESUMO

Background Online platforms are used to manage aspects of our lives including health outside clinical settings. Little is known about the effectiveness of using online platforms to manage hypertension. We assessed effects of tweeting/retweeting cardiovascular health content by individuals with poorly controlled hypertension on systolic blood pressure (SBP) and patient activation. Methods and Results We conducted this 2-arm randomized controlled trial. Eligibility included diagnosis of hypertension; SBP >140 mm Hg; and an existing Twitter account or willingness to create one to follow study Twitter account. Intervention arm was asked to tweet/retweet health content 2×/week using a specific hashtag for study duration (6 months). The main measures include primary outcome change in SBP; secondary outcome point change in Patient Activation Measure (PAM). We remotely recruited and enrolled 611 participants, mean age 52 (SD, 11.7). Mean baseline SBP for the intervention group was 155.8 and for control was 155.6. At 6 months, mean SBP for intervention group was 137.6 and for control was 135.7. Mean change in SBP from baseline to 6 months for the intervention group was -18.5 and for control was -19.8 (P=0.48). Mean PAM at baseline for the intervention group was 70.3 for control was 72.7. At 6 months, mean PAM scores were 71.1 (intervention) and 75.6 (control). Mean change in PAM score for the intervention group was 0.0 and for control was 3.3 (P=0.12). Conclusions Recruiting and engaging patients and collecting outcome measures remotely are feasible using Twitter. Encouraging patients with poorly controlled hypertension to tweet or retweet health content on Twitter did not improve SBP or PAM score at 6 months. Registration URL: https://www.clinicaltrials.gov. Unique identifier: NCT02622256.

6.
J Pers ; 2021 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-34536229

RESUMO

OBJECTIVE: We explore the personality of counties as assessed through linguistic patterns on social media. Such studies were previously limited by the cost and feasibility of large-scale surveys; however, language-based computational models applied to large social media datasets now allow for large-scale personality assessment. METHOD: We applied a language-based assessment of the five factor model of personality to 6,064,267 U.S. Twitter users. We aggregated the Twitter-based personality scores to 2,041 counties and compared to political, economic, social, and health outcomes measured through surveys and by government agencies. RESULTS: There was significant personality variation across counties. Openness to experience was higher on the coasts, conscientiousness was uniformly spread, extraversion was higher in southern states, agreeableness was higher in western states, and emotional stability was highest in the south. Across 13 outcomes, language-based personality estimates replicated patterns that have been observed in individual-level and geographic studies. This includes higher Republican vote share in less agreeable counties and increased life satisfaction in more conscientious counties. CONCLUSIONS: Results suggest that regions vary in their personality and that these differences can be studied through computational linguistic analysis of social media. Furthermore, these methods may be used to explore other psychological constructs across geographies.

7.
J Med Internet Res ; 23(9): e22844, 2021 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-34477562

RESUMO

BACKGROUND: The assessment of behaviors related to mental health typically relies on self-report data. Networked sensors embedded in smartphones can measure some behaviors objectively and continuously, with no ongoing effort. OBJECTIVE: This study aims to evaluate whether changes in phone sensor-derived behavioral features were associated with subsequent changes in mental health symptoms. METHODS: This longitudinal cohort study examined continuously collected phone sensor data and symptom severity data, collected every 3 weeks, over 16 weeks. The participants were recruited through national research registries. Primary outcomes included depression (8-item Patient Health Questionnaire), generalized anxiety (Generalized Anxiety Disorder 7-item scale), and social anxiety (Social Phobia Inventory) severity. Participants were adults who owned Android smartphones. Participants clustered into 4 groups: multiple comorbidities, depression and generalized anxiety, depression and social anxiety, and minimal symptoms. RESULTS: A total of 282 participants were aged 19-69 years (mean 38.9, SD 11.9 years), and the majority were female (223/282, 79.1%) and White participants (226/282, 80.1%). Among the multiple comorbidities group, depression changes were preceded by changes in GPS features (Time: r=-0.23, P=.02; Locations: r=-0.36, P<.001), exercise duration (r=0.39; P=.03) and use of active apps (r=-0.31; P<.001). Among the depression and anxiety groups, changes in depression were preceded by changes in GPS features for Locations (r=-0.20; P=.03) and Transitions (r=-0.21; P=.03). Depression changes were not related to subsequent sensor-derived features. The minimal symptoms group showed no significant relationships. There were no associations between sensor-based features and anxiety and minimal associations between sensor-based features and social anxiety. CONCLUSIONS: Changes in sensor-derived behavioral features are associated with subsequent depression changes, but not vice versa, suggesting a directional relationship in which changes in sensed behaviors are associated with subsequent changes in symptoms.


Assuntos
Depressão , Smartphone , Adulto , Ansiedade/diagnóstico , Ansiedade/epidemiologia , Transtornos de Ansiedade , Depressão/diagnóstico , Depressão/epidemiologia , Feminino , Humanos , Estudos Longitudinais , Masculino
8.
Proc Natl Acad Sci U S A ; 118(39)2021 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-34544875

RESUMO

On May 25, 2020, George Floyd, an unarmed Black American male, was killed by a White police officer. Footage of the murder was widely shared. We examined the psychological impact of Floyd's death using two population surveys that collected data before and after his death; one from Gallup (117,568 responses from n = 47,355) and one from the US Census (409,652 responses from n = 319,471). According to the Gallup data, in the week following Floyd's death, anger and sadness increased to unprecedented levels in the US population. During this period, more than a third of the US population reported these emotions. These increases were more pronounced for Black Americans, nearly half of whom reported these emotions. According to the US Census Household Pulse data, in the week following Floyd's death, depression and anxiety severity increased among Black Americans at significantly higher rates than that of White Americans. Our estimates suggest that this increase corresponds to an additional 900,000 Black Americans who would have screened positive for depression, associated with a burden of roughly 2.7 million to 6.3 million mentally unhealthy days.


Assuntos
Ansiedade/epidemiologia , Depressão/epidemiologia , Emoções/fisiologia , Homicídio/psicologia , Saúde Mental/etnologia , Polícia/estatística & dados numéricos , Racismo/psicologia , Adolescente , Adulto , Afro-Americanos/psicologia , Ira/fisiologia , Ansiedade/psicologia , Depressão/psicologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estados Unidos/epidemiologia , Adulto Jovem
9.
Crit Care Med ; 49(8): 1312-1321, 2021 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-33711001

RESUMO

OBJECTIVES: The National Early Warning Score, Modified Early Warning Score, and quick Sepsis-related Organ Failure Assessment can predict clinical deterioration. These scores exhibit only moderate performance and are often evaluated using aggregated measures over time. A simulated prospective validation strategy that assesses multiple predictions per patient-day would provide the best pragmatic evaluation. We developed a deep recurrent neural network deterioration model and conducted a simulated prospective evaluation. DESIGN: Retrospective cohort study. SETTING: Four hospitals in Pennsylvania. PATIENTS: Inpatient adults discharged between July 1, 2017, and June 30, 2019. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: We trained a deep recurrent neural network and logistic regression model using data from electronic health records to predict hourly the 24-hour composite outcome of transfer to ICU or death. We analyzed 146,446 hospitalizations with 16.75 million patient-hours. The hourly event rate was 1.6% (12,842 transfers or deaths, corresponding to 260,295 patient-hours within the predictive horizon). On a hold-out dataset, the deep recurrent neural network achieved an area under the precision-recall curve of 0.042 (95% CI, 0.04-0.043), comparable with logistic regression model (0.043; 95% CI 0.041 to 0.045), and outperformed National Early Warning Score (0.034; 95% CI, 0.032-0.035), Modified Early Warning Score (0.028; 95% CI, 0.027- 0.03), and quick Sepsis-related Organ Failure Assessment (0.021; 95% CI, 0.021-0.022). For a fixed sensitivity of 50%, the deep recurrent neural network achieved a positive predictive value of 3.4% (95% CI, 3.4-3.5) and outperformed logistic regression model (3.1%; 95% CI 3.1-3.2), National Early Warning Score (2.0%; 95% CI, 2.0-2.0), Modified Early Warning Score (1.5%; 95% CI, 1.5-1.5), and quick Sepsis-related Organ Failure Assessment (1.5%; 95% CI, 1.5-1.5). CONCLUSIONS: Commonly used early warning scores for clinical decompensation, along with a logistic regression model and a deep recurrent neural network model, show very poor performance characteristics when assessed using a simulated prospective validation. None of these models may be suitable for real-time deployment.


Assuntos
Deterioração Clínica , Cuidados Críticos/normas , Aprendizado Profundo/normas , Escores de Disfunção Orgânica , Sepse/terapia , Adulto , Humanos , Masculino , Pessoa de Meia-Idade , Pennsylvania , Estudos Retrospectivos , Medição de Risco
10.
NPJ Digit Med ; 4(1): 55, 2021 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-33767336

RESUMO

An understanding of healthcare super-utilizers' online behaviors could better identify experiences to inform interventions. In this retrospective case-control study, we analyzed patients' social media posts to better understand their day-to-day behaviors and emotions expressed online. Patients included those receiving care in an urban academic emergency department who consented to share access to their historical Facebook posts and electronic health records. Super-utilizers were defined as patients with more than six visits to the Emergency Department (ED) in a year. We compared posts by super-utilizers with a matched group using propensity scoring based on age, gender and Charlson comorbidity index. Super-utilizers were more likely to post about confusion and negativity (D = .65, 95% CI-[.38, .95]), self-reflection (D = .63 [.35, .91]), avoidance (D = .62 [.34, .90]), swearing (D = .52 [.24, .79]), sleep (D = .60 [.32, .88]), seeking help and attention (D = .61 [.33, .89]), psychosomatic symptoms, (D = .49 [.22, .77]), self-agency (D = .56 [.29, .85]), anger (D = .51, [.24, .79]), stress (D = .46, [.19, .73]), and lonely expressions (D = .44, [.17, .71]). Insights from this study can potentially supplement offline community care services with online social support interventions considering the high engagement of super-utilizers on social media.

11.
JMIR Cardio ; 5(1): e24473, 2021 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-33605888

RESUMO

BACKGROUND: Current atherosclerotic cardiovascular disease (ASCVD) predictive models have limitations; thus, efforts are underway to improve the discriminatory power of ASCVD models. OBJECTIVE: We sought to evaluate the discriminatory power of social media posts to predict the 10-year risk for ASCVD as compared to that of pooled cohort risk equations (PCEs). METHODS: We consented patients receiving care in an urban academic emergency department to share access to their Facebook posts and electronic medical records (EMRs). We retrieved Facebook status updates up to 5 years prior to study enrollment for all consenting patients. We identified patients (N=181) without a prior history of coronary heart disease, an ASCVD score in their EMR, and more than 200 words in their Facebook posts. Using Facebook posts from these patients, we applied a machine-learning model to predict 10-year ASCVD risk scores. Using a machine-learning model and a psycholinguistic dictionary, Linguistic Inquiry and Word Count, we evaluated if language from posts alone could predict differences in risk scores and the association of certain words with risk categories, respectively. RESULTS: The machine-learning model predicted the 10-year ASCVD risk scores for the categories <5%, 5%-7.4%, 7.5%-9.9%, and ≥10% with area under the curve (AUC) values of 0.78, 0.57, 0.72, and 0.61, respectively. The machine-learning model distinguished between low risk (<10%) and high risk (>10%) with an AUC of 0.69. Additionally, the machine-learning model predicted the ASCVD risk score with Pearson r=0.26. Using Linguistic Inquiry and Word Count, patients with higher ASCVD scores were more likely to use words associated with sadness (r=0.32). CONCLUSIONS: Language used on social media can provide insights about an individual's ASCVD risk and inform approaches to risk modification.

12.
Ann Surg ; 273(5): 900-908, 2021 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-33074901

RESUMO

OBJECTIVE: The aim of this study was to systematically assess the application and potential benefits of natural language processing (NLP) in surgical outcomes research. SUMMARY BACKGROUND DATA: Widespread implementation of electronic health records (EHRs) has generated a massive patient data source. Traditional methods of data capture, such as billing codes and/or manual review of free-text narratives in EHRs, are highly labor-intensive, costly, subjective, and potentially prone to bias. METHODS: A literature search of PubMed, MEDLINE, Web of Science, and Embase identified all articles published starting in 2000 that used NLP models to assess perioperative surgical outcomes. Evaluation metrics of NLP systems were assessed by means of pooled analysis and meta-analysis. Qualitative synthesis was carried out to assess the results and risk of bias on outcomes. RESULTS: The present study included 29 articles, with over half (n = 15) published after 2018. The most common outcome identified using NLP was postoperative complications (n = 14). Compared to traditional non-NLP models, NLP models identified postoperative complications with higher sensitivity [0.92 (0.87-0.95) vs 0.58 (0.33-0.79), P < 0.001]. The specificities were comparable at 0.99 (0.96-1.00) and 0.98 (0.95-0.99), respectively. Using summary of likelihood ratio matrices, traditional non-NLP models have clinical utility for confirming documentation of outcomes/diagnoses, whereas NLP models may be reliably utilized for both confirming and ruling out documentation of outcomes/diagnoses. CONCLUSIONS: NLP usage to extract a range of surgical outcomes, particularly postoperative complications, is accelerating across disciplines and areas of clinical outcomes research. NLP and traditional non-NLP approaches demonstrate similar performance measures, but NLP is superior in ruling out documentation of surgical outcomes.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Narração , Processamento de Linguagem Natural , Procedimentos Cirúrgicos Operatórios , Humanos
13.
J Pers Soc Psychol ; 120(2): 370-383, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-32700960

RESUMO

A rapidly growing literature has attempted to explain Donald Trump's success in the 2016 U.S. presidential election as a result of a wide variety of differences in individual characteristics, attitudes, and social processes. We propose that the economic and psychological processes previously established have in common that they generated or electorally capitalized on unhappiness in the electorate, which emerges as a powerful high-level predictor of the 2016 electoral outcome. Drawing on a large dataset covering over 2 million individual surveys, which we aggregated to the county level, we find that low levels of evaluative, experienced, and eudaemonic subjective well-being (SWB) are strongly predictive of Trump's victory, accounting for an extensive list of demographic, ideological, and socioeconomic covariates and robustness checks. County-level future life evaluation alone correlates with the Trump vote share over Republican baselines at r = -.78 in the raw data, a magnitude rarely seen in the social sciences. We show similar findings when examining the association between individual-level life satisfaction and Trump voting. Low levels of SWB also predict anti-incumbent voting at the 2012 election, both at the county and individual level. The findings suggest that SWB is a powerful high-level marker of (dis)content and that SWB should be routinely considered alongside economic explanations of electoral choice. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Assuntos
Felicidade , Política , Atitude , Humanos , Saúde Mental , Inquéritos e Questionários , Estados Unidos
14.
Womens Health (Lond) ; 16: 1745506520949392, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33028170

RESUMO

We sought to evaluate whether there was variability in language used on social media across different time points of pregnancy (before, during, and after pregnancy, as well as by trimester and parity). Consenting patients shared access to their individual Facebook posts and electronic medical records. Random forest models trained on Facebook posts could differentiate first trimester of pregnancy from 3 months before pregnancy (F1 score = .63) and from a random 3-month time period (F1 score = .64). Posts during pregnancy were more likely to include themes about family (ß = .22), food craving (ß = .14), and date/times (ß = .13), while posts 3 months prior to pregnancy included themes about social life (ß = .30), sleep (ß = .31), and curse words (ß = .27), and 3 months post-pregnancy included themes of gratitude (ß = .17), health appointments (ß = .21), and religiosity (ß = .18). Users who were pregnant for the first time were more likely to post about lack of sleep (ß = .15), activities of daily living (ß = .09), and communication (ß = .08) compared with those who were pregnant after having a child who posted about others' birthdays (ß = .16) and life events (.12). A better understanding about social media timelines can provide insight into lifestyle choices that are specific to pregnancy.


Assuntos
Idioma , Registros Médicos , Paridade , Trimestres da Gravidez , Mídias Sociais , Atividades Cotidianas , Adulto , Feminino , Humanos , Estilo de Vida , Gravidez , Adulto Jovem
16.
Sci Rep ; 10(1): 11456, 2020 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-32632209

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

17.
medRxiv ; 2020 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-32511551

RESUMO

The COVID-19 outbreak has clear clinical and economic impacts, but also affects behaviors e.g. through social distancing, and may increase stress and anxiety. However, while case numbers are tracked daily, we know little about the psychological effects of the outbreak on individuals in the moment. Here we examine the psychological and behavioral shifts over the initial stages of the outbreak in the United States in an observational longitudinal study. Through GPS phone data we find that homestay is increasing, while being at work dropped precipitously. Using regular real-time experiential surveys we observe an overall increase in stress and mood levels which is similar in size to the weekend vs. weekday differences. As there is a significant difference between weekday and weekend mood and stress levels, this is an important decrease in wellbeing. For some, especially those affected by job loss, the mental health impact is severe.

20.
Clin Transl Radiat Oncol ; 22: 69-75, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32274426

RESUMO

Background and Purpose: Radiation esophagitis is a clinically important toxicity seen with treatment for locally-advanced non-small cell lung cancer. There is considerable disagreement among prior studies in identifying predictors of radiation esophagitis. We apply machine learning algorithms to identify factors contributing to the development of radiation esophagitis to uncover previously unidentified criteria and more robust dosimetric factors. Materials and Methods: We used machine learning approaches to identify predictors of grade ≥ 3 radiation esophagitis in a cohort of 202 consecutive locally-advanced non-small cell lung cancer patients treated with definitive chemoradiation from 2008 to 2016. We evaluated 35 clinical features per patient grouped into risk factors, comorbidities, imaging, stage, histology, radiotherapy, chemotherapy and dosimetry. Univariate and multivariate analyses were performed using a panel of 11 machine learning algorithms combined with predictive power assessments. Results: All patients were treated to a median dose of 66.6 Gy at 1.8 Gy per fraction using photon (89.6%) and proton (10.4%) beam therapy, most often with concurrent chemotherapy (86.6%). 11.4% of patients developed grade ≥ 3 radiation esophagitis. On univariate analysis, no individual feature was found to predict radiation esophagitis (AUC range 0.45-0.55, p ≥ 0.07). In multivariate analysis, all machine learning algorithms exhibited poor predictive performance (AUC range 0.46-0.56, p ≥ 0.07). Conclusions: Contemporary machine learning algorithms applied to our modern, relatively large institutional cohort could not identify any reliable predictors of grade ≥ 3 radiation esophagitis. Additional patients are needed, and novel patient-specific and treatment characteristics should be investigated to develop clinically meaningful methods to mitigate this survival altering toxicity.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...