|

A machine learning approach to detect potentially harmful and protective suicide-related content in broadcast media.

Metzler, Hannah; Baginski, Hubert; Garcia, David; Niederkrotenthaler, Thomas.

PLoS One ; 19(5): e0300917, 2024.

Article En | MEDLINE | ID: mdl-38743759

Suicide-related media content has preventive or harmful effects depending on the specific content. Proactive media screening for suicide prevention is hampered by the scarcity of machine learning approaches to detect specific characteristics in news reports. This study applied machine learning to label large quantities of broadcast (TV and radio) media data according to media recommendations reporting suicide. We manually labeled 2519 English transcripts from 44 broadcast sources in Oregon and Washington, USA, published between April 2019 and March 2020. We conducted a content analysis of media reports regarding content characteristics. We trained a benchmark of machine learning models including a majority classifier, approaches based on word frequency (TF-IDF with a linear SVM) and a deep learning model (BERT). We applied these models to a selection of more simple (e.g., focus on a suicide death), and subsequently to putatively more complex tasks (e.g., determining the main focus of a text from 14 categories). Tf-idf with SVM and BERT were clearly better than the naive majority classifier for all characteristics. In a test dataset not used during model training, F1-scores (i.e., the harmonic mean of precision and recall) ranged from 0.90 for celebrity suicide down to 0.58 for the identification of the main focus of the media item. Model performance depended strongly on the number of training samples available, and much less on assumed difficulty of the classification task. This study demonstrates that machine learning models can achieve very satisfactory results for classifying suicide-related broadcast media content, including multi-class characteristics, as long as enough training samples are available. The developed models enable future large-scale screening and investigations of broadcast media.

Machine Learning , Mass Media , Humans , Suicide , Suicide Prevention , Oregon , Washington , Deep Learning

Association of 7 million+ tweets featuring suicide-related content with daily calls to the Suicide Prevention Lifeline and with suicides, United States, 2016-2018.

Niederkrotenthaler, Thomas; Tran, Ulrich S; Baginski, Hubert; Sinyor, Mark; Strauss, Markus J; Sumner, Steven A; Voracek, Martin; Till, Benedikt; Murphy, Sean; Gonzalez, Frances; Gould, Madelyn; Garcia, David; Draper, John; Metzler, Hannah.

Aust N Z J Psychiatry ; 57(7): 994-1003, 2023 Jul.

Article En | MEDLINE | ID: mdl-36239594

OBJECTIVE: The aim of this study was to assess associations of various content areas of Twitter posts with help-seeking from the US National Suicide Prevention Lifeline (Lifeline) and with suicides. METHODS: We retrieved 7,150,610 suicide-related tweets geolocated to the United States and posted between 1 January 2016 and 31 December 2018. Using a specially devised machine-learning approach, we categorized posts into content about prevention, suicide awareness, personal suicidal ideation without coping, personal coping and recovery, suicide cases and other. We then applied seasonal autoregressive integrated moving average analyses to assess associations of tweet categories with daily calls to the US National Suicide Prevention Lifeline (Lifeline) and suicides on the same day. We hypothesized that coping-related and prevention-related tweets are associated with greater help-seeking and potentially fewer suicides. RESULTS: The percentage of posts per category was 15.4% (standard deviation: 7.6%) for awareness, 13.8% (standard deviation: 9.4%) for prevention, 12.3% (standard deviation: 9.1%) for suicide cases, 2.4% (standard deviation: 2.1%) for suicidal ideation without coping and 0.8% (standard deviation: 1.7%) for coping posts. Tweets about prevention were positively associated with Lifeline calls (B = 1.94, SE = 0.73, p = 0.008) and negatively associated with suicides (B = -0.11, standard error = 0.05, p = 0.038). Total number of tweets were negatively associated with calls (B = -0.01, standard error = 0.0003, p = 0.007) and positively associated with suicide, (B = 6.4 × 10-5, standard error = 2.6 × 10-5, p = 0.015). CONCLUSION: This is the first large-scale study to suggest that daily volume of specific suicide-prevention-related social media content on Twitter corresponds to higher daily levels of help-seeking behaviour and lower daily number of suicide deaths. PREREGISTRATION: As Predicted, #66922, 26 May 2021.

Social Media , Suicide , Humans , United States/epidemiology , Suicide Prevention , Suicidal Ideation , Data Collection

Detecting Potentially Harmful and Protective Suicide-Related Content on Twitter: Machine Learning Approach.

Metzler, Hannah; Baginski, Hubert; Niederkrotenthaler, Thomas; Garcia, David.

J Med Internet Res ; 24(8): e34705, 2022 08 17.

Article En | MEDLINE | ID: mdl-35976193

BACKGROUND: Research has repeatedly shown that exposure to suicide-related news media content is associated with suicide rates, with some content characteristics likely having harmful and others potentially protective effects. Although good evidence exists for a few selected characteristics, systematic and large-scale investigations are lacking. Moreover, the growing importance of social media, particularly among young adults, calls for studies on the effects of the content posted on these platforms. OBJECTIVE: This study applies natural language processing and machine learning methods to classify large quantities of social media data according to characteristics identified as potentially harmful or beneficial in media effects research on suicide and prevention. METHODS: We manually labeled 3202 English tweets using a novel annotation scheme that classifies suicide-related tweets into 12 categories. Based on these categories, we trained a benchmark of machine learning models for a multiclass and a binary classification task. As models, we included a majority classifier, an approach based on word frequency (term frequency-inverse document frequency with a linear support vector machine) and 2 state-of-the-art deep learning models (Bidirectional Encoder Representations from Transformers [BERT] and XLNet). The first task classified posts into 6 main content categories, which are particularly relevant for suicide prevention based on previous evidence. These included personal stories of either suicidal ideation and attempts or coping and recovery, calls for action intending to spread either problem awareness or prevention-related information, reporting of suicide cases, and other tweets irrelevant to these 5 categories. The second classification task was binary and separated posts in the 11 categories referring to actual suicide from posts in the off-topic category, which use suicide-related terms in another meaning or context. RESULTS: In both tasks, the performance of the 2 deep learning models was very similar and better than that of the majority or the word frequency classifier. BERT and XLNet reached accuracy scores above 73% on average across the 6 main categories in the test set and F1-scores between 0.69 and 0.85 for all but the suicidal ideation and attempts category (F1=0.55). In the binary classification task, they correctly labeled around 88% of the tweets as about suicide versus off-topic, with BERT achieving F1-scores of 0.93 and 0.74, respectively. These classification performances were similar to human performance in most cases and were comparable with state-of-the-art models on similar tasks. CONCLUSIONS: The achieved performance scores highlight machine learning as a useful tool for media effects research on suicide. The clear advantage of BERT and XLNet suggests that there is crucial information about meaning in the context of words beyond mere word frequencies in tweets about suicide. By making data labeling more efficient, this work has enabled large-scale investigations on harmful and protective associations of social media content with suicide rates and help-seeking behavior.

Social Media , Suicide Prevention , Humans , Machine Learning , Natural Language Processing , Suicidal Ideation , Young Adult