RESUMO
Deceased public figures are often said to live on in collective memory. We quantify this phenomenon by tracking mentions of 2,362 public figures in English-language online news and social media (Twitter) 1 y before and after death. We measure the sharp spike and rapid decay of attention following death and model collective memory as a composition of communicative and cultural memory. Clustering reveals four patterns of postmortem memory, and regression analysis shows that boosts in media attention are largest for premortem popular anglophones who died a young, unnatural death; that long-term boosts are smallest for leaders and largest for artists; and that, while both the news and Twitter are triggered by young and unnatural deaths, the news additionally curates collective memory when old persons or leaders die. Overall, we illuminate the age-old question of who is remembered by society, and the distinct roles of news and social media in collective memory formation.
Assuntos
Meios de Comunicação de Massa/tendências , Identificação Social , Mídias Sociais/tendências , Comunicação , Humanos , Eventos de Massa , Memória , Fatores SociológicosRESUMO
BACKGROUND: e-Cigarette (electronic cigarette) use has been a public health issue in the United States. On June 23, 2022, the US Food and Drug Administration (FDA) issued marketing denial orders (MDOs) to Juul Labs Inc for all their products currently marketed in the United States. However, one day later, on June 24, 2022, a federal appeals court granted a temporary reprieve to Juul Labs that allowed it to keep its e-cigarettes on the market. As the conversation around Juul continues to evolve, it is crucial to gain insights into the sentiments and opinions expressed by individuals on social media. OBJECTIVE: This study aims to conduct a comprehensive analysis of tweets before and after the ban on Juul, aiming to shed light on public perceptions and sentiments surrounding this contentious topic and to better understand the life cycle of public health-related policy on social media. METHODS: Natural language processing (NLP) techniques were used, including state-of-the-art BERTopic topic modeling and sentiment analysis. A total of 6023 tweets and 22,288 replies or retweets were collected from Twitter (rebranded as X in 2023) between June 2022 and October 2022. The encoded topics were used in time-trend analysis to depict the boom-and-bust cycle. Content analyses of retweets were also performed to better understand public perceptions and sentiments about this contentious topic. RESULTS: The attention surrounding the FDA's ban on Juul lasted no longer than a week on Twitter. Not only the news (ie, tweets with a YouTube link that directs to the news site) related to the announcement itself, but the surrounding discussions (eg, potential consequences of this ban or block and concerns toward kids or youth health) diminished shortly after June 23, 2022, the date when the ban was officially announced. Although a short rebound was observed on July 4, 2022, which was contributed by the suspension on the following day, discussions dried out in 2 days. Out of the top 50 most retweeted tweets, we observed that, except for neutral (23/45, 51%) sentiment that broadcasted the announcement, posters responded more negatively (19/45, 42%) to the FDA's ban. CONCLUSIONS: We observed a short life cycle for this news announcement, with a preponderance of negative sentiment toward the FDA's ban on Juul. Policy makers could use tactics such as issuing ongoing updates and reminders about the ban, highlighting its impact on public health, and actively engaging with influential social media users who can help maintain the conversation.
Assuntos
Sistemas Eletrônicos de Liberação de Nicotina , Processamento de Linguagem Natural , Mídias Sociais , United States Food and Drug Administration , Mídias Sociais/estatística & dados numéricos , Estados Unidos , Humanos , Opinião Pública , Regulamentação Governamental , Saúde Pública/legislação & jurisprudênciaRESUMO
BACKGROUND: The rapid adoption and sustained use of social media globally has provided researchers with access to unprecedented quantities of low-latency data at minimal costs. This may be of particular interest to nutrition research because food is frequently posted about and discussed on social media platforms. This scoping review investigates the ways in which social media is being used to understand population food consumption, attitudes and behaviours. METHODS: The peer-reviewed literature was searched from 2003 to 2021 using four electronic databases. RESULTS: The review identified 71 eligible studies from 25 countries. Two-thirds (n = 47) were published within the last 5 years. The USA had the highest research output (31%, n = 22) and Twitter was the most used platform (41%, n = 29). A diverse range of dataset sizes were used, with some studies relying on manual techniques to collect and analyse data, whereas others required the use of advanced software technology. Most studies were conducted by disciplines outside health, with only two studies (3%) being conducted by nutritionists. CONCLUSIONS: It appears the development of methodological and ethical frameworks as well as partnerships between experts in nutrition and information technology may be required to advance the field in nutrition research. Moving beyond traditional methods of dietary data collection may prove social media as a useful adjunct to inform recommended dietary practices and food policies.
Assuntos
Mídias Sociais , Humanos , Dieta , Comportamentos Relacionados com a Saúde , Comportamento Social , TecnologiaRESUMO
BACKGROUND: To contain and curb the spread of COVID-19, the governments of countries around the world have used different strategies (lockdown, mandatory vaccination, immunity passports, voluntary social distancing, etc). OBJECTIVE: This study aims to examine the reactions produced by the public announcement of a binding political decision presented by the president of the French Republic, Emmanuel Macron, on July 12, 2021, which imposed vaccination on caregivers and an immunity passport on all French people to access restaurants, cinemas, bars, and so forth. METHODS: To measure these announcement reactions, 901,908 unique tweets posted on Twitter (Twitter Inc) between July 12 and August 11, 2021, were extracted. A neural network was constructed to examine the arguments of the tweets and to identify the types of arguments used by Twitter users. RESULTS: This study shows that in the debate about mandatory vaccination and immunity passports, mostly "con" arguments (399,803/847,725, 47%; χ26=952.8; P<.001) and "scientific" arguments (317,156/803,583, 39%; χ26=5006.8; P<.001) were used. CONCLUSIONS: This study shows that during July and August 2021, social events permeating the public sphere and discussions about mandatory vaccination and immunity passports collided on Twitter. Moreover, a political decision based on scientific arguments led citizens to challenge it using pseudoscientific arguments contesting the effectiveness of vaccination and the validity of these political decisions.
Assuntos
COVID-19 , Mídias Sociais , Humanos , COVID-19/prevenção & controle , Processamento de Linguagem Natural , Controle de Doenças Transmissíveis , Redes Neurais de ComputaçãoRESUMO
Social media analysis provides an alternate approach to monitoring and understanding risk perceptions regarding COVID-19 over time. Our current understandings of risk perceptions regarding COVID-19 do not disentangle the three dimensions of risk perceptions (perceived susceptibility, perceived severity, and negative emotion) as the pandemic has evolved. Data are also limited regarding the impact of social determinants of health (SDOH) on COVID-19-related risk perceptions over time. To address these knowledge gaps, we extracted tweets regarding COVID-19-related risk perceptions and developed indicators for the three dimensions of risk perceptions based on over 502 million geotagged tweets posted by over 4.9 million Twitter users from January 2020 to December 2021 in the United States. We examined correlations between risk perception indicator scores and county-level SDOH. The three dimensions of risk perceptions demonstrate different trajectories. Perceived severity maintained a high level throughout the study period. Perceived susceptibility and negative emotion peaked on March 11, 2020 (COVID-19 declared global pandemic by WHO) and then declined and remained stable at lower levels until increasing once again with the Omicron period. Relative frequency of tweet posts on risk perceptions did not closely follow epidemic trends of COVID-19 (cases, deaths). Users from socioeconomically vulnerable counties showed lower attention to perceived severity and susceptibility of COVID-19 than those from wealthier counties. Examining trends in tweets regarding the multiple dimensions of risk perceptions throughout the COVID-19 pandemic can help policymakers frame in-time, tailored, and appropriate responses to prevent viral spread and encourage preventive behavior uptake in the United States.
Assuntos
COVID-19 , Mídias Sociais , Humanos , Estados Unidos/epidemiologia , COVID-19/epidemiologia , COVID-19/psicologia , Pandemias , Inquéritos e Questionários , Fatores SocioeconômicosRESUMO
In many countries, mental health issues are among the most serious public health concerns. National mental health statistics are frequently collected from reported patient cases or government-sponsored surveys, which have restricted coverage, frequency, and timeliness. Many domains of study, including public healthcare and biomedical informatics, have recently adopted social media data as a feasible real-time alternative to traditional methods of gathering representative information at the population level in a variety of contexts. However, because of the limits of fundamental natural language processing tools and labeled corpora in countries with limited natural language resources, such as Thailand, implementing social media systems to monitor mental health signals could be challenging. This paper presents LAPoMM, a novel framework for monitoring real-time mental health indicators from social media data without using labeled datasets in low-resource languages. Specifically, we use cross-lingual methods to train language-agnostic models and validate our framework by examining cross-correlations between the aggregate predicted mental signals and real-world administrative data from Thailand's Department of Mental Health, which includes monthly depression patients and reported cases of suicidal attempts. A combination of a language-agnostic representation and a deep learning classification model outperforms all other cross-lingual techniques for recognizing various mental signals in tweets, such as emotions, sentiments, and suicidal tendencies. The correlation analyses discover a strong positive relationship between actual depression cases and the predicted negative sentiment signals as well as suicide attempts and negative signals (e.g., fear, sadness, and disgust) and suicidal tendency. These findings establish the effectiveness of our proposed framework and its potential applications in monitoring population-level mental health using large-scale social media data. Furthermore, because the language-agnostic model utilized in the methodology is capable of supporting a wide range of languages, the proposed LAPoMM framework can be easily generalized for analogous applications in other countries with limited language resources.
Assuntos
Aprendizado Profundo , Mídias Sociais , Humanos , Saúde Mental , Processamento de Linguagem Natural , Rede SocialRESUMO
BACKGROUND: Vaccination against COVID-19 has been available in Germany since December 2020. However, about 30% of the population report not wanting to be vaccinated. In order to increase the willingness of the population to get vaccinated, data on the acceptance of vaccination and its influencing factors are necessary. Little is known about why individuals refuse the COVID-19 vaccination. The aim of this study was to investigate the reasons leading to rejecting vaccination, based on posts from three social media sites. METHODS: The German-language versions of Instagram, Twitter and YouTube were searched regarding negative attitudes towards COVID-19 vaccination. Data was extracted until a saturation effect could be observed. The data included posts created from January 20, 2020 to May 2, 2021. This time frame roughly covers the period from the first reports of the spread of SARS-CoV-2 up to the general availability of vaccines against COVID-19 in Germany. We used an interpretive thematic approach to analyze the data and to inductively generate codes, subcategories and categories. RESULTS: Based on 333 posts written by 323 contributing users, we identified six main categories of reasons for refusing a COVID-19 vaccination: Low perceived benefit of vaccination, low perceived risk of contracting COVID-19, health concerns, lack of information, systemic mistrust and spiritual or religious reasons. The analysis reveals a lack of information among users and the spread of misinformation with regard to COVID-19 and vaccination. Users feel inadequately informed about vaccination or do not understand the information available. These information gaps may be related to information not being sufficiently sensitive to the needs of the target group. In addition to limited information for the general population, misinformation on the internet can also be an important reason for refusing vaccination. CONCLUSIONS: The study emphasizes the relevance of providing trustworthy and quality-assured information on COVID-19 and COVID-19 vaccination to all population groups. In addition, vaccinations should be easily accessible in order to promote the population's willingness to be vaccinated.
Assuntos
COVID-19 , Mídias Sociais , COVID-19/epidemiologia , COVID-19/prevenção & controle , Vacinas contra COVID-19 , Alemanha/epidemiologia , Humanos , SARS-CoV-2RESUMO
BACKGROUND: Crowdsourcing services, such as Amazon Mechanical Turk (AMT), allow researchers to use the collective intelligence of a wide range of web users for labor-intensive tasks. As the manual verification of the quality of the collected results is difficult because of the large volume of data and the quick turnaround time of the process, many questions remain to be explored regarding the reliability of these resources for developing digital public health systems. OBJECTIVE: This study aims to explore and evaluate the application of crowdsourcing, generally, and AMT, specifically, for developing digital public health surveillance systems. METHODS: We collected 296,166 crowd-generated labels for 98,722 tweets, labeled by 610 AMT workers, to develop machine learning (ML) models for detecting behaviors related to physical activity, sedentary behavior, and sleep quality among Twitter users. To infer the ground truth labels and explore the quality of these labels, we studied 4 statistical consensus methods that are agnostic of task features and only focus on worker labeling behavior. Moreover, to model the meta-information associated with each labeling task and leverage the potential of context-sensitive data in the truth inference process, we developed 7 ML models, including traditional classifiers (offline and active), a deep learning-based classification model, and a hybrid convolutional neural network model. RESULTS: Although most crowdsourcing-based studies in public health have often equated majority vote with quality, the results of our study using a truth set of 9000 manually labeled tweets showed that consensus-based inference models mask underlying uncertainty in data and overlook the importance of task meta-information. Our evaluations across 3 physical activity, sedentary behavior, and sleep quality data sets showed that truth inference is a context-sensitive process, and none of the methods studied in this paper were consistently superior to others in predicting the truth label. We also found that the performance of the ML models trained on crowd-labeled data was sensitive to the quality of these labels, and poor-quality labels led to incorrect assessment of these models. Finally, we have provided a set of practical recommendations to improve the quality and reliability of crowdsourced data. CONCLUSIONS: Our findings indicate the importance of the quality of crowd-generated labels in developing ML models designed for decision-making purposes, such as public health surveillance decisions. A combination of inference models outlined and analyzed in this study could be used to quantitatively measure and improve the quality of crowd-generated labels for training ML models.
Assuntos
Crowdsourcing , Humanos , Aprendizado de Máquina , Vigilância em Saúde Pública , Reprodutibilidade dos Testes , Qualidade do SonoRESUMO
Hashtags have been an integral element of social media platforms over the years and are widely used by users to promote, organize and connect users. Despite the intensive use of hashtags, there is no basis for using congruous tags, which causes the creation of many unrelated contents in hashtag searches. The presence of mismatched content in the hashtag creates many problems for individuals and brands. Although several methods have been presented to solve the problem by recommending hashtags based on the users' interest, the detection and analysis of the characteristics of these repetitive contents with irrelevant hashtags have rarely been addressed. To this end, we propose a novel hybrid deep learning hashtag incongruity detection by fusing visual and textual modality. We fine-tune BERT and ResNet50 pre-trained models to encode textual and visual information to encode textual and visual data simultaneously. We further attempt to show the capability of logo detection and face recognition in discriminating images. To extract faces, we introduce a pipeline that ranks faces based on the number of times they appear on Instagram accounts using face clustering. Moreover, we conduct our analysis and experiments on a dataset of Instagram posts that we collect from hashtags related to brands and celebrities. Unlike the existing works, we analyze these contents from both content and user perspectives and show a significant difference between data. In light of our results, we show that our multimodal model outperforms other models and the effectiveness of object detection in detecting mismatched information.
Assuntos
Aprendizado Profundo , Mídias Sociais , HumanosRESUMO
With the onset of COVID-19, the pandemic has aroused huge discussions on social media like Twitter, followed by many social media analyses concerning it. Despite such an abundance of studies, however, little work has been done on reactions from the public and officials on social networks and their associations, especially during the early outbreak stage. In this paper, a total of 9,259,861 COVID-19-related English tweets published from 31 December 2019 to 11 March 2020 are accumulated for exploring the participatory dynamics of public attention and news coverage during the early stage of the pandemic. An easy numeric data augmentation (ENDA) technique is proposed for generating new samples while preserving label validity. It attains superior performance on text classification tasks with deep models (BERT) than an easier data augmentation method. To demonstrate the efficacy of ENDA further, experiments and ablation studies have also been implemented on other benchmark datasets. The classification results of COVID-19 tweets show tweets peaks trigged by momentous events and a strong positive correlation between the daily number of personal narratives and news reports. We argue that there were three periods divided by the turning points on January 20 and February 23 and the low level of news coverage suggests the missed windows for government response in early January and February. Our study not only contributes to a deeper understanding of the dynamic patterns and relationships of public attention and news coverage on social media during the pandemic but also sheds light on early emergency management and government response on social media during global health crises.
RESUMO
The outbreak of COVID-19 has led to a global health crisis and caused huge emotional swings. However, the positive emotional expressions, like self-confidence, optimism, and praise, that appear in Chinese social networks are rarely explored by researchers. This study aims to analyze the characteristics of netizens' positive energy expressions and the impact of node events on public emotional expression during the COVID-19 pandemic. First, a total of 6,525,249 Chinese texts posted by Sina Weibo users were randomly selected through textual data cleaning and word segmentation for corpus construction. A fine-grained sentiment lexicon that contained POSITIVE ENERGY was built using Word2Vec technology; this lexicon was later used to conduct sentiment category analysis on original posts. Next, through manual labeling and multi-classification machine learning model construction, four mainstream machine learning algorithms were selected to train the emotional intensity model. Finally, the lexicon and optimized emotional intensity model were used to analyze the emotional expressions of Chinese netizens. The results show that POSITIVE ENERGY expression accounted for 40.97% during the COVID-19 pandemic. Over the course of time, POSITIVE ENERGY emotions were displayed at the highest levels and SURPRISES the lowest. The analysis results of the node events showed after the outbreak was confirmed officially, the expressions of POSITIVE ENERGY and FEAR increased simultaneously. After the initial victory in pandemic prevention and control, the expression of POSITIVE ENERGY and SAD reached a peak, while the increase of SAD was the most prominent. The fine-grained sentiment lexicon, which includes a POSITIVE ENERGY category, demonstrated reliable algorithm performance and can be used for sentiment classification of Chinese Internet context. We also found many POSITIVE ENERGY expressions in Chinese online social platforms which are proven to be significantly affected by nod events of different nature.
RESUMO
It has not been long since a new disease called COVID-19 has hit the international community. Unknown nature of the virus, evidence of its adaptability and survival in new conditions, its widespread prevalence and also lengthy recovery period, along with daily notifications of new infection and fatality statistics, have created a wave of fear and anxiety among the public community and authorities. These factors have led to extreme changes in the social discourse in a rather short period of time. The analysis of this discourse is important to reconcile the society and restore ordinary conditions of mental peace and health. Although much research has been done on the disease since its international pandemic, the sociological analysis of the recent public phenomenon, especially in developing countries, still needs attention. We propose a framework for analyzing social media data and news stories oriented around COVID-19 disease. Our research is based on an extensive Persian data set gathered from different social media networks and news agencies in the period of January 21-April 29, 2020. We use the Latent Dirichlet Allocation (LDA) model and dynamic topic modeling to understand and capture the change of discourse in terms of temporal subjects. We scrutinize the reasons of subject alternations by exploring the related events and adopted practices and policies. The social discourse can highly affect the community morale and polarization. Therefore, we further analyze the polarization in online social media posts, and detect points of concept drift in the stream. Based on the analyzed content, effective guidelines are extracted to shift polarization towards positive. The results show that the proposed framework is able to provide an effective practical approach for cause and effect analysis of the social discourse.
Assuntos
COVID-19 , Mídias Sociais , Humanos , Irã (Geográfico)/epidemiologia , Pandemias , SARS-CoV-2RESUMO
BACKGROUND: COVID-19 has continued to spread in the United States and globally. Closely monitoring public engagement and perceptions of COVID-19 and preventive measures using social media data could provide important information for understanding the progress of current interventions and planning future programs. OBJECTIVE: The aim of this study is to measure the public's behaviors and perceptions regarding COVID-19 and its effects on daily life during 5 months of the pandemic. METHODS: Natural language processing (NLP) algorithms were used to identify COVID-19-related and unrelated topics in over 300 million online data sources from June 15 to November 15, 2020. Posts in the sample were geotagged by NetBase, a third-party data provider, and sensitivity and positive predictive value were both calculated to validate the classification of posts. Each post may have included discussion of multiple topics. The prevalence of discussion regarding these topics was measured over this time period and compared to daily case rates in the United States. RESULTS: The final sample size included 9,065,733 posts, 70% of which were sourced from the United States. In October and November, discussion including mentions of COVID-19 and related health behaviors did not increase as it had from June to September, despite an increase in COVID-19 daily cases in the United States beginning in October. Additionally, discussion was more focused on daily life topics (n=6,210,255, 69%), compared with COVID-19 in general (n=3,390,139, 37%) and COVID-19 public health measures (n=1,836,200, 20%). CONCLUSIONS: There was a decline in COVID-19-related social media discussion sourced mainly from the United States, even as COVID-19 cases in the United States increased to the highest rate since the beginning of the pandemic. Targeted public health messaging may be needed to ensure engagement in public health prevention measures as global vaccination efforts continue.
Assuntos
COVID-19/epidemiologia , Saúde Pública/estatística & dados numéricos , Mídias Sociais/estatística & dados numéricos , Estudos Transversais , Humanos , Processamento de Linguagem Natural , Pandemias , SARS-CoV-2 , Estados Unidos/epidemiologia , VacinaçãoRESUMO
With the online presence of more than half the world population, social media plays a very important role in the lives of individuals as well as businesses alike. Social media enables businesses to advertise their products, build brand value, and reach out to their customers. To leverage these social media platforms, it is important for businesses to process customer feedback in the form of posts and tweets. Sentiment analysis is the process of identifying the emotion, either positive, negative or neutral, associated with these social media texts. The presence of sarcasm in texts is the main hindrance in the performance of sentiment analysis. Sarcasm is a linguistic expression often used to communicate the opposite of what is said, usually something that is very unpleasant, with an intention to insult or ridicule. Inherent ambiguity in sarcastic expressions make sarcasm detection very difficult. In this work, we focus on detecting sarcasm in textual conversations from various social networking platforms and online media. To this end, we develop an interpretable deep learning model using multi-head self-attention and gated recurrent units. The multi-head self-attention module aids in identifying crucial sarcastic cue-words from the input, and the recurrent units learn long-range dependencies between these cue-words to better classify the input text. We show the effectiveness of our approach by achieving state-of-the-art results on multiple datasets from social networking platforms and online media. Models trained using our proposed approach are easily interpretable and enable identifying sarcastic cues in the input text which contribute to the final classification score. We visualize the learned attention weights on a few sample input texts to showcase the effectiveness and interpretability of our model.
RESUMO
Social data has shown important role in tracking, monitoring and risk management of disasters. Indeed, several works focused on the benefits of social data analysis for the healthcare practices and curing domain. Similarly, these data are exploited now for tracking the COVID-19 pandemic but the majority of works exploited Twitter as source. In this paper, we choose to exploit Facebook, rarely used, for tracking the evolution of COVID-19 related trends. In fact, a multilingual dataset covering 7 languages (English (EN), Arabic (AR), Spanish (ES), Italian (IT), German (DE), French (FR) and Japanese (JP)) is extracted from Facebook public posts. The proposal is an analytics process including a data gathering step, pre-processing, LDA-based topic modeling and presentation module using graph structure. Data analysing covers the duration spanned from January 1st, 2020 to May 15, 2020 divided on three periods in cumulative way: first period January-February, second period March-April and the last one to 15 May. The results showed that the extracted topics correspond to the chronological development of what has been circulated around the pandemic and the measures that have been taken according to the various languages under discussion representing several countries.
RESUMO
Road traffic pollution is one of the key factors affecting urban air quality. There is a consensus in the community that the efficient use of public transport is the most effective solution. In that sense, much effort has been made in the data mining discipline to come up with solutions able to anticipate taxi demands in a city. This helps to optimize the trips made by such an important urban means of transport. However, most of the existing solutions in the literature define the taxi demand prediction as a regression problem based on historical taxi records. This causes serious limitations with respect to the required data to operate and the interpretability of the prediction outcome. In this paper, we introduce QUADRIVEN (QUalitative tAxi Demand pRediction based on tIme-Variant onlinE social Network data analysis), a novel approach to deal with the taxi demand prediction problem based on human-generated data widely available on online social networks. The result of the prediction is defined on the basis of categorical labels that allow obtaining a semantically-enriched output. Finally, this proposal was tested with different models in a large urban area, showing quite promising results with an F1 score above 0.8.
RESUMO
BACKGROUND: Public and internet-based social media such as online healthcare-oriented chat groups provide a convenient channel for patients and people concerned about health to communicate and share information with each other. The chat logs of an online healthcare-oriented chat group can potentially be used to extract latent topics, to encourage participation, and to recommend relevant healthcare information to users. OBJECTIVE: This paper addresses the use of online healthcare chat logs to automatically discover both underlying topics and user interests. METHOD: We present a new probabilistic model that exploits healthcare chat logs to find hidden topics and changes in these topics over time. The proposed model uses separate but associated hidden variables to explore both topics and individual interests such that it can provide useful insights to the participants of online healthcare chat groups about their interests in terms of weighted topics or vice versa. RESULTS: We evaluate the proposed model on a real-world chat log by comparing its performance to benchmark topic models, i.e., latent Dirichlet allocation (LDA) and Author Topic Model (ATM), on the topic extraction task. The chat log is obtained from an online chat group of pregnant women, which consists of 233,452 chat word tokens contributed by 118 users. Both detected individual interests and underlying topics with their progressive information over time are demonstrated. The results show that the performance of the proposed model exceeds that of the benchmark models. CONCLUSION: The experimental results illustrate that the proposed model is a promising method for extracting healthcare knowledge from social media data.
Assuntos
Mineração de Dados , Atenção à Saúde , Mídias Sociais , Feminino , Humanos , Modelos Estatísticos , Relações Profissional-PacienteRESUMO
BACKGROUND: The early detection of mental health crises is crucial for timely interventions and improved outcomes. This study explores the potential of artificial intelligence (AI) in analyzing social media data to identify early signs of mental health crises. METHODS: We developed a multimodal deep learning model integrating natural language processing and temporal analysis techniques. The model was trained on a diverse dataset of 996,452 social media posts in multiple languages (English, Spanish, Mandarin, and Arabic) collected from Twitter, Reddit, and Facebook over 12 months. Its performance was evaluated using standard metrics and validated against expert psychiatric assessments. RESULTS: The AI model demonstrated a high level of accuracy (89.3%) in detecting early signs of mental health crises, with an average lead time of 7.2 days before human expert identification. Performance was consistent across languages (F1 scores: 0.827-0.872) and platforms (F1 scores: 0.839-0.863). Key digital markers included linguistic patterns, behavioral changes, and temporal trends. The model showed varying levels of accuracy for different crisis types: depressive episodes (91.2%), manic episodes (88.7%), suicidal ideation (93.5%), and anxiety crises (87.3%). CONCLUSIONS: AI-powered analysis of social media data shows promise for the early detection of mental health crises across diverse linguistic and cultural contexts. However, ethical challenges, including privacy concerns, potential stigmatization, and cultural biases, need careful consideration. Future research should focus on longitudinal outcome studies, ethical integration of the method with existing mental health services, and developing personalized, culturally sensitive models.
RESUMO
BACKGROUND: Cesarean section (CS) rates in Indonesia are rapidly increasing for both sociocultural and medical reasons. However, there is limited understanding of the role that social media plays in influencing preferences regarding mode of birth (vaginal or CS). Social media provides a platform for users to seek and exchange information, including information on the mode of birth, which may help unpack social influences on health behavior. OBJECTIVE: This study aims to explore how CS is portrayed on Instagram in Indonesia. METHODS: We downloaded public Instagram posts from Indonesia containing CS hashtags and extracted their attributes (image, caption, hashtags, and objects and texts within images). Posts were divided into 2 periods-before COVID-19 and during COVID-19-to examine changes in CS portrayal during the pandemic. We used a mixed methods approach to analysis using text mining, descriptive statistics, and qualitative content analysis. RESULTS: A total of 9978 posts were analyzed quantitatively, and 720 (7.22%) posts were sampled and analyzed qualitatively. The use of text (527/5913, 8.91% vs 242/4065, 5.95%; P<.001) and advertisement materials (411/5913, 6.95% vs 83/4065, 2.04%; P<.001) increased during the COVID-19 pandemic compared to before the pandemic, indicating growth of information sharing on CS over time. Posts with CS hashtags primarily promoted herbal medicine for faster recovery and services for choosing auspicious childbirth dates, encouraging elective CS. Some private health facilities offered discounts on CS for special events such as Mother's Day and promoted techniques such as enhanced recovery after CS for comfortable, painless birth, and faster recovery after CS. Hashtags related to comfortable or painless birth (2358/5913, 39.88% vs 278/4065, 6.84%; P<.001), enhanced recovery after CS (124/5913, 2.1% vs 0%; P<.001), feng shui services (110/5913, 1.86% vs 56/4065, 1.38%; P=.03), names of health care providers (2974/5913, 50.3% vs 304/4065, 7.48%; P<.001), and names of hospitals (1460/5913, 24.69% vs 917/4065, 22.56%; P=.007) were more prominent during compared to before the pandemic. CONCLUSIONS: This study highlights the necessity of enforcing advertisement regulations regarding birth-related medical services in the commercial and private sectors. Enhanced health promotion efforts are crucial to ensure that women receive accurate, balanced, and appropriate information about birth options. Continuous and proactive health information dissemination from government organizations is essential to counteract biases favoring CS over vaginal birth.
Assuntos
COVID-19 , Cesárea , Mídias Sociais , Humanos , Cesárea/estatística & dados numéricos , COVID-19/epidemiologia , COVID-19/prevenção & controle , Feminino , Indonésia/epidemiologia , Gravidez , Pandemias , SARS-CoV-2RESUMO
The Netherlands police are looking for measures to examine sentiment on social media related to protest demonstrations. While models exist to detect more subtle expressions of sentiment within tweets, models trained in the Dutch language are scarce. Being able to predict sentiment development during protests is relevant for parties like the Dutch government and the police to get more insight to when and where potential law enforcement is needed for public order and safety. Therefore, to analyse sentiment before, during, and after protest demonstrations, data was collected with tweets related to a Black Lives Matter protest that took place in Amsterdam during the COVID-19 pandemic. All tweets have been manually labelled by a dedicated open-source intelligence (OSINT) team within the Netherlands police following an established protocol. Both the data and the protocol are available, and interesting for researchers in natural language processing, topic detection, sentiment analysis, and protests analysis. The developed labelling tool for the labelling process is publicly available.