Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Proc Natl Acad Sci U S A ; 112(8): 2389-94, 2015 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-25675475

RESUMO

Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (i) the words of natural human language possess a universal positivity bias, (ii) the estimated emotional content of words is consistent between languages under translation, and (iii) this positivity bias is strongly independent of frequency of word use. Alongside these general regularities, we describe interlanguage variations in the emotional spectrum of languages that allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.


Assuntos
Viés , Emoções , Idioma , Humanos , Fatores de Tempo
3.
PLoS One ; 16(12): e0260592, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34879105

RESUMO

Measuring the specific kind, temporal ordering, diversity, and turnover rate of stories surrounding any given subject is essential to developing a complete reckoning of that subject's historical impact. Here, we use Twitter as a distributed news and opinion aggregation source to identify and track the dynamics of the dominant day-scale stories around Donald Trump, the 45th President of the United States. Working with a data set comprising around 20 billion 1-grams, we first compare each day's 1-gram and 2-gram usage frequencies to those of a year before, to create day- and week-scale timelines for Trump stories for 2016-2021. We measure Trump's narrative control, the extent to which stories have been about Trump or put forward by Trump. We then quantify story turbulence and collective chronopathy-the rate at which a population's stories for a subject seem to change over time. We show that 2017 was the most turbulent overall year for Trump. In 2020, story generation slowed dramatically during the first two major waves of the COVID-19 pandemic, with rapid turnover returning first with the Black Lives Matter protests following George Floyd's murder and then later by events leading up to and following the 2020 US presidential election, including the storming of the US Capitol six days into 2021. Trump story turnover for 2 months during the COVID-19 pandemic was on par with that of 3 days in September 2017. Our methods may be applied to any well-discussed phenomenon, and have potential to enable the computational aspects of journalism, history, and biography.


Assuntos
Política , COVID-19/epidemiologia , COVID-19/patologia , COVID-19/virologia , Humanos , SARS-CoV-2/isolamento & purificação , Estados Unidos
4.
PLoS One ; 16(1): e0244476, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33406101

RESUMO

In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as confirmed recovery through serological tests for antibodies, and we need to track major socioeconomic indices. But we also need auxiliary data of all kinds, including data related to how populations are talking about the unfolding pandemic through news and stories. To in part help on the social media side, we curate a set of 2000 day-scale time series of 1- and 2-grams across 24 languages on Twitter that are most 'important' for April 2020 with respect to April 2019. We determine importance through our allotaxonometric instrument, rank-turbulence divergence. We make some basic observations about some of the time series, including a comparison to numbers of confirmed deaths due to COVID-19 over time. We broadly observe across all languages a peak for the language-specific word for 'virus' in January 2020 followed by a decline through February and then a surge through March and April. The world's collective attention dropped away while the virus spread out from China. We host the time series on Gitlab, updating them on a daily basis while relevant. Our main intent is for other researchers to use these time series to enhance whatever analyses that may be of use during the pandemic as well as for retrospective investigations.


Assuntos
COVID-19/psicologia , Pandemias/estatística & dados numéricos , Mídias Sociais/tendências , Atenção , COVID-19/etiologia , Infecções por Coronavirus/etiologia , Infecções por Coronavirus/psicologia , Humanos , Idioma , Estudos Retrospectivos , SARS-CoV-2/patogenicidade
5.
Sci Adv ; 7(29)2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-34272243

RESUMO

In real time, Twitter strongly imprints world events, popular culture, and the day-to-day, recording an ever-growing compendium of language change. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spreading through retweets. Here, we describe Storywrangler, an ongoing curation of over 100 billion tweets containing 1 trillion 1-grams from 2008 to 2021. For each day, we break tweets into 1-, 2-, and 3-grams across 100+ languages, generating frequencies for words, hashtags, handles, numerals, symbols, and emojis. We make the dataset available through an interactive time series viewer and as downloadable time series and daily distributions. Although Storywrangler leverages Twitter data, our method of tracking dynamic changes in n-grams can be extended to any temporally evolving corpus. Illustrating the instrument's potential, we present example use cases including social amplification, the sociotechnical dynamics of famous individuals, box office success, and social unrest.

6.
PLoS One ; 14(2): e0210484, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30759111

RESUMO

Natural hazards are becoming increasingly expensive as climate change and development are exposing communities to greater risks. Preparation and recovery are critical for climate change resilience, and social media are being used more and more to communicate before, during, and after disasters. While there is a growing body of research aimed at understanding how people use social media surrounding disaster events, most existing work has focused on a single disaster case study. In the present study, we analyze five of the costliest disasters in the last decade in the United States (Hurricanes Irene and Sandy, two sets of tornado outbreaks, and flooding in Louisiana) through the lens of Twitter. In particular, we explore the frequency of both generic and specific food-security related terms, and quantify the relationship between network size and Twitter activity during disasters. We find differences in tweet volume for keywords depending on disaster type, with people using Twitter more frequently in preparation for Hurricanes, and for real-time or recovery information for tornado and flooding events. Further, we find that people share a host of general disaster and specific preparation and recovery terms during these events. Finally, we find that among all account types, individuals with "average" sized networks are most likely to share information during these disasters, and in most cases, do so more frequently than normal. This suggests that around disasters, an ideal form of social contagion is being engaged in which average people rather than outsized influentials are key to communication. These results provide important context for the type of disaster information and target audiences that may be most useful for disaster communication during varying extreme events.


Assuntos
Desastres , Redes Sociais Online , Mídias Sociais , Mudança Climática , Tempestades Ciclônicas/economia , Desastres/economia , Inundações/economia , Humanos , Louisiana , Tornados/economia , Estados Unidos
7.
PLoS One ; 13(12): e0209651, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30592735

RESUMO

The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense of verbs. In this study we quantify the extent of verb regularization using two vastly disparate datasets: (1) Six years of published books scanned by Google (2003-2008), and (2) A decade of social media messages posted to Twitter (2008-2017). We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in English Fiction books. Regularization is also greater for tweets geotagged in the United States relative to American English books, but the opposite is true for tweets geotagged in the United Kingdom relative to British English books. We also find interesting regional variations in regularization across counties in the United States. However, once differences in population are accounted for, we do not identify strong correlations with socio-demographic variables such as education or income.


Assuntos
Livros , Idioma , Mídias Sociais , Algoritmos , Humanos , Modelos Teóricos , Reino Unido , Estados Unidos
8.
PLoS One ; 13(4): e0195644, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29668754

RESUMO

Since the shooting of Black teenager Michael Brown by White police officer Darren Wilson in Ferguson, Missouri, the protest hashtag #BlackLivesMatter has amplified critiques of extrajudicial killings of Black Americans. In response to #BlackLivesMatter, other Twitter users have adopted #AllLivesMatter, a counter-protest hashtag whose content argues that equal attention should be given to all lives regardless of race. Through a multi-level analysis of over 860,000 tweets, we study how these protests and counter-protests diverge by quantifying aspects of their discourse. We find that #AllLivesMatter facilitates opposition between #BlackLivesMatter and hashtags such as #PoliceLivesMatter and #BlueLivesMatter in such a way that historically echoes the tension between Black protesters and law enforcement. In addition, we show that a significant portion of #AllLivesMatter use stems from hijacking by #BlackLivesMatter advocates. Beyond simply injecting #AllLivesMatter with #BlackLivesMatter content, these hijackers use the hashtag to directly confront the counter-protest notion of "All lives matter." Our findings suggest that Black Lives Matter movement was able to grow, exhibit diverse conversations, and avoid derailment on social media by making discussion of counter-protest opinions a central topic of #AllLivesMatter, rather than the movement itself.


Assuntos
Negro ou Afro-Americano , Aplicação da Lei , Mídias Sociais , Dissidências e Disputas , Humanos , Polícia , Mídias Sociais/estatística & dados numéricos
9.
PLoS One ; 12(2): e0168893, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28187216

RESUMO

We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the "caloric content" of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of "caloric input", "caloric output", and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities; is tunable to specific health and well-being measures such as diabetes rates; has the capability of providing a real-time signal reflecting a population's health; and has the potential to be used alongside traditional survey data in the development of public policy and collective self-awareness. Because our Lexicocalorimeter is a linear superposition of principled phrase scores, we also show we can move beyond correlations to explore what people talk about in collective detail, and assist in the understanding and explanation of how population-scale conditions vary, a capacity unavailable to black-box type methods.


Assuntos
Ingestão de Energia , Saúde Pública/métodos , Mídias Sociais/estatística & dados numéricos , Monitoramento Epidemiológico , Alimentos/estatística & dados numéricos , Humanos , Saúde Pública/estatística & dados numéricos , Estados Unidos
10.
Sci Rep ; 7(1): 13006, 2017 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-29021528

RESUMO

We developed computational models to predict the emergence of depression and Post-Traumatic Stress Disorder in Twitter users. Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy). We extracted predictive features measuring affect, linguistic style, and context from participant tweets (N = 279,951) and built models using these features with supervised learning algorithms. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners' average success rates in diagnosing depression, albeit in a separate population. Results held even when the analysis was restricted to content posted before first depression diagnosis. State-space temporal analysis suggests that onset of depression may be detectable from Twitter data several months prior to diagnosis. Predictive results were replicated with a separate sample of individuals diagnosed with PTSD (Nusers = 174, Ntweets = 243,775). A state-space time series model revealed indicators of PTSD almost immediately post-trauma, often many months prior to clinical diagnosis. These methods suggest a data-driven, predictive approach for early screening and detection of mental illness.


Assuntos
Depressão/diagnóstico , Progressão da Doença , Mídias Sociais , Humanos , Transtornos de Estresse Pós-Traumáticos/diagnóstico
11.
Phys Rev E ; 95(5-1): 052301, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-28618612

RESUMO

Herbert Simon's classic rich-get-richer model is one of the simplest empirically supported mechanisms capable of generating heavy-tail size distributions for complex systems. Simon argued analytically that a population of flavored elements growing by either adding a novel element or randomly replicating an existing one would afford a distribution of group sizes with a power-law tail. Here, we show that, in fact, Simon's model does not produce a simple power-law size distribution as the initial element has a dominant first-mover advantage, and will be overrepresented by a factor proportional to the inverse of the innovation probability. The first group's size discrepancy cannot be explained away as a transient of the model, and may therefore be many orders of magnitude greater than expected. We demonstrate how Simon's analysis was correct but incomplete, and expand our alternate analysis to quantify the variability of long term rankings for all groups. We find that the expected time for a first replication is infinite, and show how an incipient group must break the mechanism to improve their odds of success. We present an example of citation counts for a specific field that demonstrates a first-mover advantage consistent with our revised view of the rich-get-richer mechanism. Our findings call for a reexamination of preceding work invoking Simon's model and provide an expanded understanding going forward.

12.
PLoS One ; 11(5): e0154184, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27167740

RESUMO

Instabilities and long term shifts in seasons, whether induced by natural drivers or human activities, pose great disruptive threats to ecological, agricultural, and social systems. Here, we propose, measure, and explore two fundamental markers of location-sensitive seasonal variations: the Summer and Winter Teletherms-the on-average annual dates of the hottest and coldest days of the year. We analyse daily temperature extremes recorded at 1218 stations across the contiguous United States from 1853-2012, and observe large regional variation with the Summer Teletherm falling up to 90 days after the Summer Solstice, and 50 days for the Winter Teletherm after the Winter Solstice. We show that Teletherm temporal dynamics are substantive with clear and in some cases dramatic shifts reflective of system bifurcations. We also compare recorded daily temperature extremes with output from two regional climate models finding considerable though relatively unbiased error. Our work demonstrates that Teletherms are an intuitive, powerful, and statistically sound measure of local climate change, and that they pose detailed, stringent challenges for future theoretical and computational models.


Assuntos
Mudança Climática/estatística & dados numéricos , Temperatura Baixa , Temperatura Alta , Modelos Estatísticos , Clima , Mudança Climática/história , História do Século XIX , História do Século XX , História do Século XXI , Humanos , Chuva , Análise Espaço-Temporal , Estados Unidos
13.
PLoS One ; 11(2): e0148134, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26849061

RESUMO

A thermal convection loop is a annular chamber filled with water, heated on the bottom half and cooled on the top half. With sufficiently large forcing of heat, the direction of fluid flow in the loop oscillates chaotically, dynamics analogous to the Earth's weather. As is the case for state-of-the-art weather models, we only observe the statistics over a small region of state space, making prediction difficult. To overcome this challenge, data assimilation (DA) methods, and specifically ensemble methods, use the computational model itself to estimate the uncertainty of the model to optimally combine these observations into an initial condition for predicting the future state. Here, we build and verify four distinct DA methods, and then, we perform a twin model experiment with the computational fluid dynamics simulation of the loop using the Ensemble Transform Kalman Filter (ETKF) to assimilate observations and predict flow reversals. We show that using adaptively shaped localized covariance outperforms static localized covariance with the ETKF, and allows for the use of less observations in predicting flow reversals. We also show that a Dynamic Mode Decomposition (DMD) of the temperature and velocity fields recovers the low dimensional system underlying reversals, finding specific modes which together are predictive of reversal direction.


Assuntos
Simulação por Computador , Convecção , Hidrodinâmica , Estatística como Assunto , Tempo (Meteorologia) , Análise de Variância , Dinâmica não Linear
14.
Phys Rev E ; 93(5): 052314, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-27300917

RESUMO

Sports are spontaneous generators of stories. Through skill and chance, the script of each game is dynamically written in real time by players acting out possible trajectories allowed by a sport's rules. By properly characterizing a given sport's ecology of "game stories," we are able to capture the sport's capacity for unfolding interesting narratives, in part by contrasting them with random walks. Here we explore the game story space afforded by a data set of 1310 Australian Football League (AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories rather than distinct clusters. We show how coarse graining reveals identifiable motifs ranging from last-minute comeback wins to one-sided blowouts. Through an extensive comparison with biased random walks, we show that real AFL games deliver a broader array of motifs than null models, and we provide consequent insights into the narrative appeal of real games.


Assuntos
Modelos Teóricos , Futebol , Austrália , Humanos
15.
PLoS One ; 10(8): e0136092, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26291877

RESUMO

The consequences of anthropogenic climate change are extensively debated through scientific papers, newspaper articles, and blogs. Newspaper articles may lack accuracy, while the severity of findings in scientific papers may be too opaque for the public to understand. Social media, however, is a forum where individuals of diverse backgrounds can share their thoughts and opinions. As consumption shifts from old media to new, Twitter has become a valuable resource for analyzing current events and headline news. In this research, we analyze tweets containing the word "climate" collected between September 2008 and July 2014. Through use of a previously developed sentiment measurement tool called the Hedonometer, we determine how collective sentiment varies in response to climate change news, events, and natural disasters. We find that natural disasters, climate bills, and oil-drilling can contribute to a decrease in happiness while climate rallies, a book release, and a green ideas contest can contribute to an increase in happiness. Words uncovered by our analysis suggest that responses to climate change news are predominately from climate change activists rather than climate change deniers, indicating that Twitter is a valuable resource for the spread of climate change awareness.


Assuntos
Mudança Climática , Opinião Pública , Mídias Sociais/estatística & dados numéricos , Desastres , Aquecimento Global , Felicidade , Humanos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa