Your browser doesn't support javascript.
loading
Sweet tweets! Evaluating a new approach for probability-based sampling of Twitter.
Buskirk, Trent D; Blakely, Brian P; Eck, Adam; McGrath, Richard; Singh, Ravinder; Yu, Youzhi.
Afiliação
  • Buskirk TD; Bowling Green State University, Bowling Green, USA.
  • Blakely BP; Bowling Green State University, Bowling Green, USA.
  • Eck A; Bowling Green State University, Bowling Green, USA.
  • McGrath R; Bowling Green State University, Bowling Green, USA.
  • Singh R; Bowling Green State University, Bowling Green, USA.
  • Yu Y; Bowling Green State University, Bowling Green, USA.
EPJ Data Sci ; 11(1): 9, 2022.
Article em En | MEDLINE | ID: mdl-35223365
As survey costs continue to rise and response rates decline, researchers are seeking more cost-effective ways to collect, analyze and process social and public opinion data. These issues have created an opportunity and interest in expanding the fit-for-purpose paradigm to include alternate sources such as passively collected sensor data and social media data. However, methods for accessing, sourcing and sampling social media data are just now being developed. In fact, there has been a small but growing body of literature focusing on comparing different Twitter data access methods through either the elaborate firehose or the free Twitter search or streaming APIs. Missing from the literature is a good understanding of how to randomly sample Tweets to produce datasets that are representative of the daily discourse, especially within geographical regions of interest, without requiring a census of all Tweets. This understanding is necessary for producing quality estimates of public opinion from social media sources such as Twitter. To address this gap, we propose and test the Velocity-Based Estimation for Sampling Tweets (VBEST) algorithm for selecting a probability based sample of tweets. We compare the performance of VBEST sample estimates to other methods of accessing Twitter through the Search API on the distribution of total Tweets as well as COVID-19 keyword incidence and frequency and find that the VBEST samples produce consistent and relatively low levels of overall bias compared to common methods of access through the Search API across many experimental conditions.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: EPJ Data Sci Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: Alemanha

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: EPJ Data Sci Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: Alemanha