Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 19(5): e0304201, 2024.
Article in English | MEDLINE | ID: mdl-38820351

ABSTRACT

TripAdvisor reviews and comparable data sources play an important role in many tasks in Natural Language Processing (NLP), providing a data basis for the identification and classification of subjective judgments, such as hotel or restaurant reviews, into positive or negative polarities. This study explores three important factors influencing variation in crowdsourced polarity judgments, focusing on TripAdvisor reviews in Spanish. Three hypotheses are tested: the role of Part Of Speech (POS), the impact of sentiment words such as "tasty", and the influence of neutral words like "ok" on judgment variation. The study's methodology employs one-word titles, demonstrating their efficacy in studying polarity variation of words. Statistical tests on mean equality are performed on word groups of our interest. The results of this study reveal that adjectives in one-word titles tend to result in lower judgment variation compared to other word types or POS. Sentiment words contribute to lower judgment variation as well, emphasizing the significance of sentiment words in research on polarity judgments, and neutral words are associated with higher judgment variation as expected. However, these effects cannot be always reproduced in longer titles, which suggests that longer titles do not represent the best data source for testing the ambiguity of single words due to the influence on word polarity by other words like negation in longer titles. This empirical investigation contributes valuable insights into the factors influencing polarity variation of words, providing a foundation for NLP practitioners that aim to capture and predict polarity judgments in Spanish and for researchers that aim to understand factors influencing judgment variation.


Subject(s)
Judgment , Natural Language Processing , Humans , Language
2.
Front Psychol ; 14: 1137038, 2023.
Article in English | MEDLINE | ID: mdl-37205084

ABSTRACT

This paper investigates the influence of the relative size of speech communities on language use in multilingual regions and cities. Due to peoples' everyday mobility inside a city, it is still unclear whether the size of a population matters for language use on a sub-city scale. By testing the correlation between the size of a population and language use on various spatial scales, this study will contribute to a better understanding of the extent to which sociodemographic factors influence language use. The present study investigates two particular phenomena that are common to multilingual speakers, namely language mixing or Code-Switching and using multiple languages without mixing. Demographic information from a Canadian census will make predictions about the intensity of Code-Switching and language use by multilinguals in cities of Quebec and neighborhoods of Montreal. Geolocated tweets will be used to identify where these linguistic phenomena occur the most and the least. My results show that the intensity of Code-Switching and the use of English by bilinguals is influenced by the size of anglophone and francophone populations on various spatial scales such as the city level, land use level (city center vs. periphery of Montreal), and large urban zones on the sub-city level, namely the western and eastern urban zones of Montreal. However, the correlation between population figures and language use is difficult to measure and evaluate on a much smaller sub-urban scale such as the city block scale due to factors such as population figures missing from the census and people's mobility. A qualitative evaluation of language use on a small spatial scale seems to suggest that other social influences such as the location context or topic of discussion are much more important predictors for language use than population figures. Methods will be suggested for testing this hypothesis in future research. I conclude that geographic space can provide us information about the relation between language use in multilingual cities and sociodemographic factors such as a speech community's size and that social media is a valuable alternative data source for sociolinguistic research that offers new insights into the mechanisms of language use such as Code-Switching.

3.
PLoS One ; 17(9): e0274114, 2022.
Article in English | MEDLINE | ID: mdl-36084118

ABSTRACT

Analysis of language geography is increasingly being used for studying spatial patterns of social dynamics. This trend is fueled by social media platforms such as Twitter which provide access to large amounts of natural language data combined with geolocation and user metadata enabling reconstruction of detailed spatial patterns of language use. Most studies are performed on large spatial scales associated with countries and regions, where language dynamics are often dominated by the effects of geographic and administrative borders. Extending to smaller, urban scales, however, allows visualization of spatial patterns of language use determined by social dynamics within the city, providing valuable information for a range of social topics from demographic studies to urban planning. So far, few studies have been made in this domain, due, in part, to the challenges in developing algorithms that accurately classify linguistic features. Here we extend urban-scale geographical analysis of language use beyond lexical meaning to include other sociolinguistic markers that identify language style, dialect and social groups. Some features, which have not been explored with social-media data on the urban scale, can be used to target a range of social phenomena. Our study focuses on Twitter use in Buenos Aires and our approach classifies tweets based on contrasting sets of tokens manually selected to target precise linguistic features. We perform statistical analyses of eleven categories of language use to quantify the presence of spatial patterns and the extent to which they are socially driven. We then perform the first comparative analysis assessing how the patterns and strength of social drivers vary with category. Finally, we derive plausible explanations for the patterns by comparing them with independently generated maps of geosocial context. Identifying these connections is a key aspect of the social-dynamics analysis which has so far received insufficient attention.


Subject(s)
Social Media , Cities , Data Collection , Humans , Linguistics , Metadata
4.
Front Sociol ; 7: 805716, 2022.
Article in English | MEDLINE | ID: mdl-35372565

ABSTRACT

In this article, I explore Twitter data to analyze Gender Neutral Language (GNL) in (Greater) Buenos Aires, (Greater) La Plata, and Córdoba. The goal is to characterize the social context behind GNL. Social context analysis of social media data is challenging given that this data type does not contain the social characteristics of its users and the circumstances under which the tweets were written. In order to fill this gap, I will derive the social context information from textual and temporal features by analyzing the names of locations, companies, and people used in the text and relating these entities to the message of the tweet. The analysis of temporal features will give us insights into the correlation between language use and social events. Our results show that the general characterization of the social context behind GNL is associated with socio-economically rich areas in city centers. Users of GNL in the investigated areas address certain groups of people with words that express familiarity and close social relationships, such as those meaning "friends" and "neighbors" and that give them information about a political, cultural, or social event or concerning commercial products/services. The temporal analysis by month supports this characterization by showing that certain political and social events induce a higher frequency of GNL. This paper contributes to previous research on GNL in Argentina by testing existing hypotheses quantitatively. The new discovery presented here is that political activism is not the only language context in which GNL is used in social media and that GNL is not exclusively used in big cities of Argentina but also in smaller cities.

SELECTION OF CITATIONS
SEARCH DETAIL
...