Pesquisa | Portal de Pesquisa da BVS

A comparison of homonym meaning frequency estimates derived from movie and television subtitles, free association, and explicit ratings.

Rice, Caitlin A; Beekhuizen, Barend; Dubrovsky, Vladimir; Stevenson, Suzanne; Armstrong, Blair C.

Behav Res Methods ; 51(3): 1399-1425, 2019 06.

Artigo em Inglês | MEDLINE | ID: mdl-30203161

RESUMO

Most words are ambiguous, with interpretation dependent on context. Advancing theories of ambiguity resolution is important for any general theory of language processing, and for resolving inconsistencies in observed ambiguity effects across experimental tasks. Focusing on homonyms (words such as bank with unrelated meanings EDGE OF A RIVER vs. FINANCIAL INSTITUTION), the present work advances theories and methods for estimating the relative frequency of their meanings, a factor that shapes observed ambiguity effects. We develop a new method for estimating meaning frequency based on the meaning of a homonym evoked in lines of movie and television subtitles according to human raters. We also replicate and extend a measure of meaning frequency derived from the classification of free associates. We evaluate the internal consistency of these measures, compare them to published estimates based on explicit ratings of each meaning's frequency, and compare each set of norms in predicting performance in lexical and semantic decision mega-studies. All measures have high internal consistency and show agreement, but each is also associated with unique variance, which may be explained by integrating cognitive theories of memory with the demands of different experimental methodologies. To derive frequency estimates, we collected manual classifications of 533 homonyms over 50,000 lines of subtitles, and of 357 homonyms across over 5000 homonym-associate pairs. This database-publicly available at: www.blairarmstrong.net/homonymnorms/ -constitutes a novel resource for computational cognitive modeling and computational linguistics, and we offer suggestions around good practices for its use in training and testing models on labeled data.

Assuntos

Associação Livre , Adolescente , Feminino , Humanos , Linguística , Masculino , Filmes Cinematográficos , Semântica , Televisão , Adulto Jovem

Three design principles of language: the search for parsimony in redundancy.

Beekhuizen, Barend; Bod, Rens; Zuidema, Willem.

Lang Speech ; 56(Pt 3): 265-90, 2013 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-24416957

RESUMO

In this paper we present three design principles of language - experience, heterogeneity and redundancy--and present recent developments in a family of models incorporating them, namely Data-Oriented Parsing/Unsupervised Data-Oriented Parsing. Although the idea of some form of redundant storage has become part and parcel of parsing technologies and usage-based linguistic approaches alike, the question how much of it is cognitively realistic and/or computationally optimally efficient is an open one. We argue that a segmentation-based approach (Bayesian Model Merging) combined with an all-subtrees approach reduces the number of rules needed to achieve an optimal performance, thus making the parser more efficient. At the same time, starting from unsegmented wholes comes closer to the acquisitional situation of a language learner, and thus adds to the cognitive plausibility of the model.

Assuntos

Idioma , Teorema de Bayes , Humanos

Probing Lexical Ambiguity: Word Vectors Encode Number and Relatedness of Senses.

Beekhuizen, Barend; Armstrong, Blair C; Stevenson, Suzanne.

Cogn Sci ; 45(5): e12943, 2021 05.

Artigo em Inglês | MEDLINE | ID: mdl-34018227

RESUMO

Lexical ambiguity-the phenomenon of a single word having multiple, distinguishable senses-is pervasive in language. Both the degree of ambiguity of a word (roughly, its number of senses) and the relatedness of those senses have been found to have widespread effects on language acquisition and processing. Recently, distributional approaches to semantics, in which a word's meaning is determined by its contexts, have led to successful research quantifying the degree of ambiguity, but these measures have not distinguished between the ambiguity of words with multiple related senses versus multiple unrelated meanings. In this work, we present the first assessment of whether distributional meaning representations can capture the ambiguity structure of a word, including both the number and relatedness of senses. On a very large sample of English words, we find that some, but not all, distributional semantic representations that we test exhibit detectable differences between sets of monosemes (unambiguous words; N = 964), polysemes (with multiple related senses; N = 4,096), and homonyms (with multiple unrelated senses; N = 355). Our findings begin to answer open questions from earlier work regarding whether distributional semantic representations of words, which successfully capture various semantic relationships, also reflect fine-grained aspects of meaning structure that influence human behavior. Our findings emphasize the importance of measuring whether proposed lexical representations capture such distinctions: In addition to standard benchmarks that test the similarity structure of distributional semantic models, we need to also consider whether they have cognitively plausible ambiguity structure.

Assuntos

Psicolinguística , Semântica , Humanos , Idioma

More Than the Eye Can See: A Computational Model of Color Term Acquisition and Color Discrimination.

Beekhuizen, Barend; Stevenson, Suzanne.

Cogn Sci ; 42(8): 2699-2734, 2018 11.

Artigo em Inglês | MEDLINE | ID: mdl-30079497

RESUMO

We explore the following two cognitive questions regarding crosslinguistic variation in lexical semantic systems: Why are some linguistic categories-that is, the associations between a term and a portion of the semantic space-harder to learn than others? How does learning a language-specific set of lexical categories affect processing in that semantic domain? Using a computational word-learner, and the domain of color as a testbed, we investigate these questions by modeling both child acquisition of color terms and adult behavior on a non-verbal color discrimination task. A further goal is to test an approach to lexical semantic representation based on the principle that the more languages label any two situations with the same word, the more conceptually similar those two situations are. We compare such a crosslinguistically based semantic space to one based on perceptual similarity. Our computational model suggests a mechanistic explanation for the interplay between term frequency and the semantic closeness of learned categories in developmental error patterns for color terms. Our model also indicates how linguistic relativity effects could arise from an acquisition mechanism that yields language-specific topologies for the same semantic domain. Moreover, we find that the crosslinguistically inspired semantic space supports these results at least as well as-and in some aspects better than-the purely perceptual one, thus confirming our approach as a practical and principled method for lexical semantic representation in cognitive modeling.

Assuntos

Percepção de Cores/fisiologia , Discriminação Psicológica/fisiologia , Desenvolvimento da Linguagem , Modelos Neurológicos , Aprendizagem Verbal/fisiologia , Criança , Humanos , Idioma

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA