Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 27
1.
Top Cogn Sci ; 16(1): 38-53, 2024 01.
Article En | MEDLINE | ID: mdl-38145974

I present a computational-level model of language production in terms of a combination of information theory and control theory in which words are chosen incrementally in order to maximize communicative value subject to an information-theoretic capacity constraint. The theory generally predicts a tradeoff between ease of production and communicative accuracy. I apply the theory to two cases of apparent availability effects in language production, in which words are selected on the basis of their accessibility to a speaker who has not yet perfectly planned the rest of the utterance. Using corpus data on English relative clause complementizer dropping and experimental data on Mandarin noun classifier choice, I show that the theory reproduces the observed phenomena, providing an alternative account to Uniform Information Density and a promising general model of language production which is tightly linked to emerging theories in computational neuroscience.


Neurosciences , Psycholinguistics , Humans , Language , Communication
2.
Cognition ; 241: 105543, 2023 Dec.
Article En | MEDLINE | ID: mdl-37713956

Grammatical cues are sometimes redundant with word meanings in natural language. For instance, English word order rules constrain the word order of a sentence like "The dog chewed the bone" even though the status of "dog" as subject and "bone" as object can be inferred from world knowledge and plausibility. Quantifying how often this redundancy occurs, and how the level of redundancy varies across typologically diverse languages, can shed light on the function and evolution of grammar. To that end, we performed a behavioral experiment in English and Russian and a cross-linguistic computational analysis measuring the redundancy of grammatical cues in transitive clauses extracted from corpus text. English and Russian speakers (n = 484) were presented with subjects, verbs, and objects (in random order and with morphological markings removed) extracted from naturally occurring sentences and were asked to identify which noun is the subject of the action. Accuracy was high in both languages (∼89% in English, ∼87% in Russian). Next, we trained a neural network machine classifier on a similar task: predicting which nominal in a subject-verb-object triad is the subject. Across 30 languages from eight language families, performance was consistently high: a median accuracy of 87%, comparable to the accuracy observed in the human experiments. The conclusion is that grammatical cues such as word order are necessary to convey subjecthood and objecthood in a minority of naturally occurring transitive clauses; nevertheless, they can (a) provide an important source of redundancy and (b) are crucial for conveying intended meaning that cannot be inferred from the words alone, including descriptions of human interactions, where roles are often reversible (e.g., Ray helped Lu/Lu helped Ray), and expressing non-prototypical meanings (e.g., "The bone chewed the dog.").

3.
Proc Natl Acad Sci U S A ; 120(39): e2220593120, 2023 09 26.
Article En | MEDLINE | ID: mdl-37725652

I apply a recently emerging perspective on the complexity of action selection, the rate-distortion theory of control, to provide a computational-level model of errors and difficulties in human language production, which is grounded in information theory and control theory. Language production is cast as the sequential selection of actions to achieve a communicative goal subject to a capacity constraint on cognitive control. In a series of calculations, simulations, corpus analyses, and comparisons to experimental data, I show that the model directly predicts some of the major known qualitative and quantitative phenomena in language production, including semantic interference and predictability effects in word choice; accessibility-based ("easy-first") production preferences in word order alternations; and the existence and distribution of disfluencies including filled pauses, corrections, and false starts. I connect the rate-distortion view to existing models of human language production, to probabilistic models of semantics and pragmatics, and to proposals for controlled language generation in the machine learning and reinforcement learning literature.


Language , Semantics , Humans , Communication , Information Theory , Machine Learning
4.
Cognition ; 240: 105505, 2023 11.
Article En | MEDLINE | ID: mdl-37598582

We explore systems of spatial deictic words (such as 'here' and 'there') from the perspective of communicative efficiency using typological data from over 200 languages Nintemann et al. (2020). We argue from an information-theoretic perspective that spatial deictic systems balance informativity and complexity in the sense of the Information Bottleneck (Zaslavsky et al., (2018). We find that under an appropriate choice of cost function and need probability over meanings, among all the 21,146 theoretically possible spatial deictic systems, those adopted by real languages lie near an efficient frontier of informativity and complexity. Moreover, we find that the conditions that the need probability and the cost function need to satisfy for this result are consistent with the cognitive science literature on spatial cognition, especially regarding the source-goal asymmetry. We further show that the typological data are better explained by introducing a notion of consistency into the Information Bottleneck framework, which is jointly optimized along with informativity and complexity.


Cognition , Cognitive Science , Humans , Communication , Language , Probability
5.
Proc Natl Acad Sci U S A ; 119(43): e2122602119, 2022 10 25.
Article En | MEDLINE | ID: mdl-36260742

A major goal of psycholinguistic theory is to account for the cognitive constraints limiting the speed and ease of language comprehension and production. Wide-ranging evidence demonstrates a key role for linguistic expectations: A word's predictability, as measured by the information-theoretic quantity of surprisal, is a major determinant of processing difficulty. But surprisal, under standard theories, fails to predict the difficulty profile of an important class of linguistic patterns: the nested hierarchical structures made possible by recursion in human language. These nested structures are better accounted for by psycholinguistic theories of constrained working memory capacity. However, progress on theory unifying expectation-based and memory-based accounts has been limited. Here we present a unified theory of a rational trade-off between precision of memory representations with ease of prediction, a scaled-up computational implementation using contemporary machine learning methods, and experimental evidence in support of the theory's distinctive predictions. We show that the theory makes nuanced and distinctive predictions for difficulty patterns in nested recursive structures predicted by neither expectation-based nor memory-based theories alone. These predictions are confirmed 1) in two language comprehension experiments in English, and 2) in sentence completions in English, Spanish, and German. More generally, our framework offers computationally explicit theory and methods for understanding how memory constraints and prediction interact in human language comprehension and production.


Comprehension , Linguistics , Humans , Language , Psycholinguistics , Memory, Short-Term
6.
Cognition ; 222: 104902, 2022 05.
Article En | MEDLINE | ID: mdl-34583835

Going back to Ross (1967) and Chomsky (1973), researchers have sought to understand what conditions permit long-distance dependencies in language, such as between the wh-word what and the verb bought in the sentence 'What did John think that Mary bought?'. In the present work, we attempt to understand why changing the main verb in wh-questions affects the acceptability of long-distance dependencies out of embedded clauses. In particular, it has been claimed that factive and manner-of-speaking verbs block such dependencies (e.g., 'What did John know/whisper that Mary bought?'), whereas verbs like think and believe allow them. Here we provide 3 acceptability judgment experiments of filler-gap constructions across embedded clauses to evaluate four types of accounts based on (1) discourse; (2) syntax; (3) semantics; and (4) our proposal related to verb-frame frequency. The patterns of acceptability are most simply explained by two factors: verb-frame frequency, such that dependencies with verbs that rarely take embedded clauses are less acceptable; and construction type, such that wh-questions and clefts are less acceptable than declaratives. We conclude that the low acceptability of filler-gap constructions formed by certain sentence complement verbs is due to infrequent linguistic exposure.


Language , Semantics , Humans , Judgment , Language Tests , Linguistics
7.
Lang Resour Eval ; 55(1): 63-77, 2021.
Article En | MEDLINE | ID: mdl-34720781

It is now a common practice to compare models of human language processing by comparing how well they predict behavioral and neural measures of processing difficulty, such as reading times, on corpora of rich naturalistic linguistic materials. However, many of these corpora, which are based on naturally-occurring text, do not contain many of the low-frequency syntactic constructions that are often required to distinguish between processing theories. Here we describe a new corpus consisting of English texts edited to contain many low-frequency syntactic constructions while still sounding fluent to native speakers. The corpus is annotated with hand-corrected Penn Treebank-style parse trees and includes self-paced reading time data and aligned audio recordings. We give an overview of the content of the corpus, review recent work using the corpus, and release the data.

8.
Front Psychol ; 12: 672408, 2021.
Article En | MEDLINE | ID: mdl-34135832

I present a computational-level model of semantic interference effects in online word production within a rate-distortion framework. I consider a bounded-rational agent trying to produce words. The agent's action policy is determined by maximizing accuracy in production subject to computational constraints. These computational constraints are formalized using mutual information. I show that semantic similarity-based interference among words falls out naturally from this setup, and I present a series of simulations showing that the model captures some of the key empirical patterns observed in Stroop and Picture-Word Interference paradigms, including comparisons to human data from previous experiments.

9.
Cereb Cortex ; 31(9): 4006-4023, 2021 07 29.
Article En | MEDLINE | ID: mdl-33895807

What role do domain-general executive functions play in human language comprehension? To address this question, we examine the relationship between behavioral measures of comprehension and neural activity in the domain-general "multiple demand" (MD) network, which has been linked to constructs like attention, working memory, inhibitory control, and selection, and implicated in diverse goal-directed behaviors. Specifically, functional magnetic resonance imaging data collected during naturalistic story listening are compared with theory-neutral measures of online comprehension difficulty and incremental processing load (reading times and eye-fixation durations). Critically, to ensure that variance in these measures is driven by features of the linguistic stimulus rather than reflecting participant- or trial-level variability, the neuroimaging and behavioral datasets were collected in nonoverlapping samples. We find no behavioral-neural link in functionally localized MD regions; instead, this link is found in the domain-specific, fronto-temporal "core language network," in both left-hemispheric areas and their right hemispheric homotopic areas. These results argue against strong involvement of domain-general executive circuits in language comprehension.


Comprehension/physiology , Language , Nerve Net/physiology , Adult , Attention/physiology , Brain/diagnostic imaging , Executive Function/physiology , Female , Fixation, Ocular , Functional Laterality , Humans , Language Tests , Magnetic Resonance Imaging , Male , Memory, Short-Term/physiology , Psycholinguistics , Psychomotor Performance/physiology , Reading , Young Adult
10.
Psychol Rev ; 128(4): 726-756, 2021 07.
Article En | MEDLINE | ID: mdl-33793259

Memory limitations are known to constrain language comprehension and production, and have been argued to account for crosslinguistic word order regularities. However, a systematic assessment of the role of memory limitations in language structure has proven elusive, in part because it is hard to extract precise large-scale quantitative generalizations about language from existing mechanistic models of memory use in sentence processing. We provide an architecture-independent information-theoretic formalization of memory limitations which enables a simple calculation of the memory efficiency of languages. Our notion of memory efficiency is based on the idea of a memory-surprisal trade-off: A certain level of average surprisal per word can only be achieved at the cost of storing some amount of information about the past context. Based on this notion of memory usage, we advance the Efficient Trade-off Hypothesis: The order of elements in natural language is under pressure to enable favorable memory-surprisal trade-offs. We derive that languages enable more efficient trade-offs when they exhibit information locality: When predictive information about an element is concentrated in its recent past. We provide empirical evidence from three test domains in support of the Efficient Trade-off Hypothesis: A reanalysis of a miniature artificial language learning experiment, a large-scale study of word order in corpora of 54 languages, and an analysis of morpheme order in two agglutinative languages. These results suggest that principles of order in natural language can be explained via highly generic cognitively motivated principles and lend support to efficiency-based models of the structure of human language. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Comprehension , Language , Humans , Language Development
11.
Cognition ; 209: 104491, 2021 04.
Article En | MEDLINE | ID: mdl-33545512

Language is used as a channel by which speakers convey, among other things, newsworthy and informative messages, i.e., content that is otherwise unpredictable to the comprehender. We therefore might expect comprehenders to show a preference for such messages. However, comprehension studies tend to emphasize the opposite: i.e., processing ease for situation-predictable content (e.g., chopping carrots with a knife). Comprehenders are known to deploy knowledge about situation plausibility during processing in fine-grained context-sensitive ways. Using self-paced reading, we test whether comprehenders can also deploy this knowledge in favor of newsworthy content to yield informativity-driven effects alongside, or instead of, plausibility-driven effects. We manipulate semantic context (unusual protagonists), syntactic construction (wh- clefts), and the communicative environment (text messages). Reading times (primarily sentence-finally) show facilitation for sentences containing newsworthy content (e.g., chopping carrots with a shovel), where the content is both unpredictable at the situation level because of its atypicality and also unpredictable at the word level because of the large number of atypical elements a speaker could potentially mention. Our studies are the first to show that informativity-driven effects are observable at all, and the results highlight the need for models that distinguish between comprehenders' estimate of content plausibility and their estimate of a speaker's decision to talk about that content.


Comprehension , Language , Attention , Humans , Semantics
12.
Cogn Sci ; 44(3): e12814, 2020 03.
Article En | MEDLINE | ID: mdl-32100918

A key component of research on human sentence processing is to characterize the processing difficulty associated with the comprehension of words in context. Models that explain and predict this difficulty can be broadly divided into two kinds, expectation-based and memory-based. In this work, we present a new model of incremental sentence processing difficulty that unifies and extends key features of both kinds of models. Our model, lossy-context surprisal, holds that the processing difficulty at a word in context is proportional to the surprisal of the word given a lossy memory representation of the context-that is, a memory representation that does not contain complete information about previous words. We show that this model provides an intuitive explanation for an outstanding puzzle involving interactions of memory and expectations: language-dependent structural forgetting, where the effects of memory on sentence processing appear to be moderated by language statistics. Furthermore, we demonstrate that dependency locality effects, a signature prediction of memory-based theories, can be derived from lossy-context surprisal as a special case of a novel, more general principle called information locality.


Comprehension , Language , Humans , Memory , Models, Psychological
13.
Proc Natl Acad Sci U S A ; 117(5): 2347-2353, 2020 02 04.
Article En | MEDLINE | ID: mdl-31964811

The universal properties of human languages have been the subject of intense study across the language sciences. We report computational and corpus evidence for the hypothesis that a prominent subset of these universal properties-those related to word order-result from a process of optimization for efficient communication among humans, trading off the need to reduce complexity with the need to reduce ambiguity. We formalize these two pressures with information-theoretic and neural-network models of complexity and ambiguity and simulate grammars with optimized word-order parameters on large-scale data from 51 languages. Evolution of grammars toward efficiency results in word-order patterns that predict a large subset of the major word-order correlations across languages.


Generalization, Psychological/physiology , Language , Cognition , Communication , Humans , Language Development , Linguistics/standards , Neural Networks, Computer
14.
Neurobiol Lang (Camb) ; 1(1): 104-134, 2020.
Article En | MEDLINE | ID: mdl-36794007

The frontotemporal language network responds robustly and selectively to sentences. But the features of linguistic input that drive this response and the computations that these language areas support remain debated. Two key features of sentences are typically confounded in natural linguistic input: words in sentences (a) are semantically and syntactically combinable into phrase- and clause-level meanings, and (b) occur in an order licensed by the language's grammar. Inspired by recent psycholinguistic work establishing that language processing is robust to word order violations, we hypothesized that the core linguistic computation is composition, and, thus, can take place even when the word order violates the grammatical constraints of the language. This hypothesis predicts that a linguistic string should elicit a sentence-level response in the language network provided that the words in that string can enter into dependency relationships as in typical sentences. We tested this prediction across two fMRI experiments (total N = 47) by introducing a varying number of local word swaps into naturalistic sentences, leading to progressively less syntactically well-formed strings. Critically, local dependency relationships were preserved because combinable words remained close to each other. As predicted, word order degradation did not decrease the magnitude of the blood oxygen level-dependent response in the language network, except when combinable words were so far apart that composition among nearby words was highly unlikely. This finding demonstrates that composition is robust to word order violations, and that the language regions respond as strongly as they do to naturalistic linguistic input, providing that composition can take place.

15.
Cognition ; 195: 104086, 2020 02.
Article En | MEDLINE | ID: mdl-31731116

Languages vary in their number of color terms. A widely accepted theory proposes that languages evolve, acquiring color terms in a stereotyped sequence. This theory, by Berlin and Kay (BK), is supported by analyzing best exemplars ("focal colors") of basic color terms in the World Color Survey (WCS) of 110 languages. But the instructions of the WCS were complex and the color chips confounded hue and saturation, which likely impacted focal-color selection. In addition, it is now known that even so-called early-stage languages nonetheless have a complete representation of color distributed across the population. These facts undermine the BK theory. Here we revisit the evolution of color terms using original color-naming data obtained with simple instructions in Tsimane', an Amazonian culture that has limited contact with industrialized society. We also collected data in Bolivian-Spanish speakers and English speakers. We discovered that information theory analysis of color-naming data was not influenced by color-chip saturation, which motivated a new analysis of the WCS data. Embedded within a universal pattern in which warm colors (reds, oranges) are always communicated more efficiently than cool colors (blues, greens), as languages increase in overall communicative efficiency about color, some colors undergo greater increases in communication efficiency compared to others. Communication efficiency increases first for yellow, then brown, then purple. The present analyses and results provide a new framework for understanding the evolution of color terms: what varies among cultures is not whether colors are seen differently, but the extent to which color is useful.


Color Perception , Color , Communication , Cross-Cultural Comparison , Adolescent , Adult , Aged , Bolivia , Female , Humans , Indians, South American , Information Theory , Male , Middle Aged , Psycholinguistics , United States , Young Adult
17.
Trends Cogn Sci ; 23(5): 389-407, 2019 05.
Article En | MEDLINE | ID: mdl-31006626

Cognitive science applies diverse tools and perspectives to study human language. Recently, an exciting body of work has examined linguistic phenomena through the lens of efficiency in usage: what otherwise puzzling features of language find explanation in formal accounts of how language might be optimized for communication and learning? Here, we review studies that deploy formal tools from probability and information theory to understand how and why language works the way that it does, focusing on phenomena ranging from the lexicon through syntax. These studies show how a pervasive pressure for efficiency guides the forms of natural language and indicate that a rich future for language research lies in connecting linguistics to cognitive psychology and mathematical theories of communication and inference.


Efficiency , Language , Communication , Humans , Learning , Linguistics
18.
Entropy (Basel) ; 21(7)2019 Jun 28.
Article En | MEDLINE | ID: mdl-33267354

The Predictive Rate-Distortion curve quantifies the trade-off between compressing information about the past of a stochastic process and predicting its future accurately. Existing estimation methods for this curve work by clustering finite sequences of observations or by utilizing analytically known causal states. Neither type of approach scales to processes such as natural languages, which have large alphabets and long dependencies, and where the causal states are not known analytically. We describe Neural Predictive Rate-Distortion (NPRD), an estimation method that scales to such processes, leveraging the universal approximation capabilities of neural networks. Taking only time series data as input, the method computes a variational bound on the Predictive Rate-Distortion curve. We validate the method on processes where Predictive Rate-Distortion is analytically known. As an application, we provide bounds on the Predictive Rate-Distortion of natural language, improving on bounds provided by clustering sequences. Based on the results, we argue that the Predictive Rate-Distortion curve is more useful than the usual notion of statistical complexity for characterizing highly complex processes such as natural language.

19.
Cognition ; 181: 141-150, 2018 12.
Article En | MEDLINE | ID: mdl-30195136

In everyday communication, speakers make errors and produce language in a noisy environment. Recent work suggests that comprehenders possess cognitive mechanisms for dealing with noise in the linguistic signal: a noisy-channel model. A key parameter of these models is the noise model: the comprehender's implicit model of how noise affects utterances before they are perceived. Here we examine this noise model in detail, asking whether comprehension behavior reflects a noise model that is adapted to context. We asked readers to correct sentences if they noticed errors, and manipulated context by including exposure sentences containing obvious deletions (A bystander was rescued by the fireman in the nick time.), insertions, exchanges, mixed errors, or no errors. On test sentences (The bat swung the player.), participants' corrections differed depending on the exposure condition. The results demonstrate that participants model specific types of errors and make inferences about the intentions of the speaker accordingly.


Comprehension , Linguistics , Bayes Theorem , Humans , Reading
20.
Top Cogn Sci ; 10(1): 209-224, 2018 01.
Article En | MEDLINE | ID: mdl-29218788

A central goal of typological research is to characterize linguistic features in terms of both their functional role and their fit to social and cognitive systems. One long-standing puzzle concerns why certain languages employ grammatical gender. In an information theoretic analysis of German noun classification, Dye, Milin, Futrell, and Ramscar (2017) enumerated a number of important processing advantages gender confers. Yet this raises a further puzzle: If gender systems are so beneficial to processing, what does this mean for languages that make do without them? Here, we compare the communicative function of gender marking in German (a deterministic system) to that of prenominal adjectives in English (a probabilistic one), finding that despite their differences, both systems act to efficiently smooth information over discourse, making nouns more equally predictable in context. We examine why evolutionary pressures may favor one system over another and discuss the implications for compositional accounts of meaning and Gricean principles of communication.


Communication , Psycholinguistics , Humans
...