RESUMEN
Adults struggle to learn non-native speech categories in many experimental settings (Goto, Neuropsychologia, 9(3), 317-323 1971), but learn efficiently in a video game paradigm where non-native speech sounds have functional significance (Lim & Holt, Cognitive Science, 35(7), 1390-1405 2011). Behavioral and neural evidence from this and other paradigms point toward the involvement of reinforcement learning mechanisms in speech category learning (Harmon, Idemaru, & Kapatsinski, Cognition, 189, 76-88 2019; Lim, Fiez, & Holt, Proceedings of the National Academy of Sciences, 116, 201811992 2019). We formalize this hypothesis computationally and implement a deep reinforcement learning network to map between environmental input and actions. Comparing to a supervised model of learning, we show that the reinforcement network closely matches aspects of human behavior in two experiments - learning of synthesized auditory noise tokens and improvement in speech sound discrimination. Both models perform comparably and the similarity in the output of each model leads us to believe that there is little inherent computational benefit to a reward-based learning mechanism. We suggest that the specific neural circuitry engaged by the paradigm and links between striatum and superior temporal areas play a critical role in effective learning.
RESUMEN
As children gradually master grammatical rules, they often go through a period of producing form-meaning associations that were not observed in the input. For example, 2- to 3-year-old English-learning children use the bare form of verbs in settings that require obligatory past tense meaning while already starting to produce the grammatical -ed inflection. While many studies have focused on overgeneralization errors, fewer studies have attempted to explain the root of this earlier stage of rule acquisition. In this work, we use computational modeling to replicate children's production behavior prior to the generalization of past tense production in English. We illustrate how seemingly erroneous productions emerge in a model, without being licensed in the grammar and despite the model aiming at conforming to grammatical forms. Our results show that bare form productions stem from a tension between two factors: (1) trying to produce a less frequent meaning (the past tense) and (2) being unable to restrict the production of frequent forms (the bare form) as learning progresses. Like children, our model goes through a stage of bare form production and then converges on adult-like production of the regular past tense, showing that these different stages can be accounted for through a single learning mechanism.
Asunto(s)
Generalización Psicológica , Aprendizaje , Adulto , Humanos , Niño , Preescolar , Simulación por Computador , LingüísticaRESUMEN
In the first year of life, infants' speech perception becomes attuned to the sounds of their native language. This process of early phonetic learning has traditionally been framed as phonetic category acquisition. However, recent studies have hypothesized that the attunement may instead reflect a perceptual space learning process that does not involve categories. In this article, we explore the idea of perceptual space learning by implementing five different perceptual space learning models and testing them on three phonetic contrasts that have been tested in the infant speech perception literature. We reproduce and extend previous results showing that a perceptual space learning model that uses only distributional information about the acoustics of short time slices of speech can account for at least some crosslinguistic differences in infant perception. Moreover, we find that a second perceptual space learning model, which benefits from word-level guidance. performs equally well in capturing crosslinguistic differences in infant speech perception. These results provide support for the general idea of perceptual space learning as a theory of early phonetic learning but suggest that more fine-grained data are needed to distinguish between different formal accounts. Finally, we provide testable empirical predictions of the two most promising models and show that these are not identical, making it possible to independently evaluate each model in experiments with infants in future research.
Asunto(s)
Desarrollo del Lenguaje , Percepción del Habla , Humanos , Lactante , Fonética , Lenguaje , Aprendizaje Espacial , Simulación por ComputadorRESUMEN
Children with developmental language disorder (DLD) regularly use the bare form of verbs (e.g., dance) instead of inflected forms (e.g., danced). We propose an account of this behavior in which processing difficulties of children with DLD disproportionally affect processing novel inflected verbs in their input. Limited experience with inflection in novel contexts leads the inflection to face stronger competition from alternatives. Competition is resolved through a compensatory behavior that involves producing a more accessible alternative: in English, the bare form. We formalize this hypothesis within a probabilistic model that trades off context-dependent versus independent processing. Results show an over-reliance on preceding stem contexts when retrieving the inflection in a model that has difficulty with processing novel inflected forms. We further show that following the introduction of a bias to store and retrieve forms with preceding contexts, generalization in the typically developing (TD) models remains more or less stable, while the same bias in the DLD models exaggerates difficulties with generalization. Together, the results suggest that inconsistent use of inflectional morphemes by children with DLD could stem from inferences they make on the basis of data containing fewer novel inflected forms. Our account extends these findings to suggest that problems with detecting a form in novel contexts combined with a bias to rely on familiar contexts when retrieving a form could explain sequential planning difficulties in children with DLD. RESEARCH HIGHLIGHTS: Generalization difficulties with inflectional morphemes in children with Developmental Language Disorder arise from these children's limited experience with novel inflected forms. Limited experience with a form in novel contexts could lead to a storage bias where retrieving a form often requires relying on familiar preceding stems. While generalization in typically developing models remains stable across a range of model parameters, certain parameter values in the impaired models exaggerate difficulties with generalization. Children with DLD compensate for these retrieval difficulties through accessibility-driven language production: they produce the most accessible form among the alternatives.
Asunto(s)
Trastornos del Desarrollo del Lenguaje , Niño , Humanos , Lenguaje , Pruebas del LenguajeRESUMEN
At birth, infants discriminate most of the sounds of the world's languages, but by age 1, infants become language-specific listeners. This has generally been taken as evidence that infants have learned which acoustic dimensions are contrastive, or useful for distinguishing among the sounds of their language(s), and have begun focusing primarily on those dimensions when perceiving speech. However, speech is highly variable, with different sounds overlapping substantially in their acoustics, and after decades of research, we still do not know what aspects of the speech signal allow infants to differentiate contrastive from noncontrastive dimensions. Here we show that infants could learn which acoustic dimensions of their language are contrastive, despite the high acoustic variability. Our account is based on the cross-linguistic fact that even sounds that overlap in their acoustics differ in the contexts they occur in. We predict that this should leave a signal that infants can pick up on and show that acoustic distributions indeed vary more by context along contrastive dimensions compared with noncontrastive dimensions. By establishing this difference, we provide a potential answer to how infants learn about sound contrasts, a question whose answer in natural learning environments has remained elusive.
Asunto(s)
Desarrollo del Lenguaje , Percepción del Habla , Habla , Humanos , Lactante , AprendizajeRESUMEN
Iterated learning models of language evolution have typically been used to study the emergence of language, rather than historical language change. We use iterated learning models to investigate historical change in the accent classes of two Korean dialects. Simulations reveal that many of the patterns of historical change can be explained as resulting from successive generations of phonotactic learning. Comparisons between different iterated learning models also suggest that Korean learners' phonotactic generalizations are guided by storage of entire syllable-sized units, and provide evidence that perceptual confusions between different forms substantially impacted historical change. This suggests that in addition to accounting for the evolution of broad general characteristics of language, iterated learning models can also provide insight into more detailed patterns of historical language change.
Asunto(s)
Lenguaje , Aprendizaje , Generalización Psicológica , Humanos , Desarrollo del LenguajeRESUMEN
Learning in any domain depends on how the data for learning are represented. In the domain of language acquisition, children's representations of the speech they hear determine what generalizations they can draw about their target grammar. But these input representations change over development as a function of children's developing linguistic knowledge, and may be incomplete or inaccurate when children lack the knowledge to parse their input veridically. How does learning succeed in the face of potentially misleading data? We address this issue using the case study of "non-basic" clauses in verb learning. A young infant hearing What did Amy fix? might not recognize that what stands in for the direct object of fix, and might think that fix is occurring without a direct object. We follow a previous proposal that children might filter nonbasic clauses out of the data for learning verb argument structure, but offer a new approach. Instead of assuming that children identify the data to filter in advance, we demonstrate computationally that it is possible for learners to infer a filter on their input without knowing which clauses are nonbasic. We instantiate a learner that considers the possibility that it misparses some of the sentences it hears, and learns to filter out those parsing errors in order to correctly infer transitivity for the majority of 50 frequent verbs in child-directed speech. Our learner offers a novel solution to the problem of learning from immature input representations: Learners may be able to avoid drawing faulty inferences from misleading data by identifying a filter on their input, without knowing in advance what needs to be filtered.
Asunto(s)
Desarrollo del Lenguaje , Habla , Humanos , Lactante , Lenguaje , Lingüística , Aprendizaje VerbalRESUMEN
We incorporate social reasoning about groups of informants into a model of word learning, and show that the model accounts for infant looking behavior in tasks of both word learning and recognition. Simulation 1 models an experiment where 16-month-old infants saw familiar objects labeled either correctly or incorrectly, by either adults or audio talkers. Simulation 2 reinterprets puzzling data from the Switch task, an audiovisual habituation procedure wherein infants are tested on familiarized associations between novel objects and labels. Eight-month-olds outperform 14-month-olds on the Switch task when required to distinguish labels that are minimal pairs (e.g., "buk" and "puk"), but 14-month-olds' performance is improved by habituation stimuli featuring multiple talkers. Our modeling results support the hypothesis that beliefs about knowledgeability and group membership guide infant looking behavior in both tasks. These results show that social and linguistic development interact in non-trivial ways, and that social categorization findings in developmental psychology could have substantial implications for understanding linguistic development in realistic settings where talkers vary according to observable features correlated with social groupings, including linguistic, ethnic, and gendered groups.
RESUMEN
Before they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get better at distinguishing English and [l], as in "rock" vs. "lock," relative to infants learning Japanese. Influential accounts of this early phonetic learning phenomenon initially proposed that infants group sounds into native vowel- and consonant-like phonetic categories-like and [l] in English-through a statistical clustering mechanism dubbed "distributional learning." The feasibility of this mechanism for learning phonetic categories has been challenged, however. Here, we demonstrate that a distributional learning algorithm operating on naturalistic speech can predict early phonetic learning, as observed in Japanese and American English infants, suggesting that infants might learn through distributional learning after all. We further show, however, that, contrary to the original distributional learning proposal, our model learns units too brief and too fine-grained acoustically to correspond to phonetic categories. This challenges the influential idea that what infants learn are phonetic categories. More broadly, our work introduces a mechanism-driven approach to the study of early phonetic learning, together with a quantitative modeling framework that can handle realistic input. This allows accounts of early phonetic learning to be linked to concrete, systematic predictions regarding infants' attunement.
Asunto(s)
Desarrollo del Lenguaje , Modelos Neurológicos , Procesamiento de Lenguaje Natural , Fonética , Humanos , Percepción del Habla , Software de Reconocimiento del HablaRESUMEN
Early changes in infants' ability to perceive native and nonnative speech sound contrasts are typically attributed to their developing knowledge of phonetic categories. We critically examine this hypothesis and argue that there is little direct evidence of category knowledge in infancy. We then propose an alternative account in which infants' perception changes because they are learning a perceptual space that is appropriate to represent speech, without yet carving up that space into phonetic categories. If correct, this new account has substantial implications for understanding early language development.
RESUMEN
Infants learn about the sounds of their language and adults process the sounds they hear, even though sound categories often overlap in their acoustics. Researchers have suggested that listeners rely on context for these tasks, and have proposed two main ways that context could be helpful: top-down information accounts, which argue that listeners use context to predict which sound will be produced, and normalization accounts, which argue that listeners compensate for the fact that the same sound is produced differently in different contexts by factoring out this systematic context-dependent variability from the acoustics. These ideas have been somewhat conflated in past research, and have rarely been tested on naturalistic speech. We implement top-down and normalization accounts separately and evaluate their relative efficacy on spontaneous speech, using the test case of Japanese vowels. We find that top-down information strategies are effective even on spontaneous speech. Surprisingly, we find that at least one common implementation of normalization is ineffective on spontaneous speech, in contrast to what has been found on lab speech. We provide analyses showing that when there are systematic regularities in which contexts different sounds occur in-which are common in naturalistic speech, but generally controlled for in lab speech-normalization can actually increase category overlap rather than decrease it. This work calls into question the usefulness of normalization in naturalistic listening tasks, and highlights the importance of applying ideas from carefully controlled lab speech to naturalistic, spontaneous speech.
Asunto(s)
Lenguaje , Aprendizaje , Acústica del Lenguaje , Percepción del Habla , Humanos , Fonética , Teoría Psicológica , HablaRESUMEN
It is generally accepted that infants initially discriminate native and non-native contrasts and that perceptual reorganization within the first year of life results in decreased discrimination of non-native contrasts, and improved discrimination of native contrasts. However, recent findings from Narayan, Werker, and Beddor (2010) surprisingly suggested that some acoustically subtle native-language contrasts might not be discriminated until the end of the first year of life. We first provide countervailing evidence that young English-learning infants can discriminate the Filipino contrast tested by Narayan et al. when tested in a more sensitive paradigm. Next, we show that young infants learning either English or French can also discriminate comparably subtle non-native contrasts from Tamil. These findings show that Narayan et al.'s null findings were due to methodological choices and indicate that young infants are sensitive to even subtle acoustic contrasts that cue phonetic distinctions cross-linguistically. Based on experimental results and acoustic analyses, we argue that instead of specific acoustic metrics, infant discrimination results themselves are the most informative about the salience of phonetic distinctions.
Asunto(s)
Aprendizaje Discriminativo , Fonética , Percepción del Habla , Estimulación Acústica , Femenino , Humanos , Lactante , Desarrollo del Lenguaje , Masculino , Acústica del LenguajeRESUMEN
Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the hypothesis that children are classifying nouns optimally with respect to a distribution that does not match the surface distribution of statistical features in their input. We propose three ways in which children's apparent statistical insensitivity might arise, and find that all three provide ways to account for the difference between children's behavior and the optimal classifier. A fourth model combines two of these proposals and finds that children's insensitivity is best modeled as a bias to ignore certain features during classification, rather than an inability to encode those features during learning. These results provide insight into children's developing knowledge of noun classes and highlight the complex ways in which statistical information from the input interacts with children's learning processes.
Asunto(s)
Desarrollo del Lenguaje , Lenguaje , Modelos Teóricos , Teorema de Bayes , Niño , Femenino , Humanos , Masculino , Aprendizaje por Probabilidad , VocabularioRESUMEN
To attain native-like competence, second language (L2) learners must establish mappings between familiar speech sounds and new phoneme categories. For example, Spanish learners of English must learn that [d] and [ð], which are allophones of the same phoneme in Spanish, can distinguish meaning in English (i.e., /deɪ/ "day" and /ðeɪ/ "they"). Because adult listeners are less sensitive to allophonic than phonemic contrasts in their native language (L1), novel target language contrasts between L1 allophones may pose special difficulty for L2 learners. We investigate whether advanced Spanish late-learners of English overcome native language mappings to establish new phonological relations between familiar phones. We report behavioral and magnetoencepholographic (MEG) evidence from two experiments that measured the sensitivity and pre-attentive processing of three listener groups (L1 English, L1 Spanish, and advanced Spanish late-learners of English) to differences between three nonword stimulus pairs ([idi]-[iði], [idi]-[iɾi], and [iði]-[iɾi]) which differ in phones that play a different functional role in Spanish and English. Spanish and English listeners demonstrated greater sensitivity (larger d' scores) for nonword pairs distinguished by phonemic than by allophonic contrasts, mirroring previous findings. Spanish late-learners demonstrated sensitivity (large d' scores and MMN responses) to all three contrasts, suggesting that these L2 learners may have established a novel [d]-[ð] contrast despite the phonological relatedness of these sounds in the L1. Our results suggest that phonological relatedness influences perceived similarity, as evidenced by the results of the native speaker groups, but may not cause persistent difficulty for advanced L2 learners. Instead, L2 learners are able to use cues that are present in their input to establish new mappings between familiar phones.
RESUMEN
Categorical effects are found across speech sound categories, with the degree of these effects ranging from extremely strong categorical perception in consonants to nearly continuous perception in vowels. We show that both strong and weak categorical effects can be captured by a unified model. We treat speech perception as a statistical inference problem, assuming that listeners use their knowledge of categories as well as the acoustics of the signal to infer the intended productions of the speaker. Simulations show that the model provides close fits to empirical data, unifying past findings of categorical effects in consonants and vowels and capturing differences in the degree of categorical effects through a single parameter.
Asunto(s)
Fonética , Psicolingüística , Percepción del Habla/fisiología , HumanosRESUMEN
Infant-directed speech (IDS) has distinctive properties that differ from adult-directed speech (ADS). Why it has these properties-and whether they are intended to facilitate language learning-is a matter of contention. We argue that much of this disagreement stems from lack of a formal, guiding theory of how phonetic categories should best be taught to infantlike learners. In the absence of such a theory, researchers have relied on intuitions about learning to guide the argument. We use a formal theory of teaching, validated through experiments in other domains, as the basis for a detailed analysis of whether IDS is well designed for teaching phonetic categories. Using the theory, we generate ideal data for teaching phonetic categories in English. We qualitatively compare the simulated teaching data with human IDS, finding that the teaching data exhibit many features of IDS, including some that have been taken as evidence IDS is not for teaching. The simulated data reveal potential pitfalls for experimentalists exploring the role of IDS in language learning. Focusing on different formants and phoneme sets leads to different conclusions, and the benefit of the teaching data to learners is not apparent until a sufficient number of examples have been provided. Finally, we investigate transfer of IDS to learning ADS. The teaching data improve classification of ADS data but only for the learner they were generated to teach, not universally across all classes of learners. This research offers a theoretically grounded framework that empowers experimentalists to systematically evaluate whether IDS is for teaching. (PsycINFO Database Record
Asunto(s)
Desarrollo Infantil , Desarrollo del Lenguaje , Aprendizaje , Habla , Enseñanza , Adulto , Humanos , Lactante , FonéticaRESUMEN
Infants segment words from fluent speech during the same period when they are learning phonetic categories, yet accounts of phonetic category acquisition typically ignore information about the words in which sounds appear. We use a Bayesian model to illustrate how feedback from segmented words might constrain phonetic category learning by providing information about which sounds occur together in words. Simulations demonstrate that word-level information can successfully disambiguate overlapping English vowel categories. Learning patterns in the model are shown to parallel human behavior from artificial language learning tasks. These findings point to a central role for the developing lexicon in phonetic category acquisition and provide a framework for incorporating top-down constraints into models of category learning.
Asunto(s)
Simulación por Computador , Formación de Concepto/fisiología , Desarrollo del Lenguaje , Aprendizaje/fisiología , Teorema de Bayes , Humanos , Fonética , VocabularioRESUMEN
Infants begin to segment words from fluent speech during the same time period that they learn phonetic categories. Segmented words can provide a potentially useful cue for phonetic learning, yet accounts of phonetic category acquisition typically ignore the contexts in which sounds appear. We present two experiments to show that, contrary to the assumption that phonetic learning occurs in isolation, learners are sensitive to the words in which sounds appear and can use this information to constrain their interpretation of phonetic variability. Experiment 1 shows that adults use word-level information in a phonetic category learning task, assigning acoustically similar vowels to different categories more often when those sounds consistently appear in different words. Experiment 2 demonstrates that 8-month-old infants similarly pay attention to word-level information and that this information affects how they treat phonetic contrasts. These findings suggest that phonetic category learning is a rich, interactive process that takes advantage of many different types of cues that are present in the input.
Asunto(s)
Aprendizaje/fisiología , Fonética , Estimulación Acústica , Adulto , Señales (Psicología) , Discriminación en Psicología , Femenino , Humanos , Lactante , Desarrollo del Lenguaje , Masculino , VocabularioRESUMEN
Probabilistic models have recently received much attention as accounts of human cognition. However, most research in which probabilistic models have been used has been focused on formulating the abstract problems behind cognitive tasks and their optimal solutions, rather than on mechanisms that could implement these solutions. Exemplar models are a successful class of psychological process models in which an inventory of stored examples is used to solve problems such as identification, categorization, and function learning. We show that exemplar models can be used to perform a sophisticated form of Monte Carlo approximation known as importance sampling and thus provide a way to perform approximate Bayesian inference. Simulations of Bayesian inference in speech perception, generalization along a single dimension, making predictions about everyday events, concept learning, and reconstruction from memory show that exemplar models can often account for human performance with only a few exemplars, for both simple and relatively complex prior distributions. These results suggest that exemplar models provide a possible mechanism for implementing at least some forms of Bayesian inference.
Asunto(s)
Teorema de Bayes , Cognición , Modelos Estadísticos , Percepción , Atención , Formación de Concepto , Predicción , Generalización Psicológica , Humanos , Memoria , Método de Montecarlo , Distribución Normal , Reconocimiento Visual de Modelos , Reconocimiento en Psicología , Percepción del HablaRESUMEN
A variety of studies have demonstrated that organizing stimuli into categories can affect the way the stimuli are perceived. We explore the influence of categories on perception through one such phenomenon, the perceptual magnet effect, in which discriminability between vowels is reduced near prototypical vowel sounds. We present a Bayesian model to explain why this reduced discriminability might occur: It arises as a consequence of optimally solving the statistical problem of perception in noise. In the optimal solution to this problem, listeners' perception is biased toward phonetic category means because they use knowledge of these categories to guide their inferences about speakers' target productions. Simulations show that model predictions closely correspond to previously published human data, and novel experimental results provide evidence for the predicted link between perceptual warping and noise. The model unifies several previous accounts of the perceptual magnet effect and provides a framework for exploring categorical effects in other domains.