Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 75
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 118(12)2021 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-33723064

RESUMO

Words categorize the semantic fields they refer to in ways that maximize communication accuracy while minimizing complexity. Focusing on the well-studied color domain, we show that artificial neural networks trained with deep-learning techniques to play a discrimination game develop communication systems whose distribution on the accuracy/complexity plane closely matches that of human languages. The observed variation among emergent color-naming systems is explained by different degrees of discriminative need, of the sort that might also characterize different human communities. Like human languages, emergent systems show a preference for relatively low-complexity solutions, even at the cost of imperfect communication. We demonstrate next that the nature of the emergent systems crucially depends on communication being discrete (as is human word usage). When continuous message passing is allowed, emergent systems become more complex and eventually less efficient. Our study suggests that efficient semantic categorization is a general property of discrete communication systems, not limited to human language. It suggests moreover that it is exactly the discrete nature of such systems that, acting as a bottleneck, pushes them toward low complexity and optimal efficiency.

2.
Proc Natl Acad Sci U S A ; 118(7)2021 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-33510040

RESUMO

Before they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get better at distinguishing English and [l], as in "rock" vs. "lock," relative to infants learning Japanese. Influential accounts of this early phonetic learning phenomenon initially proposed that infants group sounds into native vowel- and consonant-like phonetic categories-like and [l] in English-through a statistical clustering mechanism dubbed "distributional learning." The feasibility of this mechanism for learning phonetic categories has been challenged, however. Here, we demonstrate that a distributional learning algorithm operating on naturalistic speech can predict early phonetic learning, as observed in Japanese and American English infants, suggesting that infants might learn through distributional learning after all. We further show, however, that, contrary to the original distributional learning proposal, our model learns units too brief and too fine-grained acoustically to correspond to phonetic categories. This challenges the influential idea that what infants learn are phonetic categories. More broadly, our work introduces a mechanism-driven approach to the study of early phonetic learning, together with a quantitative modeling framework that can handle realistic input. This allows accounts of early phonetic learning to be linked to concrete, systematic predictions regarding infants' attunement.


Assuntos
Desenvolvimento da Linguagem , Modelos Neurológicos , Processamento de Linguagem Natural , Fonética , Humanos , Percepção da Fala , Interface para o Reconhecimento da Fala
3.
J Child Lang ; 50(6): 1294-1317, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37246513

RESUMO

There is a current 'theory crisis' in language acquisition research, resulting from fragmentation both at the level of the approaches and the linguistic level studied. We identify a need for integrative approaches that go beyond these limitations, and propose to analyse the strengths and weaknesses of current theoretical approaches of language acquisition. In particular, we advocate that language learning simulations, if they integrate realistic input and multiple levels of language, have the potential to contribute significantly to our understanding of language acquisition. We then review recent results obtained through such language learning simulations. Finally, we propose some guidelines for the community to build better simulations.


Assuntos
Desenvolvimento da Linguagem , Aprendizagem , Humanos , Idioma , Linguística
4.
Behav Res Methods ; 55(8): 4489-4501, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-36750521

RESUMO

We introduce Shennong, a Python toolbox and command-line utility for audio speech features extraction. It implements a wide range of well-established state-of-the-art algorithms: spectro-temporal filters such as Mel-Frequency Cepstral Filterbank or Predictive Linear Filters, pre-trained neural networks, pitch estimators, speaker normalization methods, and post-processing algorithms. Shennong is an open source, reliable and extensible framework built on top of the popular Kaldi speech processing library. The Python implementation makes it easy to use by non-technical users and integrates with third-party speech modeling and machine learning tools from the Python ecosystem. This paper describes the Shennong software architecture, its core components, and implemented algorithms. Then, three applications illustrate its use. We first present a benchmark of speech features extraction algorithms available in Shennong on a phone discrimination task. We then analyze the performances of a speaker normalization model as a function of the speech duration used for training. We finally compare pitch estimation algorithms on speech under various noise conditions.


Assuntos
Ecossistema , Fala , Humanos , Algoritmos , Software , Redes Neurais de Computação
5.
J Acoust Soc Am ; 150(1): 353, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34340514

RESUMO

Deep learning models have become potential candidates for auditory neuroscience research, thanks to their recent successes in a variety of auditory tasks, yet these models often lack interpretability to fully understand the exact computations that have been performed. Here, we proposed a parametrized neural network layer, which computes specific spectro-temporal modulations based on Gabor filters [learnable spectro-temporal filters (STRFs)] and is fully interpretable. We evaluated this layer on speech activity detection, speaker verification, urban sound classification, and zebra finch call type classification. We found that models based on learnable STRFs are on par for all tasks with state-of-the-art and obtain the best performance for speech activity detection. As this layer remains a Gabor filter, it is fully interpretable. Thus, we used quantitative measures to describe distribution of the learned spectro-temporal modulations. Filters adapted to each task and focused mostly on low temporal and spectral modulations. The analyses show that the filters learned on human speech have similar spectro-temporal parameters as the ones measured directly in the human auditory cortex. Finally, we observed that the tasks organized in a meaningful way: the human vocalization tasks closer to each other and bird vocalizations far away from human vocalizations and urban sounds tasks.


Assuntos
Córtex Auditivo , Percepção da Fala , Estimulação Acústica , Percepção Auditiva , Redes Neurais de Computação
6.
Behav Res Methods ; 52(1): 264-278, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-30937845

RESUMO

A basic task in first language acquisition likely involves discovering the boundaries between words or morphemes in input where these basic units are not overtly segmented. A number of unsupervised learning algorithms have been proposed in the last 20 years for these purposes, some of which have been implemented computationally, but whose results remain difficult to compare across papers. We created a tool that is open source, enables reproducible results, and encourages cumulative science in this domain. WordSeg has a modular architecture: It combines a set of corpora description routines, multiple algorithms varying in complexity and cognitive assumptions (including several that were not publicly available, or insufficiently documented), and a rich evaluation package. In the paper, we illustrate the use of this package by analyzing a corpus of child-directed speech in various ways, which further allows us to make recommendations for experimental design of follow-up work. Supplementary materials allow readers to reproduce every result in this paper, and detailed online instructions further enable them to go beyond what we have done. Moreover, the system can be installed within container software that ensures a stable and reliable environment. Finally, by virtue of its modular architecture and transparency, WordSeg can work as an open-source platform, to which other researchers can add their own segmentation algorithms.


Assuntos
Fala , Algoritmos , Humanos , Desenvolvimento da Linguagem , Software
7.
Child Dev ; 90(3): 759-773, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-29094348

RESUMO

This article provides an estimation of how frequently, and from whom, children aged 0-11 years (Ns between 9 and 24) receive one-on-one verbal input among Tsimane forager-horticulturalists of lowland Bolivia. Analyses of systematic daytime behavioral observations reveal < 1 min per daylight hour is spent talking to children younger than 4 years of age, which is 4 times less than estimates for others present at the same time and place. Adults provide a majority of the input at 0-3 years of age but not afterward. When integrated with previous work, these results reveal large cross-cultural variation in the linguistic experiences provided to young children. Consideration of more diverse human populations is necessary to build generalizable theories of language acquisition.


Assuntos
Fazendeiros , Indígenas Sul-Americanos/etnologia , Relações Interpessoais , Comportamento Verbal , Adulto , Bolívia/etnologia , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Masculino
8.
J Acoust Soc Am ; 143(5): EL372, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29857692

RESUMO

Theories of cross-linguistic phonetic category perception posit that listeners perceive foreign sounds by mapping them onto their native phonetic categories, but, until now, no way to effectively implement this mapping has been proposed. In this paper, Automatic Speech Recognition systems trained on continuous speech corpora are used to provide a fully specified mapping between foreign sounds and native categories. The authors show how the machine ABX evaluation method can be used to compare predictions from the resulting quantitative models with empirically attested effects in human cross-linguistic phonetic category perception.


Assuntos
Idioma , Redes Neurais de Computação , Fonética , Percepção da Fala , Interface para o Reconhecimento da Fala/classificação , Humanos , Percepção da Fala/fisiologia
9.
Dev Psychobiol ; 59(5): 603-612, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28561883

RESUMO

A central assumption in the perceptual attunement literature holds that exposure to a speech sound contrast leads to improvement in native speech sound processing. However, whether the amount of exposure matters for this process has not been put to a direct test. We elucidated indicators of frequency-dependent perceptual attunement by comparing 5-8-month-old Dutch infants' discrimination of tokens containing a highly frequent [hɪt-he:t] and a highly infrequent [hʏt-hø:t] native vowel contrast as well as a non-native [hɛt-haet] vowel contrast in a behavioral visual habituation paradigm (Experiment 1). Infants discriminated both native contrasts similarly well, but did not discriminate the non-native contrast. We sought further evidence for subtle differences in the processing of the two native contrasts using near-infrared spectroscopy and a within-participant design (Experiment 2). The neuroimaging data did not provide additional evidence that responses to native contrasts are modulated by frequency of exposure. These results suggest that even large differences in exposure to a native contrast may not directly translate to behavioral and neural indicators of perceptual attunement, raising the possibility that frequency of exposure does not influence improvements in discriminating native contrasts.


Assuntos
Encéfalo/fisiologia , Discriminação Psicológica/fisiologia , Desenvolvimento da Linguagem , Percepção da Fala/fisiologia , Encéfalo/diagnóstico por imagem , Feminino , Neuroimagem Funcional , Humanos , Lactente , Masculino , Espectroscopia de Luz Próxima ao Infravermelho
10.
J Acoust Soc Am ; 142(2): EL211, 2017 08.
Artigo em Inglês | MEDLINE | ID: mdl-28863560

RESUMO

This study aims to quantify the relative contributions of phonetic categories and acoustic detail on phonotactically induced perceptual vowel epenthesis in Japanese listeners. A vowel identification task tested whether a vowel was perceived within illegal consonant clusters and, if so, which vowel was heard. Cross-spliced stimuli were used in which vowel coarticulation present in the cluster did not match the quality of the flanking vowel. Two clusters were used, /hp/ and /kp/, the former containing larger amounts of resonances of the preceding vowel. While both flanking vowel and coarticulation influenced vowel quality, the influence of coarticulation was larger, especially for /hp/.


Assuntos
Fonética , Acústica da Fala , Percepção da Fala , Qualidade da Voz , Estimulação Acústica , Adulto , Feminino , Humanos , Masculino , Reconhecimento Psicológico , Adulto Jovem
11.
Cogn Neuropsychol ; 33(5-6): 343-51, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27593456

RESUMO

Pointing is a communicative gesture that allows individuals to share information about surrounding objects with other humans. Patients with heterotopagnosia are specifically impaired in pointing to other humans' body parts but not in pointing to themselves or to objects. Here, we describe a female patient with heterotopagnosia who was more accurate in pointing to men's body parts than to women's body parts. We replicated this gender effect in healthy participants with faster reaction times for pointing to men's body parts than to women's body parts. We discuss the role of gender stereotypes in explaining why it is more difficult to point to women than to men.


Assuntos
Afasia Primária Progressiva/fisiopatologia , Comunicação , Dedos/fisiologia , Sexo , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Tempo de Reação , Fatores Sexuais
12.
J Acoust Soc Am ; 140(1): EL1, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27475196

RESUMO

This study aims to quantify the role of prosodic boundaries in early language acquisition using a computational modeling approach. A spoken term discovery system that models early word learning was used with and without a prosodic component on speech corpora of English, Spanish, and Japanese. The results showed that prosodic information induces a consistent improvement both in the alignment of the terms to actual word boundaries and in the phonemic homogeneity of the discovered clusters of terms. This benefit was found also when automatically discovered prosodic boundaries were used, boundaries which did not perfectly match the linguistically defined ones.


Assuntos
Linguagem Infantil , Simulação por Computador , Aprendizagem Verbal , Algoritmos , Criança , Humanos , Idioma , Fala , Percepção da Fala
13.
J Acoust Soc Am ; 140(2): 1239, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27586752

RESUMO

This study explores the long-standing hypothesis that the acoustic cues to prosodic boundaries in infant-directed speech (IDS) make those boundaries easier to learn than those in adult-directed speech (ADS). Three cues (pause duration, nucleus duration, and pitch change) were investigated, by means of a systematic review of the literature, statistical analyses of a corpus of Japanese, and machine learning experiments. The review of previous work revealed that the effect of register on boundary cues is less well established than previously thought, and that results often vary across studies for certain cues. Statistical analyses run on a large database of mother-child and mother-interviewer interactions showed that the duration of a pause and the duration of the syllable nucleus preceding the boundary are two cues which are enhanced in IDS, while f0 change is actually degraded in IDS. Supervised and unsupervised machine learning techniques applied to these acoustic cues revealed that IDS boundaries were consistently better classified than ADS ones, regardless of the learning method used. The role of the cues examined in this study and the importance of these findings in the more general context of early linguistic structure acquisition is discussed.


Assuntos
Linguagem Infantil , Sinais (Psicologia) , Fatores Etários , Feminino , Humanos , Lactente , Mães , Fala , Acústica da Fala , Percepção da Fala , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina não Supervisionado
14.
Psychol Sci ; 26(3): 341-7, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25630443

RESUMO

Infants learn language at an incredible speed, and one of the first steps in this voyage is learning the basic sound units of their native languages. It is widely thought that caregivers facilitate this task by hyperarticulating when speaking to their infants. Using state-of-the-art speech technology, we addressed this key theoretical question: Are sound categories clearer in infant-directed speech than in adult-directed speech? A comprehensive examination of sound contrasts in a large corpus of recorded, spontaneous Japanese speech demonstrates that there is a small but significant tendency for contrasts in infant-directed speech to be less clear than those in adult-directed speech. This finding runs contrary to the idea that caregivers actively enhance phonetic categories in infant-directed speech. These results suggest that to be plausible, theories of infants' language acquisition must posit an ability to learn from noisy data.


Assuntos
Relações Mãe-Filho , Percepção da Fala , Feminino , Humanos , Lactente , Japão , Desenvolvimento da Linguagem , Mães , Fonética , Acústica da Fala
15.
Dev Sci ; 17(4): 628-35, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24628942

RESUMO

The present study investigated the neural correlates of infant discrimination of very similar linguistic varieties (Quebecois and Parisian French) using functional Near InfraRed Spectroscopy. In line with previous behavioral and electrophysiological data, there was no evidence that 3-month-olds discriminated the two regional accents, whereas 5-month-olds did, with the locus of discrimination in left anterior perisylvian regions. These neuroimaging results suggest that a developing language network relying crucially on left perisylvian cortices sustains infants' discrimination of similar linguistic varieties within this early period of infancy.


Assuntos
Estimulação Acústica/métodos , Idioma , Percepção da Altura Sonora/fisiologia , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Percepção da Fala/fisiologia , Mapeamento Encefálico , Córtex Cerebral/fisiologia , Eletrofisiologia , França , Humanos , Lactente , Comportamento do Lactente , Desenvolvimento da Linguagem , Quebeque
16.
Cognition ; 245: 105734, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38335906

RESUMO

Infants learn their native language(s) at an amazing speed. Before they even talk, their perception adapts to the language(s) they hear. However, the mechanisms responsible for this perceptual attunement and the circumstances in which it takes place remain unclear. This paper presents the first attempt to study perceptual attunement using ecological child-centered audio data. We show that a simple prediction algorithm exhibits perceptual attunement when applied on unrealistic clean audio-book data, but fails to do so when applied on ecologically-valid child-centered data. In the latter scenario, perceptual attunement only emerges when the prediction mechanism is supplemented with inductive biases that force the algorithm to focus exclusively on speech segments while learning speaker-, pitch-, and room-invariant representations. We argue these biases are plausible given previous research on infants and non-human animals. More generally, we show that what our model learns and how it develops through exposure to speech depends exquisitely on the details of the input signal. By doing so, we illustrate the importance of considering ecologically valid input data when modeling language acquisition.


Assuntos
Fonética , Percepção da Fala , Humanos , Lactente , Desenvolvimento da Linguagem , Aprendizagem , Idioma
17.
Dev Sci ; 16(1): 24-34, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23278924

RESUMO

Previous research with artificial language learning paradigms has shown that infants are sensitive to statistical cues to word boundaries (Saffran, Aslin & Newport, 1996) and that they can use these cues to extract word-like units (Saffran, 2001). However, it is unknown whether infants use statistical information to construct a receptive lexicon when acquiring their native language. In order to investigate this issue, we rely on the fact that besides real words a statistical algorithm extracts sound sequences that are highly frequent in infant-directed speech but constitute nonwords. In three experiments, we use a preferential listening paradigm to test French-learning 11-month-old infants' recognition of highly frequent disyllabic sequences from their native language. In Experiments 1 and 2, we use nonword stimuli and find that infants listen longer to high-frequency than to low-frequency sequences. In Experiment 3, we compare high-frequency nonwords to real words in the same frequency range, and find that infants show no preference. Thus, at 11 months, French-learning infants recognize highly frequent sound sequences from their native language and fail to differentiate between words and nonwords among these sequences. These results are evidence that they have used statistical information to extract word candidates from their input and stored them in a 'protolexicon', containing both words and nonwords.


Assuntos
Sinais (Psicologia) , Desenvolvimento da Linguagem , Reconhecimento Psicológico/fisiologia , Vocabulário , Estimulação Acústica , França , Humanos , Lactente , Modelos Biológicos
18.
Behav Brain Sci ; 36(4): 416-7, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23883745

RESUMO

Second person social cognition cannot be restricted to dyadic interactions between two persons (the "I" and the "you"). Many instances of social communication are triadic, and involve a third person (the "him/her/it"), which is the object of the interaction. We discuss neuropsychological and brain imaging data showing that triadic interactions involve dedicated brain networks distinct from those of dyadic interactions.


Assuntos
Cognição/fisiologia , Relações Interpessoais , Neurônios-Espelho/fisiologia , Percepção Social , Teoria da Mente/fisiologia , Humanos
19.
Cereb Cortex ; 21(2): 254-61, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20497946

RESUMO

This study uses near-infrared spectroscopy in young infants in order to elucidate the nature of functional cerebral processing for speech. Previous imaging studies of infants' speech perception revealed left-lateralized responses to native language. However, it is unclear if these activations were due to language per se rather than to some low-level acoustic correlate of spoken language. Here we compare native (L1) and non-native (L2) languages with 3 different nonspeech conditions including emotional voices, monkey calls, and phase scrambled sounds that provide more stringent controls. Hemodynamic responses to these stimuli were measured in the temporal areas of Japanese 4 month-olds. The results show clear left-lateralized responses to speech, prominently to L1, as opposed to various activation patterns in the nonspeech conditions. Furthermore, implementing a new analysis method designed for infants, we discovered a slower hemodynamic time course in awake infants. Our results are largely explained by signal-driven auditory processing. However, stronger activations to L1 than to L2 indicate a language-specific neural factor that modulates these responses. This study is the first to discover a significantly higher sensitivity to L1 in 4 month-olds and reveals a neural precursor of the functional specialization for the higher cognitive network.


Assuntos
Mapeamento Encefálico , Desenvolvimento Infantil/fisiologia , Idioma , Percepção da Fala/fisiologia , Lobo Temporal/fisiologia , Estimulação Acústica/métodos , Emoções/fisiologia , Feminino , Lateralidade Funcional/fisiologia , Hemoglobinas/metabolismo , Humanos , Lactente , Masculino , Análise Numérica Assistida por Computador , Tempo de Reação , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Fatores de Tempo
20.
Cognition ; 219: 104961, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34856424

RESUMO

Infants come to learn several hundreds of word forms by two years of age, and it is possible this involves carving these forms out from continuous speech. It has been proposed that the task is facilitated by the presence of prosodic boundaries. We revisit this claim by running computational models of word segmentation, with and without prosodic information, on a corpus of infant-directed speech. We use five cognitively-based algorithms, which vary in whether they employ a sub-lexical or a lexical segmentation strategy and whether they are simple heuristics or embody an ideal learner. Results show that providing expert-annotated prosodic breaks does not uniformly help all segmentation models. The sub-lexical algorithms, which perform more poorly, benefit most, while the lexical ones show a very small gain. Moreover, when prosodic information is derived automatically from the acoustic cues infants are known to be sensitive to, errors in the detection of the boundaries lead to smaller positive effects, and even negative ones for some algorithms. This shows that even though infants could potentially use prosodic breaks, it does not necessarily follow that they should incorporate prosody into their segmentation strategies, when confronted with realistic signals.


Assuntos
Percepção da Fala , Fala , Simulação por Computador , Sinais (Psicologia) , Humanos , Lactente , Aprendizagem , Acústica da Fala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA