RESUMEN
The present study investigated the role of syntactic processing in driving bilingual language selection. In two experiments, 120 English-dominant Spanish-English bilinguals read aloud 18 paragraphs with language switches. In Experiment 1a, each paragraph included eight switch words on function targets (four that repeated in every paragraph), and Experiment 1b was a replication with eight additional switches on content words in each paragraph. Both experiments had three conditions: (a) normal, (b) noun-swapped (in which nouns within consecutive sentences were swapped), and (c) random (in which words in each sentence were reordered randomly). In both experiments bilinguals produced intrusion errors, automatically translating language switch words by mistake, especially on function words (e.g., saying the day and stay awake instead of the day y stay awake). Intrusion rates did not vary across experiments even though switch rate was doubled in Experiment 1b relative to Experiment 1a. Bilinguals produced the most intrusions in normal paragraphs, slightly but significantly fewer intrusions in noun-swapped paragraphs, and a dramatic drop in intrusion rates in the random condition, even though the random condition elicited the most within-language errors. Bilinguals also demonstrated a common signature of inhibitory control in the form of reversed language dominance effects, which did not vary significantly across paragraph types. Finally, intrusions increased with switch word predictability (surprisal), but significant differences between conditions remained when controlling for predictability. These results demonstrate that bilingual language selection is driven by syntactic processing, which operates independently from other language control mechanisms, such as inhibition. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
RESUMEN
Theories of bilingual language production predict that bilinguals with Alzheimer's disease (AD) should exhibit one of two decline patterns. Either parallel decline of both languages (if decline reflects damage to semantic representations that are accessed by both languages), or asymmetrical decline, with greater decline of the nondominant language (if decline reflects reduced ability to resolve competition from the dominant language with disease progression). Only two previous studies examined decline longitudinally with one showing parallel, and the other asymmetrical, decline. We examined decline over 2-7 years (3.9 on average) in Spanish-English bilinguals (N = 23). Logistic regression revealed a parallel decline pattern at one year from baseline, but an asymmetrical decline pattern over the longer decline period, with greater decline of the nondominant language (when calculating predicted probabilities of a correct response). The asymmetrical decline pattern was significantly greater for the nondominant language only when including item-difficulty in the model. Exploratory analyses across dominance groups looking at proportional decline relative to initial naming accuracy further suggested that decline of the nondominant language may be more precipitous if that language was acquired later in life, but the critical interaction needed to support this possibility was not statistically significant in a logistic regression analysis. These results suggest that accessibility of the nondominant language may initially be more resilient in early versus more advanced AD, and that AD affects shared semantic representations before executive control declines to a point where the ability to name pictures in single-language testing block is disrupted. Additional work is needed to determine if asymmetrical decline patterns are magnified by late age of acquisition of the nondominant language, and if more subtle impairments to executive control underlie impairments to language switching that occur in the earliest stages of AD (even preclinically).
Asunto(s)
Enfermedad de Alzheimer , Multilingüismo , Humanos , Enfermedad de Alzheimer/fisiopatología , Masculino , Femenino , Anciano , Estudios Longitudinales , Anciano de 80 o más Años , Progresión de la Enfermedad , Pruebas Neuropsicológicas , Pruebas del Lenguaje , Trastornos del Lenguaje/etiología , Trastornos del Lenguaje/fisiopatología , Trastornos del Lenguaje/diagnósticoRESUMEN
Speech recognition by both humans and machines frequently fails in non-optimal yet common situations. For example, word recognition error rates for second-language (L2) speech can be high, especially under conditions involving background noise. At the same time, both human and machine speech recognition sometimes shows remarkable robustness against signal- and noise-related degradation. Which acoustic features of speech explain this substantial variation in intelligibility? Current approaches align speech to text to extract a small set of pre-defined spectro-temporal properties from specific sounds in particular words. However, variation in these properties leaves much cross-talker variation in intelligibility unexplained. We examine an alternative approach utilizing a perceptual similarity space acquired using self-supervised learning. This approach encodes distinctions between speech samples without requiring pre-defined acoustic features or speech-to-text alignment. We show that L2 English speech samples are less tightly clustered in the space than L1 samples reflecting variability in English proficiency among L2 talkers. Critically, distances in this similarity space are perceptually meaningful: L1 English listeners have lower recognition accuracy for L2 speakers whose speech is more distant in the space from L1 speech. These results indicate that perceptual similarity may form the basis for an entirely new speech and language analysis approach.
Asunto(s)
Acústica del Lenguaje , Inteligibilidad del Habla , Percepción del Habla , Humanos , Masculino , Femenino , Adulto , Adulto Joven , Multilingüismo , Reconocimiento en Psicología , RuidoRESUMEN
This study investigates heritage bilingual speakers' perception of naturalistic code-switched sentences (i.e., use of both languages in one sentence). Studies of single word perception suggest that code-switching is more difficult to perceive than single language speech. However, such difficulties may not extend to more naturalistic sentences, where predictability and other cues may serve to ameliorate such difficulties. Fifty-four Mexican-American Spanish heritage bilinguals transcribed sentences in noise in English, Spanish, and code-switched blocks. Participants were better at perceiving speech in single language blocks than code-switched blocks. The results indicate that increased language co-activation when perceiving code-switching results in significant processing costs.
Asunto(s)
Señales (Psicología) , Habla , Humanos , Lenguaje , Americanos Mexicanos , PercepciónRESUMEN
Measuring how well human listeners recognize speech under varying environmental conditions (speech intelligibility) is a challenge for theoretical, technological, and clinical approaches to speech communication. The current gold standard-human transcription-is time- and resource-intensive. Recent advances in automatic speech recognition (ASR) systems raise the possibility of automating intelligibility measurement. This study tested 4 state-of-the-art ASR systems with second language speech-in-noise and found that one, whisper, performed at or above human listener accuracy. However, the content of whisper's responses diverged substantially from human responses, especially at lower signal-to-noise ratios, suggesting both opportunities and limitations for ASR--based speech intelligibility modeling.
Asunto(s)
Percepción del Habla , Humanos , Percepción del Habla/fisiología , Ruido/efectos adversos , Inteligibilidad del Habla/fisiología , Software de Reconocimiento del Habla , Reconocimiento en PsicologíaRESUMEN
Peer review is a core component of scientific practice. Although peer review ideally improves research and promotes rigor, it also has consequences for what types of research are published and cited and who wants to (and is able to) advance in research-focused careers. Despite these consequences, few reviewers or editors receive training or oversight to ensure their feedback is helpful, professional, and culturally sensitive. Here, we critically examine the peer-review system in psychology and neuroscience at multiple levels, from ideas to institutions, interactions, and individuals. We highlight initiatives that aim to change the normative negativity of peer review and provide authors with constructive, actionable feedback that is sensitive to diverse identities, methods, topics, and environments. We conclude with a call to action for how individuals, groups, and organizations can improve the culture of peer review. We provide examples of how changes in the peer-review system can be made with an eye to diversity (increasing the range of identities and experiences constituting the field), equity (fair processes and outcomes across groups), and inclusion (experiences that promote belonging across groups). These changes can improve scientists' experience of peer review, promote diverse perspectives and identities, and enhance the quality and impact of science. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Asunto(s)
Revisión por Pares , PsicologíaRESUMEN
BACKGROUND AND HYPOTHESIS: Motor abnormalities are predictive of psychosis onset in individuals at clinical high risk (CHR) for psychosis and are tied to its progression. We hypothesize that these motor abnormalities also disrupt their speech production (a highly complex motor behavior) and predict CHR individuals will produce more variable speech than healthy controls, and that this variability will relate to symptom severity, motor measures, and psychosis-risk calculator risk scores. STUDY DESIGN: We measure variability in speech production (variability in consonants, vowels, speech rate, and pausing/timing) in N = 58 CHR participants and N = 67 healthy controls. Three different tasks are used to elicit speech: diadochokinetic speech (rapidly-repeated syllables e.g., papapa , pataka ), read speech, and spontaneously-generated speech. STUDY RESULTS: Individuals in the CHR group produced more variable consonants and exhibited greater speech rate variability than healthy controls in two of the three speech tasks (diadochokinetic and read speech). While there were no significant correlations between speech measures and remotely-obtained motor measures, symptom severity, or conversion risk scores, these comparisons may be under-powered (in part due to challenges of remote data collection during the COVID-19 pandemic). CONCLUSION: This study provides a thorough and theory-driven first look at how speech production is affected in this at-risk population and speaks to the promise and challenges facing this approach moving forward.
RESUMEN
Theories of speech production have proposed that in contexts where multiple languages are produced, bilinguals inhibit the dominant language with the goal of making both languages equally accessible. This process often overshoots this goal, leading to a surprising pattern: better performance in the nondominant vs. dominant language, or reversed language dominance effects. However, the reliability of this effect in single word production studies with cued language switches has been challenged by a recent meta-analysis. Correcting for errors in this analysis, we find that dominance effects are reliably reduced and reversed during language mixing. Reversed dominance has also consistently been reported in the production of connected speech elicited by reading aloud of mixed language paragraphs. When switching, bilinguals produced translation-equivalent intrusion errors (e.g., saying pero instead of but) more often when intending to produce words in the dominant language. We show this dominant language vulnerability is not exclusive to switching out of the nondominant language and extends to non-switch words, linking connected speech results to patterns first reported in single word studies. Reversed language dominance is a robust phenomenon that reflects the tip of the iceberg of inhibitory control of the dominant language in bilingual language production.
RESUMEN
Subject-verb agreement errors are common in sentence production. Many studies have used experimental paradigms targeting the production of subject-verb agreement from a sentence preamble (The key to the cabinets) and eliciting verb errors ( *were shiny). Through reanalysis of previous data (50 experiments; 102,369 observations), we show that this paradigm also results in many errors in preamble repetition, particularly of local noun number (The key to the *cabinet). We explore the mechanisms of both errors in parallelism in producing syntax (PIPS), a model in the Gradient Symbolic Computation framework. PIPS models sentence production using a continuous-state stochastic dynamical system that optimizes grammatical constraints (shaped by previous experience) over vector representations of symbolic structures. At intermediate stages in the computation, grammatical constraints allow multiple competing parses to be partially activated, resulting in stable but transient conjunctive blend states. In the context of the preamble completion task, memory constraints reduce the strength of the target structure, allowing for co-activation of non-target parses where the local noun controls the verb (notional agreement and locally agreeing relative clauses) and non-target parses that include structural constituents with contrasting number specifications (e.g., plural instead of singular local noun). Simulations of the preamble completion task reveal that these partially activated non-target parses, as well the need to balance accurate encoding of lexical and syntactic aspects of the prompt, result in errors. In other words: Because sentence processing is embedded in a processor with finite memory and prior experience with production, interference from non-target production plans causes errors.
Asunto(s)
Lenguaje , Semántica , HumanosRESUMEN
Speakers learning a second language show systematic differences from native speakers in the retrieval, planning, and articulation of speech. A key challenge in examining the interrelationship between these differences at various stages of production is the need for manual annotation of fine-grained properties of speech. We introduce a new method for automatically analyzing voice onset time (VOT), a key phonetic feature indexing differences in sound systems cross-linguistically. In contrast to previous approaches, our method allows reliable measurement of prevoicing, a dimension of VOT variation used by many languages. Analysis of VOTs, word durations, and reaction times from German-speaking learners of Spanish (Baus et al., 2013) suggest that while there are links between the factors impacting planning and articulation, these two processes also exhibit some degree of independence. We discuss the implications of these findings for theories of speech production and future research in bilingual language processing.
RESUMEN
In two speech production experiments, we investigated the link between phonetic variation and the scope of advance planning at the word form encoding stage. We examined cases where a word has, in addition to the pronunciation of the word in isolation, a context-specific pronunciation variant that appears only when the following word includes specific sounds. To the extent that the speaker uses the variant specific to the following context, we can infer that the phonological content of the upcoming word is included in the current planning scope. We hypothesize that the time alignment between selection of the phonetic variant in the currently-being-encoded word and retrieval of segmental details of the upcoming word is variable from moment to moment depending on current task demands and the dynamics of lexical access for each word involved. The results showed that the use of a context-sensitive phonetic variant of /t/ ("flapping") by English speakers reliably increased under conditions which favor advance planning. Our hypothesis was supported by evidence compatible with its three key predictions: an increase in flapping in phrases with a higher frequency following word, more flapping in a procedure with a response delay relative to a speeded response, and an attenuation of the following word frequency effect with delayed responses. This reveals that within speakers, the degree of advance planning varies continuously from moment to moment, reflecting (in part) the accessibility of form properties of individual words in the utterance.
Asunto(s)
Fonética , Habla , Humanos , Lenguaje , Medición de la Producción del HablaRESUMEN
Speaking involves both retrieving the sounds of a word (phonological planning) and realizing these selected sounds in fluid speech (articulation). Recent phonetic research on speech errors has argued that multiple candidate sounds in phonological planning can influence articulation because the pronunciation of mis-selected error sounds is slightly skewed towards unselected target sounds. Yet research to date has only examined these phonetic distortions in experimentally-elicited errors, leaving doubt as to whether they reflect tendencies in spontaneous speech. Here, we analyzed the pronunciation of speech errors of English-speaking adults in natural conversations relative to matched correct words by the same speakers, and found the conjectured phonetic distortions. Comparison of these data with a larger set of experimentally-elicited errors failed to reveal significant differences between the two types of errors. These findings provide ecologically-valid data supporting models that allow for information about multiple planning representations to simultaneously influence speech articulation.
Asunto(s)
Fonética , Habla , Adulto , Comunicación , Humanos , Lenguaje , Medición de la Producción del HablaRESUMEN
The language and speech of individuals with psychosis reflect their impairments in cognition and motor processes. These language disturbances can be used to identify individuals with and at high risk for psychosis, as well as help track and predict symptom progression, allowing for early intervention and improved outcomes. However, current methods of language assessment-manual annotations and/or clinical rating scales-are time intensive, expensive, subject to bias, and difficult to administer on a wide scale, limiting this area from reaching its full potential. Computational methods that can automatically perform linguistic analysis have started to be applied to this problem and could drastically improve our ability to use linguistic information clinically. In this article, we first review how these automated, computational methods work and how they have been applied to the field of psychosis. We show that across domains, these methods have captured differences between individuals with psychosis and healthy controls and can classify individuals with high accuracies, demonstrating the promise of these methods. We then consider the obstacles that need to be overcome before these methods can play a significant role in the clinical process and provide suggestions for how the field should address them. In particular, while much of the work thus far has focused on demonstrating the successes of these methods, we argue that a better understanding of when and why these models fail will be crucial toward ensuring these methods reach their potential in the field of psychosis.
Asunto(s)
Disfunción Cognitiva/fisiopatología , Trastornos del Lenguaje/fisiopatología , Psicolingüística , Trastornos Psicóticos/fisiopatología , Esquizofrenia/fisiopatología , Pensamiento/fisiología , Adulto , Biomarcadores , Disfunción Cognitiva/etiología , Humanos , Trastornos del Lenguaje/etiología , Trastornos Psicóticos/complicaciones , Esquizofrenia/complicacionesRESUMEN
Neural interfaces that directly produce intelligible speech from brain activity would allow people with severe impairment from neurological disorders to communicate more naturally. Here, we record neural population activity in motor, premotor and inferior frontal cortices during speech production using electrocorticography (ECoG) and show that ECoG signals alone can be used to generate intelligible speech output that can preserve conversational cues. To produce speech directly from neural data, we adapted a method from the field of speech synthesis called unit selection, in which units of speech are concatenated to form audible output. In our approach, which we call Brain-To-Speech, we chose subsequent units of speech based on the measured ECoG activity to generate audio waveforms directly from the neural recordings. Brain-To-Speech employed the user's own voice to generate speech that sounded very natural and included features such as prosody and accentuation. By investigating the brain areas involved in speech production separately, we found that speech motor cortex provided more information for the reconstruction process than the other cortical areas.
RESUMEN
Interactive models of language production predict that it should be possible to observe long-distance interactions; effects that arise at one level of processing influence multiple subsequent stages of representation and processing. We examine the hypothesis that disruptions arising in nonform-based levels of planning-specifically, lexical selection-should modulate articulatory processing. A novel automatic phonetic analysis method was used to examine productions in a paradigm yielding both general disruptions to formulation processes and, more specifically, overt errors during lexical selection. This analysis method allowed us to examine articulatory disruptions at multiple levels of analysis, from whole words to individual segments. Baseline performance by young adults was contrasted with young speakers' performance under time pressure (which previous work has argued increases interaction between planning and articulation) and performance by older adults (who may have difficulties inhibiting nontarget representations, leading to heightened interactive effects). The results revealed the presence of interactive effects. Our new analysis techniques revealed these effects were strongest in initial portions of responses, suggesting that speech is initiated as soon as the first segment has been planned. Interactive effects did not increase under response pressure, suggesting interaction between planning and articulation is relatively fixed. Unexpectedly, lexical selection disruptions appeared to yield some degree of facilitation in articulatory processing (possibly reflecting semantic facilitation of target retrieval) and older adults showed weaker, not stronger interactive effects (possibly reflecting weakened connections between lexical and form-level representations). (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Asunto(s)
Fonética , Psicolingüística , Habla , Adolescente , Anciano , Envejecimiento/psicología , Asociación , Femenino , Humanos , Inhibición Psicológica , Masculino , Persona de Mediana Edad , Redes Neurales de la Computación , Reconocimiento Visual de Modelos , Lectura , Adulto JovenRESUMEN
The current study investigated how aging affects production and self-correction of errors in connected speech elicited via a read aloud task. Thirty-five cognitively healthy older and 56 younger participants read aloud 6 paragraphs in each of three conditions increasing in difficulty: (a) normal, (b) nouns-swapped (in which nouns were shuffled across pairs of sentences in each paragraph), and (c) exchange (in which adjacent words in every two sentences were reversed in order). Reading times and errors increased with task difficulty, but self-correction rates were lowest in the nouns-swapped condition. Older participants read aloud more slowly, and after controlling for aging-related advantages in vocabulary knowledge, produced more speech errors (especially in the normal condition), and self-corrected errors less often than younger participants. Exploratory analysis of error types revealed that aging increased the rate of function word substitution errors (saying the instead of a), whereas younger participants omitted content words more often than did older participants. This pattern of aging deficits reveals powerful effects of vocabulary knowledge on speech production and suggests aging speakers can compensate for aging-related decline in control over speech production with their higher vocabulary knowledge and careful attention to speech planning in more difficult speaking conditions. These results suggest a model of speech production in which planning of speech is relatively automatic, whereas monitoring and self-correction are more attention-demanding, in turn leaving speech production relatively intact in aging. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Asunto(s)
Envejecimiento/fisiología , Envejecimiento/psicología , Lectura , Habla/fisiología , Vocabulario , Adulto , Anciano , Estudios de Cohortes , Femenino , Humanos , Masculino , Factores de TiempoRESUMEN
Speech is a critical form of human communication and is central to our daily lives. Yet, despite decades of study, an understanding of the fundamental neural control of speech production remains incomplete. Current theories model speech production as a hierarchy from sentences and phrases down to words, syllables, speech sounds (phonemes), and the actions of vocal tract articulators used to produce speech sounds (articulatory gestures). Here, we investigate the cortical representation of articulatory gestures and phonemes in ventral precentral and inferior frontal gyri in men and women. Our results indicate that ventral precentral cortex represents gestures to a greater extent than phonemes, while inferior frontal cortex represents both gestures and phonemes. These findings suggest that speech production shares a common cortical representation with that of other types of movement, such as arm and hand movements. This has important implications both for our understanding of speech production and for the design of brain-machine interfaces to restore communication to people who cannot speak.SIGNIFICANCE STATEMENT Despite being studied for decades, the production of speech by the brain is not fully understood. In particular, the most elemental parts of speech, speech sounds (phonemes) and the movements of vocal tract articulators used to produce these sounds (articulatory gestures), have both been hypothesized to be encoded in motor cortex. Using direct cortical recordings, we found evidence that primary motor and premotor cortices represent gestures to a greater extent than phonemes. Inferior frontal cortex (part of Broca's area) appears to represent both gestures and phonemes. These findings suggest that speech production shares a similar cortical organizational structure with the movement of other body parts.