RESUMEN
When speech is too fast, the tracking of the acoustic signal along the auditory pathway deteriorates, leading to suboptimal speech segmentation and decoding of speech information. Thus, speech comprehension is limited by the temporal constraints of the auditory system. Here we ask whether individual differences in auditory-motor coupling strength in part shape these temporal constraints. In two behavioural experiments, we characterize individual differences in the comprehension of naturalistic speech as function of the individual synchronization between the auditory and motor systems and the preferred frequencies of the systems. Obviously, speech comprehension declined at higher speech rates. Importantly, however, both higher auditory-motor synchronization and higher spontaneous speech motor production rates were predictive of better speech-comprehension performance. Furthermore, performance increased with higher working memory capacity (digit span) and higher linguistic, model-based sentence predictability-particularly so at higher speech rates and for individuals with high auditory-motor synchronization. The data provide evidence for a model of speech comprehension in which individual flexibility of not only the motor system but also auditory-motor synchronization may play a modulatory role.
Asunto(s)
Comprensión , Habla , Humanos , Acústica , Extremidades , LingüísticaRESUMEN
This study examined the effect of modality onset asynchrony and response processing time for the recognition of text-supplemented speech. Speech and text were periodically interrupted by noise or black bars, respectively, to preserve 50% of the sentence and presented in unimodal and multimodal conditions. Sentence recognition and response errors were assessed for responses made simultaneous with the stimulus or after its presentation. Increased processing time allowed for the cognitive repair of initial response errors in working memory. Text-supplemented speech was best recognized with minimal temporal asynchrony. Overall, text supplementation facilitated the recognition of degraded speech when provided sufficient processing time.
Asunto(s)
Suplementos Dietéticos , Habla , Memoria a Corto Plazo , Tiempo de Reacción , Reconocimiento en PsicologíaRESUMEN
Inputs delivered to different sensory organs provide us with complementary speech information about the environment. The goal of this study was to establish which multisensory characteristics can facilitate speech recognition in noise. The major finding is that the tracking of temporal cues of visual/tactile speech synced with auditory speech can play a key role in speech-in-noise performance. This suggests that multisensory interactions are fundamentally important for speech recognition ability in noisy environments, and they require salient temporal cues. The amplitude envelope, serving as a reliable temporal cue source, can be applied through different sensory modalities when speech recognition is compromised.
Asunto(s)
Señales (Psicología) , Percepción del Habla , HablaRESUMEN
Listeners parse the speech signal effortlessly into words and phrases, but many questions remain about how. One classic idea is that rhythm-related auditory principles play a role, in particular, that a psycho-acoustic "iambic-trochaic law" (ITL) ensures that alternating sounds varying in intensity are perceived as recurrent binary groups with initial prominence (trochees), while alternating sounds varying in duration are perceived as binary groups with final prominence (iambs). We test the hypothesis that the ITL is in fact an indirect consequence of the parsing of speech along two in-principle orthogonal dimensions: prominence and grouping. Results from several perception experiments show that the two dimensions, prominence and grouping, are each reliably cued by both intensity and duration, while foot type is not associated with consistent cues. The ITL emerges only when one manipulates either intensity or duration in an extreme way. Overall, the results suggest that foot perception is derivative of the cognitively more basic decisions of grouping and prominence, and the notions of trochee and iamb may not play any direct role in speech parsing. A task manipulation furthermore gives new insight into how these decisions mutually inform each other.
Asunto(s)
Acústica , Habla , Señales (Psicología) , Grupo Social , SonidoRESUMEN
This study investigates the impact of wearing a face mask on the production and perception of coarticulatory vowel nasalization. Speakers produced monosyllabic American English words with oral and nasal codas (i.e., CVC and CVN) in face-masked and un-face-masked conditions to a real human interlocutor. The vowel was either tense or lax. Acoustic analyses indicate that speakers produced greater coarticulatory vowel nasality in CVN items when wearing a face mask, particularly, when the vowel is lax, suggesting targeted enhancement of the oral-nasalized contrast in this condition. This enhancement is not observed for tense vowels. In a perception study, participants heard CV syllables excised from the recorded words and performed coda identifications. For lax vowels, listeners were more accurate at identifying the coda in the face-masked condition, indicating that they benefited from the speakers' production adjustments. Overall, the results indicate that speakers adapt their speech in specific contexts when wearing a face mask, and these speaker adjustments have an influence on listeners' abilities to identify words in the speech signal.
Asunto(s)
Máscaras , Habla , Humanos , Equipo de Protección Personal , Acústica , PercepciónRESUMEN
Recently, social media platforms are heavily moderated to prevent the spread of online hate speech, which is usually fertile in toxic words and is directed toward an individual or a community. Owing to such heavy moderation, newer and more subtle techniques are being deployed. One of the most striking among these is fear speech. Fear speech, as the name suggests, attempts to incite fear about a target community. Although subtle, it might be highly effective, often pushing communities toward a physical conflict. Therefore, understanding their prevalence in social media is of paramount importance. This article presents a large-scale study to understand the prevalence of 400K fear speech and over 700K hate speech posts collected from Gab.com. Remarkably, users posting a large number of fear speech accrue more followers and occupy more central positions in social networks than users posting a large number of hate speech. They can also reach out to benign users more effectively than hate speech users through replies, reposts, and mentions. This connects to the fact that, unlike hate speech, fear speech has almost zero toxic content, making it look plausible. Moreover, while fear speech topics mostly portray a community as a perpetrator using a (fake) chain of argumentation, hate speech topics hurl direct multitarget insults, thus pointing to why general users could be more gullible to fear speech. Our findings transcend even to other platforms (Twitter and Facebook) and thus necessitate using sophisticated moderation policies and mass awareness to combat fear speech.
Asunto(s)
Medios de Comunicación Sociales , Humanos , Habla , Miedo , Fertilidad , OdioRESUMEN
The step of going to a therapist is the fruit of a contingency of life, a painful reality, sometimes hardly noticeable and which becomes repetitive or unbearable. The therapist takes support on this adventure which begins in order to reveal the object which is nestled in the patient's speech. To orientate this work, the transference, the symptom and the part of jouissance are crossed. The adventure of speech thus takes the risk of going towards the intimate, present in what makes suffering. A psychoanalytical point of view is precious to enlighten the stakes of the relational field.
Asunto(s)
Psicoanálisis , Habla , HumanosRESUMEN
The spread and influence of misinformation have become a matter of concern in society as misinformation can negatively impact individuals' beliefs, opinions and, consequently, decisions. Research has shown that individuals persevere in their biased beliefs and opinions even after the retraction of misinformation. This phenomenon is known as the belief perseverance bias. However, research on mitigating the belief perseverance bias after the retraction of misinformation has been limited. Only a few debiasing techniques with limited practical applicability have been proposed, and research on comparing various techniques in terms of their effectiveness has been scarce. This paper contributes to research on mitigating the belief perseverance bias after the retraction of misinformation by proposing counter-speech and awareness-training techniques and comparing them in terms of effectiveness to the existing counter-explanation technique in an experiment with N = 251 participants. To determine changes in opinions, the extent of the belief perseverance bias and the effectiveness of the debiasing techniques in mitigating the belief perseverance bias, we measure participants' opinions four times in the experiment by using Likert items and phi-coefficient measures. The effectiveness of the debiasing techniques is assessed by measuring the difference between the baseline opinions before exposure to misinformation and the opinions after exposure to a debiasing technique. Further, we discuss the efforts of the providers and recipients of debiasing and the practical applicability of the debiasing techniques. The CS technique, with a very large effect size, is the most effective among the three techniques. The CE and AT techniques, with medium effect sizes, are close to being equivalent in terms of their effectiveness. The CS and AT techniques are associated with less cognitive and time effort of the recipients of debiasing than the CE technique, while the AT and CE techniques require less effort from the providers of debiasing than the CS technique.
Asunto(s)
Comunicación , Habla , Humanos , Actitud , SesgoRESUMEN
PURPOSE: Acquired central dysgraphia is a heterogeneous neurological disorder that usually co-occurs with other language disorders. Written language training is relevant to improve everyday skills and as a compensatory strategy to support limited oral communication. A systematic evaluation of existing writing treatments is thus needed. METHOD: We performed a systematic review of speech and language therapies for acquired dysgraphia in studies of neurological diseases (PROSPERO: CRD42018084221), following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist with a search on several databases for articles written in English and published until August 31, 2021. Only methodological well-designed studies were included. Further assessment of methodological quality was conducted by means of a modified version of the Downs and Black checklist. RESULTS: Eleven studies of 43 patients in total were included. For each study, we collected data on type of population, type of impairment, experimental design, type of treatment, and measured outcomes. The studies had a medium level of assessed methodological quality. An informative description of treatments and linkages to deficits is reported. CONCLUSIONS: Although there is a need for further experimental evidence, most treatments showed good applicability and improvement of written skills in patients with dysgraphia. Lexical treatments appear to be more frequently adopted and more flexible in improving dysgraphia and communication, especially when a multimodal approach is used. Finally, the reported description of treatment modalities for dysgraphia in relation to patients' deficits may be important for providing tailored therapies in clinical management.
Asunto(s)
Agrafia , Trastornos del Lenguaje , Humanos , Agrafia/diagnóstico , Agrafia/etiología , Agrafia/terapia , Habla , Terapia del Lenguaje , Trastornos del Lenguaje/diagnóstico , Trastornos del Lenguaje/etiología , Trastornos del Lenguaje/terapia , LenguajeRESUMEN
This article reports the first group-based intervention study in the UK of using speech to-text (STT) technology to improve the writing of children with special educational needs and disabilities (SEND). Over a period of five years, thirty children took part in total from three settings; a mainstream school, a special school and a special unit of a different mainstream school. All children had Education, Health and Care Plans because of their difficulties in spoken and written communication. Children were trained to use the Dragon STT system, and used it on set tasks for 16-18 weeks. Handwritten text and self-esteem were assessed before and after the intervention, and screen-written text at the end. The results showed that this approach had boosted the quantity and quality of handwritten text, with post-test screen-written text significantly better than handwritten at post-test. The self-esteem instrument also showed positive and statistically significant results. The findings support the feasibility of using STT to support children with writing difficulties. All the data were gathered before the Covid-19 pandemic; the implications of this, and of the innovative research design, are discussed.
Asunto(s)
COVID-19 , Personas con Discapacidad , Niño , Humanos , Habla , Pandemias , EscrituraRESUMEN
PURPOSE: To compare the vowel emission and number counting tasks in perceptual-auditory differentiation among children with and without laryngeal lesions. METHODS: Observational, analytical, and cross-sectional methods were used. Medical records of 44 children were selected from a database of an otorhinolaryngology service at a University Hospital and they were divided into groups: without laryngeal lesion (WOLL), and with laryngeal lesion (WLL), with 33 and 11 children. For the auditory-perceptual evaluation, the vocal samples were separated according to the type of task. They were analyzed separately by a judge who analyzed the general degree of vocal deviation and assessed whether the child would pass or fail in the face of a screening situation. RESULTS: There was a difference between the WOLL and WLL groups in terms of the overall degree of vocal deviation for the task of number counting, with a predominance of mild deviations in WOLL and moderate in WLL. In the screening, there was a difference between the groups during the number counting task, with more failures in the WLL. The groups were similar in the sustained vowel task, both in terms of the overall degree of vocal deviation and the vocal screening. Most children in the WLL failed in both tasks during vocal screening compared to the children in the WOLL who, in general, failed in only one task. CONCLUSION: The task of number counting contributes to the auditory differentiation in children with and without laryngeal lesion, by identifying deviations of greater intensity in children with laryngeal lesion.
OBJETIVO: comparar as tarefas de emissão de vogal e de contagem de números na diferenciação perceptivo-auditiva de crianças com e sem lesão laríngea. MÉTODO: Estudo observacional, analítico, transversal. Utilizou-se banco de dados de uma pesquisa de doutorado, com resultados de avaliações laringológicas e gravação de amostras vocais de 44 crianças que se dividiram em: Grupo sem lesão laríngea (GSLL), com 33 crianças; e grupo com lesão laríngea (GCLL), com 11 crianças. Para a avaliação perceptivo-auditiva, as amostras vocais foram separadas de acordo com o tipo de tarefa e analisadas separadamente por uma juíza, que analisou o grau geral do desvio vocal e informou se, diante de uma situação de triagem, a criança passaria ou falharia. RESULTADOS: Houve diferença entre os grupos GSLL e GCLL quanto ao grau geral do desvio vocal para tarefa de emissão de números, com predomínio de desvios discretos no GSLL e moderados no GCLL. Quanto à triagem, houve diferença entre os grupos para a tarefa de contagem, com mais falhas no GCLL. Os grupos foram semelhantes na tarefa de vogal, tanto no que se refere à intensidade do desvio quanto ao resultado da triagem. A maior parte das crianças do GCLL falhou em ambas as tarefas na situação de triagem vocal, com diferença em relação às crianças do GSLL que, em geral, falharam em apenas uma tarefa. CONCLUSÃO: A tarefa de contagem de números contribui para a diferenciação auditiva de crianças com e sem lesão laríngea, por identificar desvios de maior intensidade em crianças com lesão.
Asunto(s)
Trastornos de la Voz , Voz , Humanos , Niño , Habla , Calidad de la Voz , Estudios Transversales , Trastornos de la Voz/diagnóstico , Acústica del LenguajeRESUMEN
BACKGROUND: For the professions of audiology and speech-language therapy (A/SLT), there continues be a dire need for more equitable services. Therefore there is a need to develop emerging practices which have a specific focus on equity as a driving force in shifting practices. This scoping review aimed to synthesise the characteristics of emerging practices in A/SLT clinical practice in relation to equity with an emphasis on communication professions. METHODS: This scoping review followed the Joanna Briggs Institute guidelines and aimed to map the emerging practices in A/SLT to identify the ways in which the professions are developing equitable practices. Papers were included if they addressed equity, focused on clinical practice and were situated within A/SLT literature. There were no time or language restrictions. The review included all sources of evidence across PubMed, Scopus, EbscoHost, The Cochrane Library and Dissertation Abstracts International, Education Resource Information Centre from their inception. The review uses PRISMA Extension for scoping reviews and PRISMA-Equity Extension reporting guidelines. RESULTS: The 20 included studies ranged from 1997-2020, spanning over 20 years. There were a variety of papers including empirical studies, commentaries, reviews and research. The results demonstrated that the professions were increasingly considering addressing equity through their practice. However, there was a prominent focus around culturally and linguistically diverse populations, with limited engagement around other intersections of marginalisation. The results also showed that while the majority of contributions to theorising equity are from the Global North with a small cluster from the Global South offering critical contributions considering social categories such as race and class. Collectively the contributions from the Global South remain a very small minority of the professional discourse which have a focus on equity. CONCLUSION: Over the last eight years, the A/SLT professions are increasingly developing emerging practices to advance equity by engaging with marginalised communities. However, the professions have a long way to go to achieve equitable practice. The decolonial lens acknowledges the impact and influence of colonisation and coloniality in shaping inequity. Using this lens, we argue for the need to consider communication as a key aspect of health necessary to achieve health equity.
Asunto(s)
Audiología , Terapia del Lenguaje , Humanos , Terapia del Lenguaje/educación , Habla , Logopedia/educación , Práctica ProfesionalRESUMEN
Voice-based depression detection methods have been studied worldwide as an objective and easy method to detect depression. Conventional studies estimate the presence or severity of depression. However, an estimation of symptoms is a necessary technique not only to treat depression, but also to relieve patients' distress. Hence, we studied a method for clustering symptoms from HAM-D scores of depressed patients and by estimating patients in different symptom groups based on acoustic features of their speech. We could separate different symptom groups with an accuracy of 79%. The results suggest that voice from speech can estimate the symptoms associated with depression.
Asunto(s)
Trastorno Depresivo Mayor , Voz , Humanos , Depresión , Trastorno Depresivo Mayor/diagnóstico , Habla , AcústicaRESUMEN
BACKGROUND: The aim of this study was to investigate real-life speech levels of health professionals during communication with older inpatients in small group settings. METHODS: This is a prospective observational study assessing group interactions between geriatric inpatients and health professionals in a geriatric rehabilitation unit of a tertiary university hospital (Bern, Switzerland). We measured speech levels of health professionals during three typical group interactions (discharge planning meeting (n = 21), chair exercise group (n = 5), and memory training group (n = 5)) with older inpatients. Speech levels were measured using the CESVA LF010 (CESVA instruments s.l.u., Barcelona, Spain). A threshold of <60 dBA was defined as a potentially inadequate speech level. RESULTS: Overall, mean talk time of recorded sessions was 23.2 (standard deviation 8.3) minutes. The mean proportion of talk time with potentially inadequate speech levels was 61.6% (sd 32.0%). The mean proportion of talk time with potentially inadequate speech levels was significantly higher in chair exercise groups (95.1% (sd 4.6%)) compared to discharge planning meetings (54.8% (sd 32.5%), p = 0.01) and memory training groups (56.3% (sd 25.4%), p = 0.01). CONCLUSIONS: Our data show that real-life speech level differs between various types of group settings and suggest potentially inadequate speech levels by healthcare professionals requiring further study.
Asunto(s)
Pacientes Internos , Habla , Humanos , Anciano , Personal de Salud , Comunicación , Atención a la SaludRESUMEN
Acoustic cues of voice gender influence not only how people perceive the speaker's gender (e.g., whether that person is a man, woman, or non-binary) but also how they perceive certain phonemes produced by that person. One such sociophonetic cue is the [s]/[Ê] distinction in English; which phoneme is perceived depends on the perceived gender of the speaker. Recent research has shown that gender expansive people differ from cisgender people in their perception of voice gender and thus, this could be reflected in their categorization of sibilants. Despite this, there has been no research to date on how gender expansive people categorize sibilants. Furthermore, while voice gender expression is often discussed within a biological context (e.g., vocal folds), voice extends to those who use other communication methods. The current study fills this gap by explicitly recruiting people of all genders and asking them to perform a sibilant categorization task using synthetic voices. The results show that cisgender and gender expansive people perceive synthetic sibilants differently, especially from a "nonbinary" synthetic voice. These results have implications for developing more inclusive speech technology for gender expansive individuals, in particular for nonbinary people who use speech-generating devices.
Asunto(s)
Habla , Voz , Humanos , Femenino , Masculino , Señales (Psicología) , Lenguaje , PercepciónRESUMEN
The widespread ubiquity of hate speech affects people's attitudes and behavior. Exposure to hate speech can lead to prejudice, dehumanization, and lack of empathy towards members of outgroups. However, the impact of exposure to hate speech on empathy and propensity to attribute mental states to others has never been directly tested empirically. In this fMRI study, we examine the effects of exposure to hate speech on neural mechanisms of empathy towards ingroup (Poles) versus outgroup members (Arabs). Thirty healthy young adults were randomly assigned to 2 groups: hateful and neutral. During the fMRI study, they were initially exposed to hateful or neutral comments and subsequently to narratives depicting Poles and Arabs in pain. Using whole-brain and region of interest analysis, we showed that exposure to derogatory language about migrants attenuates the brain response to someone else's pain in the right temporal parietal junction (rTPJ), irrespective of group membership (Poles or Arabs). Given that rTPJ is associated with processes relevant to perspective-taking, its reduced activity might be related to a decreased propensity to take the psychological perspective of others. This finding suggests that hate speech affects human functioning beyond intergroup relations.
Asunto(s)
Odio , Habla , Adulto Joven , Humanos , Empatía , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Dolor/psicologíaRESUMEN
In the current study, we asked whether delays in the earliest stages of picture naming elicit disfluency. To address this question, we used a network task, where participants describe the route taken by a marker through visually presented networks of objects. Additionally, given that disfluencies are arguably multifactorial, we combined this task with eye tracking, to be able to disentangle disfluency related to word preparation from other factors (e.g., stalling strategy). We used visual blurring, which hinders visual identification of the items and thereby slows down selection of a lexical concept. We tested the effect of this manipulation on disfluency production and visual attention. Blurriness did not lead to more disfluency on average and viewing times decreased with blurred pictures. However, multivariate pattern analyses revealed that a classifier could predict above chance, from the pattern of disfluency, whether each participant was about to name blurred or control pictures. Impeding the conceptual generation of a message therefore affected the pattern of disfluencies of each participant individually, but this pattern was not consistent from one participant to another. Additionally, some of the disfluency and eye-movement variables correlated with individual cognitive differences, in particular with inhibition.
Asunto(s)
Movimientos Oculares , Habla , Humanos , Medición de la Producción del Habla , Análisis Multivariante , Inhibición PsicológicaRESUMEN
Speech enhancement tasks for audio with a low SNR are challenging. Existing speech enhancement methods are mainly designed for high SNR audio, and they usually use RNNs to model audio sequence features, which causes the model to be unable to learn long-distance dependencies, thus limiting its performance in low-SNR speech enhancement tasks. We design a complex transformer module with sparse attention to overcome this problem. Different from the traditional transformer model, this model is extended to effectively model complex domain sequences, using the sparse attention mask balance model's attention to long-distance and nearby relations, introducing the pre-layer positional embedding module to enhance the model's perception of position information, adding the channel attention module to enable the model to dynamically adjust the weight distribution between channels according to the input audio. The experimental results show that, in the low-SNR speech enhancement tests, our models have noticeable performance improvements in speech quality and intelligibility, respectively.