Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 5.214
Filtrar
1.
J Acoust Soc Am ; 152(4): 2013, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36319233

RESUMO

The purpose of this investigation was to determine if a group of listeners having thresholds at 4 kHz exceeding 7.5 dB HL, and no more than "slight" hearing loss, would exhibit degradations in performance when "target" stimuli were masked tokens of speech. Intelligibility thresholds and detection thresholds were measured separately for speech masked by flat-spectrum noise or speech-shaped noise. Both NoSo and NoSπ configurations were employed. Consistent with findings of earlier investigations, when maskers and speech tokens were broadband, NoSo and NoSπ detection thresholds were substantially lower than intelligibility thresholds. More importantly, for the small cohorts tested, mean thresholds obtained from the ≤7.5 dB and >7.5 dB groups were equivalent. When maskers and speech targets were high-pass filtered at 500 Hz and above, the mean intelligibility thresholds obtained from the >7.5 dB group were about 4 dB higher than those obtained from the ≤7.5 dB group, independent of masker type and interaural configuration of the stimuli. In real-world listening situations, such deficits may manifest themselves as substantially reduced speech intelligibility and, perhaps, increased "listening effort" for listeners whose thresholds at 4 kHz exceed 7.5 dB HL and who have no more than "slight" hearing loss.


Assuntos
Surdez , Perda Auditiva , Percepção da Fala , Humanos , Fala , Limiar Auditivo , Ruído , Mascaramento Perceptivo , Inteligibilidade da Fala
2.
Am J Speech Lang Pathol ; 31(6): 2789-2805, 2022 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-36327495

RESUMO

PURPOSE: This study investigated the effects of three clear speech variants on sentence intelligibility and speaking effort for speakers with Parkinson's disease (PD) and age- and sex-matched neurologically healthy controls. METHOD: Fourteen speakers with PD and 14 neurologically healthy speakers participated. Each speaker was recorded reading 18 sentences from the Speech Intelligibility Test in their habitual speaking style and for three clear speech variants: clear (SC; given instructions to speak clearly), hearing impaired (HI; given instructions to speak with someone with a hearing impairment), and overenunciate (OE; given instructions to overenunciate each word). Speakers rated the amount of physical and mental effort exerted during each speaking condition using visual analog scales (averaged to yield a metric of overall speaking effort). Sentence productions were orthographically transcribed by 50 naive listeners. Linear mixed-effects models were used to compare intelligibility and speaking effort across the clear speech variants. RESULTS: Intelligibility was reduced for the PD group in comparison to the control group only in the habitual condition. All clear speech variants significantly improved intelligibility above habitual levels for the PD group, with OE maximizing intelligibility, followed by the SC and HI conditions. Both groups rated speaking effort to be significantly higher for both the OE and HI conditions versus the SC and habitual conditions. DISCUSSION: For speakers with PD, all clear speech variants increased intelligibility to a level comparable to that of healthy controls. All clear speech variants were also associated with higher levels of speaking effort than habitual speech for the speakers with PD. Clinically, findings suggest that clear speech training programs consider using the instruction "overenunciate" for maximizing intelligibility. Future research is needed to identify if high levels of speaking effort elicited by the clear speech variants affect long-term sustainability of the intelligibility benefit.


Assuntos
Perda Auditiva , Doença de Parkinson , Humanos , Acústica da Fala , Doença de Parkinson/complicações , Doença de Parkinson/diagnóstico , Nafazolina , Inteligibilidade da Fala , Medida da Produção da Fala , Perda Auditiva/complicações , Disartria/etiologia , Disartria/complicações
3.
Trends Hear ; 26: 23312165221134378, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36437739

RESUMO

Unhindered auditory and visual signals are essential for a sufficient speech understanding of cochlear implant (CI) users. Face masks are an important hygiene measurement against the COVID-19 virus but disrupt these signals. This study determinates the extent and the mechanisms of speech intelligibility alteration in CI users caused by different face masks. The audiovisual German matrix sentence test was used to determine speech reception thresholds (SRT) in noise in different conditions (audiovisual, audio-only, speechreading and masked audiovisual using two different face masks). Thirty-seven CI users and ten normal-hearing listeners (NH) were included. CI users showed a reduction in speech reception threshold of 5.0 dB due to surgical mask and 6.5 dB due to FFP2 mask compared to the audiovisual condition without mask. The greater proportion of reduction in SRT by mask could be accounted for by the loss of the visual signal (up to 4.5 dB). The effect of each mask was significantly larger in CI users who exclusively hear with their CI (surgical: 7.8 dB, p = 0.005 and FFP2: 8.7 dB, p = 0.01) compared to NH (surgical: 3.8 dB and FFP2: 5.1 dB). This study confirms that CI users who exclusively rely on their CI for hearing are particularly susceptible. Therefore, visual signals should be made accessible for communication whenever possible, especially when communicating with CI users.


Assuntos
COVID-19 , Implantes Cocleares , Percepção da Fala , Humanos , Máscaras/efeitos adversos , Pandemias , Inteligibilidade da Fala
4.
Trends Hear ; 26: 23312165221134003, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36426573

RESUMO

Pupillometry data are commonly reported relative to a baseline value recorded in a controlled pre-task condition. In this study, the influence of the experimental design and the preparatory processing related to task difficulty on the baseline pupil size was investigated during a speech intelligibility in noise paradigm. Furthermore, the relationship between the baseline pupil size and the temporal dynamics of the pupil response was assessed. The analysis revealed strong effects of block presentation order, within-block sentence order and task difficulty on the baseline values. An interaction between signal-to-noise ratio and block order was found, indicating that baseline values reflect listener expectations arising from the order in which the different blocks were presented. Furthermore, the baseline pupil size was found to affect the slope, delay and curvature of the pupillary response as well as the peak pupil dilation. This suggests that baseline correction might be sufficient when reporting pupillometry results in terms of mean pupil dilation only, but not when a more complex characterization of the temporal dynamics of the response is considered. By clarifying which factors affect baseline pupil size and how baseline values interact with the task-evoked response, the results from the present study can contribute to a better interpretation of the pupillary response as a marker of cognitive processing.


Assuntos
Ruído , Pupila , Humanos , Pupila/fisiologia , Ruído/efeitos adversos , Inteligibilidade da Fala/fisiologia , Razão Sinal-Ruído
5.
Behav Neurol ; 2022: 1224680, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36225387

RESUMO

The purpose of the study is to investigate how much of variance in Parkinson's Disease (PD) individuals' speech intelligibility could be predicted by seven speech fluency indicators (i.e., repetition, omission, distortion, correction, unfilled pauses, filled pauses, and speaking rate). Speech data were retrieved from a database containing a reading task produced by a group of 16 English-speaking individuals with PD (Jaeger, Trivedi & Stadtchnitzer, 2019). The results from a multiple regression indicated that an addition of 54% of variance in the speech intelligibility scores among individuals with PD could be accounted for after the speakers' PD severity level measured based on Hoehn and Yahr's (1967) disease stage was included as a covariate. In addition, omission and correction were the two fluency indicators that contributed to the general intelligibility score in a statistically significant way. Specifically, for every one-unit gain in the number of correction and omission, speech intelligibility scores would decline by 0.687 and 0.131 point (out of a 7-point scale), respectively. The current study hence supported Magee, Copland, and Vogel's (2019) view that the language production abilities and quantified dysarthria measures among individuals with PD should be explored together. Additionally, the clinical implications based on the current findings were discussed.


Assuntos
Doença de Parkinson , Inteligibilidade da Fala , Disartria , Humanos , Idioma , Medida da Produção da Fala/métodos
6.
J Med Internet Res ; 24(10): e40567, 2022 10 20.
Artigo em Inglês | MEDLINE | ID: mdl-36264608

RESUMO

BACKGROUND: Most individuals with Parkinson disease (PD) experience a degradation in their speech intelligibility. Research on the use of automatic speech recognition (ASR) to assess intelligibility is still sparse, especially when trying to replicate communication challenges in real-life conditions (ie, noisy backgrounds). Developing technologies to automatically measure intelligibility in noise can ultimately assist patients in self-managing their voice changes due to the disease. OBJECTIVE: The goal of this study was to pilot-test and validate the use of a customized web-based app to assess speech intelligibility in noise in individuals with dysarthria associated with PD. METHODS: In total, 20 individuals with dysarthria associated with PD and 20 healthy controls (HCs) recorded a set of sentences using their phones. The Google Cloud ASR API was used to automatically transcribe the speakers' sentences. An algorithm was created to embed speakers' sentences in +6-dB signal-to-noise multitalker babble. Results from ASR performance were compared to those from 30 listeners who orthographically transcribed the same set of sentences. Data were reduced into a single event, defined as a success if the artificial intelligence (AI) system transcribed a random speaker or sentence as well or better than the average of 3 randomly chosen human listeners. These data were further analyzed by logistic regression to assess whether AI success differed by speaker group (HCs or speakers with dysarthria) or was affected by sentence length. A discriminant analysis was conducted on the human listener data and AI transcriber data independently to compare the ability of each data set to discriminate between HCs and speakers with dysarthria. RESULTS: The data analysis indicated a 0.8 probability (95% CI 0.65-0.91) that AI performance would be as good or better than the average human listener. AI transcriber success probability was not found to be dependent on speaker group. AI transcriber success was found to decrease with sentence length, losing an estimated 0.03 probability of transcribing as well as the average human listener for each word increase in sentence length. The AI transcriber data were found to offer the same discrimination of speakers into categories (HCs and speakers with dysarthria) as the human listener data. CONCLUSIONS: ASR has the potential to assess intelligibility in noise in speakers with dysarthria associated with PD. Our results hold promise for the use of AI with this clinical population, although a full range of speech severity needs to be evaluated in future work, as well as the effect of different speaking tasks on ASR.


Assuntos
Doença de Parkinson , Percepção da Fala , Humanos , Disartria/etiologia , Disartria/complicações , Doença de Parkinson/complicações , Inteligência Artificial , Inteligibilidade da Fala
7.
Am J Speech Lang Pathol ; 31(6): 2688-2706, 2022 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-36301994

RESUMO

PURPOSE: This pilot research project sought to determine if an intensive accent modification training program that included See the Sound-Visual Phonics and prosodic gestures improved articulation, prosody, and intelligibility measures in refugees from Burma. PARTICIPANTS: Four individuals (two men, two women) aged 20-67 participated in this study, and they were recruited from a state organization supporting refugees who have resettled in the United States. METHOD: All participants completed the Proficiency in Oral English Communication (POEC) and Assessment of Intelligibility of Dysarthric Speech (AIDS) to measure pre- and posttraining changes. The duration of this study was 6 weeks and consisted of 1 week of pretesting, 4 weeks of accent modification training, and 1 week of posttesting. Participants attended a total of twelve 50-min accent modification training sessions, including eight individual sessions (twice per week) and four group sessions (once per week), which provided a functional way to practice newly acquired skills in a scripted conversational-type format. Trained and untrained articulation and prosody probes were used to establish baselines and measure change. RESULTS: All four participants showed gains across articulation and prosody (in untrained and trained items). On pre- and posttest measures, three of the four participants also made gains on the broad measures of the AIDS and the POEC. CONCLUSION: Findings support that a brief and intensive multimodality accent modification program can be beneficial.


Assuntos
Síndrome de Imunodeficiência Adquirida , Refugiados , Masculino , Feminino , Humanos , Inteligibilidade da Fala , Mianmar , Testes Auditivos
8.
J Speech Lang Hear Res ; 65(11): 4060-4070, 2022 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-36198057

RESUMO

PURPOSE: This study investigated whether listener processing of dysarthric speech requires the recruitment of more cognitive resources (i.e., higher levels of listening effort) than neurotypical speech. We also explored relationships between behavioral listening effort, perceived listening effort, and objective measures of word transcription accuracy. METHOD: A word recall paradigm was used to index behavioral listening effort. The primary task involved word transcription, whereas a memory task involved recalling words from previous sentences. Nineteen listeners completed the paradigm twice, once while transcribing dysarthric speech and once while transcribing neurotypical speech. Perceived listening effort was rated using a visual analog scale. RESULTS: Results revealed significant effects of dysarthria on the likelihood of correct word recall, indicating that the transcription of dysarthric speech required higher levels of behavioral listening effort relative to neurotypical speech. There was also a significant relationship between transcription accuracy and measures of behavioral listening effort, such that listeners who were more accurate in understanding dysarthric speech exhibited smaller changes in word recall when listening to dysarthria. The subjective measure of perceived listening effort did not have a statistically significant correlation with measures of behavioral listening effort or transcription accuracy. CONCLUSIONS: Results suggest that cognitive resources, particularly listeners' working memory capacity, are more taxed when deciphering dysarthric versus neurotypical speech. An increased demand on these resources may affect a listener's ability to remember aspects of their conversations with people with dysarthria, even when the speaker is fully intelligible.


Assuntos
Inteligibilidade da Fala , Percepção da Fala , Humanos , Disartria/psicologia , Percepção da Fala/fisiologia , Esforço de Escuta , Percepção Auditiva
9.
J Acoust Soc Am ; 152(3): 1528, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36182271

RESUMO

Speech production while wearing hearing protectors poses significant challenges due to their occlusion effect and disruption of the Lombard effect. An experiment was conducted with 24 individuals as they read a list of 12 sentences in open ears and while wearing an earmuff in quiet and in four different noises [pink, International Female Fluctuating Masker (IFFM), speech-spectrum noise (SSnoise), and helicopter] at two levels (70 and 85 dBA). An acoustic manikin, fitted or not with an identical protector, served as the target listener. In noise, speech levels decreased when the talkers wore the earmuff but increased when the target listener was fitted with the earmuff. When the earmuff was used by both the talkers and target listener, speech levels were lower by 3-6 dB at the higher noise level compared to when they were both open ears. Speech levels were typically lower, but extended speech intelligibility index estimates were consistently higher, in fluctuating (IFFM, helicopter) than in continuous noises (pink, SSnoise). Talkers' pitch frequency and voice spectrum measurements followed very closely the changes in speech levels, showing no evidence of compensatory voice modifications. Implications of the lower talker speech levels when wearing hearing protectors are discussed in terms of protector selection, training, and individuals with hearing loss.


Assuntos
Ruído , Percepção da Fala , Dispositivos de Proteção das Orelhas , Feminino , Audição , Humanos , Ruído/efeitos adversos , Mascaramento Perceptivo , Inteligibilidade da Fala
10.
J Acoust Soc Am ; 152(3): 1573, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36182275

RESUMO

Natural, conversational speech signals contain sources of symbolic and iconic information, both of which are necessary for the full understanding of speech. But speech intelligibility tests, which are generally derived from written language, present only symbolic information sources, including lexical semantics and syntactic structures. Speech intelligibility tests exclude almost all sources of information about talkers, including their communicative intentions and their cognitive states and processes. There is no reason to suspect that either hearing impairment or noise selectively affect perception of only symbolic information. We must therefore conclude that diagnosis of good or poor speech intelligibility on the basis of standard speech tests is based on measurement of only a fraction of the task of speech perception. This paper presents a descriptive comparison of information sources present in three widely used speech intelligibility tests and spontaneous, conversational speech elicited using a referential communication task. The aim of this comparison is to draw attention to the differences in not just the signals, but the tasks of listeners perceiving these different speech signals and to highlight the implications of these differences for the interpretation and generalizability of speech intelligibility test results.


Assuntos
Inteligibilidade da Fala , Percepção da Fala , Cognição , Idioma , Ruído/efeitos adversos
11.
PLoS One ; 17(10): e0275779, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36227836

RESUMO

PURPOSE: The current study investigated the therapeutic potential of transcranial direct current stimulation (tDCS) on speech intelligibility, speech-related physiological and vocal functions among post-stroke dysarthric patients. METHOD: Nine chronic post-stroke dysarthric patients were randomly assigned to the stimulation or sham group. The stimulation group received 2mA of anodal tDCS over the left inferior primary motor cortex for 15 minutes, while the sham group received 30s of stimulation under the same settings. All the participants received 10 daily 15 minutes of individualized speech therapy targeting their dominant phonological process or phonemes with the greatest difficulty. The outcome measures included (1) perceptual analysis of single words, passage reading and diadochokinetic rate, (2) acoustic analysis of a sustained vowel, and (3) kinematic analysis of rapid syllable repetitions and syllable production in sentence, conducted before and after the treatment. RESULTS: The results revealed that both the stimulation and sham groups had improved perceptual speech intelligibility at the word level, reduced short rushes of speech during passage reading, improved rate during alternating motion rate, AMR-kha1, and improved articulatory kinematics in AMR-tha1 and syllables /tha1/ and /kha1/ production in sentence. Compared to the sham group, the stimulation group showed significant improvement in articulatory kinematics in AMR-kha1 and syllable /kha1/ production in sentence. The findings also showed that anodal stimulation led to reduced shimmer value in sustained vowel /a/ phonation, positive changes in articulatory kinematics in AMR-tha1 and syllables /pha1/ and /kha1/ production in sentence at the post treatment measure. In addition to positive effects on articulatory control, reduced perturbation of voice amplitude documented in the stimulation group post treatment suggests possible tDCS effects on the vocal function. CONCLUSIONS: The current study documented the beneficial effects of anodal tDCS over the primary motor cortex on speech production and suggested that combined tDCS and speech therapy may promote recovery from post-stroke dysarthria.


Assuntos
Córtex Motor , Acidente Vascular Cerebral , Estimulação Transcraniana por Corrente Contínua , Humanos , Projetos Piloto , Inteligibilidade da Fala , Acidente Vascular Cerebral/complicações , Acidente Vascular Cerebral/terapia , Estimulação Transcraniana por Corrente Contínua/métodos
12.
Turk J Med Sci ; 52(2): 436-444, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36161630

RESUMO

BACKGROUND: In our study, we aimed to evaluate the hearing aid benefit and speech intelligibility with hearing aids using objective and subjective measurements, according to the type of hearing loss in elderly individuals who used different types of hearing aids. METHODS: The objective and subjective findings from a total of 47 elderly individuals between the ages of 60 and 84, who used regular hearing aids for at least six months, and who were diagnosed with different types and degrees of hearing loss were evaluated by scanning them retrospectively. RESULTS: : In our study, the Adaptive Turkish matrix sentence test (ATMST) was carried out with binaural headphones, and a statistically significant difference was observed between the ATMST scores of individuals with symmetrical hearing loss. A significant difference was found between the ATMST score averages for individuals with symmetrical hearing loss (S0 N90 and S0 N270) and asymmetric hearing loss (S0 N0 and S0 N270) in the free area. A significant difference was found between abbreviated profile of hearing aid benefit satisfaction questionnaires before and after hearing aid use in all groups. DISCUSSION: The Turkish matrix sentence test (TMST) in noise can be used routinely in clinics in order to evaluate the possible hearing loss from the daily environment and the hearing aid effectiveness.


Assuntos
Auxiliares de Audição , Perda Auditiva Neurossensorial , Perda Auditiva , Percepção da Fala , Idoso , Criança , Pré-Escolar , Perda Auditiva Neurossensorial/reabilitação , Humanos , Estudos Retrospectivos , Inteligibilidade da Fala
13.
JASA Express Lett ; 2(4): 045204, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-36154231

RESUMO

This study examined how speaking style and guise influence the intelligibility of text-to-speech (TTS) and naturally produced human voices. Results showed that TTS voices were less intelligible overall. Although using a clear speech style improved intelligibility for both human and TTS voices (using "newscaster" neural TTS), the clear speech effect was stronger for TTS voices. Finally, a visual device guise decreased intelligibility, regardless of voice type. The results suggest that both speaking style and visual guise affect intelligibility of human and TTS voices. Findings are discussed in terms of theories about the role of social information in speech perception.


Assuntos
Percepção da Fala , Envio de Mensagens de Texto , Voz , Cognição , Humanos , Inteligibilidade da Fala
14.
JASA Express Lett ; 2(2): 025201, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-36154263

RESUMO

Listeners can understand speech in noise by "glimpsing" some of the speech regions less affected by noise. This study investigates the contributions of those spectro-temporal regions, known as glimpses, at different energy levels to speech intelligibility in noise. Two listening experiments were conducted to examine the intelligibility of speech in different glimpse compositions in two types of noise. The results suggest that glimpsed spectro-temporal regions with energy above the mean noise level are the primary cue for speech perception in noise, and that listeners can use less-robust cues until at least 15 dB below the glimpsing threshold.


Assuntos
Ruído , Percepção da Fala , Percepção Auditiva , Sinais (Psicologia) , Ruído/efeitos adversos , Inteligibilidade da Fala
15.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 1972-1976, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-36086160

RESUMO

Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.e., 0.8) scores in each test condition; and the human listening test showed that the average word correct rate of eight listeners was higher than 97.5%. These findings indicated that the proposed deep learning-based system can be a potential approach to synthesize a highly intelligible speech with limited envelope information in the future.


Assuntos
Aprendizado Profundo , Percepção da Fala , Percepção Auditiva , Humanos , Inteligibilidade da Fala , Fatores de Tempo
16.
PLoS One ; 17(9): e0272127, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36107945

RESUMO

PURPOSE: It is well known that speech uses both the auditory and visual modalities to convey information. In cases of congenital sensory deprivation, the feedback language learners have access to for mapping visible and invisible orofacial articulation is impoverished. Although the effects of blindness on the movements of the lips, jaw, and tongue have been documented in francophone adults, not much is known about their consequences for speech intelligibility. The objective of this study is to investigate the effects of congenital visual deprivation on vowel intelligibility in adult speakers of Canadian French. METHOD: Twenty adult listeners performed two perceptual identification tasks in which vowels produced by congenitally blind adults and sighted adults were used as stimuli. The vowels were presented in the auditory, visual, and audiovisual modalities (experiment 1) and at different signal-to-noise ratios in the audiovisual modality (experiment 2). Correct identification scores were calculated. Sequential information analyses were also conducted to assess the amount of information transmitted to the listeners along the three vowel features of height, place of articulation, and rounding. RESULTS: The results showed that, although blind speakers did not differ from their sighted peers in the auditory modality, they had lower scores in the audiovisual and visual modalities. Some vowels produced by blind speakers are also less robust in noise than those produced by sighted speakers. CONCLUSION: Together, the results suggest that adult blind speakers have learned to adapt to their sensory loss so that they can successfully achieve intelligible vowel targets in non-noisy conditions but that they produce less intelligible speech in noisy conditions. Thus, the trade-off between visible (lips) and invisible (tongue) articulatory cues observed between vowels produced by blind and sighted speakers is not equivalent in terms of perceptual efficiency.


Assuntos
Acústica da Fala , Percepção da Fala , Cegueira/congênito , Canadá , Humanos , Inteligibilidade da Fala , Medida da Produção da Fala
17.
J Acoust Soc Am ; 152(2): 970, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-36050149

RESUMO

The intelligibility of interrupted speech stimuli has been known to be almost perfect when segment duration is shorter than 80 ms, which means that the interrupted segments are perceptually organized into a coherent stream under this condition. However, why listeners can successfully group the interrupted segments into a coherent stream has been largely unknown. Here, we show that the intelligibility for mosaic speech in which original speech was segmented in frequency and time and noise-vocoded with the average power in each unit was largely reduced by periodical interruption. At the same time, the intelligibility could be recovered by promoting auditory grouping of the interrupted segments by stretching the segments up to 40 ms and reducing the gaps, provided that the number of frequency bands was enough ( ≥ 4) and the original segment duration was equal to or less than 40 ms. The interruption was devastating for mosaic speech stimuli, very likely because the deprivation of periodicity and temporal fine structure with mosaicking prevented successful auditory grouping for the interrupted segments.


Assuntos
Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Ruído
18.
Comput Intell Neurosci ; 2022: 4473952, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36059405

RESUMO

Bone-conducted microphone (BCM) senses vibrations from bones in the skull during speech to electrical audio signal. When transmitting speech signals, bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and have better noise-resistance capabilities than standard air-conduction microphones (ACMs). BCMs have a different frequency response than ACMs because they only capture the low-frequency portion of speech signals. When we replace an ACM with a BCM, we may get satisfactory noise suppression results, but the speech quality and intelligibility may suffer due to the nature of the solid vibration. Mismatched BCM and ACM characteristics can also have an impact on ASR performance, and it is impossible to recreate a new ASR system using voice data from BCMs. The speech intelligibility of a BCM-conducted speech signal is determined by the location of the bone used to acquire the signal and accurately model phonemes of words. Deep learning techniques such as neural network have traditionally been used for speech recognition. However, neural networks have a high computational cost and are unable to model phonemes in signals. In this paper, the intelligibility of BCM signal speech was evaluated for different bone locations, namely the right ramus, larynx, and right mastoid. Listener and deep learning architectures such as CapsuleNet, UNet, and S-Net were used to acquire the BCM signal for Tamil words and evaluate speech intelligibility. As validated by the listener and deep learning architectures, the Larynx bone location improves speech intelligibility.


Assuntos
Aprendizado Profundo , Percepção da Fala , Índia , Idioma , Inteligibilidade da Fala/fisiologia
19.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 2581-2584, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-36085897

RESUMO

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals. Despite improving the speech quality, such approaches do not deliver required levels of speech intelligibility in everyday noisy environments. Intelligibility-oriented (I-O) loss functions have recently been developed to train DL approaches for robust speech enhancement. Here, we formulate, for the first time, a novel canonical correlation based I-O loss function to more effectively train DL algorithms. Specifically, we present a canonical-correlation based short-time objective intelligibility (CC-STOI) cost function to train a fully convolutional neural network (FCN) model. We carry out comparative simulation experiments to show that our CC-STOI based speech enhancement framework outperforms state-of-the-art DL models trained with conventional distance-based and STOI-based loss functions, using objective and subjective evaluation measures for case of both unseen speakers and noises. Ongoing future work is evaluating the proposed approach for design of robust hearing-assistive technology.


Assuntos
Aprendizado Profundo , Inteligibilidade da Fala , Algoritmos , Análise de Correlação Canônica , Audição
20.
J Speech Lang Hear Res ; 65(11): 4498-4506, 2022 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-36179216

RESUMO

PURPOSE: Down syndrome occurs in one of 700 births, and high rates of hearing loss are reported in this population. This puts children with Down syndrome at risk for communication, learning, and social development difficulties, compounding known language and cognitive vulnerabilities in this population. The purpose of this study was to comprehensively characterize audiological profiles in children with Down syndrome, including the use of extended high-frequency sensitivity and speech intelligibility index assessment. METHOD: Participants were 18 children with Down syndrome between 5 and 17 years of age. Audiological profiles were characterized using behavioral audiometry, tympanometry, and wideband acoustic immittance (WAI). Audibility was characterized using the speech intelligibility index. RESULTS: Of the participants successfully completing behavioral audiometry, hearing loss of a moderate or greater degree was observed in one or both ears for 46% of the participants at conventional audiometric test frequencies and 85% of the participants at frequencies above 8 kHz. Seven children met criteria for amplification based on the speech intelligibility index, but only two wore hearing aids. Abnormal middle ear function was found in approximately 50% of the participants for whom WAI or tympanometry were successfully measured. CONCLUSIONS: Consistent with prior research, high rates of hearing loss and middle ear dysfunction were observed. The high prevalence of hearing loss above 8 kHz suggests the importance of including extended high-frequency assessment in audiologic characterization of children with Down syndrome. Few children meeting audibility-based guidelines for amplification wore hearing aids, putting them at additional risk for speech/language and educational difficulties. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.21200422.


Assuntos
Síndrome de Down , Auxiliares de Audição , Perda Auditiva , Criança , Humanos , Síndrome de Down/complicações , Inteligibilidade da Fala , Perda Auditiva/diagnóstico , Perda Auditiva/epidemiologia , Audiometria
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...