Search | VHL Regional Portal

Linguistic disparities in cross-language automatic speech recognition transfer from Arabic to Tashlhiyt.

Zellou, Georgia; Lahrouchi, Mohamed.

Sci Rep ; 14(1): 313, 2024 01 03.

Article in English | MEDLINE | ID: mdl-38172277

ABSTRACT

Tashlhiyt is a low-resource language with respect to acoustic databases, language corpora, and speech technology tools, such as Automatic Speech Recognition (ASR) systems. This study investigates whether a method of cross-language re-use of ASR is viable for Tashlhiyt from an existing commercially-available system built for Arabic. The source and target language in this case have similar phonological inventories, but Tashlhiyt permits typologically rare phonological patterns, including vowelless words, while Arabic does not. We find systematic disparities in ASR transfer performance (measured as word error rate (WER) and Levenshtein distance) for Tashlhiyt across word forms and speaking style variation. Overall, performance was worse for casual speaking modes across the board. In clear speech, performance was lower for vowelless than for voweled words. These results highlight systematic speaking mode- and phonotactic-disparities in cross-language ASR transfer. They also indicate that linguistically-informed approaches to ASR re-use can provide more effective ways to adapt existing speech technology tools for low resource languages, especially when they contain typologically rare structures. The study also speaks to issues of linguistic disparities in ASR and speech technology more broadly. It can also contribute to understanding the extent to which machines are similar to, or different from, humans in mapping the acoustic signal to discrete linguistic representations.

Subject(s)

Speech Perception , Humans , Language , Linguistics , Speech , Speech Recognition Software

Clear speech in Tashlhiyt Berber: The perception of typologically uncommon word-initial contrasts by native and naive listeners.

Zellou, Georgia; Lahrouchi, Mohamed; Bensoukas, Karim.

J Acoust Soc Am ; 152(6): 3429, 2022 12.

Article in English | MEDLINE | ID: mdl-36586870

ABSTRACT

Tashlhiyt Berber is known for having typologically unusual word-initial phonological contrasts, specifically, word-initial singleton-geminate minimal pairs (e.g., sin vs ssin) and sequences of consonants that violate the sonority sequencing principle (e.g., non-rising sonority sequences: fsin). The current study investigates the role of a listener-oriented speaking style on the perceptual enhancement of these rarer phonological contrasts. It examines the perception of word-initial singleton, geminate, and complex onsets in Tashlhiyt Berber across clear and casual speaking styles by native and naive listeners. While clear speech boosts the discriminability of pairs containing singleton-initial words for both listener groups, only native listeners performed better in discriminating between initial singleton-geminate contrasts in clear speech. Clear speech did not improve perception for lexical contrasts containing a non-rising-sonority consonant cluster for either listener group. These results are discussed in terms of how clear speech can inform phonological typology and the role of phonetic enhancement in language-universal vs language-specific speech perception.

Subject(s)

Speech Perception , Speech , Language , Phonetics , Contrast Media

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL