Search | VHL Regional Portal

Multilingual end-to-end ASR for low-resource Turkic languages with common alphabets.

Bekarystankyzy, Akbayan; Mamyrbayev, Orken; Mendes, Mateus; Fazylzhanova, Anar; Assam, Muhammad.

Sci Rep ; 14(1): 13835, 2024 06 15.

Article in English | MEDLINE | ID: mdl-38879705

ABSTRACT

To obtain a reliable and accurate automatic speech recognition (ASR) machine learning model, it is necessary to have sufficient audio data transcribed, for training. Many languages in the world, especially the agglutinative languages of the Turkic family, suffer from a lack of this type of data. Many studies have been conducted in order to obtain better models for low-resource languages, using different approaches. The most popular approaches include multilingual training and transfer learning. In this study, we combined five agglutinative languages from the Turkic family-Kazakh, Bashkir, Kyrgyz, Sakha, and Tatar,-in order to provide multilingual training using connectionist temporal classification and an attention mechanism including a language model, because these languages have cognate words, sentence formation rules, and alphabet (Cyrillic). Data from the open-source database Common voice was used for the study, to make the experiments reproducible. The results of the experiments showed that multilingual training could improve ASR performances for all languages included in the experiment, except Bashkir language. A dramatic result was achieved for the Kyrgyz language: word error rate decreased to nearly one-fifth and character error rate decreased to one-fourth, which proves that this approach can be helpful for critically low-resource languages.

Subject(s)

Language , Multilingualism , Humans , Machine Learning , Speech Recognition Software

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL