Your browser doesn't support javascript.
loading
Phone duration modeling for speaker age estimation in children.
Shivakumar, Prashanth Gurunath; Bishop, Somer; Lord, Catherine; Narayanan, Shrikanth.
Afiliación
  • Shivakumar PG; Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California 90089, USA.
  • Bishop S; Department of Psychiatry, University of California, San Francisco, California 94143, USA.
  • Lord C; Semel Institute of Neuroscience and Human Behavior, University of California, Los Angeles, California 90095, USA.
  • Narayanan S; Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California 90089, USA.
J Acoust Soc Am ; 152(5): 3000, 2022 11.
Article en En | MEDLINE | ID: mdl-36456280
ABSTRACT
Automatic inference of paralinguistic information from speech, such as age, is an important area of research with many technological applications. Speaker age estimation can help with age-appropriate curation of information content and personalized interactive experiences. However, automatic speaker age estimation in children is challenging due to the paucity of speech data representing the developmental spectrum, and the large signal variability including within a given age group. Most prior approaches in child speaker age estimation adopt methods directly drawn from research on adult speech. In this paper, we propose a novel technique that exploits temporal variability present in children's speech for estimation of children's age. We focus on phone durations as biomarker of children's age. Phone duration distributions are derived by forced-aligning children's speech with transcripts. Regression models are trained to predict speaker age among children studying in kindergarten up to grade 10. Experiments on two children's speech datasets are used to demonstrate the robustness and portability of proposed features over multiple domains of varying signal conditions. Phonemes contributing most to estimation of children speaker age are analyzed and presented. Experimental results suggest phone durations contain important development-related information of children. The proposed features are also suited for application under low data scenarios.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Instituciones Académicas / Teléfono Tipo de estudio: Prognostic_studies Límite: Adult / Child / Humans Idioma: En Revista: J Acoust Soc Am Año: 2022 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Instituciones Académicas / Teléfono Tipo de estudio: Prognostic_studies Límite: Adult / Child / Humans Idioma: En Revista: J Acoust Soc Am Año: 2022 Tipo del documento: Article País de afiliación: Estados Unidos