Your browser doesn't support javascript.
loading
Robust fundamental frequency estimation in sustained vowels: detailed algorithmic comparisons and information fusion with adaptive Kalman filtering.
Tsanas, Athanasios; Zañartu, Matías; Little, Max A; Fox, Cynthia; Ramig, Lorraine O; Clifford, Gari D.
Afiliación
  • Tsanas A; Institute of Biomedical Engineering, Department of Engineering Science, Old Road Campus Research Building, University of Oxford, Headington, Oxford OX3 7DQ, United Kingdom.
  • Zañartu M; Department of Electronic Engineering at Universidad Técnica Federico Santa María, Av. España 1680, Casilla 110-V, Valparaiso 2390123, Chile.
  • Little MA; MIT Media Lab, 77 Massachusetts Avenue, E14/E15, Cambridge, Massachusetts 02139-4307.
  • Fox C; National Center for Voice and Speech, 136 South Main Street, Suite 320, Salt Lake City, Utah 84101-1623.
  • Ramig LO; Speech, Language, and Hearing Sciences, 2501 Kittredge Loop Road, 409 UCB, University of Colorado, Boulder, Colorado 80309-0409.
  • Clifford GD; Institute of Biomedical Engineering, Department of Engineering Science, Old Road Campus Research Building, University of Oxford, Headington, Oxford OX3 7DQ, United Kingdom.
J Acoust Soc Am ; 135(5): 2885-901, 2014 May.
Article en En | MEDLINE | ID: mdl-24815269
ABSTRACT
There has been consistent interest among speech signal processing researchers in the accurate estimation of the fundamental frequency (F(0)) of speech signals. This study examines ten F(0) estimation algorithms (some well-established and some proposed more recently) to determine which of these algorithms is, on average, better able to estimate F(0) in the sustained vowel /a/. Moreover, a robust method for adaptively weighting the estimates of individual F(0) estimation algorithms based on quality and performance measures is proposed, using an adaptive Kalman filter (KF) framework. The accuracy of the algorithms is validated using (a) a database of 117 synthetic realistic phonations obtained using a sophisticated physiological model of speech production and (b) a database of 65 recordings of human phonations where the glottal cycles are calculated from electroglottograph signals. On average, the sawtooth waveform inspired pitch estimator and the nearly defect-free algorithms provided the best individual F(0) estimates, and the proposed KF approach resulted in a ∼16% improvement in accuracy over the best single F(0) estimation algorithm. These findings may be useful in speech signal processing applications where sustained vowels are used to assess vocal quality, when very accurate F(0) estimation is required.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Fonación / Fonética / Algoritmos Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: J Acoust Soc Am Año: 2014 Tipo del documento: Article País de afiliación: Reino Unido

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Fonación / Fonética / Algoritmos Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: J Acoust Soc Am Año: 2014 Tipo del documento: Article País de afiliación: Reino Unido