Automatic vocal tract landmark localization from midsagittal MRI data.

Eslami, Mohammad; Neuschaefer-Rube, Christiane; Serrurier, Antoine

Eslami, Mohammad; Neuschaefer-Rube, Christiane; Serrurier, Antoine.

Afiliación

Eslami M; Clinic for Phoniatrics, Pedaudiology & Communication Disorders, University Hospital and Medical Faculty, RWTH Aachen University, Aachen, Germany. meslami@ukaachen.de.
Neuschaefer-Rube C; Clinic for Phoniatrics, Pedaudiology & Communication Disorders, University Hospital and Medical Faculty, RWTH Aachen University, Aachen, Germany.
Serrurier A; Clinic for Phoniatrics, Pedaudiology & Communication Disorders, University Hospital and Medical Faculty, RWTH Aachen University, Aachen, Germany. aserrurier@ukaachen.de.

Sci Rep ; 10(1): 1468, 2020 01 30.

Article en En | MEDLINE | ID: mdl-32001739

RESUMEN

The various speech sounds of a language are obtained by varying the shape and position of the articulators surrounding the vocal tract. Analyzing their variations is crucial for understanding speech production, diagnosing speech disorders and planning therapy. Identifying key anatomical landmarks of these structures on medical images is a pre-requisite for any quantitative analysis and the rising amount of data generated in the field calls for an automatic solution. The challenge lies in the high inter- and intra-speaker variability, the mutual interaction between the articulators and the moderate quality of the images. This study addresses this issue for the first time and tackles it by means of Deep Learning. It proposes a dedicated network architecture named Flat-net and its performance are evaluated and compared with eleven state-of-the-art methods from the literature. The dataset contains midsagittal anatomical Magnetic Resonance Images for 9 speakers sustaining 62 articulations with 21 annotated anatomical landmarks per image. Results show that the Flat-net approach outperforms the former methods, leading to an overall Root Mean Square Error of 3.6 pixels/0.36 cm obtained in a leave-one-out procedure over the speakers. The implementation codes are also shared publicly on GitHub.

Asunto(s)

Puntos Anatómicos de Referencia/diagnóstico por imagen; Imagen por Resonancia Magnética; Habla; Puntos Anatómicos de Referencia/anatomía & histología; Automatización; Aprendizaje Profundo; Epiglotis/anatomía & histología; Epiglotis/diagnóstico por imagen; Femenino; Glotis/anatomía & histología; Glotis/diagnóstico por imagen; Humanos; Labio/anatomía & histología; Labio/diagnóstico por imagen; Masculino; Boca/anatomía & histología; Boca/diagnóstico por imagen; Nasofaringe/anatomía & histología; Nasofaringe/diagnóstico por imagen; Nariz/anatomía & histología; Nariz/diagnóstico por imagen; Lengua/anatomía & histología; Lengua/diagnóstico por imagen; Pliegues Vocales/anatomía & histología; Pliegues Vocales/diagnóstico por imagen; Voz

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Habla / Imagen por Resonancia Magnética / Puntos Anatómicos de Referencia Límite: Female / Humans / Male Idioma: En Revista: Sci Rep Año: 2020 Tipo del documento: Article País de afiliación: Alemania

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google