Automated Speech Recognition in Adult Stroke Survivors: Comparing Human and Computer Transcriptions.

Jacks, Adam; Haley, Katarina L; Bishop, Gary; Harmon, Tyson G

Jacks, Adam; Haley, Katarina L; Bishop, Gary; Harmon, Tyson G.

Affiliation

Jacks A; Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA, adam_jacks@med.unc.edu.
Haley KL; Division of Speech and Hearing Sciences, Department of Allied Health Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
Bishop G; Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
Harmon TG; Department of Communication Disorders, Brigham Young University, Provo, Utah, USA.

Folia Phoniatr Logop ; 71(5-6): 286-296, 2019.

Article in En | MEDLINE | ID: mdl-31117105

ABSTRACT

ABSTRACT

OBJECTIVE:

Speech sound errors are common in people with a variety of communication disorders and can result in impaired message transmission to listeners. Valid and reliable metrics exist to quantify this problem, but they are rarely used in clinical settings due to the time-intensive nature of speech transcription by humans. Automated speech recognition (ASR) technologies have advanced substantially in recent years, enabling them to serve as realistic proxies for human listeners. This study aimed to determine how closely transcription scores from human listeners correspond to scores from an ASR system. PATIENTS AND

METHODS:

Sentence recordings from 10 stroke survivors with aphasia and apraxia of speech were transcribed orthographically by 3 listeners and a web-based ASR service. Adjusted transcription scores were calculated for all samples based on accuracy of transcribed content words.

RESULTS:

As expected, transcription scores were significantly higher for the humans than for ASR. However, intraclass correlations revealed excellent agreement among the humans and ASR systems, and the systematically lower scores for computer speech recognition were effectively equalized simply by adding the regression intercept.

CONCLUSIONS:

The results suggest the clinical feasibility of supplementing or substituting human transcriptions with computer-generated scores, though extension to other speech disorders requires further research.

Subject(s)

Aphasia/rehabilitation; Apraxias/rehabilitation; Speech Recognition Software; Stroke Rehabilitation/methods; Survivors; Adult; Aged; Female; Humans; Male; Middle Aged; Speech Intelligibility

Key words

Aphasia; Assessment; Automated speech recognition; Intelligibility; Speech transcription; Stroke

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Aphasia / Apraxias / Survivors / Speech Recognition Software / Stroke Rehabilitation Limits: Adult / Aged / Female / Humans / Male / Middle aged Language: En Journal: Folia Phoniatr Logop Journal subject: PATOLOGIA DA FALA E LINGUAGEM Year: 2019 Document type: Article

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google