Search | VHL Regional Portal

Confused or not Confused?: Disentangling Brain Activity from EEG Data Using Bidirectional LSTM Recurrent Neural Networks.

Ni, Zhaoheng; Yuksel, Ahmet Cem; Ni, Xiuyan; Mandel, Michael I; Xie, Lei.

ACM BCB ; 2017: 241-246, 2017 Aug.

Article in English | MEDLINE | ID: mdl-28966996

ABSTRACT

Brain fog, also known as confusion, is one of the main reasons for low performance in the learning process or any kind of daily task that involves and requires thinking. Detecting confusion in a human's mind in real time is a challenging and important task that can be applied to online education, driver fatigue detection and so on. In this paper, we apply Bidirectional LSTM Recurrent Neural Networks to classify students' confusion in watching online course videos from EEG data. The results show that Bidirectional LSTM model achieves the state-of-the-art performance compared with other machine learning approaches, and shows strong robustness as evaluated by cross-validation. We can predict whether or not a student is confused in the accuracy of 73.3%. Furthermore, we find the most important feature to detecting the brain confusion is the gamma 1 wave of EEG signal. Our results suggest that machine learning is a potentially powerful tool to model and understand brain activity.

Measuring time-frequency importance functions of speech with bubble noise.

Mandel, Michael I; Yoho, Sarah E; Healy, Eric W.

J Acoust Soc Am ; 140(4): 2542, 2016 Oct.

Article in English | MEDLINE | ID: mdl-27794278

ABSTRACT

Listeners can reliably perceive speech in noisy conditions, but it is not well understood what specific features of speech they use to do this. This paper introduces a data-driven framework to identify the time-frequency locations of these features. Using the same speech utterance mixed with many different noise instances, the framework is able to compute the importance of each time-frequency point in the utterance to its intelligibility. The mixtures have approximately the same global signal-to-noise ratio at each frequency, but very different recognition rates. The difference between these intelligible vs unintelligible mixtures is the alignment between the speech and spectro-temporally modulated noise, providing different combinations of "glimpses" of speech in each mixture. The current results reveal the locations of these important noise-robust phonetic features in a restricted set of syllables. Classification models trained to predict whether individual mixtures are intelligible based on the location of these glimpses can generalize to new conditions, successfully predicting the intelligibility of novel mixtures. They are able to generalize to novel noise instances, novel productions of the same word by the same talker, novel utterances of the same word spoken by different talkers, and, to some extent, novel consonants.

Subject(s)

Speech , Comprehension , Noise , Phonetics , Speech Perception

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL