RESUMEN
The research in non-invasive Brain-Computer Interface (BCI) has led to significant improvements in the recent years for potential end users. However, the user experience and the BCI illiteracy problem remains challenging areas to address for obtaining robust and resilient clinical applications. In this study, we address the choice of the time segment for the detection of steady state visual evoked potential (SSVEP) detection. This problem has been widely addressed for the detection of event-related potentials compared to SSVEP based BCIs. The choice of this parameter is typically fixed and has a direct influence on both the detection accuracy and the information transfer rate. We propose to shift the problem of the time segment to the choice of the threshold for determining if a response has been properly detected. We consider two open-datasets for benchmarking the rationale of the approach. The results support the conclusion that an adaptive time segment for each trial, based on the selection of a threshold, can lead to a substantial higher ITR (86.92 bits/min), compared to the time segment chosen at the user (79.56 bits/min) or group level (73.78 bits/min). Finally, the results suggest that the threshold could be determined automatically in relation to the number of classes. Such an approach can leverage the literacy of SSVEP based BCI.
Asunto(s)
Interfaces Cerebro-Computador , Potenciales Evocados Visuales , Electroencefalografía , Humanos , Examen Neurológico , Estimulación LuminosaRESUMEN
The recent development of inexpensive and accurate eye-trackers allows the creation of gazed based virtual keyboards that can be used by a large population of disabled people in developing countries. Thanks to eye-tracking technology, gaze-based virtual keyboards can be designed in relation to constraints related to the gaze detection accuracy and the considered display device. In this paper, we propose a new multimodal multiscript gaze-based virtual keyboard where it is possible to change the layout of the graphical user interface in relation to the script. Traditionally, virtual keyboards are assessed for a single language (e.g. English). We propose a multiscript gaze based virtual keyboard that can be accessed for people who communicate with the Latin, Bangla, and/or Devanagari scripts. We evaluate the performance of the virtual keyboard with two main groups of participants: 28 people who can communicate with both Bangla and English, and 24 people who can communicate with both Devanagari and English. The performance is assessed in relation to the information transfer rate when participants had to spell a sentence using their gaze for pointing to the command, and a dedicated mouth switch for commands selection. The results support the conclusion that the system is efficient, with no difference in terms of information transfer rate between Bangla and Devanagari. However, the performance is higher with English, despite the fact it was the secondary language of the participants.
Asunto(s)
Interfaz Usuario-Computador , Personas con Discapacidad , Humanos , LenguajeRESUMEN
Deep learning techniques have recently been successful in the classification of brain evoked responses for multiple applications, including brain-machine interface. Single-trial detection in the electroencephalogram (EEG) of brain evoked responses, like event-related potentials (ERPs), requires multiple processing stages, in the spatial and temporal domains, to extract high level features. Convolutional neural networks, as a type of deep learning method, have been used for EEG signal detection as the underlying structure of the EEG signal can be included in such system, facilitating the learning step. The EEG signal is typically decomposed into 2 main dimensions: space and time. However, the spatial dimension can be decomposed into 2 dimensions that better represent the relationships between the sensors that are involved in the classification. We propose to analyze the performance of 2D and 3D convolutional neural networks for the classification of ERPs with a dataset based on 64 EEG channels. We propose and compare 6 conv net architectures: 4 using 3D convolutions, that vary in relation to the number of layers and feature maps, and 2 using 2D convolutions. The results support the conclusion that 3D convolutions provide better performance than 2D convolutions for the binary classification of ERPs.
Asunto(s)
Interfaces Cerebro-Computador , Electroencefalografía , Redes Neurales de la Computación , Encéfalo , Aprendizaje Profundo , Potenciales Evocados , HumanosRESUMEN
Target detection during serial visual presentation tasks is an active research topic in the brain-computer interface (BCI) community as this type of paradigm allows to take advantage of event-related potentials (ERPs) through electroencephalography (EEG) recordings to enhance the accuracy of target detection. The detection of brain evoked responses at the single-trial level remains a challenging task and can be exploited in various applications. Typical non-invasive BCIs based on event-related brain responses use EEG. In clinical settings, brain signals recorded with magnetoencephalography (MEG) can be advantageously used thanks to their high spatial and temporal resolution. In this study, we address the problem of the relationships between behavioral performance and single-trial detection by considering a task with different levels of difficulty. We consider images of faces with six different facial expressions (anger, disgust, fear, neutrality, sadness, and happiness). We consider MEG signals recorded on ten healthy participants in six sessions where targets were one of the six types of facial expressions in each session. The results support the conclusion that a high performance can be obtained at the single-trial level $( {AUC }= 0 . 903 \pm 0 .045)$, and that the performance is correlated with the behavioral performance (reaction time and hit rate).
Asunto(s)
Interfaces Cerebro-Computador , Encéfalo , Potenciales Evocados , Expresión Facial , Magnetoencefalografía , Adulto , Encéfalo/fisiología , Electroencefalografía/métodos , Potenciales Evocados/fisiología , Miedo , Femenino , Felicidad , Humanos , Magnetoencefalografía/métodos , Masculino , Tiempo de ReacciónRESUMEN
A large number of people with disabilities rely on assistive technologies to communicate with their families, to use social media, and have a social life. Despite a significant increase of novel assitive technologies, robust, non-invasive, and inexpensive solutions should be proposed and optimized in relation to the physical abilities of the users. A reliable and robust identification of intentional visual commands is an important issue in the development of eye-movements based user interfaces. The detection of a command with an eyetracking system can be achieved with a dwell time. Yet, a large number of people can use simple hand gestures as a switch to select a command. We propose a new virtual keyboard based on the detection of ten commands. The keyboard includes all the letters of the Latin script (upper and lower case), punctuation marks, digits, and a delete button. To select a command in the keyboard, the user points the desired item with the gaze, and select it with hand gesture. The system has been evaluated across eight healthy subjects with five predefined hand gestures, and a button for the selection. The results support the conclusion that the performance of a subject, in terms of speed and information transfer rate (ITR), depends on the choice of the hand gesture. The best gesture for each subject provides a mean performance of $8 . 77 \pm 2 .90$ letters per minute, which corresponds to an ITR of $57 . 04 \pm 14 .55$ bits per minute. The results highlight that the hand gesture assigned for the selection of an item is inter-subject dependent.
Asunto(s)
Gestos , Dispositivos de Autoayuda , Movimientos Oculares , Mano , Voluntarios Sanos , HumanosRESUMEN
Portable eye-trackers provide an efficient way to access the point of gaze from a user on a computer screen. Thanks to eyetracking, gaze-based virtual keyboard can be developed by taking into account constraints related to the gaze detection accuracy. In this paper, we propose a new gaze-based virtual keyboard where all the letters can be accessed directly through a single command. In addition, we propose a USB mouth switch that is directly connected through a computer mouse, with the mouse switch replacing the left click button. This approach is considered to tackle the Midas touch problem with eye-tracking for people who are severely disabled. The performance is evaluated on 10 participants by comparing the following three conditions: gaze detection with mouth switch, gaze detection with dwell time by considering the distance to the closest command, and the gaze detection within the surface of the command box. Finally, a workload using NASA-TLX test was conducted on the different conditions. The results revealed that the proposed approach with the mouth switch provides a better performance in terms of typing speed (36.6 ± 8.4 letters/minute) compared to the other conditions, and a high acceptance as an input device.
Asunto(s)
Personas con Discapacidad , Boca , Computadores , Humanos , Tacto , Interfaz Usuario-ComputadorRESUMEN
The detection of brain responses at the single-trial level in the electroencephalogram (EEG) such as event-related potentials (ERPs) is a difficult problem that requires different processing steps to extract relevant discriminant features. While most of the signal and classification techniques for the detection of brain responses are based on linear algebra, different pattern recognition techniques such as convolutional neural network (CNN), as a type of deep learning technique, have shown some interests as they are able to process the signal after limited pre-processing. In this study, we propose to investigate the performance of CNNs in relation of their architecture and in relation to how they are evaluated: a single system for each subject, or a system for all the subjects. More particularly, we want to address the change of performance that can be observed between specifying a neural network to a subject, or by considering a neural network for a group of subjects, taking advantage of a larger number of trials from different subjects. The results support the conclusion that a convolutional neural network trained on different subjects can lead to an AUC above 0.9 by using an appropriate architecture using spatial filtering and shift invariant layers.
Asunto(s)
Potenciales Evocados , Encéfalo , Electroencefalografía , Aprendizaje Automático , Redes Neurales de la ComputaciónRESUMEN
The recognition of brain evoked responses at the single-trial level is a challenging task. Typical non-invasive brain-computer interfaces based on event-related brain responses use eletroencephalograhy. In this study, we consider brain signals recorded with magnetoencephalography (MEG), and we expect to take advantage of the high spatial and temporal resolution for the detection of targets in a series of images. This study was used for the data analysis competition held in the 20th International Conference on Biomagnetism (Biomag) 2016, wherein the goal was to provide a method for single-trial detection of even-related fields corresponding to the presentation of happy faces during the rapid presentation of images of faces with six different facial expressions (anger, disgust, fear, neutrality, sadness, and happiness). The datasets correspond to 204 gradiometers signals obtained from four participants. The best method is based on the combination of several approaches, and mainly based on Riemannian geometry, and it provided an area under the ROC curve of 0.956±0.043. The results show that a high recognition rate of facial expressions can be obtained at the signal-trial level using advanced signal processing and machine learning methodologies.
Asunto(s)
Magnetoencefalografía , Emociones , Expresión Facial , Miedo , Felicidad , HumanosRESUMEN
Acute bouts of aerobic physical exercise can modulate subsequent cognitive task performance and oscillatory brain activity measured with electroencephalography (EEG). Here, we investigated the sequencing of these modulations of perceptual and cognitive processes using scalp recorded EEG acquired during exercise. Twelve participants viewed pseudo-random sequences of frequent non-target stimuli (cars), infrequent distractors (obliquely oriented faces) and infrequent targets that required a simple detection response (obliquely oriented faces, where the angle was different than the infrequent distractors). The sequences were presented while seated on a stationary bike under three conditions during which scalp recorded EEG was also acquired: rest, low-intensity exercise, and high-intensity exercise. Behavioral target detection was faster during high-intensity exercise compared to both rest and low-intensity exercise. An event-related potential (ERP) analysis of the EEG data revealed that the mean amplitude of the visual P1 component evoked by frequent non-targets measured at parietal-occipital electrodes was larger during low-intensity exercise compared to rest. The P1 component evoked by infrequent targets also peaked earlier during low-intensity exercise compared to rest and high-intensity exercise. The P3a ERP component evoked by infrequent distractors measured at parietal electrodes peaked significantly earlier during both low- and high-intensity exercise when compared to rest. The modulation of the visual P1 and the later P3a components is consistent with the conclusion that exercise modulates multiple stages of neural information processing, ranging from early stage sensory processing (P1) to post-perceptual target categorization (P3a).
Asunto(s)
Ciclismo/fisiología , Procesos Mentales/fisiología , Análisis de Varianza , Mapeo Encefálico , Electroencefalografía , Potenciales Evocados/fisiología , Femenino , Análisis de Fourier , Frecuencia Cardíaca/fisiología , Humanos , Masculino , Estimulación Luminosa , Tiempo de Reacción , Adulto JovenRESUMEN
To propose a reliable and robust Brain-Computer Interface (BCI), efficient machine learning and signal processing methods have to be used. However, it is often necessary to have a sufficient number of labeled brain responses to create a model. A large database that would represent all of the possible variabilities of the signal is not always possible to obtain, because calibration sessions have to be short. In the case of BCIs based on the detection of event-related potentials (ERPs), we propose to tackle this problem by including additional deformed patterns in the training database to increase the number of labeled brain responses. The creation of the additional deformed patterns is based on two approaches: (i) smooth deformation fields, and (ii) right and left shifted signals. The evaluation is performed with data from 10 healthy subjects participating in a P300 speller experiment. The results show that small shifts of the signal allow a better estimation of both spatial filters, and a linear classifier. The best performance, AUC=0.828 ± 0.061, is obtained by combining the smooth deformation fields and the shifts, after spatial filtering, compared to AUC=0.543 ± 0.025, without additional deformed patterns. The results support the conclusion that adding signals with small deformations can significantly improve the performance of single-trial detection when the amount of training data is limited.
Asunto(s)
Electroencefalografía , Potenciales Relacionados con Evento P300/fisiología , Adulto , Área Bajo la Curva , Teorema de Bayes , Encéfalo/fisiología , Interfaces Cerebro-Computador , Análisis Discriminante , Femenino , Humanos , Masculino , Curva ROC , Procesamiento de Señales Asistido por Computador , Relación Señal-RuidoRESUMEN
Rapid serial visual presentation (RSVP) tasks, in which participants are presented with a continuous sequence of images in one location, have been used in combination with electroencephalography (EEG) in a variety of Brain-Machine Interface (BMI) applications. The RSVP task is advantageous because it can be performed at a high temporal rate. The rate of the RSVP sequence is controlled by the stimulus onset asynchrony (SOA) between subsequent stimuli. When used within the context of a BMI, an RSVP task with short SOA could increase the information throughput of the system while also allowing for stimulus repetitions. However, reducing the SOA also increases the perceptual degradation caused by presenting two stimuli in close succession, and it decreases the target-to-target interval (TTI), which can increase the cognitive demands of the task. These negative consequences of decreasing the SOA could affect on the EEG signal measured in the task and degrade the performance of the BMI. Here we systematically investigate the effects of SOA and stimulus repetition (r) on single-trial target detection in an RSVP task. Ten healthy volunteers participated in an RSVP task in four conditions that varied in SOA and repetitions (SOA=500 ms, r=1; SOA=250 ms, r=2; SOA=166 ms, r=3; and SOA=100 ms, r=5) while processing time across conditions was controlled. There were two key results: First, when controlling for the number of repetitions, single-trial performance increases when the SOA decreases. Second, when the repetitions were combined, the best performance (AUC=0.967) was obtained with the shortest SOA (100 ms). These results suggest that shortening the SOA in an RSVP task has the benefit of increasing the performance relative to longer SOAs, and it also allows a higher number of repetitions of the stimuli in a limited amount of time.
Asunto(s)
Encéfalo/fisiología , Potenciales Evocados/fisiología , Estimulación Luminosa , Área Bajo la Curva , Femenino , Humanos , Masculino , Adulto JovenRESUMEN
A challenge in designing a Brain-Computer Interface (BCI) is the choice of the channels, e.g. the most relevant sensors. Although a setup with many sensors can be more efficient for the detection of Event-Related Potential (ERP) like the P300, it is relevant to consider only a low number of sensors for a commercial or clinical BCI application. Indeed, a reduced number of sensors can naturally increase the user comfort by reducing the time required for the installation of the EEG (electroencephalogram) cap and can decrease the price of the device. In this study, the influence of spatial filtering during the process of sensor selection is addressed. Two of them maximize the Signal to Signal-plus-Noise Ratio (SSNR) for the different sensor subsets while the third one maximizes the differences between the averaged P300 waveform and the non P300 waveform. We show that the locations of the most relevant sensors subsets for the detection of the P300 are highly dependent on the use of spatial filtering. Applied on data from 20 healthy subjects, this study proves that subsets obtained where sensors are suppressed in relation to their individual SSNR are less efficient than when sensors are suppressed in relation to their contribution once the different selected sensors are combined for enhancing the signal. In other words, it highlights the difference between estimating the P300 projection on the scalp and evaluating the more efficient sensor subsets for a P300-BCI. Finally, this study explores the issue of channel commonality across subjects. The results support the conclusion that spatial filters during the sensor selection procedure allow selecting better sensors for a visual P300 Brain-Computer Interface.
Asunto(s)
Mapeo Encefálico , Encéfalo/fisiología , Potenciales Relacionados con Evento P300/fisiología , Detección de Señal Psicológica , Interfaz Usuario-Computador , Adulto , Ondas Encefálicas/fisiología , Electroencefalografía , Femenino , Humanos , Masculino , Estimulación Luminosa/métodos , Procesamiento de Señales Asistido por Computador , Adulto JovenRESUMEN
A brain-computer interface (BCI) is a specific type of human-computer interface that enables direct communication between human and computer through decoding of brain activity. As such, event-related potentials like the P300 can be obtained with an oddball paradigm whose targets are selected by the user. This paper deals with methods to reduce the needed set of EEG sensors in the P300 speller application. A reduced number of sensors yields more comfort for the user, decreases installation time duration, may substantially reduce the financial cost of the BCI setup and may reduce the power consumption for wireless EEG caps. Our new approach to select relevant sensors is based on backward elimination using a cost function based on the signal to signal-plus-noise ratio, after some spatial filtering. We show that this cost function selects sensors' subsets that provide a better accuracy in the speller recognition rate during the test sessions than selected subsets based on classification accuracy. We validate our selection strategy on data from 20 healthy subjects.