Silent EEG-Speech Recognition Using Convolutional and Recurrent Neural Network with 85% Accuracy of 9 Words Classification

Darya Vorontsova; Ivan Menshikov; Aleksandr Zubov; Kirill Orlov; Peter Rikunov; Ekaterina Zvereva; Lev Flitman; Anton Lanikin; Anna Sokolova; Sergey Markov; Alexandra Bernadotte

doi:10.3390/s21206744

Silent EEG-Speech Recognition Using Convolutional and Recurrent Neural Network with 85% Accuracy of 9 Words Classification

Sensors ◽

10.3390/s21206744 ◽

2021 ◽

Vol 21 (20) ◽

pp. 6744

Author(s):

Darya Vorontsova ◽

Ivan Menshikov ◽

Aleksandr Zubov ◽

Kirill Orlov ◽

Peter Rikunov ◽

...

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Brain Activity ◽

Binary Classification ◽

Input Device ◽

Limited Sample ◽

The Neural Network ◽

Eeg Data ◽

Communication Difficulties ◽

Silent Speech

In this work, we focus on silent speech recognition in electroencephalography (EEG) data of healthy individuals to advance brain–computer interface (BCI) development to include people with neurodegeneration and movement and communication difficulties in society. Our dataset was recorded from 270 healthy subjects during silent speech of eight different Russia words (commands): `forward’, `backward’, `up’, `down’, `help’, `take’, `stop’, and `release’, and one pseudoword. We began by demonstrating that silent word distributions can be very close statistically and that there are words describing directed movements that share similar patterns of brain activity. However, after training one individual, we achieved 85% accuracy performing 9 words (including pseudoword) classification and 88% accuracy on binary classification on average. We show that a smaller dataset collected on one participant allows for building a more accurate classifier for a given subject than a larger dataset collected on a group of people. At the same time, we show that the learning outcomes on a limited sample of EEG-data are transferable to the general population. Thus, we demonstrate the possibility of using selected command-words to create an EEG-based input device for people on whom the neural network classifier has not been trained, which is particularly important for people with disabilities.

Download Full-text

The System for Speech Recognition on the Basis of the Neural Network

Telecommunications and Radio Engineering ◽

10.1615/telecomradeng.v62.i2.40 ◽

2004 ◽

Vol 62 (1-6) ◽

pp. 131-142

Author(s):

V. A. Pimenov

Keyword(s):

Neural Network ◽

Speech Recognition ◽

The Neural Network

Download Full-text

ConvDip: A convolutional neural network for better M/EEG Source Imaging

10.1101/2020.04.09.033506 ◽

2020 ◽

Cited By ~ 2

Author(s):

Lukas Hecker ◽

Rebekka Rupprecht ◽

Ludger Tebartz van Elst ◽

Juergen Kornmeier

Keyword(s):

Neural Network ◽

Inverse Problem ◽

Convolutional Neural Network ◽

Brain Activity ◽

Clinical Diagnostics ◽

Dipole Model ◽

Source Imaging ◽

Eeg Source Imaging ◽

Eeg Data ◽

Inverse Solutions

AbstractEEG and MEG are well-established non-invasive methods in neuroscientific research and clinical diagnostics. Both methods provide a high temporal but low spatial resolution of brain activity. In order to gain insight about the spatial dynamics of the M/EEG one has to solve the inverse problem, which means that more than one configuration of neural sources can evoke one and the same distribution of EEG activity on the scalp. Artificial neural networks have been previously used successfully to find either one or two dipoles sources. These approaches, however, have never solved the inverse problem in a distributed dipole model with more than two dipole sources. We present ConvDip, a novel convolutional neural network (CNN) architecture that solves the EEG inverse problem in a distributed dipole model based on simulated EEG data. We show that (1) ConvDip learned to produce inverse solutions from a single time point of EEG data and (2) outperforms state-of-the-art methods (eLORETA and LCMV beamforming) on all focused performance measures. (3) It is more flexible when dealing with varying number of sources, produces less ghost sources and misses less real sources than the comparison methods. (4) It produces plausible inverse solutions for real-world EEG recordings and needs less than 40 ms for a single forward pass. Our results qualify ConvDip as an efficient and easy-to-apply novel method for source localization in EEG and MEG data, with high relevance for clinical applications, e.g. in epileptology and real time applications.

Download Full-text

Constructive Learning of Deep Neural Networks for Bigdata Analysis

International Journal of Computer Applications Technology and Research ◽

10.7753/ijcatr0912.1001 ◽

2020 ◽

Vol 9 (12) ◽

pp. 311-322

Author(s):

Soha Abd Mohamed El-Moamen ◽

Marghany Hassan Mohamed ◽

Mohammed F. Farghally

Keyword(s):

Neural Network ◽

Lung Cancer ◽

Binary Classification ◽

Network Models ◽

Classification Model ◽

Neural Network Models ◽

Constructive Learning ◽

The Neural Network ◽

Rapid Pace ◽

Better Than

The need for tracking and evaluation of patients in real-time has contributed to an increase in knowing people’s actions to enhance care facilities. Deep learning is good at both a rapid pace in collecting frameworks of big data healthcare and good predictions for detection the lung cancer early. In this paper, we proposed a constructive deep neural network with Apache Spark to classify images and levels of lung cancer. We developed a binary classification model using threshold technique classifying nodules to benign or malignant. At the proposed framework, the neural network models training, defined using the Keras API, is performed using BigDL in a distributed Spark clusters. The proposed algorithm has metrics AUC-0.9810, a misclassifying rate from which it has been shown that our suggested classifiers perform better than other classifiers.

Download Full-text

EEG Data Mining Using PCA

Data Mining and Medical Knowledge Management ◽

10.4018/978-1-60566-218-3.ch008 ◽

2011 ◽

pp. 161-180 ◽

Cited By ~ 4

Author(s):

Lenka Lhotská ◽

Vladimír Krajca ◽

Jitka Mohylová ◽

Svojmil Petránek ◽

Václav Gerla

Keyword(s):

Neural Network ◽

Data Mining ◽

Principal Components ◽

The Neural Network ◽

Eeg Data ◽

Eigen Decomposition ◽

Electroencephalogram Eeg ◽

Epileptic Spike ◽

Eeg Signal Processing

This chapter deals with the application of principal components analysis (PCA) to the field of data mining in electroencephalogram (EEG) processing. The principal components are estimated from the signal by eigen decomposition of the covariance estimate of the input. Alternatively, they can be estimated by a neural network (NN) configured for extracting the first principal components. Instead of performing computationally complex operations for eigenvector estimation, the neural network can be trained to produce ordered first principal components. Possible applications include separation of different signal components for feature extraction in the field of EEG signal processing, adaptive segmentation, epileptic spike detection, and long-term EEG monitoring evaluation of patients in a coma.

Download Full-text

The rhythmic nature of visual perception

Journal of Neurophysiology ◽

10.1152/jn.00810.2017 ◽

2018 ◽

Vol 119 (4) ◽

pp. 1251-1253 ◽

Cited By ~ 9

Author(s):

Randolph F. Helfrich

Keyword(s):

Neural Network ◽

Visual Perception ◽

Brain Activity ◽

Oculomotor System ◽

Top Down ◽

Discrete Sampling ◽

The Neural Network ◽

The World ◽

Oscillatory Brain Activity

Our continuous perception of the world could be the result of discrete sampling, where individual snapshots are seamlessly fused into a coherent stream. It has been argued that endogenous oscillatory brain activity could provide the functional substrate of cortical rhythmic sampling. A new study demonstrates that cortical rhythmic sampling is tightly linked to the oculomotor system, thus providing a novel perspective on the neural network underlying top-down guided visual perception.

Download Full-text

Deep Learning-Based Localization of EEG Electrodes Within MRI Acquisitions

Frontiers in Neurology ◽

10.3389/fneur.2021.644278 ◽

2021 ◽

Vol 12 ◽

Author(s):

Caroline Pinte ◽

Mathis Fleury ◽

Pierre Maurel

Keyword(s):

Neural Network ◽

Spatial Information ◽

Brain Activity ◽

Relevant Information ◽

Magnetic Resonance Images ◽

Detection Accuracy ◽

Simultaneous Acquisition ◽

The Neural Network ◽

Mri Sequences ◽

Short Echo Time

The simultaneous acquisition of electroencephalographic (EEG) signals and functional magnetic resonance images (fMRI) aims to measure brain activity with good spatial and temporal resolution. This bimodal neuroimaging can bring complementary and very relevant information in many cases and in particular for epilepsy. Indeed, it has been shown that it can facilitate the localization of epileptic networks. Regarding the EEG, source localization requires the resolution of a complex inverse problem that depends on several parameters, one of the most important of which is the position of the EEG electrodes on the scalp. These positions are often roughly estimated using fiducial points. In simultaneous EEG-fMRI acquisitions, specific MRI sequences can provide valuable spatial information. In this work, we propose a new fully automatic method based on neural networks to segment an ultra-short echo-time MR volume in order to retrieve the coordinates and labels of the EEG electrodes. It consists of two steps: a segmentation of the images by a neural network, followed by the registration of an EEG template on the obtained detections. We trained the neural network using 37 MR volumes and then we tested our method on 23 new volumes. The results show an average detection accuracy of 99.7% with an average position error of 2.24 mm, as well as 100% accuracy in the labeling.

Download Full-text

Fast Emotion Recognition Based on Single Pulse PPG Signal with Convolutional Neural Network

Applied Sciences ◽

10.3390/app9163355 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3355 ◽

Cited By ~ 5

Author(s):

Min Seop Lee ◽

Yun Kyu Lee ◽

Dong Sung Pae ◽

Myo Taeg Lim ◽

Dong Won Kim ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Emotion Recognition ◽

Binary Classification ◽

Single Pulse ◽

Signal Pulse ◽

Short Term ◽

Signal Features ◽

The Neural Network ◽

Valence And Arousal

Physiological signals contain considerable information regarding emotions. This paper investigated the ability of photoplethysmogram (PPG) signals to recognize emotion, adopting a two-dimensional emotion model based on valence and arousal to represent human feelings. The main purpose was to recognize short term emotion using a single PPG signal pulse. We used a one-dimensional convolutional neural network (1D CNN) to extract PPG signal features to classify the valence and arousal. We split the PPG signal into a single 1.1 s pulse and normalized it for input to the neural network based on the personal maximum and minimum values. We chose the dataset for emotion analysis using physiological (DEAP) signals for the experiment and tested the 1D CNN as a binary classification (high or low valence and arousal), achieving the short-term emotion recognition of 1.1 s with 75.3% and 76.2% valence and arousal accuracies, respectively, on the DEAP data.

Download Full-text

Performance Analysis and Recognition of Speech using Recurrent Neural Network

Technical Journal ◽

10.3126/tj.v1i1.27596 ◽

2019 ◽

Vol 1 (1) ◽

pp. 87-95

Author(s):

Bishon Lamichanne ◽

Hari K.C.

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Performance Analysis ◽

Recurrent Neural Network ◽

Error Rate ◽

English Language ◽

Daily Lives ◽

Crucial Step ◽

The Neural Network ◽

Positive Results

Speech is one of the most natural ways to communicate between people. It plays an important role in our daily lives. To make machines able to talk with people is a challenging but very useful task. A crucial step is to enable machines to recognize and understand what people are saying. Hence, speech recognition becomes a key technique providing an interface for communication between machines and humans. There has been a long research history on speech recognition. Neural network is known as a technique that has ability to classify nonlinear problem. Today, lots of research are going in the field of speech recognition with the help of the Neural Network. Even though positive results have been obtained from continuous study, research on minimizing the error rate is still gaining lots attention. The English language offers a number of challenges for speech recognition. This paper implements the RNN to analyze and recognize speech from the set of spoken words.

Download Full-text

Monitoring Severe Slugging in Pipeline-Riser System Using Accelerometers for Application in Early Recognition

Sensors ◽

10.3390/s19183930 ◽

2019 ◽

Vol 19 (18) ◽

pp. 3930 ◽

Cited By ~ 1

Author(s):

Sunah Jung ◽

Haesang Yang ◽

Kiheum Park ◽

Yutaek Seo ◽

Woojae Seong

Keyword(s):

Neural Network ◽

Linear Prediction ◽

Early Recognition ◽

Binary Classification ◽

Stable State ◽

Support Vector ◽

Dual Frequency ◽

Stable States ◽

Two Phase ◽

The Neural Network

The use of accelerometer signals for early recognition of severe slugging is investigated in a pipeline-riser system conveying an air–water two-phase flow, where six accelerometers are installed from the bottom to the top of the riser. Twelve different environmental conditions are produced by changing water and gas superficial velocities, of which three conditions are stable states and the other conditions are related to severe slugging. For online recognition, simple parameters using statistics and linear prediction coefficients are employed to extract useful features. Binary classification to recognize stable flow and severe slugging is performed using a support vector machine and a neural network. In multiclass classification, the neural network is adopted to identify four flow patterns of stable state, two types of severe slugging, and an irregular transition state between severe slugging and dual-frequency severe slugging. The performance is compared and analyzed according to the signal length for three cases of sensor location: six accelerometers, one accelerometer at the riser base, and one accelerometer at the top of the riser.

Download Full-text

COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM

Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska ◽

10.35784/iapgos.234 ◽

2019 ◽

Vol 9 (3) ◽

pp. 54-57

Author(s):

Yedilkhan Amirgaliyev ◽

Kuanyshbay Kuanyshbay ◽

Aisultan Shoiynbek

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Short Term Memory ◽

Optimization Algorithms ◽

Recognition System ◽

Model Data ◽

Short Term ◽

Term Memory ◽

The Neural Network ◽

Long Short Term Memory

This paper evaluates and compares the performances of three well-known optimization algorithms (Adagrad, Adam, Momentum) for faster training the neural network of CTC algorithm for speech recognition. For CTC algorithms recurrent neural network has been used, specifically Long-Short-Term memory. LSTM is effective and often used model. Data has been downloaded from VCTK corpus of Edinburgh University. The results of optimization algorithms have been evaluated by the Label error rate and CTC loss.

Download Full-text