scholarly journals NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2248 ◽  
Author(s):  
Debadatta Dash ◽  
Paul Ferrari ◽  
Satwik Dutta ◽  
Jun Wang

Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy.

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Shilpa Sharma ◽  
Punam Rattan ◽  
Anurag Sharma ◽  
Mohammad Shabaz

Purpose This paper aims to introduce recently an unregulated unsupervised algorithm focused on voice activity detection by data clustering maximum margin, i.e. support vector machine. The algorithm for clustering K-mean used to solve speech behaviour detection issues was later applied, the application, therefore, did not permit the identification of voice detection. This is critical in demands for speech recognition. Design/methodology/approach Here, the authors find a voice activity detection detector based on a report provided by a K-mean algorithm that permits sliding window detection of voice and noise. However, first, it needs an initial detection pause. The machine initialized by the algorithm will work on health-care infrastructure and provides a platform for health-care professionals to detect the clear voice of patients. Findings Timely usage discussion on many histories of NOISEX-92 var reveals the average non-speech and the average signal-to-noise ratios hit concentrations which are higher than modern voice activity detection. Originality/value Research work is original.


ETRI Journal ◽  
2011 ◽  
Vol 33 (1) ◽  
pp. 99-109 ◽  
Author(s):  
Mohammad Hossein Moattar ◽  
Mohammad Mehdi Homayounpour

Author(s):  
Xincheng Gao ◽  
Houbin Cao ◽  
Jianfeng Zhang ◽  
Jinping Bai ◽  
Tianhang Zhang ◽  
...  

Entropy ◽  
2016 ◽  
Vol 18 (8) ◽  
pp. 298 ◽  
Author(s):  
R. Johny Elton ◽  
P. Vasuki ◽  
J. Mohanalin

2010 ◽  
Vol 24 (3) ◽  
pp. 531-543 ◽  
Author(s):  
Shi-Huang Chen ◽  
Rodrigo Capobianco Guido ◽  
Trieu-Kien Truong ◽  
Yaotsu Chang

2021 ◽  
Vol 14 ◽  
Author(s):  
Jukka Ranta ◽  
Manu Airaksinen ◽  
Turkka Kirjavainen ◽  
Sampsa Vanhatalo ◽  
Nathan J. Stevenson

ObjectiveTo develop a non-invasive and clinically practical method for a long-term monitoring of infant sleep cycling in the intensive care unit.MethodsForty three infant polysomnography recordings were performed at 1–18 weeks of age, including a piezo element bed mattress sensor to record respiratory and gross-body movements. The hypnogram scored from polysomnography signals was used as the ground truth in training sleep classifiers based on 20,022 epochs of movement and/or electrocardiography signals. Three classifier designs were evaluated in the detection of deep sleep (N3 state): support vector machine (SVM), Long Short-Term Memory neural network, and convolutional neural network (CNN).ResultsDeep sleep was accurately identified from other states with all classifier variants. The SVM classifier based on a combination of movement and electrocardiography features had the highest performance (AUC 97.6%). A SVM classifier based on only movement features had comparable accuracy (AUC 95.0%). The feature-independent CNN resulted in roughly comparable accuracy (AUC 93.3%).ConclusionAutomated non-invasive tracking of sleep state cycling is technically feasible using measurements from a piezo element situated under a bed mattress.SignificanceAn open source infant deep sleep detector of this kind allows quantitative, continuous bedside assessment of infant’s sleep cycling.


Author(s):  
Charaf Eddine Chelloug ◽  
◽  
Atef Farrouki ◽  

In speech compression systems, Voice Activity Detection (VAD) is frequently used to distinguish active voice from other noisy sounds. In this paper, a robust approach of VAD is presented to deal with non-stationary noisy environments. The proposed algorithm exploits adaptive thresholding technique to keep a desired False Acceptance (FA) rate. Iterative hypothesis tests, using signal energy, are implemented to discard or to accept the successive audio frames as active voice. According to the stationary property of the speech, we provide a smoothing method to obtain final VAD decisions. The main contribution of the proposed algorithm concerns its ability to automatically adjust the energy threshold according to the local noise estimator. We analyzed the proposed approach by presenting a comparison with the G.729-B via the NOIZEUS database. The VAD architecture is implemented on a Microcontroller-based system (MCU). Several tests have been conducted by performing real time acquisition via the Input/Output ports of the MCU-system.


Author(s):  
Saurav Dubey ◽  
Arash Mahnan ◽  
Jürgen Konczak

Abstract Speech analysis using microphones can be problematic for Voice Activity Detection (VAD) in the presence of background noise. This study explored the use of wearable accelerometers instead of microphones. We assessed if accelerometers placed on the neck can be part of a VAD system embedded in a wearable collar-like device that delivers vibro-tactile stimulation (VTS) to the larynx during speech as a therapy for patients with the voice disorder spasmodic dysphonia. Specifically, we aimed to a) find the ideal location for placing accelerometers to the neck, and b) develop a VAD algorithm that detects the onset and offset of speech. Six healthy adult participants (M/F = 3/3, age = 26 (5.1)) vocalized 20 sample sentences with and without VTS at three neck locations: 1) thyroid cartilage, 2) sterno-cleidomastoid, and 3) posterior neck above C7. Based on time-synchronized acceleration and audio signals, VAD algorithm identified the Number of Onsets of Speech and Total Time Voiced. The thyroid cartilage attachment location had over 90% accuracy detecting speech in both measures. The average accuracy of the sternocleidomastoid and C7 locations were below 75% and 15% respectively. VAD accuracy decreased with the presence of VTS trials at all locations. We conclude that accelerometer signals due to tissue motion at thyroid cartilage are most suitable for real-time VAD. These findings support the feasibility of accelerometer-based voice detection for the use in medical devices that target speech and voice disorders.


Sign in / Sign up

Export Citation Format

Share Document