Combining augmented statistical noise suppression and framewise speech/non-speech classification for robust voice activity detection

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2017.8 ◽

2017 ◽

Vol 6 ◽

Author(s):

Yasunari Obuchi

Keyword(s):

Noise Suppression ◽

Frequency Component ◽

Voice Activity Detection ◽

Training Data ◽

Activity Detection ◽

Noisy Environments ◽

Model Based ◽

Speech Classification ◽

Voice Activity ◽

Statistical Noise

This paper proposes a new voice activity detection (VAD) algorithm based on statistical noise suppression and framewise speech/non-speech classification. Although many VAD algorithms have been developed that are robust in noisy environments, the most successful ones are related to statistical noise suppression in some way. Accordingly, we formulate our VAD algorithm as a combination of noise suppression and subsequent framewise classification. The noise suppression part is improved by introducing the idea that any unreliable frequency component should be removed, and the decision can be made by the remaining signal. This augmentation can be realized using a few additional parameters embedded in the gain-estimation process. The framewise classification part can be either model-less or model-based. A model-less classifier has the advantage that it can be applied to any situation, even if no training data are available. In contrast, a model-based classifier (e.g., neural network-based classifier) requires training data but tends to be more accurate. The accuracy of the proposed algorithm is evaluated using the CENSREC-1-C public framework and confirmed to be superior to many existing algorithms.

Download Full-text

Framewise speech-nonspeech classification by neural networks for voice activity detection with statistical noise suppression

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2016.7472772 ◽

2016 ◽

Cited By ~ 5

Author(s):

Yasunari Obuchi

Keyword(s):

Neural Networks ◽

Noise Suppression ◽

Voice Activity Detection ◽

Activity Detection ◽

Voice Activity ◽

Statistical Noise

Download Full-text

Integration of Statistical-Model-Based Voice Activity Detection and Noise Suppression for Noise Robust Speech Recogni

Recent Advances in Robust Speech Recognition Technology ◽

10.2174/978160805172411101010001 ◽

2012 ◽

pp. 1-12

Keyword(s):

Statistical Model ◽

Noise Suppression ◽

Voice Activity Detection ◽

Activity Detection ◽

Model Based ◽

Noise Robust ◽

Voice Activity

Download Full-text

Study of integration of statistical model-based voice activity detection and noise suppression

10.21437/interspeech.2008-320 ◽

2008 ◽

Author(s):

Masakiyo Fujimoto ◽

Kentaro Ishizuka ◽

Tomohiro Nakatani

Keyword(s):

Statistical Model ◽

Noise Suppression ◽

Voice Activity Detection ◽

Activity Detection ◽

Model Based ◽

Voice Activity

Download Full-text

A statistical model-based voice activity detection employing minimum classification error technique

10.21437/interspeech.2008-23 ◽

2008 ◽

Author(s):

Sang-Ick Kang ◽

Ji-Hyun Song ◽

Kye-Hwan Lee ◽

Yun-Sik Park ◽

Joon-Hyuk Chang

Keyword(s):

Statistical Model ◽

Voice Activity Detection ◽

Classification Error ◽

Activity Detection ◽

Model Based ◽

Minimum Classification Error ◽

Voice Activity

Download Full-text

Noise robust model-based voice activity detection

10.21437/interspeech.2006-536 ◽

2006 ◽

Author(s):

Ángel de la Torre ◽

Javier Ramírez ◽

Carmen Benítez ◽

José C. Segura ◽

L. García ◽

...

Keyword(s):

Voice Activity Detection ◽

Activity Detection ◽

Model Based ◽

Robust Model ◽

Noise Robust ◽

Voice Activity

Download Full-text

Discriminative Weight Training for a Statistical Model-Based Voice Activity Detection

IEEE Signal Processing Letters ◽

10.1109/lsp.2007.913595 ◽

2008 ◽

Vol 15 ◽

pp. 170-173 ◽

Cited By ~ 22

Author(s):

Sang Ick Kang ◽

Q.H. Jo ◽

Joon Hyuk Chang

Keyword(s):

Statistical Model ◽

Weight Training ◽

Voice Activity Detection ◽

Activity Detection ◽

Model Based ◽

Voice Activity

Download Full-text

A wavelet-based voice activity detection algorithm in noisy environments

9th International Conference on Electronics, Circuits and Systems ◽

10.1109/icecs.2002.1046417 ◽

2003 ◽

Cited By ~ 8

Author(s):

Shi-Huang Chen ◽

Jhing-Fa Wang

Keyword(s):

Detection Algorithm ◽

Voice Activity Detection ◽

Activity Detection ◽

Noisy Environments ◽

Voice Activity

Download Full-text

Combining speech energy and edge information for fast and efficient voice activity detection in noisy environments

2008 19th International Conference on Pattern Recognition ◽

10.1109/icpr.2008.4761906 ◽

2008 ◽

Author(s):

Xiaokun Li ◽

Yunbin Deng

Keyword(s):

Voice Activity Detection ◽

Activity Detection ◽

Noisy Environments ◽

Edge Information ◽

Voice Activity

Download Full-text

Real Time Implementation of Voice Activity Detection based on False Acceptance Regulation

International Journal on Electrical Engineering and Informatics ◽

10.15676/ijeei.2020.12.3.13 ◽

2020 ◽

Vol 12 (3) ◽

pp. 654-666

Author(s):

Charaf Eddine Chelloug ◽

◽

Atef Farrouki ◽

Keyword(s):

Real Time ◽

Voice Activity Detection ◽

Activity Detection ◽

Speech Compression ◽

Noisy Environments ◽

Signal Energy ◽

Active Voice ◽

Voice Activity ◽

Time Acquisition ◽

False Acceptance

In speech compression systems, Voice Activity Detection (VAD) is frequently used to distinguish active voice from other noisy sounds. In this paper, a robust approach of VAD is presented to deal with non-stationary noisy environments. The proposed algorithm exploits adaptive thresholding technique to keep a desired False Acceptance (FA) rate. Iterative hypothesis tests, using signal energy, are implemented to discard or to accept the successive audio frames as active voice. According to the stationary property of the speech, we provide a smoothing method to obtain final VAD decisions. The main contribution of the proposed algorithm concerns its ability to automatically adjust the energy threshold according to the local noise estimator. We analyzed the proposed approach by presenting a comparison with the G.729-B via the NOIZEUS database. The VAD architecture is implemented on a Microcontroller-based system (MCU). Several tests have been conducted by performing real time acquisition via the Input/Output ports of the MCU-system.

Download Full-text

Dnn-Based Voice Activity Detection Using Auxiliary Speech Models in Noisy Environments

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8461551 ◽

2018 ◽

Cited By ~ 3

Author(s):

Yuuki Tachioka

Keyword(s):

Voice Activity Detection ◽

Activity Detection ◽

Noisy Environments ◽

Voice Activity

Download Full-text