Polynomial dynamic time warping kernel support vector machines for dysarthric speech recognition with sparse training data

Author(s):  
Vincent Wan ◽  
James Carmichael
2020 ◽  
Vol 12 (6) ◽  
pp. 2403 ◽  
Author(s):  
Ahmed Ismail ◽  
Samir Abdlerazek ◽  
Ibrahim M. El-Henawy

This paper presents an effective solution based on speech recognition to provide elderly people, patients and disabled people with an easy control system. The goal is to build a low-cost system based on speech recognition to easily access Internet of Things (IoT) devices installed in smart homes and hospitals without relying on a centralized supervisory system. The proposed system used a Raspberry Pi board to control home appliances through wireless with smartphones. The main purpose of this system is to facilitate interactions between the user and home appliances through IoT communications based on speech commands. The proposed framework contribution uses a hybrid Support Vector Machine (SVM) with a Dynamic Time Warping (DTW) algorithm to enhance the speech recognition process. The proposed solution is a machine learning-based system for controlling smart devices through speech commands with an accuracy of 97%. The results helped patients and elderly people to access and control IoT devices that are compatible with our system using speech recognition. The proposed speech recognition system is flexible with scalability and availability in adapting to existing smart IoT devices, and it provides privacy in managing patient devices. The research provides an effective method to integrate our systems among medical institutions to help elderly people and patients.


2014 ◽  
Vol 490-491 ◽  
pp. 1347-1355
Author(s):  
Xiang Lilan Zhang ◽  
Ji Ping Sun ◽  
Xu Hui Huang ◽  
Zhi Gang Luo

Lightweight speaker-dependent (SD) automatic speech recognition (ASR) is a promising solution for the problems of possibility of disclosing personal privacy and difficulty of obtaining training material for many seldom used English words and (often non-English) names. Dynamic time warping (DTW) algorithm is the state-of-the-art algorithm for small foot-print SD ASR applications, which have limited storage space and small vocabulary. In our previous work, we have successfully developed two fast and accurate DTW variations for clean speech data. However, speech recognition in adverse conditions is still a big challenge. In order to improve recognition accuracy in noisy and bad recording conditions, such as too high or low recording volume, we introduce a novel weighted DTW method. This method defines a feature index for each time frame of training data, and then applies it to the core DTW process to tune the final alignment score. With extensive experiments on one representative SD dataset of three speakers' recordings, our method achieves better accuracy than DTW, where 0.5% relative reduction of error rate (RRER) on clean speech data and 7.5% RRER on noisy and bad recording speech data. To the best of our knowledge, our new weighted DTW is the first weighted DTW method specially designed for speech data in noisy and bad recording conditions.


Speech recognition using sustenance vector machine assisted by Dynamic time warping (DTW) method is proposed. The input training datas are collected from 40 speakers for five unique words. Every one of the information was gathered in a profoundly acoustic and commotion confirmation condition. Mel recurrence cepstrum coefficients (MFCC's) are represented as constant property of the signal. First and second derivatives of MFCC are used for dynamic properties. Subsequent to deciding element vectors, an adjusted DTW technique is proposed for highlight coordinating. Support Vector Machine (SVM) as well as Radial basis function (RBF) are used to categorize. The model is tried for multiple speakers and a good detection rate is obtained.


2014 ◽  
Vol 29 (6) ◽  
pp. 1072-1082 ◽  
Author(s):  
Xiang-Lilan Zhang ◽  
Zhi-Gang Luo ◽  
Ming Li

Sign in / Sign up

Export Citation Format

Share Document