scholarly journals AUDD: Audio Urdu Digits Dataset for Automatic Audio Urdu Digit Recognition

2021 ◽  
Vol 11 (19) ◽  
pp. 8842
Author(s):  
Aisha Aiman ◽  
Yao Shen ◽  
Malika Bendechache ◽  
Irum Inayat ◽  
Teerath Kumar

The ongoing development of audio datasets for numerous languages has spurred research activities towards designing smart speech recognition systems. A typical speech recognition system can be applied in many emerging applications, such as smartphone dialing, airline reservations, and automatic wheelchairs, among others. Urdu is a national language of Pakistan and is also widely spoken in many other South Asian countries (e.g., India, Afghanistan). Therefore, we present a comprehensive dataset of spoken Urdu digits ranging from 0 to 9. Our dataset has 25,518 sound samples that are collected from 740 participants. To test the proposed dataset, we apply different existing classification algorithms on the datasets including Support Vector Machine (SVM), Multilayer Perceptron (MLP), and flavors of the EfficientNet. These algorithms serve as a baseline. Furthermore, we propose a convolutional neural network (CNN) for audio digit classification. We conduct the experiment using these networks, and the results show that the proposed CNN is efficient and outperforms the baseline algorithms in terms of classification accuracy.

1988 ◽  
Vol 16 (3) ◽  
pp. 283-293 ◽  
Author(s):  
L. E. Hiner

Speech recognition systems promise to facilitate access to computer for users with disability. This study examined the usefulness of the Texas Instruments Speech Recognition System in completing word processing tasks under the experimental conditions of 1) keyboard input only, 2) speech recognition only, and 3) a combination of keyboard input and speech recognition. Five subjects with some degree of upper-body disability were tested; the results indicate that performance was 1) greatest under the keyboard only condition, 2) lowest under the speech only condition, and 3) somewhat lower under the combined condition than under the keyboard only condition. Based on the findings, suggestions for further research were made.


2011 ◽  
Vol 467-469 ◽  
pp. 1905-1910
Author(s):  
Jun Feng Zhao ◽  
Ye Ping Zhu

This paper introduces the characteristics and requirements of speech recognition technology based on embedded platform. It also describes the basic theory and related properties of Support Vector Machine. The advantages and disadvantages of the Multiclass SVM algorithms are analyzed, providing the algorithms principles for training and recognition of SVM application in the embedded speech recognition system. Finally, we proposed a design strategy based on multiclass SVM decision tree classifier, combined with the features of the embedded speech recognition.


In order to make fast communication between human and machine, speech recognition system are used. Number of speech recognition systems have been developed by various researchers. For example speech recognition, speaker verification and speaker recognition. The basic stages of speech recognition system are pre-processing, feature extraction and feature selection and classification. Numerous works have been done for improvement of all these stages to get accurate and better results. In this paper the main focus is given to addition of machine learning in speech recognition system. This paper covers architecture of ASR that helps in getting idea about basic stages of speech recognition system. Then focus is given to the use of machine learning in ASR. The work done by various researchers using Support vector machine and artificial neural network is also covered in a section of the paper. Along with this review is presented on work done using SVM, ELM, ANN, Naive Bayes and kNN classifier. The simulation results show that the best accuracy is achieved using ELM classifier. The last section of paper covers the results obtained by using proposed approaches in which SVM, ANN with Cuckoo search algorithm and ANN with back propagation classifier is used. The focus is also on the improvement of pre-processing and feature extraction processes.


Author(s):  
Yildiz Aydin ◽  
Funda Akar

Among the many applications in the field of computer vision, face recognition systems; is a subject that has been studied extensively and has been working for a long time. In general, the success of facial recognition systems, which consist of feature extraction and classifier steps, depends not only on the classifier but also on the features used. In a face recognition system, the feature selection is to obtain distinctive features for recognition of different facial images of interest. For this purpose, SIFT, SURF and SIFT + SURF features, which are unchanging features to scaling and affine transformations, are used in this study. In addition, to be able to compare with these local features, the HOG feature which is a global feature, also has been added to the study. Classification was performed using support vector machine. Experimental results show that local features are more successful than the global feature HOG.


2021 ◽  
Vol 8 (1) ◽  
pp. 164-170
Author(s):  
Mohammad Husam Alhumsi ◽  
Saleh Belhassen

Phonetic dictionaries are regarded as pivotal components of speech recognition systems. The function of speech recognition research is to generate a machine which will accurately identify and distinguish the normal human speech from any other speaker. Literature affirmed that Arabic phonetics is one of the major problems in Arabic speech recognition. Therefore, this paper reviews previous studies tackling the challenges faced by initiating an Arabic phonetic dictionary with respect to Arabic speech recognition. It has been found that the system of speech recognition investigated areas of differences concerning Arabic phonetics. In addition, an Arabic phonetic dictionary should be initiated where the Arabic vowels’ phonemes should be considered as a component of the consonants’ phonemes. Thus, the incorporation of developed machine translation systems may enhance the quality of the system. The current paper concludes with the existing challenges faced by Arabic phonetic dictionary.


Author(s):  
Nitin Sharma ◽  
Pawan Kumar Dahiya ◽  
Baldev Raj Marwah

: Automatic licence plate recognition systems are used for various applications such as traffic monitoring, toll collection, car parking, law enforcement. In this paper, a convolutional neural network and support vector machine based automatic licence plate recognition system is proposed. Firstly, The characters extracts from the input image of vehicle. Then characters are segment and their features are extracts. The extracted features are classified using convolutional neural network and support vector machine for the final recognition of the licence plate. The obtained recognition rate by the hybridization of the convolutional neural network and the support vector machine is 96.5%. The recognition rate obtained for the proposed hybrid automatic licence plate system are compared with three other automatic licence plate systems based on neural network, support vector machine, and convolutional neural network. The proposed automatic licence plate recognition system perform better than the neural network, support vector machine, and convolutional nerural network based automatic licence plate recognition systems.


Sign in / Sign up

Export Citation Format

Share Document