Lip Movement Feature Detection and Classification Methods

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ci016 ◽

2018 ◽

pp. 312-317

Author(s):

Siddharth Mujumdar ◽

Monika Borse ◽

Jignasa Shah ◽

Gunjan Soni ◽

Dr. Sheshang Degadwala

Keyword(s):

English Language ◽

Feature Detection ◽

Recognition System ◽

Support Vector ◽

Contour Tracking ◽

Recent Past ◽

Acoustic Environment ◽

Lip Reading ◽

Lip Movement ◽

Lip Detection

Computerized lip reading has been one of the most actively researched areas of computer vision in recent past because of its crime fighting potential and invariance to acoustic environment. However, several factors like fast speech, bad pronunciation, and poor illumination, movement of face, moustaches and beards make lip reading difficult. In present work, we propose a solution for automatic lip contour tracking and recognizing numbers 1-10 of English language spoken by speakers using the information available from lip movements. In this method first, we are detecting Face then in that ROI Lip detection. Lip movements as the only input and without using any speech recognition system in parallel. The approach used in this work is found to significantly solve the purpose of lip reading when size of database is small. In this paper also compare LBP and HOG feature with Support Vector Machine Classification. The system is use for the modern sign in PASSWORD systems.

Download Full-text

Speech emotion recognition based on SVM and KNN classifications fusion

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i2.pp1259-1264 ◽

2021 ◽

Vol 11 (2) ◽

pp. 1259

Author(s):

Mohammed Jawad Al Dujaili ◽

Abbas Ebrahimi-Moghadam ◽

Ahmed Fatlawi

Keyword(s):

Speech Processing ◽

English Language ◽

Nearest Neighbor ◽

Principal Component ◽

Recognition System ◽

Speech Emotion Recognition ◽

Support Vector ◽

K Nearest Neighbor ◽

Zero Crossing ◽

Wide Range

Recognizing the sense of speech is one of the most active research topics in speech processing and in human-computer interaction programs. Despite a wide range of studies in this scope, there is still a long gap among the natural feelings of humans and the perception of the computer. In general, a sensory recognition system from speech can be divided into three main sections: attribute extraction, feature selection, and classification. In this paper, features of fundamental frequency (FEZ) (F0), energy (E), zero-crossing rate (ZCR), fourier parameter (FP), and various combinations of them are extracted from the data vector, Then, the principal component analysis (PCA) algorithm is used to reduce the number of features. To evaluate the system performance. The fusion of each emotional state will be performed later using support vector machine (SVM), K-nearest neighbor (KNN), In terms of comparison, similar experiments have been performed on the emotional speech of the German language, English language, and significant results were obtained by these comparisons.

Download Full-text

Intelligent model for speech recognition based on SVM: A case study on English language

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189314 ◽

2020 ◽

pp. 1-11

Author(s):

Qian Hou ◽

Cuijuan Li ◽

Min Kang ◽

Xin Zhao

Keyword(s):

Speech Recognition ◽

English Language ◽

Feature Recognition ◽

Learning Algorithm ◽

Recognition System ◽

Spectral Subtraction ◽

Spectral Amplitude ◽

Support Vector ◽

Linear Classifiers ◽

Intelligent Technology

English feature recognition has a certain influence on the development of English intelligent technology. In particular, the speech recognition technology has the problem of accuracy when performing English feature recognition. In order to improve the English feature recognition effect, this study takes the intelligent learning algorithm as the system algorithm and combines support vector machines to construct an English feature recognition system and uses linear classifiers and nonlinear classifiers to complete the relevant work of subjective recognition. Moreover, spectral subtraction is introduced in the front end of feature extraction, and the spectral amplitude of the noise-free signal is subtracted from the spectral amplitude of the noise to obtain the spectral amplitude of the pure signal. By taking advantage of the insensitivity of speech to the phase, the phase angle information before spectral subtraction is directly used to reconstruct the signal after spectral subtraction to obtain the denoised speech. In addition, this study uses a nonlinear power function that simulates the hearing characteristics of the human ear to extract the features of the denoised speech signal and combines the English features to expand the recognition. Finally, this study analyzes the performance of the algorithm proposed in this study through comparative experiments. The research results show that the algorithm in this paper has a certain effect.

Download Full-text

OFFLINE YORÙBÁ HANDWRITTEN WORD RECOGNITION USING GEOMETRIC FEATURE EXTRACTION AND SUPPORT VECTOR MACHINE CLASSIFIER

MALAYSIAN JOURNAL OF COMPUTING ◽

10.24191/mjoc.v5i2.8947 ◽

2020 ◽

Vol 5 (2) ◽

pp. 504

Author(s):

Matthias Omotayo Oladele ◽

Temilola Morufat Adepoju ◽

Olaide ` Abiodun Olatoke ◽

Oluwaseun Adewale Ojo

Keyword(s):

Support Vector Machine ◽

Feature Extraction ◽

Word Recognition ◽

Support Vector Machine Classifier ◽

Recognition Accuracy ◽

Recognition System ◽

Support Vector ◽

Geometric Features ◽

Total Length ◽

Yoruba Language

Yorùbá language is one of the three main languages that is been spoken in Nigeria. It is a tonal language that carries an accent on the vowel alphabets. There are twenty-five (25) alphabets in Yorùbá language with one of the alphabets a digraph (GB). Due to the difficulty in typing handwritten Yorùbá documents, there is a need to develop a handwritten recognition system that can convert the handwritten texts to digital format. This study discusses the offline Yorùbá handwritten word recognition system (OYHWR) that recognizes Yorùbá uppercase alphabets. Handwritten characters and words were obtained from different writers using the paint application and M708 graphics tablets. The characters were used for training and the words were used for testing. Pre-processing was done on the images and the geometric features of the images were extracted using zoning and gradient-based feature extraction. Geometric features are the different line types that form a particular character such as the vertical, horizontal, and diagonal lines. The geometric features used are the number of horizontal lines, number of vertical lines, number of right diagonal lines, number of left diagonal lines, total length of all horizontal lines, total length of all vertical lines, total length of all right slanting lines, total length of all left-slanting lines and the area of the skeleton. The characters are divided into 9 zones and gradient feature extraction was used to extract the horizontal and vertical components and geometric features in each zone. The words were fed into the support vector machine classifier and the performance was evaluated based on recognition accuracy. Support vector machine is a two-class classifier, hence a multiclass SVM classifier least square support vector machine (LSSVM) was used for word recognition. The one vs one strategy and RBF kernel were used and the recognition accuracy obtained from the tested words ranges between 66.7%, 83.3%, 85.7%, 87.5%, and 100%. The low recognition rate for some of the words could be as a result of the similarity in the extracted features.

Download Full-text

GESTURE RECOGNITION SYSTEM FOR NIGERIAN TRIBAL GREETING POSTURES USING SUPPORT VECTOR MACHINE

MALAYSIAN JOURNAL OF COMPUTING ◽

10.24191/mjoc.v5i2.10347 ◽

2020 ◽

Vol 5 (2) ◽

pp. 609

Author(s):

Segun Aina ◽

Kofoworola V. Sholesi ◽

Aderonke R. Lawal ◽

Samuel D. Okegbile ◽

Adeniran I. Oluwaranti

Keyword(s):

Support Vector Machine ◽

Gesture Recognition ◽

Recognition Rate ◽

Recognition Task ◽

Recognition System ◽

Human Interaction ◽

Support Vector ◽

System A ◽

Extraction Algorithm ◽

Gaussian Blur

This paper presents the application of Gaussian blur filters and Support Vector Machine (SVM) techniques for greeting recognition among the Yoruba tribe of Nigeria. Existing efforts have considered different recognition gestures. However, tribal greeting postures or gestures recognition for the Nigerian geographical space has not been studied before. Some cultural gestures are not correctly identified by people of the same tribe, not to mention other people from different tribes, thereby posing a challenge of misinterpretation of meaning. Also, some cultural gestures are unknown to most people outside a tribe, which could also hinder human interaction; hence there is a need to automate the recognition of Nigerian tribal greeting gestures. This work hence develops a Gaussian Blur – SVM based system capable of recognizing the Yoruba tribe greeting postures for men and women. Videos of individuals performing various greeting gestures were collected and processed into image frames. The images were resized and a Gaussian blur filter was used to remove noise from them. This research used a moment-based feature extraction algorithm to extract shape features that were passed as input to SVM. SVM is exploited and trained to perform the greeting gesture recognition task to recognize two Nigerian tribe greeting postures. To confirm the robustness of the system, 20%, 25% and 30% of the dataset acquired from the preprocessed images were used to test the system. A recognition rate of 94% could be achieved when SVM is used, as shown by the result which invariably proves that the proposed method is efficient.

Download Full-text

Bimodal Hand Vein Recognition System using Support Vector Machine

2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM) ◽

10.1109/hnicem51456.2020.9400017 ◽

2020 ◽

Author(s):

Abram Philip I D. Magadia ◽

Rufter Fits Gerald L. Zamora ◽

Noel B. Linsangan ◽

Hanna Leah P. Angelia

Keyword(s):

Support Vector Machine ◽

Recognition System ◽

Support Vector ◽

Hand Vein ◽

Vein Recognition

Download Full-text

FastSK: fast sequence analysis with gapped string kernels

Bioinformatics ◽

10.1093/bioinformatics/btaa817 ◽

2020 ◽

Vol 36 (Supplement_2) ◽

pp. i857-i865

Author(s):

Derrick Blakely ◽

Eamon Collins ◽

Ritambhara Singh ◽

Andrew Norton ◽

Jack Lanchantin ◽

...

Keyword(s):

Sequence Analysis ◽

Dna Sequences ◽

English Language ◽

Computation Time ◽

Entity Recognition ◽

Supplementary Information ◽

Support Vector ◽

Homology Detection ◽

Scalable Algorithm ◽

String Kernels

Abstract Motivation Gapped k-mer kernels with support vector machines (gkm-SVMs) have achieved strong predictive performance on regulatory DNA sequences on modestly sized training sets. However, existing gkm-SVM algorithms suffer from slow kernel computation time, as they depend exponentially on the sub-sequence feature length, number of mismatch positions, and the task’s alphabet size. Results In this work, we introduce a fast and scalable algorithm for calculating gapped k-mer string kernels. Our method, named FastSK, uses a simplified kernel formulation that decomposes the kernel calculation into a set of independent counting operations over the possible mismatch positions. This simplified decomposition allows us to devise a fast Monte Carlo approximation that rapidly converges. FastSK can scale to much greater feature lengths, allows us to consider more mismatches, and is performant on a variety of sequence analysis tasks. On multiple DNA transcription factor binding site prediction datasets, FastSK consistently matches or outperforms the state-of-the-art gkmSVM-2.0 algorithms in area under the ROC curve, while achieving average speedups in kernel computation of ∼100× and speedups of ∼800× for large feature lengths. We further show that FastSK outperforms character-level recurrent and convolutional neural networks while achieving low variance. We then extend FastSK to 7 English-language medical named entity recognition datasets and 10 protein remote homology detection datasets. FastSK consistently matches or outperforms these baselines. Availability and implementation Our algorithm is available as a Python package and as C++ source code at https://github.com/QData/FastSK Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Recognition of Symbolic Gestures Using Depth Information

Advances in Human-Computer Interaction ◽

10.1155/2018/1069823 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 2

Author(s):

Hasan Mahmud ◽

Md. Kamrul Hasan ◽

Abdullah-Al-Tariq ◽

Md. Hasanul Kabir ◽

M. A. Mottalib

Keyword(s):

Contextual Information ◽

Depth Map ◽

Recognition System ◽

Support Vector ◽

Svm Classifier ◽

Depth Information ◽

Scale Invariant ◽

Binary Images ◽

Depth Images ◽

Symbolic Gestures

Symbolic gestures are the hand postures with some conventionalized meanings. They are static gestures that one can perform in a very complex environment containing variations in rotation and scale without using voice. The gestures may be produced in different illumination conditions or occluding background scenarios. Any hand gesture recognition system should find enough discriminative features, such as hand-finger contextual information. However, in existing approaches, depth information of hand fingers that represents finger shapes is utilized in limited capacity to extract discriminative features of fingers. Nevertheless, if we consider finger bending information (i.e., a finger that overlaps palm), extracted from depth map, and use them as local features, static gestures varying ever so slightly can become distinguishable. Our work here corroborated this idea and we have generated depth silhouettes with variation in contrast to achieve more discriminative keypoints. This approach, in turn, improved the recognition accuracy up to 96.84%. We have applied Scale-Invariant Feature Transform (SIFT) algorithm which takes the generated depth silhouettes as input and produces robust feature descriptors as output. These features (after converting into unified dimensional feature vectors) are fed into a multiclass Support Vector Machine (SVM) classifier to measure the accuracy. We have tested our results with a standard dataset containing 10 symbolic gesture representing 10 numeric symbols (0-9). After that we have verified and compared our results among depth images, binary images, and images consisting of the hand-finger edge information generated from the same dataset. Our results show higher accuracy while applying SIFT features on depth images. Recognizing numeric symbols accurately performed through hand gestures has a huge impact on different Human-Computer Interaction (HCI) applications including augmented reality, virtual reality, and other fields.

Download Full-text

Fast Recognition System for Tree Images based on Caffe Platform and Deep Learning

10.21203/rs.3.rs-52291/v1 ◽

2020 ◽

Author(s):

Zongchen Li ◽

Wenzhuo Zhang ◽

Guoxiong Zhou

Keyword(s):

Deep Learning ◽

Dual Task ◽

Learning Algorithm ◽

Recognition System ◽

Support Vector ◽

Similar Species ◽

Graphical Interface ◽

Fast Recognition ◽

System Solution ◽

Tree Image

Abstract Aiming at the difficult problem of complex extraction for tree image in the existing complex background, we took tree species as the research object and proposed a fast recognition system solution for tree image based on Caffe platform and deep learning. In the research of deep learning algorithm based on Caffe framework, the improved Dual-Task CNN model (DCNN) is applied to train the image extractor and classifier to accomplish the dual tasks of image cleaning and tree classification. In addition, when compared with the traditional classification methods represented by Support Vector Machine (SVM) and Single-Task CNN model, Dual-Task CNN model demonstrates its superiority in classification performance. Then, in order for further improvement to the recognition accuracy for similar species, Gabor kernel was introduced to extract the features of frequency domain for images in different scales and directions, so as to enhance the texture features of leaf images and improve the recognition effect. The improved model was tested on the data sets of similar species. As demonstrated by the results, the improved deep Gabor convolutional neural network (GCNN) is advantageous in tree recognition and similar tree classification when compared with the Dual-Task CNN classification method. Finally, the recognition results of trees can be displayed on the application graphical interface as well. In the application graphical interface designed based on Ubantu system, it is capable to perform such functions as quick reading of and search for picture files, snapshot, one-key recognition, one-key e

Download Full-text

Extraction of Lip features for the Identification of Vowels Utterances using MFCC and Geometrical Aspects

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4238.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 3978-3983

Keyword(s):

Speech Recognition ◽

Software Tools ◽

Minor Axis ◽

Noisy Environment ◽

Video File ◽

Feature Vectors ◽

Individual Person ◽

Lip Reading ◽

Lip Movement

Identification of a person’s speech by his lip movement is a challenging task. Even though many software tools available for recognition of speech to text and vice versa, some of the words uttered may not be accurate as spoken and may vary from person to person because of their pronunciation. In addition, in the noisy environment speech uttered may not perceive effectively hence there lip movement for a given speech varies. Lip reading has added advantages when it augmented with speech recognition, thus increasing the perceived information. In this paper, the video file of a individual person are converted to frames and extraction of only the lip contour for vowels is done by calculating its area and other geometrical aspects. Once this is done as a part of testing it is compared with three to four people’s lip contour for vowels for first 20 frames. The parameters such as mean, centroid will remain approximately same for all people irrespective of their lip movement but there is change in major and minor axis and hence area changes considerably. In audio domain vowel detection is carried out by extracting unique features of English vowel utterance using Mel Frequency Cepstrum Coefficients (MFCC) and the feature vectors that are orthonormalized to compare the normalized vectors with standard database and results are obtained with approximation.

Download Full-text

A Baybayin word recognition system

PeerJ Computer Science ◽

10.7717/peerj-cs.596 ◽

2021 ◽

Vol 7 ◽

pp. e596

Author(s):

Rodney Pino ◽

Renier Mendoza ◽

Rachelle Sambayan

Keyword(s):

Basic Education ◽

Recognition Accuracy ◽

Recognition System ◽

The Philippines ◽

Support Vector ◽

Writing System ◽

Review Of The Literature ◽

Word Level ◽

Latin Script ◽

Word Images

Baybayin is a pre-Hispanic Philippine writing system used in Luzon island. With the effort in reintroducing the script, in 2018, the Committee on Basic Education and Culture of the Philippine Congress approved House Bill 1022 or the ”National Writing System Act,” which declares the Baybayin script as the Philippines’ national writing system. Since then, Baybayin OCR has become a field of research interest. Numerous works have proposed different techniques in recognizing Baybayin scripts. However, all those studies anchored on the classification and recognition at the character level. In this work, we propose an algorithm that provides the Latin transliteration of a Baybayin word in an image. The proposed system relies on a Baybayin character classifier generated using the Support Vector Machine (SVM). The method involves isolation of each Baybayin character, then classifying each character according to its equivalent syllable in Latin script, and finally concatenate each result to form the transliterated word. The system was tested using a novel dataset of Baybayin word images and achieved a competitive 97.9% recognition accuracy. Based on our review of the literature, this is the first work that recognizes Baybayin scripts at the word level. The proposed system can be used in automated transliterations of Baybayin texts transcribed in old books, tattoos, signage, graphic designs, and documents, among others.

Download Full-text