scholarly journals Classification Algorithm for Person Identification and Gesture Recognition Based on Hand Gestures with Small Training Sets

Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7279
Author(s):  
Krzysztof Rzecki

Classification algorithms require training data initially labelled by classes to build a model and then to be able to classify the new data. The amount and diversity of training data affect the classification quality and usually the larger the training set, the better the accuracy of classification. In many applications only small amounts of training data are available. This article presents a new time series classification algorithm for problems with small training sets. The algorithm was tested on hand gesture recordings in tasks of person identification and gesture recognition. The algorithm provides significantly better classification accuracy than other machine learning algorithms. For 22 different hand gestures performed by 10 people and the training set size equal to 5 gesture execution records per class, the error rate for the newly proposed algorithm is from 37% to 75% lower than for the other compared algorithms. When the training set consists of only one sample per class the new algorithm reaches from 45% to 95% lower error rate. Conducted experiments indicate that the algorithm outperforms state-of-the-art methods in terms of classification accuracy in the problem of person identification and gesture recognition.

2004 ◽  
Vol 43 (02) ◽  
pp. 192-201 ◽  
Author(s):  
R. E. Abdel-Aal

Summary Objectives: To introduce abductive network classifier committees as an ensemble method for improving classification accuracy in medical diagnosis. While neural networks allow many ways to introduce enough diversity among member models to improve performance when forming a committee, the self-organizing, automatic-stopping nature, and learning approach used by abductive networks are not very conducive for this purpose. We explore ways of overcoming this limitation and demonstrate improved classification on three standard medical datasets. Methods: Two standard 2-class medical datasets (Pima Indians Diabetes and Heart Disease) and a 6-class dataset (Dermatology) were used to investigate ways of training abductive networks with adequate independence, as well as methods of combining their outputs to form a network that improves performance beyond that of single models. Results: Two- or three-member committees of models trained on completely or partially different subsets of training data and using simple output combination methods achieve improvements between 2 and 5 percentage points in the classification accuracy over the best single model developed using the full training set. Conclusions: Varying model complexity alone gives abductive network models that are too correlated to ensure enough diversity for forming a useful committee. Diversity achieved through training member networks on independent subsets of the training data outweighs limitations of the smaller training set for each, resulting in net gain in committee performance. As such models train faster and can be trained in parallel, this can also speed up classifier development.


2013 ◽  
Vol 303-306 ◽  
pp. 1609-1612
Author(s):  
Huai Lin Dong ◽  
Xiao Dan Zhu ◽  
Qing Feng Wu ◽  
Juan Juan Huang

Naïve Bayes classification algorithm based on validity (NBCABV) optimizes the training data by eliminating the noise samples of training data with validity to improve the effect of classification, while it ignores the associations of properties. In consideration of the associations of properties, an improved method that is classification algorithm for Naïve Bayes based on validity and correlation (CANBBVC) is proposed to delete more noise samples with validity and correlation, thus resulting in better classification performance. Experimental results show this model has higher classification accuracy comparing the one based on validity solely.


Energies ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 1945
Author(s):  
Icksung Kim ◽  
Woohyun Kim

Fault detection and diagnosis (FDD) systems enable high cost savings and energy savings that could have economic and environmental impact. This study aims to develop and validate a data-driven FDD system for a chiller. The system uses historical operation data to capture quantitative correlations among system variables. This study evaluated the effectiveness and robustness of eight FDD classification methods based on the experimental data of the chiller (the ASHRAE 1043-RP project). The training data used for the FDD system is classified into four cases. Moreover, true and false positive rates are used to characterize the performance of the classification methods. The results show that local fault is not significantly sensitive to training data, and shows high classification accuracy for all cases. The system fault has a significant effect on the amount of data and the severity levels on the classification accuracy.


2020 ◽  
Vol 34 (05) ◽  
pp. 8261-8268
Author(s):  
Xinjian Li ◽  
Siddharth Dalmia ◽  
David Mortensen ◽  
Juncheng Li ◽  
Alan Black ◽  
...  

Automatic phonemic transcription tools are useful for low-resource language documentation. However, due to the lack of training sets, only a tiny fraction of languages have phonemic transcription tools. Fortunately, multilingual acoustic modeling provides a solution given limited audio training data. A more challenging problem is to build phonemic transcribers for languages with zero training data. The difficulty of this task is that phoneme inventories often differ between the training languages and the target language, making it infeasible to recognize unseen phonemes. In this work, we address this problem by adopting the idea of zero-shot learning. Our model is able to recognize unseen phonemes in the target language without any training data. In our model, we decompose phonemes into corresponding articulatory attributes such as vowel and consonant. Instead of predicting phonemes directly, we first predict distributions over articulatory attributes, and then compute phoneme distributions with a customized acoustic model. We evaluate our model by training it using 13 languages and testing it using 7 unseen languages. We find that it achieves 7.7% better phoneme error rate on average over a standard multilingual model.


Author(s):  
Andri Wijaya ◽  
Abba Suganda Girsang

This  article  discusses  the  analysis  of  customer  loyalty  using  three  data  mining  methods:  C4.5,Naive Bayes, and Nearest Neighbor Algorithms and real-world  empirical  data.  The  data  contain  ten  attributes related to the customer loyalty and are obtained from a national  multimedia  company  in  Indonesia.  The  dataset contains 2269 records. The study also evaluates the effects of  the  size  of  the  training  data  to  the  accuracy  of  the classification.  The  results  suggest  that  C4.5  algorithm produces   highest classification   accuracy   at   the   order of  81%  followed  by  the  methods  of  Naive  Bayes  76% and  Nearest  Neighbor  55%.  In  addition,  the  numerical evaluation  also  suggests  that  the  proportion  of  80%  is optimal  for  the  training  set.


2018 ◽  
Author(s):  
Costa D. Christopoulos ◽  
Sarvesh Garimella ◽  
Maria A. Zawadowicz ◽  
Ottmar Möhler ◽  
Daniel J. Cziczo

Abstract. Compositional analysis of atmospheric and laboratory aerosols is often conducted via single-particle mass spectrometry (SPMS), an in situ and real-time analytical technique that produces mass spectra on a single particle basis. In this study, machine learning classifiers are created using a dataset of SPMS spectra to automatically differentiate particles on the basis of chemistry and size. Machine learning algorithms build a predictive model from a training set for which the aerosol type associated with each mass spectrum is known a priori. Classification models were also created to differentiate aerosol within four broad categories: fertile soils, mineral/metallic particles, biological, and all other aerosols. Differentiation was accomplished using ~ 40 positive and negative spectral features. For the broad categorization, machine learning resulted in a classification accuracy of ~ 93 %. Classification of aerosols by specific type resulted in a classification accuracy of ~ 87 %. The ‘trained’ model was then applied to a ‘blind’ mixture of aerosols which was known to to be a subset of the training set. Model agreement was found on the presence of secondary organic aerosol, coated and uncoated mineral dust and fertile soil.


2018 ◽  
Vol 11 (10) ◽  
pp. 5687-5699 ◽  
Author(s):  
Costa D. Christopoulos ◽  
Sarvesh Garimella ◽  
Maria A. Zawadowicz ◽  
Ottmar Möhler ◽  
Daniel J. Cziczo

Abstract. Compositional analysis of atmospheric and laboratory aerosols is often conducted via single-particle mass spectrometry (SPMS), an in situ and real-time analytical technique that produces mass spectra on a single-particle basis. In this study, classifiers are created using a data set of SPMS spectra to automatically differentiate particles on the basis of chemistry and size. Machine learning algorithms build a predictive model from a training set for which the aerosol type associated with each mass spectrum is known a priori. Our primary focus surrounds the growing of random forests using feature selection to reduce dimensionality and the evaluation of trained models with confusion matrices. In addition to classifying ∼20 unique, but chemically similar, aerosol types, models were also created to differentiate aerosol within four broader categories: fertile soils, mineral/metallic particles, biological particles, and all other aerosols. Differentiation was accomplished using ∼40 positive and negative spectral features. For the broad categorization, machine learning resulted in a classification accuracy of ∼93 %. Classification of aerosols by specific type resulted in a classification accuracy of ∼87 %. The “trained” model was then applied to a “blind” mixture of aerosols which was known to be a subset of the training set. Model agreement was found on the presence of secondary organic aerosol, coated and uncoated mineral dust, and fertile soil.


2021 ◽  
Author(s):  
Alexander Derry ◽  
Kristy A. Carpenter ◽  
Russ B. Altman

The three-dimensional structures of proteins are crucial for understanding their molecular mechanisms and interactions. Machine learning algorithms that are able to learn accurate representations of protein structures are therefore poised to play a key role in protein engineering and drug development. The accuracy of such models in deployment is directly influenced by training data quality. The use of different experimental methods for protein structure determination may introduce bias into the training data. In this work, we evaluate the magnitude of this effect across three distinct tasks: estimation of model accuracy, protein sequence design, and catalytic residue prediction. Most protein structures are derived from X-ray crystallography, nuclear magnetic resonance (NMR), or cryo-electron microscopy (cryo-EM); we trained each model on datasets consisting of either all three structure types or of only X-ray data. We find that across these tasks, models consistently perform worse on test sets derived from NMR and cryo-EM than they do on test sets of structures derived from X-ray crystallography, but that the difference can be mitigated when NMR and cryo-EM structures are included in the training set. Importantly, we show that including all three types of structures in the training set does not degrade test performance on X-ray structures, and in some cases even increases it. Finally, we examine the relationship between model performance and the biophysical properties of each method, and recommend that the biochemistry of the task of interest should be considered when composing training sets.


2016 ◽  
Vol 28 (4) ◽  
Author(s):  
Georg Ruppert ◽  
Mushtaq Hussain ◽  
Heimo Müller

The paper presents a method of predicting classification accuracy of remote sensing data by means of training set analysis. Various sampling plans were applied to satellite image and its complete ground truth to derive different training sets. The quality of these training sets was determined by quantifying the similarity of the training set distributions to the ones of the entire satellite image. Each training set was then used to learn a classifier.The paper shows how the accuracy of classifications that were carried out using these classifiers depends upon the quality of the corresponding training sets.


2018 ◽  
Vol 7 (3.32) ◽  
pp. 103
Author(s):  
Mostafa Alghamdi ◽  
Tami Alwajeeh ◽  
Fahad Aljabeer ◽  
Setiawan Assegaff ◽  
Rahmat Budiarto

Traditionally human interacts with a computer by using keyboard and mouse. Considering person with handicapped from the wrist to the fingertip or amputated wrists or fingertips need alternative way; using voice or hand gesture. This work focuses on the use of hand-gesture image recognition. There are two main issues should be considered; less interactivity in static hand gesture recognition, and less accuracy in dynamic hand gesture recognition. This paper attempts to improve the accuracy of hand-gesture image recognition by experimenting simple deep learning neural network (DLNN). As this work uses a simple DLNN, the relation between the hidden layers is not considered. The number of hidden layers in the proposed architecture of the DLNN for the experiments vary from one to five.With the aims to understand the effect of the number of neurons in the hidden layers, the DLNN is experimented using different numbers of hidden neurons. Six different types of hand gestures are considered. 800 videos on hand gestures taken from Vision for Intelligent Vehicles and Applications (VIVA) portal are used in the experiment. The data is divided into two; one as training data and another part is for testing. The best result is achieved when the DLNN uses two hidden layers with 250 neurons in the first hidden layer, and 100 neurons in the second hidden layer. The average of the achieved accuracy level is 77.56%. Experimental results also show that the more number of hidden layer causes over-fitting (does not make the recognition better). It is also observed that the increase of hidden layer number and hidden neurons only affect the accuracy of recognition of the trained dataset and does not improve the recognition of untrained dataset. This result is because the interrelation among the hidden layer are not considered.  


Sign in / Sign up

Export Citation Format

Share Document