scholarly journals Javanese Document Image Recognition Using Multiclass Support Vector Machine

Author(s):  
Yuna Sugianela ◽  
Nanik Suciati

Some ancient documents in Indonesia are written in the Javanese script. Those documents contain the knowledge of history and culture of Indonesia, especially about Java. However, only a few people understand the Javanese script. Thus, the automation system is needed to translate the document written in the Javanese script. In this study, the researchers use the classification method to recognize the Javanese script written in the document. The method used is the Multiclass Support Vector Machine (SVM) using One Against One (OAO) strategy. The researchers use seven variations of Javanese script from the different document for this study. There are 31 classes and 182 data for training and testing data. The result shows good performance in the evaluation. The recognition system successfully resolves the problem of color variation from the dataset. The accuracy of the study is 81.3%.

2011 ◽  
Vol 467-469 ◽  
pp. 1905-1910
Author(s):  
Jun Feng Zhao ◽  
Ye Ping Zhu

This paper introduces the characteristics and requirements of speech recognition technology based on embedded platform. It also describes the basic theory and related properties of Support Vector Machine. The advantages and disadvantages of the Multiclass SVM algorithms are analyzed, providing the algorithms principles for training and recognition of SVM application in the embedded speech recognition system. Finally, we proposed a design strategy based on multiclass SVM decision tree classifier, combined with the features of the embedded speech recognition.


2020 ◽  
Vol 5 (2) ◽  
pp. 504
Author(s):  
Matthias Omotayo Oladele ◽  
Temilola Morufat Adepoju ◽  
Olaide ` Abiodun Olatoke ◽  
Oluwaseun Adewale Ojo

Yorùbá language is one of the three main languages that is been spoken in Nigeria. It is a tonal language that carries an accent on the vowel alphabets. There are twenty-five (25) alphabets in Yorùbá language with one of the alphabets a digraph (GB). Due to the difficulty in typing handwritten Yorùbá documents, there is a need to develop a handwritten recognition system that can convert the handwritten texts to digital format. This study discusses the offline Yorùbá handwritten word recognition system (OYHWR) that recognizes Yorùbá uppercase alphabets. Handwritten characters and words were obtained from different writers using the paint application and M708 graphics tablets. The characters were used for training and the words were used for testing. Pre-processing was done on the images and the geometric features of the images were extracted using zoning and gradient-based feature extraction. Geometric features are the different line types that form a particular character such as the vertical, horizontal, and diagonal lines. The geometric features used are the number of horizontal lines, number of vertical lines, number of right diagonal lines, number of left diagonal lines, total length of all horizontal lines, total length of all vertical lines, total length of all right slanting lines, total length of all left-slanting lines and the area of the skeleton. The characters are divided into 9 zones and gradient feature extraction was used to extract the horizontal and vertical components and geometric features in each zone. The words were fed into the support vector machine classifier and the performance was evaluated based on recognition accuracy. Support vector machine is a two-class classifier, hence a multiclass SVM classifier least square support vector machine (LSSVM) was used for word recognition. The one vs one strategy and RBF kernel were used and the recognition accuracy obtained from the tested words ranges between 66.7%, 83.3%, 85.7%, 87.5%, and 100%. The low recognition rate for some of the words could be as a result of the similarity in the extracted features.


2020 ◽  
Vol 5 (2) ◽  
pp. 609
Author(s):  
Segun Aina ◽  
Kofoworola V. Sholesi ◽  
Aderonke R. Lawal ◽  
Samuel D. Okegbile ◽  
Adeniran I. Oluwaranti

This paper presents the application of Gaussian blur filters and Support Vector Machine (SVM) techniques for greeting recognition among the Yoruba tribe of Nigeria. Existing efforts have considered different recognition gestures. However, tribal greeting postures or gestures recognition for the Nigerian geographical space has not been studied before. Some cultural gestures are not correctly identified by people of the same tribe, not to mention other people from different tribes, thereby posing a challenge of misinterpretation of meaning. Also, some cultural gestures are unknown to most people outside a tribe, which could also hinder human interaction; hence there is a need to automate the recognition of Nigerian tribal greeting gestures. This work hence develops a Gaussian Blur – SVM based system capable of recognizing the Yoruba tribe greeting postures for men and women. Videos of individuals performing various greeting gestures were collected and processed into image frames. The images were resized and a Gaussian blur filter was used to remove noise from them. This research used a moment-based feature extraction algorithm to extract shape features that were passed as input to SVM. SVM is exploited and trained to perform the greeting gesture recognition task to recognize two Nigerian tribe greeting postures. To confirm the robustness of the system, 20%, 25% and 30% of the dataset acquired from the preprocessed images were used to test the system. A recognition rate of 94% could be achieved when SVM is used, as shown by the result which invariably proves that the proposed method is efficient.


2021 ◽  
Author(s):  
A. F. M. Amidon ◽  
N. Z. Mahabob ◽  
M. H. Haron ◽  
N. Ismail ◽  
Z. M. Yusoff ◽  
...  

Author(s):  
YAN ZHANG ◽  
BIN YU ◽  
HAI-MING GU

Document image segmentation is an important research area of document image analysis which classifies the contents of a document image into a set of text and non-text classes. Previous existing methods are often designed to classify text and halftone therefore they perform poorly in classifying graphics, tables and circuit, etc. In this paper, we present a robust multi-level classification method using multi-layer perceptron (MLP) and support vector machine (SVM) to segment the texts from non-texts and thereafter classify them as tables, graphics and halftones. This method outperforms previously existing methods by overcoming various issues associated with the complexity of document images. Experimental results prove the effectiveness of our proposed method. By virtue of our multi-level classification approach, the text components, halftone components, graphic components and table components are accurately classified respectively which would highly improve OCR accuracy to reduce garbage symbols as well as increase compression ratio thereafter simultaneously.


Sign in / Sign up

Export Citation Format

Share Document