A robust video text extraction method for character recognition

2005 ◽  
Vol 36 (9) ◽  
pp. 87-96 ◽  
Author(s):  
Osamu Hori ◽  
Takeshi Mita
Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


2015 ◽  
Vol 15 (01) ◽  
pp. 1550002
Author(s):  
Brij Mohan Singh ◽  
Rahul Sharma ◽  
Debashis Ghosh ◽  
Ankush Mittal

In many documents such as maps, engineering drawings and artistic documents, etc. there exist many printed as well as handwritten materials where text regions and text-lines are not parallel to each other, curved in nature, and having various types of text such as different font size, text and non-text areas lying close to each other and non-straight, skewed and warped text-lines. Optical character recognition (OCR) systems available commercially such as ABYY fine reader and Free OCR, are not capable of handling different ranges of stylistic document images containing curved, multi-oriented, and stylish font text-lines. Extraction of individual text-lines and words from these documents is generally not straight forward. Most of the segmentation works reported is on simple documents but still it remains a highly challenging task to implement an OCR that works under all possible conditions and gives highly accurate results, especially in the case of stylistic documents. This paper presents dilation and flood fill morphological operations based approach that extracts multi-oriented text-lines and words from the complex layout or stylistic document images in the subsequent stages. The segmentation results obtained from our method proves to be superior over the standard profiling-based method.


Author(s):  
I Komang Arya Ganda Wiguna ◽  
Agus Muliantara

Handwriting identification is one out of the many research ever conducted. In its development, the handwriting can be written in real time by the user by using the mouse (online character recognition). Various studies on the traditional character handwriting recognition continue to be developed. One of them is the recognition of the Balinese characters. Balinese characters have their own unique characters compared with the other regions. The difference between the shapes of the characters with the other characters are quite similar, or there are some characters that can only be distinguished by a small sketch or doodle.This study uses Artificial Neural Network with Backpropagation algorithm to perform the Balinese characters recognition and zoning as a method of feature extraction. In a variation of the extraction method, the characteristics used are Image Centroid and Zone (ICZ), Zone Centroid and Zone (ZCZ) and normalization of features. Of the three methods, it will be determined the best method used in the Balinese characters recognition.From the test results of the extraction method, the combined characteristics of the ICZ, ZCZ and normalization of features were the most effective to be used for the recognition of the Balinese characters. The level of accuracy obtained from the results of the online testing was 71,28% and 72,31% for offline testing, with parameters of Backpropagation, which used the value of learning rate of 0,03, a momentum value of 0,5 and the number of neurons in the hidden layer of 130.


Sign in / Sign up

Export Citation Format

Share Document