scholarly journals E-Braille Documents: Novel Method For Error Free Generation

2014 ◽  
Vol 19 (4) ◽  
pp. 21-26
Author(s):  
Mohd Wajid ◽  
Vinay Kumar

Abstract Present manuscript proposes a technique for estimating the angle of rotation for a Braille document image which in turn will be used for aiding its automatic character recognition. The technique is based on maximizing number of null projection of the derived image vector. Results show that any amount of rotation transformed distortion can be nullified and thus leading to proper reading of imprinted Braille character pattern. The proposed methods have been successfully tested on manually written as well as computer generated Braille with rotation distortion.

2015 ◽  
Vol 4 (2) ◽  
pp. 74-94
Author(s):  
Pawan Kumar Singh ◽  
Ram Sarkar ◽  
Mita Nasipuri

Script identification is an appealing research interest in the field of document image analysis during the last few decades. The accurate recognition of the script is paramount to many post-processing steps such as automated document sorting, machine translation and searching of text written in a particular script in multilingual environment. For automatic processing of such documents through Optical Character Recognition (OCR) software, it is necessary to identify different script words of the documents before feeding them to the OCR of individual scripts. In this paper, a robust word-level handwritten script identification technique has been proposed using texture based features to identify the words written in any of the seven popular scripts namely, Bangla, Devanagari, Gurumukhi, Malayalam, Oriya, Telugu, and Roman. The texture based features comprise of a combination of Histograms of Oriented Gradients (HOG) and Moment invariants. The technique has been tested on 7000 handwritten text words in which each script contributes 1000 words. Based on the identification accuracies and statistical significance testing of seven well-known classifiers, Multi-Layer Perceptron (MLP) has been chosen as the final classifier which is then tested comprehensively using different folds and with different epoch sizes. The overall accuracy of the system is found to be 94.7% using 5-fold cross validation scheme, which is quite impressive considering the complexities and shape variations of the said scripts. This is an extended version of the paper described in (Singh et al., 2014).


2021 ◽  
Author(s):  
Ushasi Chaudhuri

Rough set is a well-studied subject with a theoretical foundation and many applications. However, its usage in image processing has been very sparse. Most of the well-known algorithms for document image processing related to character recognition, character spotting, and logo retrieval resort to supervised classification, causing the system to slow down in the speed with increasing diversity in the documents, as well as the need to have a large training dataset. Hence, with an aim to resolve the tediousness and pitfalls of training, but without compromising on the efficiency, we introduce a rough-set-theoretic model. It is designed to perform an unsupervised classification of optical characters and logos with a small subset of attributes, called the semi-reduct. The semi-reduct attributes are mostly geometric and topological in nature, each having a small range of discrete values estimated from different combinatorial characteristics of rough-set approximations. This eventually leads to quick and easy discernibility of almost all the characters and logos. In this thesis, we first explain the basics of rough set theory. Subsequently, we propose various attributes that can be easily computed from the binary representation of the images. In subsequent chapters we show how one can select an appropriate subset of such attributes, known as semi-reduct, to perform a document processing task. We demonstrate in this thesis that using the above attributes one can design a character recognition system that is both computationally and storage efficient. Using a different semi-reduct, we show that one can also solve the very delicate task of character spotting in ancient inscriptions. Additionally, we propose appropriate pre-processing steps to binarize the old and dilapidated inscriptions. Finally, we propose a novel technique for logo retrieval using a suitably prepared semi-reduct. Comparison with other existing techniques substantiates our claim that attributes from the rough set are indeed good candidates for document image processing.


Author(s):  
Jane Courtney

For Visually impaired People (VIPs), the ability to convert text to sound can mean a new level of independence or the simple joy of a good book. With significant advances in Optical Character Recognition (OCR) in recent years, a number of reading aids are appearing on the market. These reading aids convert images captured by a camera to text which can then be read aloud. However, all of these reading aids suffer from a key issue – the user must be able to visually target the text and capture an image of sufficient quality for the OCR algorithm to function – no small task for VIPs. In this work, a Sound-Emitting Document Image Quality Assessment metric (SEDIQA) is proposed which allows the user to hear the quality of the text image and automatically captures the best image for OCR accuracy. This work also includes testing of OCR performance against image degradations, to identify the most significant contributors to accuracy reduction. The proposed No-Reference Image Quality Assessor (NR-IQA) is validated alongside established NR-IQAs and this work includes insights into the performance of these NR-IQAs on document images.


2019 ◽  
Vol 9 (21) ◽  
pp. 4529
Author(s):  
Tao Liu ◽  
Hao Liu ◽  
Yingying Wu ◽  
Bo Yin ◽  
Zhiqiang Wei

Capturing document images using digital cameras in uneven lighting conditions is challenging, leading to poorly captured images, which hinders the processing that follows, such as Optical Character Recognition (OCR). In this paper, we propose the use of exposure bracketing techniques to solve this problem. Instead of capturing one image, we used several images that were captured with different exposure settings and used the exposure bracketing technique to generate a high-quality image that incorporates useful information from each image. We found that this technique can enhance image quality and provides an effective way of improving OCR accuracy. Our contributions in this paper are two-fold: (1) a preprocessing chain that uses exposure bracketing techniques for document images is discussed, and an automatic registration method is proposed to find the geometric disparity between multiple document images, which lays the foundation for exposure bracketing; (2) several representative exposure bracketing algorithms are incorporated in the processing chain and their performances are evaluated and compared.


Author(s):  
Priti P. Rege ◽  
Shaheera Akhter

Text separation in document image analysis is an important preprocessing step before executing an optical character recognition (OCR) task. It is necessary to improve the accuracy of an OCR system. Traditionally, for separating text from a document, different feature extraction processes have been used that require handcrafting of the features. However, deep learning-based methods are excellent feature extractors that learn features from the training data automatically. Deep learning gives state-of-the-art results on various computer vision, image classification, segmentation, image captioning, object detection, and recognition tasks. This chapter compares various traditional as well as deep-learning techniques and uses a semantic segmentation method for separating text from Devanagari document images using U-Net and ResU-Net models. These models are further fine-tuned for transfer learning to get more precise results. The final results show that deep learning methods give more accurate results compared with conventional methods of image processing for Devanagari text extraction.


Sign in / Sign up

Export Citation Format

Share Document