A deep learning based character recognition system from multimedia document

Author(s):  
Usha Yadav ◽  
Satya Verma ◽  
Deepak Kumar Xaxa ◽  
Chandrakant Mahobiya
2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


Author(s):  
Oyeniran Oluwashina Akinloye ◽  
Oyebode Ebenezer Olukunle

Numerous works have been proposed and implemented in computerization of various human languages, nevertheless, miniscule effort have also been made so as to put Yorùbá Handwritten Character on the map of Optical Character Recognition. This study presents a novel technique in the development of Yorùbá alphabets recognition system through the use of deep learning. The developed model was implemented on Matlab R2018a environment using the developed framework where 10,500 samples of dataset were for training and 2100 samples were used for testing. The training of the developed model was conducted using 30 Epoch, at 164 iteration per epoch while the total iteration is 4920 iterations. Also, the training period was estimated to 11296 minutes 41 seconds. The model yielded the network accuracy of 100% while the accuracy of the test set is 97.97%, with F1 score of 0.9800, Precision of 0.9803 and Recall value of 0.9797.


Optical Character Recognition (OCR) is a computer vision technique which recognizes text present in any form of images, such as scanned documents and photos. In recent years, OCR has improved significantly in the precise recognition of text from images. Though there are many existing applications, we plan on exploring the domain of deep learning and build an optical character recognition system using deep learning architectures. In the later stage, this OCR system is developed to form a web application which provides the functionalities. The approach applied to achieve this is to implement a hybrid model containing three components namely, the Convolutional Neural Network component, the Recurrent Neural Network component and the Transcription component which decodes the output from RNN into the corresponding label sequence. The process of solving problems involving text recognition required CNN to extract feature maps from images. These sequence of feature vectors undergo sequence modeling through the RNN component predicting label distributions which are later translated using the Connectionist Temporal Classification technique in the transcription layer. The model implemented acts as the backend of the web application developed using the Flask web framework. The complete application is later containerized into an image using Docker. This helps in easy deployment on the application along with its environment across any system.


Author(s):  
Kedar R ◽  
Kaviraj A ◽  
Manish R ◽  
Niteesh B ◽  
Suthir S

The technology is growing and increasing in our day to day life to satisfy the needs of human beings. The system we are going to propose makes the human job easier. Here the YOLO algorithm which is a deep learning object detection architecture is used to detect the number plate of the vehicle. After detecting the number plate it converts the vehicle number to a text format. Then it checks it with the database to see if the vehicle is authorized to enter into the premise or not. This system can be implemented in highly restrained areas such as military areas, government organizations, Parliament, etc. This proposed system has around six stages: Capture Image, Search for black pixels, Image filtering, Plate region extraction, character extraction, OCR for character recognition. The alphanumeric characters are identified using the OCR algorithm. It is then used to compare the obtained result from the YOLO algorithm with the database and then check if the vehicle is allowed to enter the premise or not. This proposed system is simulated and implemented using Python, and it was also tested on real-time images for performance purposes.


Author(s):  
Qiaokang Liang ◽  
◽  
Qiao Ge ◽  
Wei Sun ◽  
Dan Zhang ◽  
...  

In the food and beverage industry, the existing recognition of code characters on the surface of complex packaging usually suffers from low accuracy and low speed. This work presents an efficient and accurate inkjet code recognition system based on the combination of the deep learning and traditional image processing methods. The proposed system mainly consists of three sequential modules, i.e., the characters region extraction by modified YOLOv3-tiny network, the character processing by the traditional image processing methods such as binarization and the modified character projection segmentation, and the character recognition by a Convolutional recurrent neural network (CRNN) model based on a modified version of MobileNetV3. In this system, only a small amount of tag data has been made and an effective character data generator is designed to randomly generate different experimental data for the CRNN model training. To the best of our knowledge, this report for the first time describes that deep learning has been applied to the recognition of codes on complex background for the real-life industrial application. Experimental results have been provided to verify the accuracy and effectiveness of the proposed model, demonstrating a recognition accuracy of 0.986 and a processing speed of 100 ms per bottle in the end-to-end character recognition system.


Author(s):  
Manish M. Kayasth ◽  
Bharat C. Patel

The entire character recognition system is logically characterized into different sections like Scanning, Pre-processing, Classification, Processing, and Post-processing. In the targeted system, the scanned image is first passed through pre-processing modules then feature extraction, classification in order to achieve a high recognition rate. This paper describes mainly on Feature extraction and Classification technique. These are the methodologies which play an important role to identify offline handwritten characters specifically in Gujarati language. Feature extraction provides methods with the help of which characters can identify uniquely and with high degree of accuracy. Feature extraction helps to find the shape contained in the pattern. Several techniques are available for feature extraction and classification, however the selection of an appropriate technique based on its input decides the degree of accuracy of recognition. 


Sign in / Sign up

Export Citation Format

Share Document