Chinese Character Boxes: Single Shot Detector Network for Chinese Character Detection

Junhwan Ryu; Sungho Kim

doi:10.3390/app9020315

Chinese Character Boxes: Single Shot Detector Network for Chinese Character Detection

Applied Sciences ◽

10.3390/app9020315 ◽

2019 ◽

Vol 9 (2) ◽

pp. 315 ◽

Cited By ~ 3

Author(s):

Junhwan Ryu ◽

Sungho Kim

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Chinese Character ◽

Layer Structure ◽

Single Step ◽

Single Shot ◽

Data Set ◽

Pyramid Structure ◽

Feature Pyramid ◽

Translation Systems

This paper proposes a deep learning-based Chinese character detection network which is important for character recognition and translation. Detecting the correct character area is an important part of recognition and translation. Previous studies have focused on methods using projection through image pre-processing and recognition methods based on segmentation and methods using hand-crafted features such as analyzing and using features. Unfortunately, the results are vulnerable to noise. Recently, recognition or translation systems based on deep learning were dealt with as a single step from detection to translation but they failed to consider the inaccurate localization problem that arises in detectors. This paper proposes a Chinese character boxes (CCB) network that deals with a method to detect the character area more accurately using the single-shot multibox detector (SSD) as the baseline and called CCB-SSD. The proposed CCB-SSD network has a single prediction layer structure in which unnecessary layers are removed from the feature-pyramid structure. The augmentation method for training is introduced and the problem caused by the use of default boxes is solved by using the proposed non-maximum suppression (NMS). The experimental results revealed a 96.1% detection rate and 0.89 performance against the false positives per character (FPPC) which is the proposed false positive index for the character data-set and caoshu data-set used in this paper. This method showed better performance than the conventional SSD with 69.4% and 6.57 FPPC.

Download Full-text

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text

The Assisted Positioning Technology for High Speed Train Based on Deep Learning

Applied Sciences ◽

10.3390/app10238625 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8625

Author(s):

Yali Song ◽

Yinghong Wen

Keyword(s):

Deep Learning ◽

High Speed ◽

Detection Method ◽

Generative Adversarial Networks ◽

Single Shot ◽

High Speed Train ◽

Data Set ◽

Detection Model ◽

Model Based ◽

Cumulative Error

In the positioning process of a high-speed train, cumulative error may result in a reduction in the positioning accuracy. The assisted positioning technology based on kilometer posts can be used as an effective method to correct the cumulative error. However, the traditional detection method of kilometer posts is time-consuming and complex, which greatly affects the correction efficiency. Therefore, in this paper, a kilometer post detection model based on deep learning is proposed. Firstly, the Deep Convolutional Generative Adversarial Networks (DCGAN) algorithm is introduced to construct an effective kilometer post data set. This greatly reduces the cost of real data acquisition and provides a prerequisite for the construction of the detection model. Then, by using the existing optimization as a reference and further simplifying the design of the Single Shot multibox Detector (SSD) model according to the specific application scenario of this paper, the kilometer post detection model based on an improved SSD algorithm is established. Finally, from the analysis of the experimental results, we know that the detection model established in this paper ensures both detection accuracy and efficiency. The accuracy of our model reached 98.92%, while the detection time was only 35.43 ms. Thus, our model realizes the rapid and accurate detection of kilometer posts and improves the assisted positioning technology based on kilometer posts by optimizing the detection method.

Download Full-text

End-to-end Optical Chinese Character Recognition Based on Deep Learning

Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence ◽

10.1145/3377713.3377809 ◽

2019 ◽

Author(s):

Binglun Li

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Chinese Character ◽

Chinese Character Recognition ◽

End To End

Download Full-text

Comparative Analysis of Deep Neural Networks for the Detection and Decoding of Data Matrix Landmarks in Cluttered Indoor Environments

Journal of Intelligent & Robotic Systems ◽

10.1007/s10846-021-01442-x ◽

2021 ◽

Vol 103 (1) ◽

Author(s):

Tiago Almeida ◽

Vitor Santos ◽

Oscar Martinez Mozos ◽

Bernardo Lourenço

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Input Image ◽

Automated Guided Vehicles ◽

Data Matrix ◽

Single Shot ◽

Indoor Environments ◽

Data Set ◽

Valid Solution

AbstractData Matrix patterns imprinted as passive visual landmarks have shown to be a valid solution for the self-localization of Automated Guided Vehicles (AGVs) in shop floors. However, existing Data Matrix decoding applications take a long time to detect and segment the markers in the input image. Therefore, this paper proposes a pipeline where the detector is based on a real-time Deep Learning network and the decoder is a conventional method, i.e. the implementation in libdmtx. To do so, several types of Deep Neural Networks (DNNs) for object detection were studied, trained, compared, and assessed. The architectures range from region proposals (Faster R-CNN) to single-shot methods (SSD and YOLO). This study focused on performance and processing time to select the best Deep Learning (DL) model to carry out the detection of the visual markers. Additionally, a specific data set was created to evaluate those networks. This test set includes demanding situations, such as high illumination gradients in the same scene and Data Matrix markers positioned in skewed planes. The proposed approach outperformed the best known and most used Data Matrix decoder available in libraries like libdmtx.

Download Full-text

Online Handwritten Chinese Character Recognition: From a Bayesian Approach to Deep Learning

Advances in Chinese Document and Text Processing ◽

10.1142/9789813143685_0004 ◽

2017 ◽

pp. 79-126

Author(s):

Lianwen Jin ◽

Weixin Yang ◽

Ziyong Feng ◽

Zecheng Xie

Keyword(s):

Deep Learning ◽

Bayesian Approach ◽

Character Recognition ◽

Chinese Character ◽

Chinese Character Recognition ◽

Handwritten Chinese Character Recognition

Download Full-text

One-dimensional deep learning inversion of electromagnetic induction data using convolutional neural network

Geophysical Journal International ◽

10.1093/gji/ggaa161 ◽

2020 ◽

Vol 222 (1) ◽

pp. 247-259 ◽

Cited By ~ 2

Author(s):

Davood Moghadas

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Electromagnetic Induction ◽

Computational Cost ◽

Single Step ◽

Accurate Estimation ◽

Well Performance ◽

Convolutional Network ◽

Data Set

SUMMARY Conventional geophysical inversion techniques suffer from several limitations including computational cost, nonlinearity, non-uniqueness and dimensionality of the inverse problem. Successful inversion of geophysical data has been a major challenge for decades. Here, a novel approach based on deep learning (DL) inversion via convolutional neural network (CNN) is proposed to instantaneously estimate subsurface electrical conductivity (σ) layering from electromagnetic induction (EMI) data. In this respect, a fully convolutional network was trained on a large synthetic data set generated based on 1-D EMI forward model. The accuracy of the proposed approach was examined using several synthetic scenarios. Moreover, the trained network was used to find subsurface electromagnetic conductivity images (EMCIs) from EMI data measured along two transects from Chicken Creek catchment (Brandenburg, Germany). Dipole–dipole electrical resistivity tomography data were measured as well to obtain reference subsurface σ distributions down to a 6 m depth. The inversely estimated models were juxtaposed and compared with their counterparts obtained from a spatially constrained deterministic algorithm as a standard code. Theoretical simulations demonstrated a well performance of the algorithm even in the presence of noise in data. Moreover, application of the DL inversion for subsurface imaging from Chicken Creek catchment manifested the accuracy and robustness of the proposed approach for EMI inversion. This approach returns subsurface σ distribution directly from EMI data in a single step without any iterations. The proposed strategy simplifies considerably EMI inversion and allows for rapid and accurate estimation of subsurface EMCI from multiconfiguration EMI data.

Download Full-text

Automated identification of cephalometric landmarks: Part 1—Comparisons between the latest deep-learning methods YOLOV3 and SSD

The Angle Orthodontist ◽

10.2319/022019-127.1 ◽

2019 ◽

Vol 89 (6) ◽

pp. 903-909 ◽

Cited By ~ 9

Author(s):

Ji-Hoon Park ◽

Hye-Won Hwang ◽

Jun-Ho Moon ◽

Youngsung Yu ◽

Hansuk Kim ◽

...

Keyword(s):

Deep Learning ◽

Computational Time ◽

Automatic Identification ◽

Identification System ◽

Single Shot ◽

Learning Methods ◽

Radiographic Images ◽

Data Set ◽

Landmark Identification ◽

Significant Difference

ABSTRACT Objective: To compare the accuracy and computational efficiency of two of the latest deep-learning algorithms for automatic identification of cephalometric landmarks. Materials and Methods: A total of 1028 cephalometric radiographic images were selected as learning data that trained You-Only-Look-Once version 3 (YOLOv3) and Single Shot Multibox Detector (SSD) methods. The number of target labeling was 80 landmarks. After the deep-learning process, the algorithms were tested using a new test data set composed of 283 images. Accuracy was determined by measuring the point-to-point error and success detection rate and was visualized by drawing scattergrams. The computational time of both algorithms was also recorded. Results: The YOLOv3 algorithm outperformed SSD in accuracy for 38 of 80 landmarks. The other 42 of 80 landmarks did not show a statistically significant difference between YOLOv3 and SSD. Error plots of YOLOv3 showed not only a smaller error range but also a more isotropic tendency. The mean computational time spent per image was 0.05 seconds and 2.89 seconds for YOLOv3 and SSD, respectively. YOLOv3 showed approximately 5% higher accuracy compared with the top benchmarks in the literature. Conclusions: Between the two latest deep-learning methods applied, YOLOv3 seemed to be more promising as a fully automated cephalometric landmark identification system for use in clinical practice.

Download Full-text

An Improved Deep Learning Network Structure for Multitask Text Implication Translation Character Recognition

Complexity ◽

10.1155/2021/6617799 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Xiaoli Ma ◽

Hongyan Xu ◽

Xiaoqian Zhang ◽

Haoyong Wang

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Good Description ◽

Rapid Development ◽

Extreme Value ◽

Text Detection ◽

Data Set ◽

Learning Network ◽

Coarse Filter ◽

Deep Learning Network

With the rapid development of artificial intelligence technology, multitasking textual translation has attracted more and more attention. Especially after the application of deep learning technology, the performance of multitask translation text detection and recognition has been greatly improved. However, because multitasking contains the interference problem faced by the translated text, there is a big gap between recognition performance and actual application requirements. Aiming at multitasking and translation text detection, this paper proposes a text localization method based on multichannel multiscale detection of the largest stable extreme value region and cascade filtering. This paper selects the appropriate color channel and scale to extract the maximum stable extreme value area as the character candidate area and designs a cascaded filter from coarse to fine to remove false detections. The coarse filter is based on some simple morphological features and stroke width features, and the fine filter is trained by a two-recognition convolutional neural network. The remaining character candidate regions are merged into horizontal or multidirectional character strings through the graph model. The experimental results on the text data set prove the effectiveness of the improved deep learning network character model and the feasibility of the textual implication translation analysis method based on this model. Among them, the text contains translation character recognition results prove that the model has good description ability. The characteristics of the model determine that this method is not sensitive to the scale of the sliding window, so it performs better than the existing typical methods in retrieval tasks.

Download Full-text

DeepAD: A Deep Learning Based Approach to Stroke-Level Abnormality Detection in Handwritten Chinese Character Recognition

2018 IEEE International Conference on Data Mining (ICDM) ◽

10.1109/icdm.2018.00176 ◽

2018 ◽

Author(s):

Tie-Qiang Wang ◽

Cheng-Lin Liu

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Chinese Character ◽

Abnormality Detection ◽

Chinese Character Recognition ◽

Handwritten Chinese Character Recognition

Download Full-text

Research on Offline Handwritten Chinese Character Recognition Based on Deep Learning

2019 9th International Conference on Information Science and Technology (ICIST) ◽

10.1109/icist.2019.8836833 ◽

2019 ◽

Author(s):

Qiuyun Hao ◽

Xiaoming Wu ◽

Sen Zhang ◽

Peng Zhang ◽

Xiaofeng Ma ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Chinese Character ◽

Chinese Character Recognition ◽

Handwritten Chinese Character Recognition

Download Full-text