scholarly journals An Improved Deep Learning Network Structure for Multitask Text Implication Translation Character Recognition

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xiaoli Ma ◽  
Hongyan Xu ◽  
Xiaoqian Zhang ◽  
Haoyong Wang

With the rapid development of artificial intelligence technology, multitasking textual translation has attracted more and more attention. Especially after the application of deep learning technology, the performance of multitask translation text detection and recognition has been greatly improved. However, because multitasking contains the interference problem faced by the translated text, there is a big gap between recognition performance and actual application requirements. Aiming at multitasking and translation text detection, this paper proposes a text localization method based on multichannel multiscale detection of the largest stable extreme value region and cascade filtering. This paper selects the appropriate color channel and scale to extract the maximum stable extreme value area as the character candidate area and designs a cascaded filter from coarse to fine to remove false detections. The coarse filter is based on some simple morphological features and stroke width features, and the fine filter is trained by a two-recognition convolutional neural network. The remaining character candidate regions are merged into horizontal or multidirectional character strings through the graph model. The experimental results on the text data set prove the effectiveness of the improved deep learning network character model and the feasibility of the textual implication translation analysis method based on this model. Among them, the text contains translation character recognition results prove that the model has good description ability. The characteristics of the model determine that this method is not sensitive to the scale of the sliding window, so it performs better than the existing typical methods in retrieval tasks.

Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 331
Author(s):  
Joseph Gesnouin ◽  
Steve Pechberti ◽  
Guillaume Bresson ◽  
Bogdan Stanciulescu ◽  
Fabien Moutarde

Understanding the behaviors and intentions of humans is still one of the main challenges for vehicle autonomy. More specifically, inferring the intentions and actions of vulnerable actors, namely pedestrians, in complex situations such as urban traffic scenes remains a difficult task and a blocking point towards more automated vehicles. Answering the question “Is the pedestrian going to cross?” is a good starting point in order to advance in the quest to the fifth level of autonomous driving. In this paper, we address the problem of real-time discrete intention prediction of pedestrians in urban traffic environments by linking the dynamics of a pedestrian’s skeleton to an intention. Hence, we propose SPI-Net (Skeleton-based Pedestrian Intention network): a representation-focused multi-branch network combining features from 2D pedestrian body poses for the prediction of pedestrians’ discrete intentions. Experimental results show that SPI-Net achieved 94.4% accuracy in pedestrian crossing prediction on the JAAD data set while being efficient for real-time scenarios since SPI-Net can reach around one inference every 0.25 ms on one GPU (i.e., RTX 2080ti), or every 0.67 ms on one CPU (i.e., Intel Core i7 8700K).


Author(s):  
A. Kala ◽  
S. Ganesh Vaidyanathan

Rainfall forecasting is the most critical and challenging task because of its dependence on different climatic and weather parameters. Hence, robust and accurate rainfall forecasting models need to be created by applying various machine learning and deep learning approaches. Several automatic systems were created to predict the weather, but it depends on the type of weather pattern, season and location, which leads in maximizing the processing time. Therefore, in this work, significant artificial algae long short-term memory (LSTM) deep learning network is introduced to forecast the monthly rainfall. During this process, Homogeneous Indian Monthly Rainfall Data Set (1871–2016) is utilized to collect the rainfall information. The gathered information is computed with the help of an LSTM approach, which is able to process the time series data and predict the dependency between the data effectively. The most challenging phase of LSTM training process is finding optimal network parameters such as weight and bias. For obtaining the optimal parameters, one of the Meta heuristic bio-inspired algorithms called Artificial Algae Algorithm (AAA) is used. The forecasted rainfall for the testing dataset is compared with the existing models. The forecasted results exhibit superiority of our model over the state-of-the-art models for forecasting Indian Monsoon rainfall. The LSTM model combined with AAA predicts the monsoon from June–September accurately.


2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


2021 ◽  
Vol 11 (1) ◽  
pp. 339-348
Author(s):  
Piotr Bojarczak ◽  
Piotr Lesiak

Abstract The article uses images from Unmanned Aerial Vehicles (UAVs) for rail diagnostics. The main advantage of such a solution compared to traditional surveys performed with measuring vehicles is the elimination of decreased train traffic. The authors, in the study, limited themselves to the diagnosis of hazardous split defects in rails. An algorithm has been proposed to detect them with an efficiency rate of about 81% for defects not less than 6.9% of the rail head width. It uses the FCN-8 deep-learning network, implemented in the Tensorflow environment, to extract the rail head by image segmentation. Using this type of network for segmentation increases the resistance of the algorithm to changes in the recorded rail image brightness. This is of fundamental importance in the case of variable conditions for image recording by UAVs. The detection of these defects in the rail head is performed using an algorithm in the Python language and the OpenCV library. To locate the defect, it uses the contour of a separate rail head together with a rectangle circumscribed around it. The use of UAVs together with artificial intelligence to detect split defects is an important element of novelty presented in this work.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ryoya Shiode ◽  
Mototaka Kabashima ◽  
Yuta Hiasa ◽  
Kunihiro Oka ◽  
Tsuyoshi Murase ◽  
...  

AbstractThe purpose of the study was to develop a deep learning network for estimating and constructing highly accurate 3D bone models directly from actual X-ray images and to verify its accuracy. The data used were 173 computed tomography (CT) images and 105 actual X-ray images of a healthy wrist joint. To compensate for the small size of the dataset, digitally reconstructed radiography (DRR) images generated from CT were used as training data instead of actual X-ray images. The DRR-like images were generated from actual X-ray images in the test and adapted to the network, and high-accuracy estimation of a 3D bone model from a small data set was possible. The 3D shape of the radius and ulna were estimated from actual X-ray images with accuracies of 1.05 ± 0.36 and 1.45 ± 0.41 mm, respectively.


2021 ◽  
Vol 11 (13) ◽  
pp. 5880
Author(s):  
Paloma Tirado-Martin ◽  
Raul Sanchez-Reillo

Nowadays, Deep Learning tools have been widely applied in biometrics. Electrocardiogram (ECG) biometrics is not the exception. However, the algorithm performances rely heavily on a representative dataset for training. ECGs suffer constant temporal variations, and it is even more relevant to collect databases that can represent these conditions. Nonetheless, the restriction in database publications obstructs further research on this topic. This work was developed with the help of a database that represents potential scenarios in biometric recognition as data was acquired in different days, physical activities and positions. The classification was implemented with a Deep Learning network, BioECG, avoiding complex and time-consuming signal transformations. An exhaustive tuning was completed including variations in enrollment length, improving ECG verification for more complex and realistic biometric conditions. Finally, this work studied one-day and two-days enrollments and their effects. Two-days enrollments resulted in huge general improvements even when verification was accomplished with more unstable signals. EER was improved in 63% when including a change of position, up to almost 99% when visits were in a different day and up to 91% if the user experienced a heartbeat increase after exercise.


Sign in / Sign up

Export Citation Format

Share Document