Phonovisual Biases in Language: is the Lexicon Tied to the Visual World?

Towards Open-Set Text Recognition via Label-to-Prototype Learning

10.36227/techrxiv.16910062 ◽

2021 ◽

Author(s):

Chang Liu ◽

Chun Yang ◽

Hai-bo Qin ◽

Xiaobin Zhu ◽

Xu-Cheng Yin

Keyword(s):

Text Recognition ◽

Visual Features ◽

Training Set ◽

Learning Framework ◽

Small Impact ◽

Learning Module ◽

Open Set ◽

Scene Text ◽

Prototype Learning ◽

Scene Text Recognition

<div><br></div><div>Scene text recognition is a popular topic and can benefit various tasks. Although many methods have been proposed for the close-set text recognition challenges, they cannot be directly applied to open-set scenarios, where the evaluation set contains novel characters not appearing in the training set. Conventional methods require collecting new data and retraining the model to handle these novel characters, which is an expensive and tedious process. In this paper, we propose a label-to-prototype learning framework to handle novel characters without retraining the model. In the proposed framework, novel characters are effectively mapped to their corresponding prototypes with a label-to-prototype learning module. This module is trained on characters with seen labels and can be easily generalized to novel characters. Additionally, feature-level rectification is conducted via topology-preserving transformation, resulting in better alignments between visual features and constructed prototypes while having a reasonably small impact on model speed. A lot of experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets.</div>

Download Full-text

A Deep Graph-Embedded LSTM Neural Network Approach for Airport Delay Prediction

Journal of Advanced Transportation ◽

10.1155/2021/6638130 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Weili Zeng ◽

Juan Li ◽

Zhibin Quan ◽

Xiaobo Lu

Keyword(s):

Neural Network ◽

The United States ◽

Convolution Kernel ◽

Training Set ◽

Neural Network Approach ◽

Delay Propagation ◽

Learning Framework ◽

Spherical Distance ◽

Model Training ◽

Delay Prediction

Due to the strong propagation causality of delays between airports, this paper proposes a delay prediction model based on a deep graph neural network to study delay prediction from the perspective of an airport network. We regard airports as nodes of a graph network and use a directed graph network to construct airports’ relationship. For adjacent airports, weights of edges are measured by the spherical distance between them, while the number of flight pairs between them is utilized for airports connected by flights. On this basis, a diffusion convolution kernel is constructed to capture characteristics of delay propagation between airports, and it is further integrated into the sequence-to-sequence LSTM neural network to establish a deep learning framework for delay prediction. We name this model as deep graph-embedded LSTM (DGLSTM). To verify the model’s effectiveness and superiority, we utilize the historical delay data of 325 airports in the United States from 2015 to 2018 as the model training set and test set. The experimental results suggest that the proposed method is superior to the existing mainstream methods in terms of accuracy and robustness.

Download Full-text

Towards Open-Set Text Recognition via Label-to-Prototype Learning

10.36227/techrxiv.16910062.v1 ◽

2021 ◽

Author(s):

Chang Liu ◽

Chun Yang ◽

Hai-bo Qin ◽

Xiaobin Zhu ◽

Xu-Cheng Yin

Keyword(s):

Text Recognition ◽

Visual Features ◽

Training Set ◽

Learning Framework ◽

Small Impact ◽

Learning Module ◽

Open Set ◽

Scene Text ◽

Prototype Learning ◽

Scene Text Recognition

<div><br></div><div>Scene text recognition is a popular topic and can benefit various tasks. Although many methods have been proposed for the close-set text recognition challenges, they cannot be directly applied to open-set scenarios, where the evaluation set contains novel characters not appearing in the training set. Conventional methods require collecting new data and retraining the model to handle these novel characters, which is an expensive and tedious process. In this paper, we propose a label-to-prototype learning framework to handle novel characters without retraining the model. In the proposed framework, novel characters are effectively mapped to their corresponding prototypes with a label-to-prototype learning module. This module is trained on characters with seen labels and can be easily generalized to novel characters. Additionally, feature-level rectification is conducted via topology-preserving transformation, resulting in better alignments between visual features and constructed prototypes while having a reasonably small impact on model speed. A lot of experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets.</div>

Download Full-text

Geometric property-based convolutional neural network for indoor object detection

International Journal of Advanced Robotic Systems ◽

10.1177/1729881421993323 ◽

2021 ◽

Vol 18 (1) ◽

pp. 172988142199332

Author(s):

Xintao Ding ◽

Boquan Li ◽

Jinbao Wang

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Geometric Property ◽

Ground Truth ◽

Geometric Constraints ◽

Depth Information ◽

Training Set ◽

Object Knowledge ◽

The Mean

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.

Download Full-text

Nodule Detection with Convolutional Neural Network Using Apache Spark and GPU Frameworks

Applied Sciences ◽

10.3390/app11062838 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2838

Author(s):

Nikitha Johnsirani Venkatesan ◽

Dong Ryeol Shin ◽

Choon Sung Nam

Keyword(s):

Neural Network ◽

Radiation Dose ◽

Convolutional Neural Network ◽

Model Performance ◽

Performance Comparison ◽

Apache Spark ◽

Training Time ◽

Learning Framework ◽

Proposed Model

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.

Download Full-text

Super multi-step wind speed forecasting system with training set extension and horizontal–vertical integration neural network

Applied Energy ◽

10.1016/j.apenergy.2021.116908 ◽

2021 ◽

Vol 292 ◽

pp. 116908

Author(s):

Ling Liu ◽

Jujie Wang

Keyword(s):

Neural Network ◽

Wind Speed ◽

Vertical Integration ◽

Training Set ◽

Wind Speed Forecasting ◽

Forecasting System

Download Full-text

Deep learning for denoising

Geophysics ◽

10.1190/geo2018-0668.1 ◽

2019 ◽

Vol 84 (6) ◽

pp. V333-V350 ◽

Cited By ~ 15

Author(s):

Siwei Yu ◽

Jianwei Ma ◽

Wenlong Wang

Keyword(s):

Neural Network ◽

Deep Learning ◽

Graphics Processing Unit ◽

Random Noise ◽

Stochastic Gradient Descent ◽

Processing Unit ◽

Noise Attenuation ◽

Optimal Parameters ◽

Training Set ◽

Unknown Variance

Compared with traditional seismic noise attenuation algorithms that depend on signal models and their corresponding prior assumptions, removing noise with a deep neural network is trained based on a large training set in which the inputs are the raw data sets and the corresponding outputs are the desired clean data. After the completion of training, the deep-learning (DL) method achieves adaptive denoising with no requirements of (1) accurate modelings of the signal and noise or (2) optimal parameters tuning. We call this intelligent denoising. We have used a convolutional neural network (CNN) as the basic tool for DL. In random and linear noise attenuation, the training set is generated with artificially added noise. In the multiple attenuation step, the training set is generated with the acoustic wave equation. The stochastic gradient descent is used to solve the optimal parameters for the CNN. The runtime of DL on a graphics processing unit for denoising has the same order as the [Formula: see text]-[Formula: see text] deconvolution method. Synthetic and field results indicate the potential applications of DL in automatic attenuation of random noise (with unknown variance), linear noise, and multiples.

Download Full-text

Neural network-based software sensor: training set design and application to a continuous pulp digester

Control Engineering Practice ◽

10.1016/j.conengprac.2004.02.013 ◽

2005 ◽

Vol 13 (2) ◽

pp. 135-143 ◽

Cited By ~ 27

Author(s):

Pascal Dufour ◽

Sharad Bhartiya ◽

Prasad S. Dhurjati ◽

Francis J. Doyle III

Keyword(s):

Neural Network ◽

Set Design ◽

Training Set ◽

Software Sensor ◽

Pulp Digester ◽

Design And Application

Download Full-text

Learning spatio-temporal visual features by a large scale neural network model

Neuroscience Research ◽

10.1016/j.neures.2010.07.1925 ◽

2010 ◽

Vol 68 ◽

pp. e434

Author(s):

Naoto Yukinawa ◽

Shin Ishii

Keyword(s):

Neural Network ◽

Network Model ◽

Neural Network Model ◽

Large Scale ◽

Visual Features ◽

Spatio Temporal

Download Full-text

Study on Neural Networks Usage to Analyse Correlation between Spectrum of Vibration Acceleration Signal from Pin of Ball Mill and its Filling Level

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.770.540 ◽

2015 ◽

Vol 770 ◽

pp. 540-546 ◽

Cited By ~ 1

Author(s):

Yuri Eremenko ◽

Dmitry Poleshchenko ◽

Anton Glushchenko

Keyword(s):

Neural Network ◽

Ball Mill ◽

Signal Amplitude ◽

Signal Spectrum ◽

Laboratory Model ◽

Training Set ◽

Vibration Acceleration ◽

Acceleration Signal ◽

The Neural Network ◽

Trained Neural Network

The question about modern intelligent information processing methods usage for a ball mill filling level evaluation is considered. Vibration acceleration signal has been measured on a mill laboratory model for that purpose. It is made with accelerometer attached to a mill pin. The conclusion is made that mill filling level can not be measured with the help of such signal amplitude only. So this signal spectrum processed by a neural network is used. A training set for the neural network is formed with the help of spectral analysis methods. Trained neural network is able to find the correlation between mill pin vibration acceleration signal and mill filling level. Test set is formed from the data which is not included into the training set. This set is used in order to evaluate the network ability to evaluate the mill filling degree. The neural network guarantees no more than 7% error in the evaluation of mill filling level.

Download Full-text