scholarly journals Phonovisual Biases in Language: is the Lexicon Tied to the Visual World?

Author(s):  
Andrea Gregor de Varda ◽  
Carlo Strapparava

The present paper addresses the study of cross-linguistic and cross-modal iconicity within a deep learning framework. An LSTM-based Recurrent Neural Network is trained to associate the phonetic representation of a concrete word, encoded as a sequence of feature vectors, to the visual representation of its referent, expressed as an HCNN-transformed image. The processing network is then tested, without further training, in a language that does not appear in the training set and belongs to a different language family. The performance of the model is evaluated through a comparison with a randomized baseline; we show that such an imaginative network is capable of extracting language-independent generalizations in the mapping from linguistic sounds to visual features, providing empirical support for the hypothesis of a universal sound-symbolic substrate underlying all languages.

2021 ◽  
Author(s):  
Chang Liu ◽  
Chun Yang ◽  
Hai-bo Qin ◽  
Xiaobin Zhu ◽  
Xu-Cheng Yin

<div><br></div><div>Scene text recognition is a popular topic and can benefit various tasks. Although many methods have been proposed for the close-set text recognition challenges, they cannot be directly applied to open-set scenarios, where the evaluation set contains novel characters not appearing in the training set. Conventional methods require collecting new data and retraining the model to handle these novel characters, which is an expensive and tedious process. In this paper, we propose a label-to-prototype learning framework to handle novel characters without retraining the model. In the proposed framework, novel characters are effectively mapped to their corresponding prototypes with a label-to-prototype learning module. This module is trained on characters with seen labels and can be easily generalized to novel characters. Additionally, feature-level rectification is conducted via topology-preserving transformation, resulting in better alignments between visual features and constructed prototypes while having a reasonably small impact on model speed. A lot of experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets.</div>


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Weili Zeng ◽  
Juan Li ◽  
Zhibin Quan ◽  
Xiaobo Lu

Due to the strong propagation causality of delays between airports, this paper proposes a delay prediction model based on a deep graph neural network to study delay prediction from the perspective of an airport network. We regard airports as nodes of a graph network and use a directed graph network to construct airports’ relationship. For adjacent airports, weights of edges are measured by the spherical distance between them, while the number of flight pairs between them is utilized for airports connected by flights. On this basis, a diffusion convolution kernel is constructed to capture characteristics of delay propagation between airports, and it is further integrated into the sequence-to-sequence LSTM neural network to establish a deep learning framework for delay prediction. We name this model as deep graph-embedded LSTM (DGLSTM). To verify the model’s effectiveness and superiority, we utilize the historical delay data of 325 airports in the United States from 2015 to 2018 as the model training set and test set. The experimental results suggest that the proposed method is superior to the existing mainstream methods in terms of accuracy and robustness.


2021 ◽  
Author(s):  
Chang Liu ◽  
Chun Yang ◽  
Hai-bo Qin ◽  
Xiaobin Zhu ◽  
Xu-Cheng Yin

<div><br></div><div>Scene text recognition is a popular topic and can benefit various tasks. Although many methods have been proposed for the close-set text recognition challenges, they cannot be directly applied to open-set scenarios, where the evaluation set contains novel characters not appearing in the training set. Conventional methods require collecting new data and retraining the model to handle these novel characters, which is an expensive and tedious process. In this paper, we propose a label-to-prototype learning framework to handle novel characters without retraining the model. In the proposed framework, novel characters are effectively mapped to their corresponding prototypes with a label-to-prototype learning module. This module is trained on characters with seen labels and can be easily generalized to novel characters. Additionally, feature-level rectification is conducted via topology-preserving transformation, resulting in better alignments between visual features and constructed prototypes while having a reasonably small impact on model speed. A lot of experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets.</div>


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


2021 ◽  
Vol 11 (6) ◽  
pp. 2838
Author(s):  
Nikitha Johnsirani Venkatesan ◽  
Dong Ryeol Shin ◽  
Choon Sung Nam

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.


Geophysics ◽  
2019 ◽  
Vol 84 (6) ◽  
pp. V333-V350 ◽  
Author(s):  
Siwei Yu ◽  
Jianwei Ma ◽  
Wenlong Wang

Compared with traditional seismic noise attenuation algorithms that depend on signal models and their corresponding prior assumptions, removing noise with a deep neural network is trained based on a large training set in which the inputs are the raw data sets and the corresponding outputs are the desired clean data. After the completion of training, the deep-learning (DL) method achieves adaptive denoising with no requirements of (1) accurate modelings of the signal and noise or (2) optimal parameters tuning. We call this intelligent denoising. We have used a convolutional neural network (CNN) as the basic tool for DL. In random and linear noise attenuation, the training set is generated with artificially added noise. In the multiple attenuation step, the training set is generated with the acoustic wave equation. The stochastic gradient descent is used to solve the optimal parameters for the CNN. The runtime of DL on a graphics processing unit for denoising has the same order as the [Formula: see text]-[Formula: see text] deconvolution method. Synthetic and field results indicate the potential applications of DL in automatic attenuation of random noise (with unknown variance), linear noise, and multiples.


2005 ◽  
Vol 13 (2) ◽  
pp. 135-143 ◽  
Author(s):  
Pascal Dufour ◽  
Sharad Bhartiya ◽  
Prasad S. Dhurjati ◽  
Francis J. Doyle III

2015 ◽  
Vol 770 ◽  
pp. 540-546 ◽  
Author(s):  
Yuri Eremenko ◽  
Dmitry Poleshchenko ◽  
Anton Glushchenko

The question about modern intelligent information processing methods usage for a ball mill filling level evaluation is considered. Vibration acceleration signal has been measured on a mill laboratory model for that purpose. It is made with accelerometer attached to a mill pin. The conclusion is made that mill filling level can not be measured with the help of such signal amplitude only. So this signal spectrum processed by a neural network is used. A training set for the neural network is formed with the help of spectral analysis methods. Trained neural network is able to find the correlation between mill pin vibration acceleration signal and mill filling level. Test set is formed from the data which is not included into the training set. This set is used in order to evaluate the network ability to evaluate the mill filling degree. The neural network guarantees no more than 7% error in the evaluation of mill filling level.


Sign in / Sign up

Export Citation Format

Share Document