Geometric Regularization of Local Activations for Knowledge Transfer in Convolutional Neural Networks

Ilias Theodorakopoulos; Foteini Fotopoulou; George Economou

doi:10.3390/info12080333

Geometric Regularization of Local Activations for Knowledge Transfer in Convolutional Neural Networks

Information ◽

10.3390/info12080333 ◽

2021 ◽

Vol 12 (8) ◽

pp. 333

Author(s):

Ilias Theodorakopoulos ◽

Foteini Fotopoulou ◽

George Economou

Keyword(s):

Neural Networks ◽

Knowledge Transfer ◽

Convolutional Neural Networks ◽

Feature Space ◽

Local Features ◽

Distance Measures ◽

Limited Data ◽

Geometrical Characteristics ◽

External Data ◽

Knowledge Distillation

In this work, we propose a mechanism for knowledge transfer between Convolutional Neural Networks via the geometric regularization of local features produced by the activations of convolutional layers. We formulate appropriate loss functions, driving a “student” model to adapt such that its local features exhibit similar geometrical characteristics to those of an “instructor” model, at corresponding layers. The investigated functions, inspired by manifold-to-manifold distance measures, are designed to compare the neighboring information inside the feature space of the involved activations without any restrictions in the features’ dimensionality, thus enabling knowledge transfer between different architectures. Experimental evidence demonstrates that the proposed technique is effective in different settings, including knowledge-transfer to smaller models, transfer between different deep architectures and harnessing knowledge from external data, producing models with increased accuracy compared to a typical training. Furthermore, results indicate that the presented method can work synergistically with methods such as knowledge distillation, further increasing the accuracy of the trained models. Finally, experiments on training with limited data show that a combined regularization scheme can achieve the same generalization as a non-regularized training with 50% of the data in the CIFAR-10 classification task.

Download Full-text

Ensemble Convolutional Neural Networks With Knowledge Transfer for Leather Defect Classification in Industrial Settings

IEEE Access ◽

10.1109/access.2020.3034731 ◽

2020 ◽

Vol 8 ◽

pp. 198600-198614

Author(s):

Masood Aslam ◽

Tariq M. Khan ◽

Syed Saud Naqvi ◽

Geoff Holmes ◽

Rafea Naffa

Keyword(s):

Neural Networks ◽

Knowledge Transfer ◽

Convolutional Neural Networks ◽

Defect Classification ◽

Industrial Settings

Download Full-text

Inter-Class Angular Loss for Convolutional Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013894 ◽

2019 ◽

Vol 33 ◽

pp. 3894-3901 ◽

Cited By ~ 1

Author(s):

Le Hui ◽

Xiang Li ◽

Chen Gong ◽

Meng Fang ◽

Joey Tianyi Zhou ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Neural Networks ◽

Learning Difficulties ◽

Feature Space ◽

Superior Performance ◽

Strongly Correlated ◽

Discriminative Ability ◽

Practical Applications ◽

Classification Tasks

Convolutional Neural Networks (CNNs) have shown great power in various classification tasks and have achieved remarkable results in practical applications. However, the distinct learning difficulties in discriminating different pairs of classes are largely ignored by the existing networks. For instance, in CIFAR-10 dataset, distinguishing cats from dogs is usually harder than distinguishing horses from ships. By carefully studying the behavior of CNN models in the training process, we observe that the confusion level of two classes is strongly correlated with their angular separability in the feature space. That is, the larger the inter-class angle is, the lower the confusion will be. Based on this observation, we propose a novel loss function dubbed “Inter-Class Angular Loss” (ICAL), which explicitly models the class correlation and can be directly applied to many existing deep networks. By minimizing the proposed ICAL, the networks can effectively discriminate the examples in similar classes by enlarging the angle between their corresponding class vectors. Thorough experimental results on a series of vision and nonvision datasets confirm that ICAL critically improves the discriminative ability of various representative deep neural networks and generates superior performance to the original networks with conventional softmax loss.

Download Full-text

Effective training of convolutional neural networks for age estimation based on knowledge distillation

Neural Computing and Applications ◽

10.1007/s00521-021-05981-0 ◽

2021 ◽

Author(s):

Antonio Greco ◽

Alessia Saggese ◽

Mario Vento ◽

Vincenzo Vigilante

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Age Estimation ◽

Resource Constraints ◽

Training Procedure ◽

Face Images ◽

Effective Training ◽

Knowledge Distillation ◽

Student Models ◽

Teacher Model

AbstractAge estimation from face images can be profitably employed in several applications, ranging from digital signage to social robotics, from business intelligence to access control. Only in recent years, the advent of deep learning allowed for the design of extremely accurate methods based on convolutional neural networks (CNNs) that achieve a remarkable performance in various face analysis tasks. However, these networks are not always applicable in real scenarios, due to both time and resource constraints that the most accurate approaches often do not meet. Moreover, in case of age estimation, there is the lack of a large and reliably annotated dataset for training deep neural networks. Within this context, we propose in this paper an effective training procedure of CNNs for age estimation based on knowledge distillation, able to allow smaller and simpler “student” models to be trained to match the predictions of a larger “teacher” model. We experimentally show that such student models are able to almost reach the performance of the teacher, obtaining high accuracy over the LFW+, LAP 2016 and Adience datasets, but being up to 15 times faster. Furthermore, we evaluate the performance of the student models in the presence of image corruptions, and we demonstrate that some of them are even more resilient to these corruptions than the teacher model.

Download Full-text

Transfer of Learning in the Convolutional Neural Networks on Classifying Geometric Shapes Based on Local or Global Invariants

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.637144 ◽

2021 ◽

Vol 15 ◽

Author(s):

Yufeng Zheng ◽

Jun Huang ◽

Tianwen Chen ◽

Yang Ou ◽

Wu Zhou

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Visual Information ◽

Transfer Of Learning ◽

Local Features ◽

Global Features ◽

Geometric Shapes ◽

Robust Learning ◽

Global Invariants

The convolutional neural networks (CNNs) are a powerful tool of image classification that has been widely adopted in applications of automated scene segmentation and identification. However, the mechanisms underlying CNN image classification remain to be elucidated. In this study, we developed a new approach to address this issue by investigating transfer of learning in representative CNNs (AlexNet, VGG, ResNet-101, and Inception-ResNet-v2) on classifying geometric shapes based on local/global features or invariants. While the local features are based on simple components, such as orientation of line segment or whether two lines are parallel, the global features are based on the whole object such as whether an object has a hole or whether an object is inside of another object. Six experiments were conducted to test two hypotheses on CNN shape classification. The first hypothesis is that transfer of learning based on local features is higher than transfer of learning based on global features. The second hypothesis is that the CNNs with more layers and advanced architectures have higher transfer of learning based global features. The first two experiments examined how the CNNs transferred learning of discriminating local features (square, rectangle, trapezoid, and parallelogram). The other four experiments examined how the CNNs transferred learning of discriminating global features (presence of a hole, connectivity, and inside/outside relationship). While the CNNs exhibited robust learning on classifying shapes, transfer of learning varied from task to task, and model to model. The results rejected both hypotheses. First, some CNNs exhibited lower transfer of learning based on local features than that based on global features. Second the advanced CNNs exhibited lower transfer of learning on global features than that of the earlier models. Among the tested geometric features, we found that learning of discriminating inside/outside relationship was the most difficult to be transferred, indicating an effective benchmark to develop future CNNs. In contrast to the “ImageNet” approach that employs natural images to train and analyze the CNNs, the results show proof of concept for the “ShapeNet” approach that employs well-defined geometric shapes to elucidate the strengths and limitations of the computation in CNN image classification. This “ShapeNet” approach will also provide insights into understanding visual information processing the primate visual systems.

Download Full-text

Evaluation of Local Features Using Convolutional Neural Networks for Person Re-Identification

Lecture Notes in Electrical Engineering - Communications, Signal Processing, and Systems ◽

10.1007/978-981-13-6504-1_107 ◽

2019 ◽

pp. 890-897

Author(s):

Shuang Liu ◽

Xiaolong Hao ◽

Zhong Zhang ◽

Mingzhu Shi

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Local Features

Download Full-text

Redundancy-Aware Pruning of Convolutional Neural Networks

Neural Computation ◽

10.1162/neco_a_01330 ◽

2020 ◽

Vol 32 (12) ◽

pp. 2532-2556

Author(s):

Guotian Xie

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Orthogonal Transformation ◽

Feature Space ◽

Intermediate Space ◽

Pruning Method ◽

Input And Output ◽

Speed Up ◽

Original Feature

Pruning is an effective way to slim and speed up convolutional neural networks. Generally previous work directly pruned neural networks in the original feature space without considering the correlation of neurons. We argue that such a way of pruning still keeps some redundancy in the pruned networks. In this letter, we proposed to prune in the intermediate space in which the correlation of neurons is eliminated. To achieve this goal, the input and output of a convolutional layer are first mapped to an intermediate space by orthogonal transformation. Then neurons are evaluated and pruned in the intermediate space. Extensive experiments have shown that our redundancy-aware pruning method surpasses state-of-the-art pruning methods on both efficiency and accuracy. Notably, using our redundancy-aware pruning method, ResNet models with three times the speed-up could achieve competitive performance with fewer floating point operations per second even compared to DenseNet.

Download Full-text

Local features and global shape information in object classification by deep convolutional neural networks

Vision Research ◽

10.1016/j.visres.2020.04.003 ◽

2020 ◽

Vol 172 ◽

pp. 46-61 ◽

Cited By ~ 1

Author(s):

Nicholas Baker ◽

Hongjing Lu ◽

Gennady Erlikhman ◽

Philip J. Kellman

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Object Classification ◽

Local Features ◽

Deep Convolutional Neural Networks ◽

Shape Information ◽

Global Shape

Download Full-text

Inferring the Disease-Associated miRNAs Based on Network Representation Learning and Convolutional Neural Networks

International Journal of Molecular Sciences ◽

10.3390/ijms20153648 ◽

2019 ◽

Vol 20 (15) ◽

pp. 3648 ◽

Cited By ~ 9

Author(s):

Xuan ◽

Sun ◽

Wang ◽

Zhang ◽

Pan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Prediction Models ◽

Prediction Method ◽

Feature Space ◽

Representation Learning ◽

Superior Performance ◽

Network Representation ◽

Disease Associations ◽

Low Dimensional

Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among miRNA similarities, disease similarities, and miRNA-disease associations. We propose a prediction method on the basis of network representation learning and convolutional neural networks to predict disease miRNAs, called CNNMDA. CNNMDA deeply integrates the similarity information of miRNAs and diseases, miRNA-disease associations, and representations of miRNAs and diseases in low-dimensional feature space. The new framework based on deep learning was built to learn the original and global representation of a miRNA-disease pair. First, diverse biological premises about miRNAs and diseases were combined to construct the embedding layer in the left part of the framework, from a biological perspective. Second, the various connection edges in the miRNA-disease network, such as similarity and association connections, were dependent on each other. Therefore, it was necessary to learn the low-dimensional representations of the miRNA and disease nodes based on the entire network. The right part of the framework learnt the low-dimensional representation of each miRNA and disease node based on non-negative matrix factorization, and these representations were used to establish the corresponding embedding layer. Finally, the left and right embedding layers went through convolutional modules to deeply learn the complex and non-linear relationships among the similarities and associations between miRNAs and diseases. Experimental results based on cross validation indicated that CNNMDA yields superior performance compared to several state-of-the-art methods. Furthermore, case studies on lung, breast, and pancreatic neoplasms demonstrated the powerful ability of CNNMDA to discover potential disease miRNAs.

Download Full-text

Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2020.3027837 ◽

2020 ◽

pp. 1-12

Author(s):

Seunghyun Lee ◽

Byung Cheol Song

Keyword(s):

Neural Networks ◽

Knowledge Transfer ◽

Convolutional Neural Networks ◽

Essential Information

Download Full-text

Automatic Identification of Local Features Representing Image Content with the Use of Convolutional Neural Networks

Applied Sciences ◽

10.3390/app10155186 ◽

2020 ◽

Vol 10 (15) ◽

pp. 5186

Author(s):

Paweł Tarasiuk ◽

Arkadiusz Tomczyk ◽

Bartłomiej Stasiak

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Sparse Matrices ◽

Local Features ◽

Automatic Identification ◽

Image Content ◽

Data Set ◽

Practical Applications ◽

A Value ◽

Working Principle

Image analysis has many practical applications and proper representation of image content is its crucial element. In this work, a novel type of representation is proposed where an image is reduced to a set of highly sparse matrices. Equivalently, it can be viewed as a set of local features of different types, as precise coordinates of detected keypoints are given. Additionally, every keypoint has a value expressing feature intensity at a given location. These features are extracted from a dedicated convolutional neural network autoencoder. This kind of representation has many advantages. First of all, local features are not manually designed but are automatically trained for a given class of images. Second, as they are trained in a network that restores its input on the output, they may be expected to minimize information loss. Consequently, they can be used to solve similar tasks replacing original images; such an ability was illustrated with image classification task. Third, the generated features, although automatically synthesized, are relatively easy to interpret. Taking a decoder part of our network, one can easily generate a visual building block connected with a specific feature. As the proposed method is entirely new, a detailed analysis of its properties for a relatively simple data set was conducted and is described in this work. Moreover, to present the quality of trained features, it is compared with results of convolutional neural networks having a similar working principle (sparse coding).

Download Full-text