Estimation of Pedestrian Pose Orientation Using Soft Target Training Based on Teacher–Student Framework

DuYeong Heo; Jae Nam; Byoung Ko

doi:10.3390/s19051147

Estimation of Pedestrian Pose Orientation Using Soft Target Training Based on Teacher–Student Framework

Sensors ◽

10.3390/s19051147 ◽

2019 ◽

Vol 19 (5) ◽

pp. 1147 ◽

Cited By ~ 1

Author(s):

DuYeong Heo ◽

Jae Nam ◽

Byoung Ko

Keyword(s):

Supervised Learning ◽

Spatial Information ◽

Classification Performance ◽

Input Image ◽

Student Model ◽

Teacher Student ◽

Specific Shape ◽

Soft Target ◽

Target Data ◽

Teacher Model

Semi-supervised learning is known to achieve better generalisation than a model learned solely from labelled data. Therefore, we propose a new method for estimating a pedestrian pose orientation using a soft-target method, which is a type of semi-supervised learning method. Because a convolutional neural network (CNN) based pose orientation estimation requires large numbers of parameters and operations, we apply the teacher–student algorithm to generate a compressed student model with high accuracy and compactness resembling that of the teacher model by combining a deep network with a random forest. After the teacher model is generated using hard target data, the softened outputs (soft-target data) of the teacher model are used for training the student model. Moreover, the orientation of the pedestrian has specific shape patterns, and a wavelet transform is applied to the input image as a pre-processing step owing to its good spatial frequency localisation property and the ability to preserve both the spatial information and gradient information of an image. For a benchmark dataset considering real driving situations based on a single camera, we used the TUD and KITTI datasets. We applied the proposed algorithm to various driving images in the datasets, and the results indicate that its classification performance with regard to the pose orientation is better than that of other state-of-the-art methods based on a CNN. In addition, the computational speed of the proposed student model is faster than that of other deep CNNs owing to the shorter model structure with a smaller number of parameters.

Download Full-text

Novel Model Based on Stacked Autoencoders with Sample-Wise Strategy for Fault Diagnosis

Mathematical Problems in Engineering ◽

10.1155/2019/8985657 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10

Author(s):

Diehao Kong ◽

Xuefeng Yan

Keyword(s):

Fault Diagnosis ◽

Chemical Engineering ◽

Ground Truth ◽

Student Model ◽

Teacher Student ◽

Stacked Autoencoders ◽

Knowledge Distillation ◽

New Perspective ◽

Current Student ◽

Teacher Model

Autoencoders are used for fault diagnosis in chemical engineering. To improve their performance, experts have paid close attention to regularized strategies and the creation of new and effective cost functions. However, existing methods are modified on the basis of only one model. This study provides a new perspective for strengthening the fault diagnosis model, which attempts to gain useful information from a model (teacher model) and applies it to a new model (student model). It pretrains the teacher model by fitting ground truth labels and then uses a sample-wise strategy to transfer knowledge from the teacher model. Finally, the knowledge and the ground truth labels are used to train the student model that is identical to the teacher model in terms of structure. The current student model is then used as the teacher of next student model. After step-by-step teacher-student reconfiguration and training, the optimal model is selected for fault diagnosis. Besides, knowledge distillation is applied in training procedures. The proposed method is applied to several benchmarked problems to prove its effectiveness.

Download Full-text

Object Detection in Densely Packed Scenes via Semi-Supervised Learning with Dual Consistency

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/172 ◽

2021 ◽

Author(s):

Chao Ye ◽

Huaidong Zhang ◽

Xuemiao Xu ◽

Weiwei Cai ◽

Jing Qin ◽

...

Keyword(s):

Object Detection ◽

Deep Neural Networks ◽

State Of The Art ◽

Student Model ◽

Training Process ◽

Teacher Student ◽

Public Dataset ◽

Dual Consistency ◽

Bounding Boxes ◽

Teacher Model

Deep neural networks have been shown to be very powerful tools for object detection in various scenes. Their remarkable performance, however, heavily depends on the availability of a large number of high quality labeled data, which are time-consuming and costly to acquire for scenes with densely packed objects. We present a novel semi-supervised approach to addressing this problem, which is designed based on a common teacher-student model, integrated with a novel intersection-over-union (IoU) aware consistency loss and a new proposal consistency loss. The IoU-aware consistency loss evaluates the IoU over the prediction pairs of the teacher model and the student model, which enforces the prediction of the student model to approach closely to that of the teacher model. The IoU-aware consistency loss also reweights the importance of different prediction pairs to suppress the low-confident pairs. The proposal consistency loss ensures proposal consistency between the two models, making it possible to involve the region proposal network in the training process with unlabeled data. We also construct a new dataset, namely RebarDSC, containing 2,125 rebar images annotated with 350,348 bounding boxes in total (164.9 annotations per image average), to evaluate the proposed method. Extensive experiments are conducted over both the RebarDSC dataset and the famous large public dataset SKU-110K. Experimental results corroborate that the proposed method is able to improve the object detection performance in densely packed scenes, consistently outperforming state-of-the-art approaches. Dataset is available in https://github.com/Armin1337/RebarDSC.

Download Full-text

A Stochastic Model for Block Segmentation of Images Based on the Quadtree and the Bayes Code for It

Entropy ◽

10.3390/e23080991 ◽

2021 ◽

Vol 23 (8) ◽

pp. 991

Author(s):

Yuta Nakahara ◽

Toshiyasu Matsushima

Keyword(s):

Computational Cost ◽

Block Size ◽

Input Image ◽

Generative Model ◽

Image Size ◽

Variable Block ◽

General Data ◽

The Difference ◽

Segmentation Of Images ◽

Target Data

In information theory, lossless compression of general data is based on an explicit assumption of a stochastic generative model on target data. However, in lossless image compression, researchers have mainly focused on the coding procedure that outputs the coded sequence from the input image, and the assumption of the stochastic generative model is implicit. In these studies, there is a difficulty in discussing the difference between the expected code length and the entropy of the stochastic generative model. We solve this difficulty for a class of images, in which they have non-stationarity among segments. In this paper, we propose a novel stochastic generative model of images by redefining the implicit stochastic generative model in a previous coding procedure. Our model is based on the quadtree so that it effectively represents the variable block size segmentation of images. Then, we construct the Bayes code optimal for the proposed stochastic generative model. It requires the summation of all possible quadtrees weighted by their posterior. In general, its computational cost increases exponentially for the image size. However, we introduce an efficient algorithm to calculate it in the polynomial order of the image size without loss of optimality. As a result, the derived algorithm has a better average coding rate than that of JBIG.

Download Full-text

Consistency regularization teacher–student semi-supervised learning method for target recognition in SAR images

The Visual Computer ◽

10.1007/s00371-021-02287-z ◽

2021 ◽

Author(s):

Ye Tian ◽

Liguo Zhang ◽

Jianguo Sun ◽

Guisheng Yin ◽

Yuxin Dong

Keyword(s):

Supervised Learning ◽

Target Recognition ◽

Learning Method ◽

Sar Images ◽

Teacher Student

Download Full-text

Attention-based deep learning networks for identification of human gait using radar micro-Doppler spectrograms

International Journal of Microwave and Wireless Technologies ◽

10.1017/s1759078721000830 ◽

2021 ◽

pp. 1-6

Author(s):

Hannah Garcia Doherty ◽

Roberto Arnaiz Burgueño ◽

Roeland P. Trommel ◽

Vasileios Papanastasiou ◽

Ronny I. A. Harmanny

Keyword(s):

Neural Networks ◽

Feature Vector ◽

Classification Performance ◽

Input Image ◽

Human Gait ◽

Learning Networks ◽

Class Label ◽

Deep Convolutional Neural Networks ◽

Network Layers ◽

Feature Dimension

Abstract Identification of human individuals within a group of 39 persons using micro-Doppler (μ-D) features has been investigated. Deep convolutional neural networks with two different training procedures have been used to perform classification. Visualization of the inner network layers revealed the sections of the input image most relevant when determining the class label of the target. A convolutional block attention module is added to provide a weighted feature vector in the channel and feature dimension, highlighting the relevant μ-D feature-filled areas in the image and improving classification performance.

Download Full-text

A semi-supervised learning detection method for vision-based monitoring of construction sites by integrating teacher-student networks and data augmentation

Advanced Engineering Informatics ◽

10.1016/j.aei.2021.101372 ◽

2021 ◽

Vol 50 ◽

pp. 101372

Author(s):

Bo Xiao ◽

Yuxuan Zhang ◽

Yuan Chen ◽

Xianfei Yin

Keyword(s):

Supervised Learning ◽

Data Augmentation ◽

Detection Method ◽

Construction Sites ◽

Teacher Student

Download Full-text

Semi-Supervised Classification and its Application to Filtering IDS False Positives

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.2309 ◽

2013 ◽

Vol 427-429 ◽

pp. 2309-2312

Author(s):

Hai Bin Mei ◽

Ming Hua Zhang

Keyword(s):

Supervised Learning ◽

Supervised Classification ◽

Classification Performance ◽

False Positives ◽

Training Data ◽

Classification Model ◽

Classification Technique

Alert classifiers built with the supervised classification technique require large amounts of labeled training alerts. Preparing for such training data is very difficult and expensive. Thus accuracy and feasibility of current classifiers are greatly restricted. This paper employs semi-supervised learning to build alert classification model to reduce the number of needed labeled training alerts. Alert context properties are also introduced to improve the classification performance. Experiments have demonstrated the accuracy and feasibility of our approach.

Download Full-text

The Use of Artificial-Intelligence-Based Ensembles for Intrusion Detection: A Review

Applied Computational Intelligence and Soft Computing ◽

10.1155/2012/850160 ◽

2012 ◽

Vol 2012 ◽

pp. 1-20 ◽

Cited By ~ 15

Author(s):

Gulshan Kumar ◽

Krishan Kumar

Keyword(s):

Artificial Intelligence ◽

Intrusion Detection ◽

Supervised Learning ◽

Ensemble Learning ◽

Classification Performance ◽

Intrusion Detection Systems ◽

Future Directions ◽

Comprehensive Review ◽

Detection Systems ◽

Combination Methods

In supervised learning-based classification, ensembles have been successfully employed to different application domains. In the literature, many researchers have proposed different ensembles by considering different combination methods, training datasets, base classifiers, and many other factors. Artificial-intelligence-(AI-) based techniques play prominent role in development of ensemble for intrusion detection (ID) and have many benefits over other techniques. However, there is no comprehensive review of ensembles in general and AI-based ensembles for ID to examine and understand their current research status to solve the ID problem. Here, an updated review of ensembles and their taxonomies has been presented in general. The paper also presents the updated review of various AI-based ensembles for ID (in particular) during last decade. The related studies of AI-based ensembles are compared by set of evaluation metrics driven from (1) architecture & approach followed; (2) different methods utilized in different phases of ensemble learning; (3) other measures used to evaluate classification performance of the ensembles. The paper also provides the future directions of the research in this area. The paper will help the better understanding of different directions in which research of ensembles has been done in general and specifically: field of intrusion detection systems (IDSs).

Download Full-text

Teacher/Student Deep Semi-Supervised Learning for Training with Noisy Labels

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) ◽

10.1109/icmla.2018.00147 ◽

2018 ◽

Author(s):

Zeyad Hailat ◽

Xue-Wen Chen

Keyword(s):

Supervised Learning ◽

Teacher Student ◽

Noisy Labels

Download Full-text

Metric Embedding Learning on Multi-Directional Projections

Algorithms ◽

10.3390/a13060133 ◽

2020 ◽

Vol 13 (6) ◽

pp. 133 ◽

Cited By ~ 2

Author(s):

Gábor Kertész

Keyword(s):

Feature Extraction ◽

High Performance ◽

Classification Performance ◽

Difficult Problem ◽

Input Image ◽

Image Size ◽

Matching Problems ◽

Low Efficiency ◽

Multi Class Classification ◽

Directional Image

Image based instance recognition is a difficult problem, in some cases even for the human eye. While latest developments in computer vision—mostly driven by deep learning—have shown that high performance models for classification or categorization can be engineered, the problem of discriminating similar objects with a low number of samples remain challenging. Advances from multi-class classification are applied for object matching problems, as the feature extraction techniques are the same; nature-inspired multi-layered convolutional nets learn the representations, and the output of such a model maps them to a multidimensional encoding space. A metric based loss brings same instance embeddings close to each other. While these solutions achieve high classification performance, low efficiency is caused by memory cost of high parameter number, which is in a relationship with input image size. Upon shrinking the input, the model requires less trainable parameters, while performance decreases. This drawback is tackled by using compressed feature extraction, e.g., projections. In this paper, a multi-directional image projection transformation with fixed vector lengths (MDIPFL) is applied for one-shot recognition tasks, trained on Siamese and Triplet architectures. Results show, that MDIPFL based approach achieves decent performance, despite of the significantly lower number of parameters.

Download Full-text