scholarly journals Differential Evolution Based Layer-Wise Weight Pruning for Compressing Deep Neural Networks

Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 880
Author(s):  
Tao Wu ◽  
Xiaoyang Li ◽  
Deyun Zhou ◽  
Na Li ◽  
Jiao Shi

Deep neural networks have evolved significantly in the past decades and are now able to achieve better progression of sensor data. Nonetheless, most of the deep models verify the ruling maxim in deep learning—bigger is better—so they have very complex structures. As the models become more complex, the computational complexity and resource consumption of these deep models are increasing significantly, making them difficult to perform on resource-limited platforms, such as sensor platforms. In this paper, we observe that different layers often have different pruning requirements, and propose a differential evolutionary layer-wise weight pruning method. Firstly, the pruning sensitivity of each layer is analyzed, and then the network is compressed by iterating the weight pruning process. Unlike some other methods that deal with pruning ratio by greedy ways or statistical analysis, we establish an optimization model to find the optimal pruning sensitivity set for each layer. Differential evolution is an effective method based on population optimization which can be used to address this task. Furthermore, we adopt a strategy to recovery some of the removed connections to increase the capacity of the pruned model during the fine-tuning phase. The effectiveness of our method has been demonstrated in experimental studies. Our method compresses the number of weight parameters in LeNet-300-100, LeNet-5, AlexNet and VGG16 by 24×, 14×, 29× and 12×, respectively.

Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1685
Author(s):  
Sakorn Mekruksavanich ◽  
Anuchit Jitpattanakul

Sensor-based human activity recognition (S-HAR) has become an important and high-impact topic of research within human-centered computing. In the last decade, successful applications of S-HAR have been presented through fruitful academic research and industrial applications, including for healthcare monitoring, smart home controlling, and daily sport tracking. However, the growing requirements of many current applications for recognizing complex human activities (CHA) have begun to attract the attention of the HAR research field when compared with simple human activities (SHA). S-HAR has shown that deep learning (DL), a type of machine learning based on complicated artificial neural networks, has a significant degree of recognition efficiency. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are two different types of DL methods that have been successfully applied to the S-HAR challenge in recent years. In this paper, we focused on four RNN-based DL models (LSTMs, BiLSTMs, GRUs, and BiGRUs) that performed complex activity recognition tasks. The efficiency of four hybrid DL models that combine convolutional layers with the efficient RNN-based models was also studied. Experimental studies on the UTwente dataset demonstrated that the suggested hybrid RNN-based models achieved a high level of recognition performance along with a variety of performance indicators, including accuracy, F1-score, and confusion matrix. The experimental results show that the hybrid DL model called CNN-BiGRU outperformed the other DL models with a high accuracy of 98.89% when using only complex activity data. Moreover, the CNN-BiGRU model also achieved the highest recognition performance in other scenarios (99.44% by using only simple activity data and 98.78% with a combination of simple and complex activities).


Author(s):  
Chen Qi ◽  
Shibo Shen ◽  
Rongpeng Li ◽  
Zhifeng Zhao ◽  
Qing Liu ◽  
...  

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1511
Author(s):  
Taylor Simons ◽  
Dah-Jye Lee

There has been a recent surge in publications related to binarized neural networks (BNNs), which use binary values to represent both the weights and activations in deep neural networks (DNNs). Due to the bitwise nature of BNNs, there have been many efforts to implement BNNs on ASICs and FPGAs. While BNNs are excellent candidates for these kinds of resource-limited systems, most implementations still require very large FPGAs or CPU-FPGA co-processing systems. Our work focuses on reducing the computational cost of BNNs even further, making them more efficient to implement on FPGAs. We target embedded visual inspection tasks, like quality inspection sorting on manufactured parts and agricultural produce sorting. We propose a new binarized convolutional layer, called the neural jet features layer, that learns well-known classic computer vision kernels that are efficient to calculate as a group. We show that on visual inspection tasks, neural jet features perform comparably to standard BNN convolutional layers while using less computational resources. We also show that neural jet features tend to be more stable than BNN convolution layers when training small models.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Fuyong Xing ◽  
Yuanpu Xie ◽  
Xiaoshuang Shi ◽  
Pingjun Chen ◽  
Zizhao Zhang ◽  
...  

Abstract Background Nucleus or cell detection is a fundamental task in microscopy image analysis and supports many other quantitative studies such as object counting, segmentation, tracking, etc. Deep neural networks are emerging as a powerful tool for biomedical image computing; in particular, convolutional neural networks have been widely applied to nucleus/cell detection in microscopy images. However, almost all models are tailored for specific datasets and their applicability to other microscopy image data remains unknown. Some existing studies casually learn and evaluate deep neural networks on multiple microscopy datasets, but there are still several critical, open questions to be addressed. Results We analyze the applicability of deep models specifically for nucleus detection across a wide variety of microscopy image data. More specifically, we present a fully convolutional network-based regression model and extensively evaluate it on large-scale digital pathology and microscopy image datasets, which consist of 23 organs (or cancer diseases) and come from multiple institutions. We demonstrate that for a specific target dataset, training with images from the same types of organs might be usually necessary for nucleus detection. Although the images can be visually similar due to the same staining technique and imaging protocol, deep models learned with images from different organs might not deliver desirable results and would require model fine-tuning to be on a par with those trained with target data. We also observe that training with a mixture of target and other/non-target data does not always mean a higher accuracy of nucleus detection, and it might require proper data manipulation during model training to achieve good performance. Conclusions We conduct a systematic case study on deep models for nucleus detection in a wide variety of microscopy images, aiming to address several important but previously understudied questions. We present and extensively evaluate an end-to-end, pixel-to-pixel fully convolutional regression network and report a few significant findings, some of which might have not been reported in previous studies. The model performance analysis and observations would be helpful to nucleus detection in microscopy images.


2021 ◽  
Author(s):  
Akinori Minagi ◽  
Hokuto Hirano ◽  
Kazuhiro Takemoto

Abstract Transfer learning from natural images is well used in deep neural networks (DNNs) for medical image classification to achieve computer-aided clinical diagnosis. Although the adversarial vulnerability of DNNs hinders practical applications owing to the high stakes of diagnosis, adversarial attacks are expected to be limited because training data — which are often required for adversarial attacks — are generally unavailable in terms of security and privacy preservation. Nevertheless, we hypothesized that adversarial attacks are also possible using natural images because pre-trained models do not change significantly after fine-tuning. We focused on three representative DNN-based medical image classification tasks (i.e., skin cancer, referable diabetic retinopathy, and pneumonia classifications) and investigated whether medical DNN models with transfer learning are vulnerable to universal adversarial perturbations (UAPs), generated using natural images. UAPs from natural images are useful for both non-targeted and targeted attacks. The performance of UAPs from natural images was significantly higher than that of random controls, although slightly lower than that of UAPs from training images. Vulnerability to UAPs from natural images was observed between different natural image datasets and between different model architectures. The use of transfer learning causes a security hole, which decreases the reliability and safety of computer-based disease diagnosis. Model training from random initialization (without transfer learning) reduced the performance of UAPs from natural images; however, it did not completely avoid vulnerability to UAPs. The vulnerability of UAPs from natural images will become a remarkable security threat.


Author(s):  
Derya Soydaner

In recent years, we have witnessed the rise of deep learning. Deep neural networks have proved their success in many areas. However, the optimization of these networks has become more difficult as neural networks going deeper and datasets becoming bigger. Therefore, more advanced optimization algorithms have been proposed over the past years. In this study, widely used optimization algorithms for deep learning are examined in detail. To this end, these algorithms called adaptive gradient methods are implemented for both supervised and unsupervised tasks. The behavior of the algorithms during training and results on four image datasets, namely, MNIST, CIFAR-10, Kaggle Flowers and Labeled Faces in the Wild are compared by pointing out their differences against basic optimization algorithms.


Author(s):  
Wen Xu ◽  
Jing He ◽  
Yanfeng Shu

Transfer learning is an emerging technique in machine learning, by which we can solve a new task with the knowledge obtained from an old task in order to address the lack of labeled data. In particular deep domain adaptation (a branch of transfer learning) gets the most attention in recently published articles. The intuition behind this is that deep neural networks usually have a large capacity to learn representation from one dataset and part of the information can be further used for a new task. In this research, we firstly present the complete scenarios of transfer learning according to the domains and tasks. Secondly, we conduct a comprehensive survey related to deep domain adaptation and categorize the recent advances into three types based on implementing approaches: fine-tuning networks, adversarial domain adaptation, and sample-reconstruction approaches. Thirdly, we discuss the details of these methods and introduce some typical real-world applications. Finally, we conclude our work and explore some potential issues to be further addressed.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 22802-22811
Author(s):  
Zhigang Li ◽  
Jialin Wang ◽  
Di Cai ◽  
Yingqi Li ◽  
Changxin Cai ◽  
...  

Author(s):  
Aydin Ayanzadeh ◽  
Sahand Vahidnia

In this paper, we leverage state of the art models on Imagenet data-sets. We use the pre-trained model and learned weighs to extract the feature from the Dog breeds identification data-set. Afterwards, we applied fine-tuning and dataaugmentation to increase the performance of our test accuracy in classification of dog breeds datasets. The performance of the proposed approaches are compared with the state of the art models of Image-Net datasets such as ResNet-50, DenseNet-121, DenseNet-169 and GoogleNet. we achieved 89.66% , 85.37% 84.01% and 82.08% test accuracy respectively which shows thesuperior performance of proposed method to the previous works on Stanford dog breeds datasets.


Sign in / Sign up

Export Citation Format

Share Document