scholarly journals Revisiting Dropout: Escaping Pressure for Training Neural Networks with Multiple Costs

Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 989
Author(s):  
Sangmin Woo ◽  
Kangil Kim ◽  
Junhyug Noh ◽  
Jong-Hun Shin ◽  
Seung-Hoon Na

A common approach to jointly learn multiple tasks with a shared structure is to optimize the model with a combined landscape of multiple sub-costs. However, gradients derived from each sub-cost often conflicts in cost plateaus, resulting in a subpar optimum. In this work, we shed light on such gradient conflict challenges and suggest a solution named Cost-Out, which randomly drops the sub-costs for each iteration. We provide the theoretical and empirical evidence of the existence of escaping pressure induced by the Cost-Out mechanism. While simple, the empirical results indicate that the proposed method can enhance the performance of multi-task learning problems, including two-digit image classification sampled from MNIST dataset and machine translation tasks for English from and to French, Spanish, and German WMT14 datasets.

Author(s):  
Dr. I. Jeena Jacob

The classification of the text involving the process of identification and categorization of text is a tedious and a challenging task too. The Capsules Network (Caps-Net) which is a unique architecture with the capability to confiscate the basic attributes comprising the insights of the particular field that could help in bridging the knowledge gap existing between the source and the destination tasks and capability learn more robust representation than the CNN-Convolutional neural networks in the image classification domain is utilized in the paper to classify the text. As the multi –task learning capability enables to part insights between the tasks that are related and enhances data used in training indirectly, the Caps-Net based multi task learning frame work is proposed in the paper. The proposed architecture including the Caps-Net effectively classifies the text and minimizes the interference experienced among the multiple tasks in the multi –task learning. The architecture put forward is evaluated using various text classification dataset ensuring the efficacy of the proffered frame work


This chapter provides additional empirical evidence on the efficiency in cooperative banks and savings banks by applying a stochastic frontier model to estimate the cost efficiency from nine countries over the period 2005 to 2011. The empirical results suggested that a higher rate of the gross domestic product (GDP) growth implies an increase in the inefficiency level, while smaller cooperative and savings banks are more efficient in managing costs compared to larger banks.


Deep Learning allows us to build powerful models to solve problems like image classification, time series prediction, natural language processing, etc. This is achieved at the cost of huge amounts of storage and processing requirements which are sometimes not possible in machines with limited resources. In this paper, we compare different methods which tackle this problem with network pruning. Selected few pruning methodologies from the deep learning literature were implemented to display their results. Modern neural architectures have a combination of different layers like convolutional layers, pooling layers, dense layers, etc. We compare pruning techniques for dense layers (such as unit/neuron pruning, and weight Pruning), and convolutional layers as well (using L1 norm, taylor expansion of loss to determine importance of convolutional filters, and Variable Importance in Projection using Partial Least Squares) for the image classification task. This study aims to ease the overhead in terms of optimization of the model for academic, as well as commercial, use of deep neural networks.


Author(s):  
Tuan Hoang ◽  
Thanh-Toan Do ◽  
Tam V. Nguyen ◽  
Ngai-Man Cheung

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.


2000 ◽  
Vol 12 (8) ◽  
pp. 1869-1887 ◽  
Author(s):  
Holger Schwenk ◽  
Yoshua Bengio

Boosting is a general method for improving the performance of learning algorithms. A recently proposed boosting algorithm, Ada Boost, has been applied with great success to several benchmark machine learning problems using mainly decision trees as base classifiers. In this article we investigate whether Ada Boost also works as well with neural networks, and we discuss the advantages and drawbacks of different versions of the Ada Boost algorithm. In particular, we compare training methods based on sampling the training set and weighting the cost function. The results suggest that random resampling of the training data is not the main explanation of the success of the improvements brought by Ada Boost. This is in contrast to bagging, which directly aims at reducing variance and for which random resampling is essential to obtain the reduction in generalization error. Our system achieves about 1.4% error on a data set of on-line handwritten digits from more than 200 writers. A boosted multilayer network achieved 1.5% error on the UCI letters and 8.1% error on the UCI satellite data set, which is significantly better than boosted decision trees.


Author(s):  
S. Matthew Liao

Abstract. A number of people believe that results from neuroscience have the potential to settle seemingly intractable debates concerning the nature, practice, and reliability of moral judgments. In particular, Joshua Greene has argued that evidence from neuroscience can be used to advance the long-standing debate between consequentialism and deontology. This paper first argues that charitably interpreted, Greene’s neuroscientific evidence can contribute to substantive ethical discussions by being part of an epistemic debunking argument. It then argues that taken as an epistemic debunking argument, Greene’s argument falls short in undermining deontological judgments. Lastly, it proposes that accepting Greene’s methodology at face value, neuroimaging results may in fact call into question the reliability of consequentialist judgments. The upshot is that Greene’s empirical results do not undermine deontology and that Greene’s project points toward a way by which empirical evidence such as neuroscientific evidence can play a role in normative debates.


2020 ◽  
Vol 2020 (10) ◽  
pp. 28-1-28-7 ◽  
Author(s):  
Kazuki Endo ◽  
Masayuki Tanaka ◽  
Masatoshi Okutomi

Classification of degraded images is very important in practice because images are usually degraded by compression, noise, blurring, etc. Nevertheless, most of the research in image classification only focuses on clean images without any degradation. Some papers have already proposed deep convolutional neural networks composed of an image restoration network and a classification network to classify degraded images. This paper proposes an alternative approach in which we use a degraded image and an additional degradation parameter for classification. The proposed classification network has two inputs which are the degraded image and the degradation parameter. The estimation network of degradation parameters is also incorporated if degradation parameters of degraded images are unknown. The experimental results showed that the proposed method outperforms a straightforward approach where the classification network is trained with degraded images only.


Actuators ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 30
Author(s):  
Pornthep Preechayasomboon ◽  
Eric Rombokas

Soft robotic actuators are now being used in practical applications; however, they are often limited to open-loop control that relies on the inherent compliance of the actuator. Achieving human-like manipulation and grasping with soft robotic actuators requires at least some form of sensing, which often comes at the cost of complex fabrication and purposefully built sensor structures. In this paper, we utilize the actuating fluid itself as a sensing medium to achieve high-fidelity proprioception in a soft actuator. As our sensors are somewhat unstructured, their readings are difficult to interpret using linear models. We therefore present a proof of concept of a method for deriving the pose of the soft actuator using recurrent neural networks. We present the experimental setup and our learned state estimator to show that our method is viable for achieving proprioception and is also robust to common sensor failures.


Sign in / Sign up

Export Citation Format

Share Document