Layer-wise learning based stochastic gradient descent method for the optimization of deep convolutional neural network

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.

Download Full-text

SSGD: A Safe and Efficient Method of Gradient Descent

Security and Communication Networks ◽

10.1155/2021/5404061 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Jinhuan Duan ◽

Xianxian Li ◽

Shiqi Gao ◽

Zili Zhong ◽

Jinyan Wang

Keyword(s):

Gradient Descent ◽

Large Scale ◽

Optimization Problems ◽

Unit Vector ◽

Descent Method ◽

Stochastic Gradient ◽

Learning System ◽

Training Data ◽

Stochastic Gradient Descent ◽

Gradient Descent Method

With the vigorous development of artificial intelligence technology, various engineering technology applications have been implemented one after another. The gradient descent method plays an important role in solving various optimization problems, due to its simple structure, good stability, and easy implementation. However, in multinode machine learning system, the gradients usually need to be shared, which will cause privacy leakage, because attackers can infer training data with the gradient information. In this paper, to prevent gradient leakage while keeping the accuracy of the model, we propose the super stochastic gradient descent approach to update parameters by concealing the modulus length of gradient vectors and converting it or them into a unit vector. Furthermore, we analyze the security of super stochastic gradient descent approach and demonstrate that our algorithm can defend against the attacks on the gradient. Experiment results show that our approach is obviously superior to prevalent gradient descent approaches in terms of accuracy, robustness, and adaptability to large-scale batches. Interestingly, our algorithm can also resist model poisoning attacks to a certain extent.

Download Full-text

Implicit Stochastic Gradient Descent Method for Cross-Domain Recommendation System

Sensors ◽

10.3390/s20092510 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2510

Author(s):

Nam D. Vo ◽

Minsung Hong ◽

Jason J. Jung

Keyword(s):

Gradient Descent ◽

Recommendation System ◽

Computation Time ◽

Descent Method ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Target Domain ◽

Cross Domain ◽

Gradient Descent Algorithm

The previous recommendation system applied the matrix factorization collaborative filtering (MFCF) technique to only single domains. Due to data sparsity, this approach has a limitation in overcoming the cold-start problem. Thus, in this study, we focus on discovering latent features from domains to understand the relationships between domains (called domain coherence). This approach uses potential knowledge of the source domain to improve the quality of the target domain recommendation. In this paper, we consider applying MFCF to multiple domains. Mainly, by adopting the implicit stochastic gradient descent algorithm to optimize the objective function for prediction, multiple matrices from different domains are consolidated inside the cross-domain recommendation system (CDRS). Additionally, we design a conceptual framework for CDRS, which applies to different industrial scenarios for recommenders across domains. Moreover, an experiment is devised to validate the proposed method. By using a real-world dataset gathered from Amazon Food and MovieLens, experimental results show that the proposed method improves 15.2% and 19.7% in terms of computation time and MSE over other methods on a utility matrix. Notably, a much lower convergence value of the loss function has been obtained from the experiment. Furthermore, a critical analysis of the obtained results shows that there is a dynamic balance between prediction accuracy and computational complexity.

Download Full-text

Kinerja Algoritma Optimasi Root-Mean-Square Propagation dan Stochastic Gradient Descent pada Klasifikasi Pneumonia Covid-19 Menggunakan CNN

Jurnal Edukasi dan Penelitian Informatika (JEPIN) ◽

10.26418/jp.v7i3.49172 ◽

2021 ◽

Vol 7 (3) ◽

pp. 420

Author(s):

Budi Nugroho ◽

Eva Yulia Puspaningrum ◽

M. Syahrul Munir

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Root Mean Square ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Mean Square ◽

X Ray

Penelitian ini berkaitan dengan proses klasifikasi Pneumonia Covid-19 (radang paru-paru atau pneumonia yang disebabkan oleh virus corona SARS-CoV-2) dari citra hasil foto rontgen / x-ray paru-paru dengan menggunakan pendekatan pembelajaran mesin. Klasifikasi dilakukan untuk menentukan apakah kondisi paru-paru seseorang mengalami Pneumonia Covid-19, Pneumonia biasa, atau Normal / Sehat. Untuk menghasilkan kinerja klasifikasi yang lebih baik, proses optimasi seringkali digunakan pada tahap pelatihan data. Banyak teknik yang digunakan untuk melakukan optimasi tersebut, diantaranya adalah algoritma Root-Mean-Square Propagation (RMSprop) dan Stochastic Gradient Descent (SGD). Pada penelitian ini, pengujian dilakukan terhadap kedua metode tersebut untuk mengetahui kinerjanya pada klasifikasi Pneumonia Covid-19. Metode klasifikasi menggunakan Convolutional Neural Network (CNN) yang menerapkan 5 layer konvolusi dengan nilai filter 16, 32, 64, 128, dan 256. Proses pelatihan menggunakan 3.900 citra yang terdiri atas 1.300 citra pneumonia covid-19, 1.300 citra pneumonia, dan 1.300 citra normal. Sedangkan proses validasi menggunakan 450 citra dan proses pengujian mengunakan 225 citra. Berdasarkan uji coba yang telah dilakukan, implementasi algoritma optimasi RMSprop menghasilkan akurasi 87,99%, presisi 0,88, recall 0,86, dan f1 score 0,87. Sedangkan implementasi algoritma optimasi SGD menghasilkan akurasi 66,22%, presisi 0,69, recall 0,64, dan f1 score 0,67. Hasil ini memberikan informasi penting bahwa algoritma optimasi RMSprop menghasilkan kinerja yang jauh lebih baik daripada SGD pada klasifikasi Pneumonia Covid-19.

Download Full-text

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme Under Weak Strong Convexity Assumption

Modeling and Optimization: Theory and Applications - Springer Proceedings in Mathematics & Statistics ◽

10.1007/978-3-319-66616-7_7 ◽

2017 ◽

pp. 95-117 ◽

Cited By ~ 1

Author(s):

Jie Liu ◽

Martin Takáč

Keyword(s):

Gradient Descent ◽

Descent Method ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Strong Convexity ◽

Gradient Descent Method ◽

Convexity Assumption

Download Full-text

Multiclass Recognition of Offline Handwritten Devanagari Characters using CNN

International Journal of Mathematical Engineering and Management Sciences ◽

10.33889/ijmems.2020.5.6.106 ◽

2020 ◽

Vol 5 (6) ◽

pp. 1429-1439

Author(s):

Mamta Bisht ◽

Richa Gupta

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Character Recognition ◽

Gradient Descent ◽

Recognition Task ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Handwritten Character Recognition ◽

Handwritten Documents ◽

Acceptable Accuracy

The handwriting style of every writer consists of variations, skewness and slanting nature and therefore, it is a stimulating task to recognise these handwritten documents. This article presents a study on various methods available in literature for Devanagari handwritten character recognition and performs its implementation using Convolutional neural network (CNN). Available methods are studied on different parameters and a tabular comparison is also presented which concludes superiority of CNN model in character recognition task. The proposed CNN model results in well acceptable accuracy using dropout and stochastic gradient descent (SGD) optimizer.

Download Full-text

Wireless Brain Wave Classification for Alzheimer’s Patients via Efficient Neural Network Computation

Advances in Data Science and Adaptive Analysis ◽

10.1142/s2424922x18500043 ◽

2018 ◽

Vol 10 (03) ◽

pp. 1850004

Author(s):

Grant Sheen

Keyword(s):

Neural Network ◽

Gradient Descent ◽

Descent Method ◽

Stochastic Gradient Descent ◽

Normal Person ◽

Gradient Descent Method ◽

Step Size ◽

Brain Wave ◽

Proposed Model ◽

Wireless Recording

Wireless recording and real time classification of brain waves are essential steps towards future wearable devices to assist Alzheimer’s patients in conveying their thoughts. This work is concerned with efficient computation of a dimension-reduced neural network (NN) model on Alzheimer’s patient data recorded by a wireless headset. Due to much fewer sensors in wireless recording than the number of electrodes in a traditional wired cap and shorter attention span of an Alzheimer’s patient than a normal person, the data is much more restrictive than is typical in neural robotics and mind-controlled games. To overcome this challenge, an alternating minimization (AM) method is developed for network training. AM minimizes a nonsmooth and nonconvex objective function one variable at a time while fixing the rest. The sub-problem for each variable is piecewise convex with a finite number of minima. The overall iterative AM method is descending and free of step size (learning parameter) in the standard gradient descent method. The proposed model, trained by the AM method, significantly outperforms the standard NN model trained by the stochastic gradient descent method in classifying four daily thoughts, reaching accuracies around 90% for Alzheimer’s patient. Curved decision boundaries of the proposed model with multiple hidden neurons are found analytically to establish the nonlinear nature of the classification.

Download Full-text

Backpropagation and stochastic gradient descent method

Neurocomputing ◽

10.1016/0925-2312(93)90006-o ◽

1993 ◽

Vol 5 (4-5) ◽

pp. 185-196 ◽

Cited By ~ 99

Author(s):

Shun-ichi Amari

Keyword(s):

Gradient Descent ◽

Descent Method ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method

Download Full-text

Localization Using Stochastic Gradient Descent Method in a 5G Network

2018 15th IEEE India Council International Conference (INDICON) ◽

10.1109/indicon45594.2018.8987064 ◽

2018 ◽

Author(s):

Ankur Pandey ◽

Pinky Pinky ◽

Sudhir Kumar

Keyword(s):

Gradient Descent ◽

Descent Method ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

5G Network

Download Full-text