scholarly journals A generalized quadratic loss for SVM and Deep Learning

2021 ◽  
Author(s):  
filippo portera

We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibitthe best generalization performances. We extend the work [3] on a gen-eralized quadratic loss for learning problems that examines pattern cor-relations in order to concentrate the learning problem into input spaceregions where patterns are more densely distributed. From a shallowmethods point of view (e.g.: SVM), since the following mathematicalderivation of problem (9) in [3] is incorrect, we restart from problem (8)in [3] and we try to solve it with one procedure that iterates over the dualvariables until the primal and dual objective functions converge. In ad-dition we propose another algorithm that tries to solve the classificationproblem directly from the primal problem formulation. We make alsouse of Multiple Kernel Learning to improve generalization performances.Moreover, we introduce for the first time a custom loss that takes in con-sideration pattern correlation for a shallow and a Deep Learning task.We propose some pattern selection criteria and the results on 4 UCIdata-sets for the SVM method. We also report the results on a largerbinary classification data-set based on Twitter, again drawn from UCI,combined with shallow Learning Neural Networks, with and without thegeneralized quadratic loss. At last, we test our loss with a Deep NeuralNetwork within a larger regression task taken from UCI. We comparethe results of our optimizers with the well known solver SVMlightandwith Keras Multi-Layers Neural Networks with standard losses and witha parameterized generalized quadratic loss, and we obtain comparable results.

Author(s):  
Jamilu Adamu

Activation Functions are crucial parts of the Deep Learning Artificial Neural Networks. From the Biological point of view, a neuron is just a node with many inputs and one output. A neural network consists of many interconnected neurons. It is a “simple” device that receives data at the input and provides a response. The function of neurons is to process and transmit information; the neuron is the basic unit in the nervous system. Carly Vandergriendt (2018) stated the human brain at birth consists of an estimated 100 billion Neurons. The ability of a machine to mimic human intelligence is called Machine Learning. Deep Learning Artificial Neural Networks was designed to work like a human brain with the aid of arbitrary choice of Non-linear Activation Functions. Currently, there is no rule of thumb on the choice of Activation Functions, “Try out different things and see what combinations lead to the best performance”, however, sincerely; the choice of Activation Functions should not be Trial and error. Jamilu (2019) proposed that Activation Functions shall be emanated from AI-ML-Purified Data Set and its choice shall satisfy Jameel’s ANNAF Stochastic and or Deterministic Criterion. The objectives of this paper are to propose instances where Deep Learning Artificial Neural Networks are SUPERINTELLIGENT. Using Jameel’s ANNAF Stochastic and or Deterministic Criterion, the paper proposed four classes where Deep Learning Artificial Neural Networks are Superintelligent namely; Stochastic Superintelligent, Deterministic Superintelligent, and Stochastic-Deterministic 1st and 2nd Levels Superintelligence. Also, a Normal Probabilistic-Deterministic case was proposed.


2021 ◽  
Vol 35 (4) ◽  
pp. 341-347
Author(s):  
Aparna Gullapelly ◽  
Barnali Gupta Banik

Classifying moving objects in video surveillance can be difficult, and it is challenging to classify hard and soft objects with high Accuracy. Here rigid and non-rigid objects are limited to vehicles and people. CNN is used for the binary classification of rigid and non-rigid objects. A deep-learning system using convolutional neural networks was trained using python and categorized according to their appearance. The classification is supplemented by the use of a data set, which contains two classes of images that are both rigid and not rigid that differ by illuminations.


2021 ◽  
Vol 4 ◽  
Author(s):  
Stefano Markidis

Physics-Informed Neural Networks (PINN) are neural networks encoding the problem governing equations, such as Partial Differential Equations (PDE), as a part of the neural network. PINNs have emerged as a new essential tool to solve various challenging problems, including computing linear systems arising from PDEs, a task for which several traditional methods exist. In this work, we focus first on evaluating the potential of PINNs as linear solvers in the case of the Poisson equation, an omnipresent equation in scientific computing. We characterize PINN linear solvers in terms of accuracy and performance under different network configurations (depth, activation functions, input data set distribution). We highlight the critical role of transfer learning. Our results show that low-frequency components of the solution converge quickly as an effect of the F-principle. In contrast, an accurate solution of the high frequencies requires an exceedingly long time. To address this limitation, we propose integrating PINNs into traditional linear solvers. We show that this integration leads to the development of new solvers whose performance is on par with other high-performance solvers, such as PETSc conjugate gradient linear solvers, in terms of performance and accuracy. Overall, while the accuracy and computational performance are still a limiting factor for the direct use of PINN linear solvers, hybrid strategies combining old traditional linear solver approaches with new emerging deep-learning techniques are among the most promising methods for developing a new class of linear solvers.


Minerals ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1265
Author(s):  
Sebastian Iwaszenko ◽  
Leokadia Róg

The study of the petrographic structure of medium- and high-rank coals is important from both a cognitive and a utilitarian point of view. The petrographic constituents and their individual characteristics and features are responsible for the properties of coal and the way it behaves in various technological processes. This paper considers the application of convolutional neural networks for coal petrographic images segmentation. The U-Net-based model for segmentation was proposed. The network was trained to segment inertinite, liptinite, and vitrinite. The segmentations prepared manually by a domain expert were used as the ground truth. The results show that inertinite and vitrinite can be successfully segmented with minimal difference from the ground truth. The liptinite turned out to be much more difficult to segment. After usage of transfer learning, moderate results were obtained. Nevertheless, the application of the U-Net-based network for petrographic image segmentation was successful. The results are good enough to consider the method as a supporting tool for domain experts in everyday work.


2020 ◽  
Author(s):  
Huseyin Yaşar ◽  
Murat Ceylan

Abstract At the end of 2019, a new type of virus, belonging to the coronaviridae family has emerged and it is considered that the virus in question is of zootonic origin. The virus that emerged in China first affected this country and then spread worldwide. Pneumonia develops due to Covid-19 virus in patients having severe disease symptoms. Many literature studies have been carried out in the process where the effects of the disease-induced pneumonia in lungs have been demonstrated with the help of chest X-ray imaging. In this study, which aims at early diagnosis of Covid-19 disease by using X-Ray images, the deep-learning approach, which is a state-of-the-art artificial intelligence method, was used and automatic classification of images was performed using Convolutional Neural Networks (CNN). In the first training-test data set used in the study, there were a total of 230 abnormal and 80 normal X-Ray images, while in the second training-test data set there were 476 X-Ray images, of which 150 abnormal and 326 normal. Thus, classification results have been provided for two data sets, containing predominantly abnormal images and predominantly normal images respectively. In the study, a 23-layer CNN architecture was developed. Within the scope of the study, results were obtained by using chest X-Ray images directly in training-test procedures and the sub-band images obtained by applying Dual Tree Complex Wavelet Transform (DT-CWT) to the above-mentioned images. The same experiments were repeated using images obtained by applying Local Binary Pattern (LBP) to the chest X-Ray images. Within the scope of the study, a new result generation algorithm having been put forward additionally, it was ensured that the experimental results were combined and the success of the study was improved. In the experiments carried out in the study, the trainings were carried out using the k-fold cross validation method. Here the k value was chosen 23. Considering the highest results of the tests performed in the study, values of sensitivity, specificity, accuracy and AUC for the first training-test data set were calculated to be 1, 1, 0,9913 and 0,9996; while for the second data set of training-test, they were 1, 0,9969, 0,9958 and 0,9996 respectively. Considering the average highest results of the experiments performed within the scope of the study, the values of sensitivity, specificity, accuracy and AUC for the first training-test data set were 0,9933, 0,9725, 0,9843 and 0,9988; while for the second training-test data set, they were 0,9813, 0,9908, 0,9857 and 0,9983 respectively.


Author(s):  
Jivan Y. Patil ◽  
Girish P. Potdar

The ability to process, understand and interact in natural language carries high importance for building a Intelligent system, as it will greatly affect the way of communicating with the system. Deep Neural Networks (DNNs) have achieved excellent performance for many of machine learning problems and are widely accepted for applications in the field of computer vision and supervised  learning. Although DNNs work well with availability of large labeled training set, it cannot be used to map complex structures like sentences end-to-end. Existing approaches for conversational modeling are domain specific and require handcrafted rules. This paper proposes a simple approach based on use of neural networks’ recently proposed sequence to sequence framework. The proposed model generates reply by predicting sentence using chained probability for given sentence(s) in conversation. This model is trained end-to-end on large data set. Proposed approach uses Attention to focus text generation on intent of conversation as well as beam search to generate optimum output with some diversity.Primary findings show that model shows common sense reasoning on movie transcript data set.


2019 ◽  
Author(s):  
Dan MacLean

AbstractGene Regulatory networks that control gene expression are widely studied yet the interactions that make them up are difficult to predict from high throughput data. Deep Learning methods such as convolutional neural networks can perform surprisingly good classifications on a variety of data types and the matrix-like gene expression profiles would seem to be ideal input data for deep learning approaches. In this short study I compiled training sets of expression data using the Arabidopsis AtGenExpress global stress expression data set and known transcription factor-target interactions from the Arabidopsis PLACE database. I built and optimised convolutional neural networks with a best model providing 95 % accuracy of classification on a held-out validation set. Investigation of the activations within this model revealed that classification was based on positive correlation of expression profiles in short sections. This result shows that a convolutional neural network can be used to make classifications and reveal the basis of those calssifications for gene expression data sets, indicating that a convolutional neural network is a useful and interpretable tool for exploratory classification of biological data. The final model is available for download and as a web application.


2020 ◽  
Vol 2 (2) ◽  
pp. 32-37
Author(s):  
P. RADIUK ◽  

Over the last decade, a set of machine learning algorithms called deep learning has led to significant improvements in computer vision, natural language recognition and processing. This has led to the widespread use of a variety of commercial, learning-based products in various fields of human activity. Despite this success, the use of deep neural networks remains a black box. Today, the process of setting hyperparameters and designing a network architecture requires experience and a lot of trial and error and is based more on chance than on a scientific approach. At the same time, the task of simplifying deep learning is extremely urgent. To date, no simple ways have been invented to establish the optimal values of learning hyperparameters, namely learning speed, sample size, data set, learning pulse, and weight loss. Grid search and random search of hyperparameter space are extremely resource intensive. The choice of hyperparameters is critical for the training time and the final result. In addition, experts often choose one of the standard architectures (for example, ResNets and ready-made sets of hyperparameters. However, such kits are usually suboptimal for specific practical tasks. The presented work offers an approach to finding the optimal set of hyperparameters of learning ZNM. An integrated approach to all hyperparameters is valuable because there is an interdependence between them. The aim of the work is to develop an approach for setting a set of hyperparameters, which will reduce the time spent during the design of ZNM and ensure the efficiency of its work. In recent decades, the introduction of deep learning methods, in particular convolutional neural networks (CNNs), has led to impressive success in image and video processing. However, the training of CNN has been commonly mostly based on the employment of quasi-optimal hyperparameters. Such an approach usually requires huge computational and time costs to train the network and does not guarantee a satisfactory result. However, hyperparameters play a crucial role in the effectiveness of CNN, as diverse hyperparameters lead to models with significantly different characteristics. Poorly selected hyperparameters generally lead to low model performance. The issue of choosing optimal hyperparameters for CNN has not been resolved yet. The presented work proposes several practical approaches to setting hyperparameters, which allows reducing training time and increasing the accuracy of the model. The article considers the function of training validation loss during underfitting and overfitting. There are guidelines in the end to reach the optimization point. The paper also considers the regulation of learning rate and momentum to accelerate network training. All experiments are based on the widespread CIFAR-10 and CIFAR-100 datasets.


Author(s):  
Andrés Ruiz-Tagle Palazuelos ◽  
Enrique López Droguett ◽  
Rodrigo Pascual

With the availability of cheaper multi-sensor systems, one has access to massive and multi-dimensional sensor data for fault diagnostics and prognostics. However, from a time, engineering and computational perspective, it is often cost prohibitive to manually extract useful features and to label all the data. To address these challenges, deep learning techniques have been used in the recent years. Within these, convolutional neural networks have shown remarkable performance in fault diagnostics and prognostics. However, this model present limitations from a prognostics and health management perspective: to improve its feature extraction generalization capabilities and reduce computation time, ill-based pooling operations are employed, which require sub-sampling of the data, thus loosing potentially valuable information regarding an asset’s degradation process. Capsule neural networks have been recently proposed to address these problems with strong results in computer vision–related classification tasks. This has motivated us to extend capsule neural networks for fault prognostics and, in particular, remaining useful life estimation. The proposed model, architecture and algorithm are tested and compared to other state-of-the art deep learning models on the benchmark Commercial Modular Aero Propulsion System Simulation turbofans data set. The results indicate that the proposed capsule neural networks are a promising approach for remaining useful life prognostics from multi-dimensional sensor data.


2021 ◽  
Vol 12 ◽  
pp. 878-901
Author(s):  
Ido Azuri ◽  
Irit Rosenhek-Goldian ◽  
Neta Regev-Rudzki ◽  
Georg Fantner ◽  
Sidney R Cohen

Progress in computing capabilities has enhanced science in many ways. In recent years, various branches of machine learning have been the key facilitators in forging new paths, ranging from categorizing big data to instrumental control, from materials design through image analysis. Deep learning has the ability to identify abstract characteristics embedded within a data set, subsequently using that association to categorize, identify, and isolate subsets of the data. Scanning probe microscopy measures multimodal surface properties, combining morphology with electronic, mechanical, and other characteristics. In this review, we focus on a subset of deep learning algorithms, that is, convolutional neural networks, and how it is transforming the acquisition and analysis of scanning probe data.


Sign in / Sign up

Export Citation Format

Share Document