Meta-Heuristic Optimization Methods for Quaternion-Valued Neural Networks

Jeremiah Bill; Lance Champagne; Bruce Cox; Trevor Bihl

doi:10.3390/math9090938

Meta-Heuristic Optimization Methods for Quaternion-Valued Neural Networks

Mathematics ◽

10.3390/math9090938 ◽

2021 ◽

Vol 9 (9) ◽

pp. 938

Author(s):

Jeremiah Bill ◽

Lance Champagne ◽

Bruce Cox ◽

Trevor Bihl

Keyword(s):

Neural Networks ◽

Optimization Methods ◽

Network Size ◽

High Dimensional ◽

Proof Of Concept ◽

Activation Functions ◽

Hypercomplex Numbers ◽

Neural Network Structure ◽

Future Work ◽

High Dimensional Datasets

In recent years, real-valued neural networks have demonstrated promising, and often striking, results across a broad range of domains. This has driven a surge of applications utilizing high-dimensional datasets. While many techniques exist to alleviate issues of high-dimensionality, they all induce a cost in terms of network size or computational runtime. This work examines the use of quaternions, a form of hypercomplex numbers, in neural networks. The constructed networks demonstrate the ability of quaternions to encode high-dimensional data in an efficient neural network structure, showing that hypercomplex neural networks reduce the number of total trainable parameters compared to their real-valued equivalents. Finally, this work introduces a novel training algorithm using a meta-heuristic approach that bypasses the need for analytic quaternion loss or activation functions. This algorithm allows for a broader range of activation functions over current quaternion networks and presents a proof-of-concept for future work.

Download Full-text

RECOGNITION OF THE TEXT BY MEANS OF DEEP LEARNING

Bulletin Series of Physics & Mathematical Sciences ◽

10.51889/2020-1.1728-7901.68 ◽

2020 ◽

Vol 69 (1) ◽

pp. 378-383

Author(s):

T.A. Nurmukhanov ◽

◽

B.S. Daribayev ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Neural Network Model ◽

Input Data ◽

Optimization Methods ◽

Text Recognition ◽

Activation Functions

Using neural networks, various variations of the classification of objects can be performed. Neural networks are used in many areas of recognition. A big area in this area is text recognition. The paper considers the optimal way to build a network for text recognition, the use of optimal methods for activation functions, and optimizers. Also, the article checked the correctness of text recognition with different optimization methods. This article is devoted to the analysis of convolutional neural networks. In the article, a convolutional neural network model will be trained with a teacher. Teaching with a teacher is a type of training for neural networks in which you provide the input data and the desired result, that is, the student looking at the input data will understand that you need to strive for the result that was provided to him.

Download Full-text

Comparison of high-dimensional neural networks using hypercomplex numbers in a robot manipulator control

Artificial Life and Robotics ◽

10.1007/s10015-021-00687-x ◽

2021 ◽

Author(s):

Kazuhiko Takahashi

Keyword(s):

Neural Networks ◽

Robot Manipulator ◽

High Dimensional ◽

Hypercomplex Numbers ◽

Manipulator Control

Download Full-text

Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks

Recent Patents on Computer Science ◽

10.2174/2213275911666181025143029 ◽

2019 ◽

Vol 12 (3) ◽

pp. 156-161 ◽

Cited By ~ 3

Author(s):

Aman Dureja ◽

Payal Pahwa

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Primary Objective ◽

Experimental Comparison ◽

Activation Functions ◽

Practical Applications ◽

Network Activation ◽

Non Linear ◽

Hidden Layer

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.

Download Full-text

Hardware implementation of radial-basis neural networks with Gaussian activation functions on FPGA

Neural Computing and Applications ◽

10.1007/s00521-021-05706-3 ◽

2021 ◽

Author(s):

Volodymyr Shymkovych ◽

Sergii Telenyk ◽

Petro Kravets

Keyword(s):

Neural Networks ◽

Hardware Implementation ◽

Gaussian Function ◽

Activation Function ◽

Rbf Neural Networks ◽

Activation Functions ◽

Rbf Network ◽

Combination Scheme ◽

Radial Basis ◽

Hidden Layer

AbstractThis article introduces a method for realizing the Gaussian activation function of radial-basis (RBF) neural networks with their hardware implementation on field-programmable gaits area (FPGAs). The results of modeling of the Gaussian function on FPGA chips of different families have been presented. RBF neural networks of various topologies have been synthesized and investigated. The hardware component implemented by this algorithm is an RBF neural network with four neurons of the latent layer and one neuron with a sigmoid activation function on an FPGA using 16-bit numbers with a fixed point, which took 1193 logic matrix gate (LUTs—LookUpTable). Each hidden layer neuron of the RBF network is designed on an FPGA as a separate computing unit. The speed as a total delay of the combination scheme of the block RBF network was 101.579 ns. The implementation of the Gaussian activation functions of the hidden layer of the RBF network occupies 106 LUTs, and the speed of the Gaussian activation functions is 29.33 ns. The absolute error is ± 0.005. The Spartan 3 family of chips for modeling has been used to get these results. Modeling on chips of other series has been also introduced in the article. RBF neural networks of various topologies have been synthesized and investigated. Hardware implementation of RBF neural networks with such speed allows them to be used in real-time control systems for high-speed objects.

Download Full-text

Neural networks trained with high-dimensional functions approximation data in high-dimensional space

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211417 ◽

2021 ◽

pp. 1-12

Author(s):

Jian Zheng ◽

Jianfeng Wang ◽

Yanping Chen ◽

Shuping Chen ◽

Jingjin Chen ◽

...

Keyword(s):

Neural Networks ◽

Dimensional Space ◽

Data Distribution ◽

High Dimensional ◽

Sufficient Information ◽

Sufficient Data ◽

High Dimensional Space ◽

Positive Effects ◽

The Neural Networks ◽

Using Data

Neural networks can approximate data because of owning many compact non-linear layers. In high-dimensional space, due to the curse of dimensionality, data distribution becomes sparse, causing that it is difficulty to provide sufficient information. Hence, the task becomes even harder if neural networks approximate data in high-dimensional space. To address this issue, according to the Lipschitz condition, the two deviations, i.e., the deviation of the neural networks trained using high-dimensional functions, and the deviation of high-dimensional functions approximation data, are derived. This purpose of doing this is to improve the ability of approximation high-dimensional space using neural networks. Experimental results show that the neural networks trained using high-dimensional functions outperforms that of using data in the capability of approximation data in high-dimensional space. We find that the neural networks trained using high-dimensional functions more suitable for high-dimensional space than that of using data, so that there is no need to retain sufficient data for neural networks training. Our findings suggests that in high-dimensional space, by tuning hidden layers of neural networks, this is hard to have substantial positive effects on improving precision of approximation data.

Download Full-text

Sensuator: A Hybrid Sensor–Actuator Approach to Soft Robotic Proprioception Using Recurrent Neural Networks

Actuators ◽

10.3390/act10020030 ◽

2021 ◽

Vol 10 (2) ◽

pp. 30

Author(s):

Pornthep Preechayasomboon ◽

Eric Rombokas

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Linear Models ◽

Open Loop ◽

Proof Of Concept ◽

State Estimator ◽

Loop Control ◽

Practical Applications ◽

Soft Actuator ◽

The Cost

Soft robotic actuators are now being used in practical applications; however, they are often limited to open-loop control that relies on the inherent compliance of the actuator. Achieving human-like manipulation and grasping with soft robotic actuators requires at least some form of sensing, which often comes at the cost of complex fabrication and purposefully built sensor structures. In this paper, we utilize the actuating fluid itself as a sensing medium to achieve high-fidelity proprioception in a soft actuator. As our sensors are somewhat unstructured, their readings are difficult to interpret using linear models. We therefore present a proof of concept of a method for deriving the pose of the soft actuator using recurrent neural networks. We present the experimental setup and our learned state estimator to show that our method is viable for achieving proprioception and is also robust to common sensor failures.

Download Full-text

Trigonometric Inference Providing Learning in Deep Neural Networks

Applied Sciences ◽

10.3390/app11156704 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6704

Author(s):

Jingyong Cai ◽

Masashi Takemoto ◽

Yuming Qiu ◽

Hironori Nakajo

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Trigonometric Approximation ◽

Model Parameters ◽

Training Algorithms ◽

Activation Functions ◽

Classical Training ◽

Sum Formula

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

Download Full-text

On extended dissipativity analysis for neural networks with time-varying delay and general activation functions

Advances in Difference Equations ◽

10.1186/s13662-016-0769-7 ◽

2016 ◽

Vol 2016 (1) ◽

Cited By ~ 4

Author(s):

Xin Wang ◽

Kun She ◽

Shouming Zhong ◽

Jun Cheng

Keyword(s):

Neural Networks ◽

Time Varying ◽

Activation Functions ◽

Time Varying Delay ◽

General Activation ◽

Dissipativity Analysis ◽

Varying Delay

Download Full-text

Feature extraction and artificial neural networks for the on-the-fly classification of high-dimensional thermochemical spaces in adaptive-chemistry simulations – ERRATUM

Data-Centric Engineering ◽

10.1017/dce.2021.4 ◽

2021 ◽

Vol 2 ◽

Author(s):

Giuseppe D’Alessio ◽

Alberto Cuoci ◽

Alessandro Parente

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Artificial Neural Networks ◽

High Dimensional ◽

Artificial Neural ◽

Adaptive Chemistry

Download Full-text

Augmenting Around-Device Interaction by Geomagnetic Field Built-in Sensor Utilization

Sensors ◽

10.3390/s21093087 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3087

Author(s):

Sandi Ljubic ◽

Franko Hržić ◽

Alen Salkanovic ◽

Ivan Štajduhar

Keyword(s):

Magnetic Field ◽

Neural Networks ◽

Curve Fitting ◽

Absolute Error ◽

Interaction Space ◽

Text Entry ◽

Proof Of Concept ◽

Field Sensor ◽

Device Interaction ◽

Input Techniques

In this paper, we investigate the possibilities for augmenting interaction around the mobile device, with the aim of enabling input techniques that do not rely on typical touch-based gestures. The presented research focuses on utilizing a built-in magnetic field sensor, whose readouts are intentionally affected by moving a strong permanent magnet around a smartphone device. Different approaches for supporting magnet-based Around-Device Interaction are applied, including magnetic field fingerprinting, curve-fitting modeling, and machine learning. We implemented the corresponding proof-of-concept applications that incorporate magnet-based interaction. Namely, text entry is achieved by discrete positioning of the magnet within a keyboard mockup, and free-move pointing is enabled by monitoring the magnet’s continuous movement in real-time. The related solutions successfully expand both the interaction language and the interaction space in front of the device without altering its hardware or involving sophisticated peripherals. A controlled experiment was conducted to evaluate the provided text entry method initially. The obtained results were promising (text entry speed of nine words per minute) and served as a motivation for implementing new interaction modalities. The use of neural networks has shown to be a better approach than curve fitting to support free-move pointing. We demonstrate how neural networks with a very small number of input parameters can be used to provide highly usable pointing with an acceptable level of error (mean absolute error of 3 mm for pointer position on the smartphone display).

Download Full-text