2/2 Training time optimization for balanced accuracy/complexity neural network models

CCCA12 ◽  
2012 ◽  
Author(s):  
Hector M. Romero Ugalde ◽  
Jean-Claude Carmona ◽  
Victor M. Alvarado
2019 ◽  
Vol 15 (3) ◽  
pp. 47-62 ◽  
Author(s):  
Chenghai Yu ◽  
Shupei Wang ◽  
Jiajun Guo

Chinese word segmentation is the basis of the Chinese natural language processing (NLP). With the development of the deep learning, various neural network models are applied to the Chinese word segmentation. However, current neural network models have the characteristics of artificial feature extraction, nonstandard word-weight, inability to effectively use long-distance information and long training time of models in Chinese word segmentation. To solve a series of problems, this article presents a CNN-Bidirectional GRU-CRF neural network model (CNN Bidirectional GRU CRF Network, CBiGCN), which breaks through the limit of conventional method window, truly realizes end-to-end processing and applies to the neural network model by the five-Tag set method, bias-variable-weight greedy strategy and supplements by Goldstein-Armijo guidelines. Besides, this model, with simple structure, is easy to be operated. And it can automatically learn features, reduces large amounts of tasks on specific knowledge in the form of handcrafted features and data pre-processing, makes use of context information effectively. The authors set an experiment with two data corpuses for Chinese word segmentation to evaluate their system. The experiment verified their new model can obtain better Chinese word segmentation results and greatly reduce training time.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1938
Author(s):  
Linling Qiu ◽  
Han Li ◽  
Meihong Wang ◽  
Xiaoli Wang

With its increasing incidence, cancer has become one of the main causes of worldwide mortality. In this work, we mainly propose a novel attention-based neural network model named Gated Graph ATtention network (GGAT) for cancer prediction, where a gating mechanism (GM) is introduced to work with the attention mechanism (AM), to break through the previous work’s limitation of 1-hop neighbourhood reasoning. In this way, our GGAT is capable of fully mining the potential correlation between related samples, helping for improving the cancer prediction accuracy. Additionally, to simplify the datasets, we propose a hybrid feature selection algorithm to strictly select gene features, which significantly reduces training time without affecting prediction accuracy. To the best of our knowledge, our proposed GGAT achieves the state-of-the-art results in cancer prediction task on LIHC, LUAD, KIRC compared to other traditional machine learning methods and neural network models, and improves the accuracy by 1% to 2% on Cora dataset, compared to the state-of-the-art graph neural network methods.


2021 ◽  
Vol 12 (07) ◽  
pp. 165-171
Author(s):  
Jonah Sokipriala ◽  
Sunny Orike

Fast detection and accurate classification of traffic signs is one of the major aspects of advance driver assistance system (ADAS) and intelligent transport systems (ITS), this paper presents a comparison between an 8-Layer convolutional neural network (CNN), and some state of the Arts model such as VGG16 and Resnet50, for traffic sign classification on The GTSRB. using a GPU to increase processing time, the design showed that with various augmentation applied to the CNN, our 8-layer Model was able to outperform the State of the Arts models with a higher test Accuracy, 50 times lesser training parameters, and faster training time our 8 -layer model was able to achieve 96% test accuracy.


The prediction of time series data is a forecast using the analysis of a relationship pattern between what will be predicted (prediction) and the time variable. The prediction process using the recurrent neural network (RNN) model could recognize and learn the data pattern of time series, but the presence of fluctuations in data makes the introduction of data patterns difficult to be learned. The data used for forecasting are tourist visits to Tanah Lot Bali tourist attraction for 10 years (2008-2017). The training process uses the RNN method on high fluctuating data, which requires a relatively long time in recognizing and studying the data patterns. Modification of the RNN method on learning rate and momentum by using dynamic values, can shorten learning time. The results showed the learning time using the RNN dynamic value, smaller than the variants of the RNN method such as the RNN Elman, Jordan RNN, Fully RNN, LSTM and the feedforward method (Backpropagation). The resulting error value is 0,05105 MSE. This value is smaller than the Fully RNN, Jordan RNN, LSTM and Feedforward methods. The elman method has the shortest training time among other models. The purpose of this research is to make a prediction design consisting of sliding windows techniques, training with neural network models and validation of results with k-fold cross-validation.


2021 ◽  
Vol 21 ◽  
pp. 330-335
Author(s):  
Maciej Wadas ◽  
Jakub Smołka

This paper presents the results of performance analysis of the Tensorflow library used in machine learning and deep neural networks. The analysis focuses on comparing the parameters obtained when training the neural network model for optimization algorithms: Adam, Nadam, AdaMax, AdaDelta, AdaGrad. Special attention has been paid to the differences between the training efficiency on tasks using microprocessor and graphics card. For the study, neural network models were created in order to recognise Polish handwritten characters. The results obtained showed that the most efficient algorithm is AdaMax, while the computer component used during the research only affects the training time of the neural network model used.


2020 ◽  
Vol 5 ◽  
pp. 140-147 ◽  
Author(s):  
T.N. Aleksandrova ◽  
◽  
E.K. Ushakov ◽  
A.V. Orlova ◽  
◽  
...  

The neural network models series used in the development of an aggregated digital twin of equipment as a cyber-physical system are presented. The twins of machining accuracy, chip formation and tool wear are examined in detail. On their basis, systems for stabilization of the chip formation process during cutting and diagnose of the cutting too wear are developed. Keywords cyberphysical system; neural network model of equipment; big data, digital twin of the chip formation; digital twin of the tool wear; digital twin of nanostructured coating choice


Energies ◽  
2021 ◽  
Vol 14 (14) ◽  
pp. 4242
Author(s):  
Fausto Valencia ◽  
Hugo Arcos ◽  
Franklin Quilumba

The purpose of this research is the evaluation of artificial neural network models in the prediction of stresses in a 400 MVA power transformer winding conductor caused by the circulation of fault currents. The models were compared considering the training, validation, and test data errors’ behavior. Different combinations of hyperparameters were analyzed based on the variation of architectures, optimizers, and activation functions. The data for the process was created from finite element simulations performed in the FEMM software. The design of the Artificial Neural Network was performed using the Keras framework. As a result, a model with one hidden layer was the best suited architecture for the problem at hand, with the optimizer Adam and the activation function ReLU. The final Artificial Neural Network model predictions were compared with the Finite Element Method results, showing good agreement but with a much shorter solution time.


Sign in / Sign up

Export Citation Format

Share Document