Hybrid training procedure applied to recurrent neural networks

1996 ◽  
Author(s):  
Xavier Loiseau ◽  
Jan Sendler
2021 ◽  
Vol 54 (4) ◽  
pp. 1-38
Author(s):  
Varsha S. Lalapura ◽  
J. Amudha ◽  
Hariramn Selvamuruga Satheesh

Recurrent Neural Networks are ubiquitous and pervasive in many artificial intelligence applications such as speech recognition, predictive healthcare, creative art, and so on. Although they provide accurate superior solutions, they pose a massive challenge “training havoc.” Current expansion of IoT demands intelligent models to be deployed at the edge. This is precisely to handle increasing model sizes and complex network architectures. Design efforts to meet these for greater performance have had inverse effects on portability on edge devices with real-time constraints of memory, latency, and energy. This article provides a detailed insight into various compression techniques widely disseminated in the deep learning regime. They have become key in mapping powerful RNNs onto resource-constrained devices. While compression of RNNs is the main focus of the survey, it also highlights challenges encountered while training. The training procedure directly influences model performance and compression alongside. Recent advancements to overcome the training challenges with their strengths and drawbacks are discussed. In short, the survey covers the three-step process, namely, architecture selection, efficient training process, and suitable compression technique applicable to a resource-constrained environment. It is thus one of the comprehensive survey guides a developer can adapt for a time-series problem context and an RNN solution for the edge.


2021 ◽  
Author(s):  
forough hassanibesheli ◽  
Niklas Boers ◽  
Jurgen Kurths

<p>Most forecasting schemes in the geosciences, and in particular for predicting weather and<br>climate indices such as the El Niño Southern Oscillation (ENSO), rely on process-based<br>numerical models [1]. Although statistical modelling[2] and prediction approaches also have<br>a long history, more recently, different machine learning techniques have been used to predict<br>climatic time series. One of the supervised machine learning algorithm which is suited for<br>temporal and sequential data processing and prediction is given by recurrent neural networks<br>(RNNs)[3]. In this study we develop a RNN-based method that (1) can learn the dynamics<br>of a stochastic time series without requiring access to a huge amount of data for training, and<br>(2) has comparatively simple structure and efficient training procedure. Since this algorithm<br>is suitable for investigating complex nonlinear time series such as climate time series, we<br>apply it to different ENSO indices. We demonstrate that our model can capture key features<br>of the complex system dynamics underlying ENSO variability, and that it can accurately<br>forecast ENSO for longer lead times in comparison to other recent studies[4].</p><p> </p><p>Reference:</p><p>[1] P. Bauer, A. Thorpe, and G. Brunet, “The quiet revolution of numerical weather prediction,”<br>Nature, vol. 525, no. 7567, pp. 47–55, 2015.</p><p>[2] D. Kondrashov, S. Kravtsov, A. W. Robertson, and M. Ghil, “A hierarchy of data-based enso<br>models,” Journal of climate, vol. 18, no. 21, pp. 4425–4444, 2005.</p><p>[3] L. R. Medsker and L. Jain, “Recurrent neural networks,” Design and Applications, vol. 5, 2001.</p><p>[4] Y.-G. Ham, J.-H. Kim, and J.-J. Luo, “Deep learning for multi-year enso forecasts,” Nature,<br>vol. 573, no. 7775, pp. 568–572, 2019.</p>


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Author(s):  
Faisal Ladhak ◽  
Ankur Gandhe ◽  
Markus Dreyer ◽  
Lambert Mathias ◽  
Ariya Rastrow ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document