Designing deep neural networks for continual learning in an open world

Mapping Intimacies ◽

10.21248/gups.62487 ◽

2021 ◽

Author(s):

◽

Martin Mundt

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Network Architecture ◽

Neural Network Training ◽

Neural Network Architecture ◽

Neural Architecture ◽

Network Training ◽

Classification Tasks ◽

Continual Learning

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.

Download Full-text

AMBIENT: Accelerated Convolutional Neural Network Architecture Search for Regulatory Genomics

10.1101/2021.02.25.432960 ◽

2021 ◽

Author(s):

Zijun Zhang ◽

Evan M. Cofer ◽

Olga G. Troyanskaya

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Network Architecture ◽

Environmental Issue ◽

Biological Sequences ◽

Neural Network Architecture ◽

Computing Power ◽

Neural Architecture

Convolutional neural networks (CNN) have become a standard approach for modeling genomic sequences. CNNs can be effectively built by Neural Architecture Search (NAS) by trading computing power for accurate neural architectures. Yet, the consumption of immense computing power is a major practical, financial, and environmental issue for deep learning. Here, we present a novel NAS framework, AMBIENT, that generates highly accurate CNN architectures for biological sequences of diverse functions, while substantially reducing the computing cost of conventional NAS.

Download Full-text

Potato Disease Classification Using Convolution Neural Networks

Advances in Animal Biosciences ◽

10.1017/s2040470017001376 ◽

2017 ◽

Vol 8 (2) ◽

pp. 244-249 ◽

Cited By ~ 16

Author(s):

D. Oppenheim ◽

G. Shani

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Data ◽

Plant Diseases ◽

Disease Classification ◽

Neural Network Training ◽

Potato Disease ◽

Network Training ◽

Different Shapes ◽

Classification Tasks

Many plant diseases have distinct visual symptoms which can be used to identify and classify them correctly. This paper presents a potato disease classification algorithm which leverages these distinct appearances and the recent advances in computer vision made possible by deep learning. The algorithm uses a deep convolutional neural network training it to classify the tubers into five classes, four diseases classes and a healthy potato class. The database of images used in this study, containing potatoes of different shapes, sizes and diseases, was acquired, classified, and labelled manually by experts. The models were trained over different train-test splits to better understand the amount of image data needed to apply deep learning for such classification tasks.

Download Full-text

optNet-50: An Optimized Residual Neural Network Architecture of Deep Learning for Driver's Distraction

2020 IEEE 23rd International Multitopic Conference (INMIC) ◽

10.1109/inmic50486.2020.9318087 ◽

2020 ◽

Author(s):

Tahir Abbas ◽

Syed Farooq Ali ◽

Aadil Zia Khan ◽

Irfan Kareem

Keyword(s):

Neural Network ◽

Deep Learning ◽

Network Architecture ◽

Neural Network Architecture

Download Full-text

Machine-learning in astronomy

Proceedings of the International Astronomical Union ◽

10.1017/s1743921314013672 ◽

2014 ◽

Vol 10 (S306) ◽

pp. 279-287 ◽

Cited By ~ 2

Author(s):

Michael Hobson ◽

Philip Graff ◽

Farhan Feroz ◽

Anthony Lasenby

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Gamma Ray ◽

Neural Network Training ◽

Training Algorithm ◽

Data Description ◽

Astronomical Data ◽

Machine Learning Methods ◽

Network Training

AbstractMachine-learning methods may be used to perform many tasks required in the analysis of astronomical data, including: data description and interpretation, pattern recognition, prediction, classification, compression, inference and many more. An intuitive and well-established approach to machine learning is the use of artificial neural networks (NNs), which consist of a group of interconnected nodes, each of which processes information that it receives and then passes this product on to other nodes via weighted connections. In particular, I discuss the first public release of the generic neural network training algorithm, calledSkyNet, and demonstrate its application to astronomical problems focusing on its use in the BAMBI package for accelerated Bayesian inference in cosmology, and the identification of gamma-ray bursters. TheSkyNetand BAMBI packages, which are fully parallelised using MPI, are available athttp://www.mrao.cam.ac.uk/software/.

Download Full-text

Modelling Peri-Perceptual Brain Processes in a Deep Learning Spiking Neural Network Architecture

Scientific Reports ◽

10.1038/s41598-018-27169-8 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 16

Author(s):

Zohreh Gholami Doborjeh ◽

Nikola Kasabov ◽

Maryam Gholami Doborjeh ◽

Alexander Sumich

Keyword(s):

Neural Network ◽

Deep Learning ◽

Network Architecture ◽

Spiking Neural Network ◽

Neural Network Architecture

Download Full-text

Deep-Learning-Based Neural Network Training for State Estimation Enhancement: Application to Attitude Estimation

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2019.2895495 ◽

2020 ◽

Vol 69 (1) ◽

pp. 24-34 ◽

Cited By ~ 5

Author(s):

Mohammad K. Al-Sharman ◽

Yahya Zweiri ◽

Mohammad Abdel Kareem Jaradat ◽

Raghad Al-Husari ◽

Dongming Gan ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

State Estimation ◽

Attitude Estimation ◽

Neural Network Training ◽

Network Training

Download Full-text

Proactive Congestion Avoidance for Distributed Deep Learning

Sensors ◽

10.3390/s21010174 ◽

2020 ◽

Vol 21 (1) ◽

pp. 174

Author(s):

Minkoo Kang ◽

Gyeongsik Yang ◽

Yeonho Yoo ◽

Chuck Yoo

Keyword(s):

Neural Network ◽

Deep Learning ◽

Queue Length ◽

Deep Neural Network ◽

Congestion Avoidance ◽

Model Parameters ◽

Neural Network Training ◽

Network Congestion ◽

Network Training ◽

Deep Learning Model

This paper presents “Proactive Congestion Notification” (PCN), a congestion-avoidance technique for distributed deep learning (DDL). DDL is widely used to scale out and accelerate deep neural network training. In DDL, each worker trains a copy of the deep learning model with different training inputs and synchronizes the model gradients at the end of each iteration. However, it is well known that the network communication for synchronizing model parameters is the main bottleneck in DDL. Our key observation is that the DDL architecture makes each worker generate burst traffic every iteration, which causes network congestion and in turn degrades the throughput of DDL traffic. Based on this observation, the key idea behind PCN is to prevent potential congestion by proactively regulating the switch queue length before DDL burst traffic arrives at the switch, which prepares the switches for handling incoming DDL bursts. In our evaluation, PCN improves the throughput of DDL traffic by 72% on average.

Download Full-text

A machine learning method for generation of a neural network architecture: a continuous ID3 algorithm

IEEE Transactions on Neural Networks ◽

10.1109/72.125869 ◽

1992 ◽

Vol 3 (2) ◽

pp. 280-291 ◽

Cited By ~ 71

Author(s):

K.J. Cios ◽

N. Liu

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Architecture ◽

Machine Learning Method ◽

Learning Method ◽

Neural Network Architecture ◽

Id3 Algorithm

Download Full-text

PARROT is a flexible recurrent neural network framework for analysis of large protein datasets

eLife ◽

10.7554/elife.70576 ◽

2021 ◽

Vol 10 ◽

Author(s):

Daniel Griffith ◽

Alex S Holehouse

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

High Throughput ◽

Recurrent Neural Network ◽

Transcriptional Activation ◽

Network Architecture ◽

Learning Approaches ◽

Large Protein ◽

Protein Datasets

The rise of high-throughput experiments has transformed how scientists approach biological questions. The ubiquity of large-scale assays that can test thousands of samples in a day has necessitated the development of new computational approaches to interpret this data. Among these tools, machine learning approaches are increasingly being utilized due to their ability to infer complex nonlinear patterns from high-dimensional data. Despite their effectiveness, machine learning (and in particular deep learning) approaches are not always accessible or easy to implement for those with limited computational expertise. Here we present PARROT, a general framework for training and applying deep learning-based predictors on large protein datasets. Using an internal recurrent neural network architecture, PARROT is capable of tackling both classification and regression tasks while only requiring raw protein sequences as input. We showcase the potential uses of PARROT on three diverse machine learning tasks: predicting phosphorylation sites, predicting transcriptional activation function of peptides generated by high-throughput reporter assays, and predicting the fibrillization propensity of amyloid beta with data generated by deep mutational scanning. Through these examples, we demonstrate that PARROT is easy to use, performs comparably to state-of-the-art computational tools, and is applicable for a wide array of biological problems.

Download Full-text

Application of deep learning methods to predict ionosphere parameters in real time

E3S Web of Conferences ◽

10.1051/e3sconf/202019602007 ◽

2020 ◽

Vol 196 ◽

pp. 02007

Author(s):

Vladimir Mochalov ◽

Anastasia Mochalova

Keyword(s):

Neural Network ◽

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Short Term Memory ◽

Neural Network Architecture ◽

Short Term ◽

Learning Methods ◽

Term Memory ◽

Long Short Term Memory

In this paper, the previously obtained results on recognition of ionograms using deep learning are expanded to predict the parameters of the ionosphere. After the ionospheric parameters have been identified on the ionogram using deep learning in real time, we can predict the parameters for some time ahead on the basis of the new data obtained Examples of predicting the ionosphere parameters using an artificial recurrent neural network architecture long short-term memory are given. The place of the block for predicting the parameters of the ionosphere in the system for analyzing ionospheric data using deep learning methods is shown.

Download Full-text