scholarly journals Brain-inspired Multimodal Learning Based on Neural Networks

2018 ◽  
Vol 4 (1) ◽  
pp. 61-72 ◽  
Author(s):  
Chang Liu ◽  
Fuchun Sun ◽  
Bo Zhang

Modern computational models have leveraged biological advances in human brain research. This study addresses the problem of multimodal learning with the help of brain-inspired models. Specifically, a unified multimodal learning architecture is proposed based on deep neural networks, which are inspired by the biology of the visual cortex of the human brain. This unified framework is validated by two practical multimodal learning tasks: image captioning, involving visual and natural language signals, and visual-haptic fusion, involving haptic and visual signals. Extensive experiments are conducted under the framework, and competitive results are achieved.

2018 ◽  
Author(s):  
Chi Zhang ◽  
Xiaohan Duan ◽  
Ruyuan Zhang ◽  
Li Tong

Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


2017 ◽  
Author(s):  
Stefania Bracci ◽  
Ioannis Kalfas ◽  
Hans Op de Beeck

AbstractRecent studies showed agreement between how the human brain and neural networks represent objects, suggesting that we might start to understand the underlying computations. However, we know that the human brain is prone to biases at many perceptual and cognitive levels, often shaped by learning history and evolutionary constraints. Here we explore one such bias, namely the bias to perceive animacy, and used the performance of neural networks as a benchmark. We performed an fMRI study that dissociated object appearance (how an object looks like) from object category (animate or inanimate) by constructing a stimulus set that includes animate objects (e.g., a cow), typical inanimate objects (e.g., a mug), and, crucially, inanimate objects that look like the animate objects (e.g., a cow-mug). Behavioral judgments and deep neural networks categorized images mainly by animacy, setting all objects (lookalike and inanimate) apart from the animate ones. In contrast, activity patterns in ventral occipitotemporal cortex (VTC) were strongly biased towards object appearance: animals and lookalikes were similarly represented and separated from the inanimate objects. Furthermore, this bias interfered with proper object identification, such as failing to signal that a cow-mug is a mug. The bias in VTC to represent a lookalike as animate was even present when participants performed a task requiring them to report the lookalikes as inanimate. In conclusion, VTC representations, in contrast to neural networks, fail to veridically represent objects when visual appearance is dissociated from animacy, probably due to a biased processing of visual features typical of animate objects.


2021 ◽  
Author(s):  
Ahoud Alhazmi ◽  
Abdulwahab Aljubairy ◽  
Wei Emma Zhang ◽  
Quan Z Sheng ◽  
Elaf Alhazmi

Author(s):  
Amira Ahmad Al-Sharkawy ◽  
Gehan A. Bahgat ◽  
Elsayed E. Hemayed ◽  
Samia Abdel-Razik Mashali

Object classification problem is essential in many applications nowadays. Human can easily classify objects in unconstrained environments easily. Classical classification techniques were far away from human performance. Thus, researchers try to mimic the human visual system till they reached the deep neural networks. This chapter gives a review and analysis in the field of the deep convolutional neural network usage in object classification under constrained and unconstrained environment. The chapter gives a brief review on the classical techniques of object classification and the development of bio-inspired computational models from neuroscience till the creation of deep neural networks. A review is given on the constrained environment issues: the hardware computing resources and memory, the object appearance and background, and the training and processing time. Datasets that are used to test the performance are analyzed according to the images environmental conditions, besides the dataset biasing is discussed.


2020 ◽  
Vol 20 (11) ◽  
pp. 6603-6608 ◽  
Author(s):  
Sung-Tae Lee ◽  
Suhwan Lim ◽  
Jong-Ho Bae ◽  
Dongseok Kwon ◽  
Hyeong-Su Kim ◽  
...  

Deep learning represents state-of-the-art results in various machine learning tasks, but for applications that require real-time inference, the high computational cost of deep neural networks becomes a bottleneck for the efficiency. To overcome the high computational cost of deep neural networks, spiking neural networks (SNN) have been proposed. Herein, we propose a hardware implementation of the SNN with gated Schottky diodes as synaptic devices. In addition, we apply L1 regularization for connection pruning of the deep spiking neural networks using gated Schottky diodes as synap-tic devices. Applying L1 regularization eliminates the need for a re-training procedure because it prunes the weights based on the cost function. The compressed hardware-based SNN is energy efficient while achieving a classification accuracy of 97.85% which is comparable to 98.13% of the software deep neural networks (DNN).


2020 ◽  
pp. 105971232092291
Author(s):  
Guido Schillaci ◽  
Antonio Pico Villalpando ◽  
Verena V Hafner ◽  
Peter Hanappe ◽  
David Colliaux ◽  
...  

This work presents an architecture that generates curiosity-driven goal-directed exploration behaviours for an image sensor of a microfarming robot. A combination of deep neural networks for offline unsupervised learning of low-dimensional features from images and of online learning of shallow neural networks representing the inverse and forward kinematics of the system have been used. The artificial curiosity system assigns interest values to a set of pre-defined goals and drives the exploration towards those that are expected to maximise the learning progress. We propose the integration of an episodic memory in intrinsic motivation systems to face catastrophic forgetting issues, typically experienced when performing online updates of artificial neural networks. Our results show that adopting an episodic memory system not only prevents the computational models from quickly forgetting knowledge that has been previously acquired but also provides new avenues for modulating the balance between plasticity and stability of the models.


2020 ◽  
Vol 1 (1) ◽  
pp. 17-23 ◽  
Author(s):  
Abdullah Ahmad Zarir ◽  
Saad Bashar ◽  
Amelia Ritahani Ismail


Author(s):  
Yunpeng Chen ◽  
Xiaojie Jin ◽  
Bingyi Kang ◽  
Jiashi Feng ◽  
Shuicheng Yan

The residual unit and its variations are wildly used in building very deep neural networks for alleviating optimization difficulty. In this work, we revisit the standard residual function as well as its several successful variants and propose a unified framework based on tensor Block Term Decomposition (BTD) to explain these apparently different residual functions from the tensor decomposition view. With the BTD framework, we further propose a novel basic network architecture, named the Collective Residual Unit (CRU). CRU further enhances parameter efficiency of deep residual neural networks by sharing core factors derived from collective tensor factorization over the involved residual units. It enables efficient knowledge sharing across multiple residual units, reduces the number of model parameters, lowers the risk of over-fitting, and provides better generalization ability. Extensive experimental results show that our proposed CRU network brings outstanding parameter efficiency -- it achieves comparable classification performance with ResNet-200 while using a model size as small as ResNet-50 on the ImageNet-1k and Places365-Standard benchmark datasets.


Author(s):  
Shiva Prasad Kasiviswanathan ◽  
Nina Narodytska ◽  
Hongxia Jin

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.


Sign in / Sign up

Export Citation Format

Share Document