scholarly journals Deep neural networks: a new framework for modelling biological vision and brain information processing

2015 ◽  
Author(s):  
Nikolaus Kriegeskorte

Recent advances in neural network modelling have enabled major strides in computer vision and other artificial intelligence applications. Human-level visual recognition abilities are coming within reach of artificial systems. Artificial neural networks are inspired by the brain and their computations could be implemented in biological neurons. Convolutional feedforward networks, which now dominate computer vision, take further inspiration from the architecture of the primate visual hierarchy. However, the current models are designed with engineering goals and not to model brain computations. Nevertheless, initial studies comparing internal representations between these models and primate brains find surprisingly similar representational spaces. With human-level performance no longer out of reach, we are entering an exciting new era, in which we will be able to build neurobiologically faithful feedforward and recurrent computational models of how biological brains perform high-level feats of intelligence, including vision.

2019 ◽  
Author(s):  
Lore Goetschalckx ◽  
Johan Wagemans

This is a preprint. Please find the published, peer reviewed version of the paper here: https://peerj.com/articles/8169/. Images differ in their memorability in consistent ways across observers. What makes an image memorable is not fully understood to date. Most of the current insight is in terms of high-level semantic aspects, related to the content. However, research still shows consistent differences within semantic categories, suggesting a role for factors at other levels of processing in the visual hierarchy. To aid investigations into this role as well as contributions to the understanding of image memorability more generally, we present MemCat. MemCat is a category-based image set, consisting of 10K images representing five broader, memorability-relevant categories (animal, food, landscape, sports, and vehicle) and further divided into subcategories (e.g., bear). They were sampled from existing source image sets that offer bounding box annotations or more detailed segmentation masks. We collected memorability scores for all 10K images, each score based on the responses of on average 99 participants in a repeat-detection memory task. Replicating previous research, the collected memorability scores show high levels of consistency across observers. Currently, MemCat is the second largest memorability image set and the largest offering a category-based structure. MemCat can be used to study the factors underlying the variability in image memorability, including the variability within semantic categories. In addition, it offers a new benchmark dataset for the automatic prediction of memorability scores (e.g., with convolutional neural networks). Finally, MemCat allows to study neural and behavioral correlates of memorability while controlling for semantic category.


2020 ◽  
Author(s):  
Zhe Xu

<p>Despite the fact that artificial intelligence boosted with data-driven methods (e.g., deep neural networks) has surpassed human-level performance in various tasks, its application to autonomous</p> <p>systems still faces fundamental challenges such as lack of interpretability, intensive need for data and lack of verifiability. In this overview paper, I overview some attempts to address these fundamental challenges by explaining, guiding and verifying autonomous systems, taking into account limited availability of simulated and real data, the expressivity of high-level</p> <p>knowledge representations and the uncertainties of the underlying model. Specifically, this paper covers learning high-level knowledge from data for interpretable autonomous systems,</p><p>guiding autonomous systems with high-level knowledge, and</p><p>verifying and controlling autonomous systems against high-level specifications.</p>


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e8169 ◽  
Author(s):  
Lore Goetschalckx ◽  
Johan Wagemans

Images differ in their memorability in consistent ways across observers. What makes an image memorable is not fully understood to date. Most of the current insight is in terms of high-level semantic aspects, related to the content. However, research still shows consistent differences within semantic categories, suggesting a role for factors at other levels of processing in the visual hierarchy. To aid investigations into this role as well as contributions to the understanding of image memorability more generally, we present MemCat. MemCat is a category-based image set, consisting of 10K images representing five broader, memorability-relevant categories (animal, food, landscape, sports, and vehicle) and further divided into subcategories (e.g., bear). They were sampled from existing source image sets that offer bounding box annotations or more detailed segmentation masks. We collected memorability scores for all 10 K images, each score based on the responses of on average 99 participants in a repeat-detection memory task. Replicating previous research, the collected memorability scores show high levels of consistency across observers. Currently, MemCat is the second largest memorability image set and the largest offering a category-based structure. MemCat can be used to study the factors underlying the variability in image memorability, including the variability within semantic categories. In addition, it offers a new benchmark dataset for the automatic prediction of memorability scores (e.g., with convolutional neural networks). Finally, MemCat allows the study of neural and behavioral correlates of memorability while controlling for semantic category.


2008 ◽  
Vol 18 (12) ◽  
pp. 3551-3609 ◽  
Author(s):  
MAKOTO ITOH ◽  
LEON O. CHUA

Visual illusion is the fallacious perception of reality or some actually existing object. In this paper, we imitate the mechanism of Ehrenstein illusion, neon color spreading illusion, watercolor illusion, Kanizsa illusion, shifted edges illusion, and hybrid image illusion using the Open Source Computer Vision Library (OpenCV). We also imitate these illusions using Cellular Neural Networks (CNNs). These imitations suggest that some illusions are processed by high-level brain functions. We next apply the morphological gradient operation to anomalous motion illusions. The processed images are classified into two kinds of images, which correspond to the central drift illusion and the peripheral drift illusion, respectively. It demonstrates that the contrast of the colors plays an important role in the anomalous motion illusion. We also imitate the anomalous motion illusions using both OpenCV and CNN. These imitations suggest that some visual illusions may be processed by the illusory movement of animations.


2019 ◽  
Author(s):  
Courtney J Spoerer ◽  
Tim C Kietzmann ◽  
Johannes Mehrer ◽  
Ian Charest ◽  
Nikolaus Kriegeskorte

AbstractDeep feedforward neural network models of vision dominate in both computational neuroscience and engineering. The primate visual system, by contrast, contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain or model. Here we show: (1) Recurrent convolutional neural network models outperform feedforward convolutional models matched in their number of parameters in large-scale visual recognition tasks on natural images. (2) Setting a confidence threshold, at which recurrent computations terminate and a decision is made, enables flexible trading of speed for accuracy. At a given confidence threshold, the model expends more time and energy on images that are harder to recognise, without requiring additional parameters for deeper computations. (3) The recurrent model’s reaction time for an image predicts the human reaction time for the same image better than several parameter-matched and state-of-the-art feedforward models. (4) Across confidence thresholds, the recurrent model emulates the behaviour of feedforward control models in that it achieves the same accuracy at approximately the same computational cost (mean number of floating-point operations). However, the recurrent model can be run longer (higher confidence threshold) and then outperforms parameter-matched feedforward comparison models. These results suggest that recurrent connectivity, a hallmark of biological visual systems, may be essential for understanding the accuracy, flexibility, and dynamics of human visual recognition.Author summaryDeep neural networks provide the best current models of biological vision and achieve the highest performance in computer vision. Inspired by the primate brain, these models transform the image signals through a sequence of stages, leading to recognition. Unlike brains in which outputs of a given computation are fed back into the same computation, these models do not process signals recurrently. The ability to recycle limited neural resources by processing information recurrently could explain the accuracy and flexibility of biological visual systems, which computer vision systems cannot yet match. Here we report that recurrent processing can improve recognition performance compared to similarly complex feedforward networks. Recurrent processing also enabled models to behave more flexibly and trade off speed for accuracy. Like humans, the recurrent network models can compute longer when an object is hard to recognise, which boosts their accuracy. The model’s recognition times predicted human recognition times for the same images. The performance and flexibility of recurrent neural network models illustrates that modeling biological vision can help us improve computer vision.


2020 ◽  
Author(s):  
Xingyi Yang ◽  
Yonghu Wang ◽  
Robert Laganiere

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>


2020 ◽  
Author(s):  
Xingyi Yang ◽  
Yong Wang ◽  
Robert Laganiere

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>


2021 ◽  
Vol 14 ◽  
Author(s):  
Joshua S. Rule ◽  
Maximilian Riesenhuber

Humans quickly and accurately learn new visual concepts from sparse data, sometimes just a single example. The impressive performance of artificial neural networks which hierarchically pool afferents across scales and positions suggests that the hierarchical organization of the human visual system is critical to its accuracy. These approaches, however, require magnitudes of order more examples than human learners. We used a benchmark deep learning model to show that the hierarchy can also be leveraged to vastly improve the speed of learning. We specifically show how previously learned but broadly tuned conceptual representations can be used to learn visual concepts from as few as two positive examples; reusing visual representations from earlier in the visual hierarchy, as in prior approaches, requires significantly more examples to perform comparably. These results suggest techniques for learning even more efficiently and provide a biologically plausible way to learn new visual concepts from few examples.


2020 ◽  
Author(s):  
Zhe Xu

<p>Despite the fact that artificial intelligence boosted with data-driven methods (e.g., deep neural networks) has surpassed human-level performance in various tasks, its application to autonomous</p> <p>systems still faces fundamental challenges such as lack of interpretability, intensive need for data and lack of verifiability. In this overview paper, I overview some attempts to address these fundamental challenges by explaining, guiding and verifying autonomous systems, taking into account limited availability of simulated and real data, the expressivity of high-level</p> <p>knowledge representations and the uncertainties of the underlying model. Specifically, this paper covers learning high-level knowledge from data for interpretable autonomous systems,</p><p>guiding autonomous systems with high-level knowledge, and</p><p>verifying and controlling autonomous systems against high-level specifications.</p>


2018 ◽  
Vol 8 (4) ◽  
pp. 20180013 ◽  
Author(s):  
Kalanit Grill-Spector ◽  
Kevin S. Weiner ◽  
Jesse Gomez ◽  
Anthony Stigliani ◽  
Vaidehi S. Natu

A central goal in neuroscience is to understand how processing within the ventral visual stream enables rapid and robust perception and recognition. Recent neuroscientific discoveries have significantly advanced understanding of the function, structure and computations along the ventral visual stream that serve as the infrastructure supporting this behaviour. In parallel, significant advances in computational models, such as hierarchical deep neural networks (DNNs), have brought machine performance to a level that is commensurate with human performance. Here, we propose a new framework using the ventral face network as a model system to illustrate how increasing the neural accuracy of present DNNs may allow researchers to test the computational benefits of the functional architecture of the human brain. Thus, the review (i) considers specific neural implementational features of the ventral face network, (ii) describes similarities and differences between the functional architecture of the brain and DNNs, and (iii) provides a hypothesis for the computational value of implementational features within the brain that may improve DNN performance. Importantly, this new framework promotes the incorporation of neuroscientific findings into DNNs in order to test the computational benefits of fundamental organizational features of the visual system.


Sign in / Sign up

Export Citation Format

Share Document