Deep neural networks: a new framework for modelling biological vision and brain information processing

MemCat: A new category-based image set quantified on memorability

10.31234/osf.io/64xfa ◽

2019 ◽

Author(s):

Lore Goetschalckx ◽

Johan Wagemans

Keyword(s):

Neural Networks ◽

Memory Task ◽

Semantic Category ◽

Source Image ◽

Behavioral Correlates ◽

Semantic Categories ◽

Visual Hierarchy ◽

Image Set ◽

High Level ◽

Repeat Detection

This is a preprint. Please find the published, peer reviewed version of the paper here: https://peerj.com/articles/8169/. Images differ in their memorability in consistent ways across observers. What makes an image memorable is not fully understood to date. Most of the current insight is in terms of high-level semantic aspects, related to the content. However, research still shows consistent differences within semantic categories, suggesting a role for factors at other levels of processing in the visual hierarchy. To aid investigations into this role as well as contributions to the understanding of image memorability more generally, we present MemCat. MemCat is a category-based image set, consisting of 10K images representing five broader, memorability-relevant categories (animal, food, landscape, sports, and vehicle) and further divided into subcategories (e.g., bear). They were sampled from existing source image sets that offer bounding box annotations or more detailed segmentation masks. We collected memorability scores for all 10K images, each score based on the responses of on average 99 participants in a repeat-detection memory task. Replicating previous research, the collected memorability scores show high levels of consistency across observers. Currently, MemCat is the second largest memorability image set and the largest offering a category-based structure. MemCat can be used to study the factors underlying the variability in image memorability, including the variability within semantic categories. In addition, it offers a new benchmark dataset for the automatic prediction of memorability scores (e.g., with convolutional neural networks). Finally, MemCat allows to study neural and behavioral correlates of memorability while controlling for semantic category.

Download Full-text

Interpretable, Data-Efficient and Verifiable Autonomy with High-Level Knowledge

10.36227/techrxiv.12591152 ◽

2020 ◽

Author(s):

Zhe Xu

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Deep Neural Networks ◽

Autonomous Systems ◽

Real Data ◽

Data Driven ◽

Knowledge Representations ◽

Limited Availability ◽

High Level ◽

Level Performance

<p>Despite the fact that artificial intelligence boosted with data-driven methods (e.g., deep neural networks) has surpassed human-level performance in various tasks, its application to autonomous</p> <p>systems still faces fundamental challenges such as lack of interpretability, intensive need for data and lack of verifiability. In this overview paper, I overview some attempts to address these fundamental challenges by explaining, guiding and verifying autonomous systems, taking into account limited availability of simulated and real data, the expressivity of high-level</p> <p>knowledge representations and the uncertainties of the underlying model. Specifically, this paper covers learning high-level knowledge from data for interpretable autonomous systems,</p><p>guiding autonomous systems with high-level knowledge, and</p><p>verifying and controlling autonomous systems against high-level specifications.</p>

Download Full-text

MemCat: a new category-based image set quantified on memorability

PeerJ ◽

10.7717/peerj.8169 ◽

2019 ◽

Vol 7 ◽

pp. e8169 ◽

Cited By ~ 1

Author(s):

Lore Goetschalckx ◽

Johan Wagemans

Keyword(s):

Neural Networks ◽

Memory Task ◽

Semantic Category ◽

Source Image ◽

Behavioral Correlates ◽

Semantic Categories ◽

Visual Hierarchy ◽

Image Set ◽

High Level ◽

Repeat Detection

Images differ in their memorability in consistent ways across observers. What makes an image memorable is not fully understood to date. Most of the current insight is in terms of high-level semantic aspects, related to the content. However, research still shows consistent differences within semantic categories, suggesting a role for factors at other levels of processing in the visual hierarchy. To aid investigations into this role as well as contributions to the understanding of image memorability more generally, we present MemCat. MemCat is a category-based image set, consisting of 10K images representing five broader, memorability-relevant categories (animal, food, landscape, sports, and vehicle) and further divided into subcategories (e.g., bear). They were sampled from existing source image sets that offer bounding box annotations or more detailed segmentation masks. We collected memorability scores for all 10 K images, each score based on the responses of on average 99 participants in a repeat-detection memory task. Replicating previous research, the collected memorability scores show high levels of consistency across observers. Currently, MemCat is the second largest memorability image set and the largest offering a category-based structure. MemCat can be used to study the factors underlying the variability in image memorability, including the variability within semantic categories. In addition, it offers a new benchmark dataset for the automatic prediction of memorability scores (e.g., with convolutional neural networks). Finally, MemCat allows the study of neural and behavioral correlates of memorability while controlling for semantic category.

Download Full-text

IMITATION OF VISUAL ILLUSIONS VIA OPENCV AND CNN

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127408022573 ◽

2008 ◽

Vol 18 (12) ◽

pp. 3551-3609 ◽

Cited By ~ 6

Author(s):

MAKOTO ITOH ◽

LEON O. CHUA

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Visual Illusion ◽

Visual Illusions ◽

Color Spreading ◽

Brain Functions ◽

Morphological Gradient ◽

Illusory Movement ◽

Neon Color Spreading ◽

High Level

Visual illusion is the fallacious perception of reality or some actually existing object. In this paper, we imitate the mechanism of Ehrenstein illusion, neon color spreading illusion, watercolor illusion, Kanizsa illusion, shifted edges illusion, and hybrid image illusion using the Open Source Computer Vision Library (OpenCV). We also imitate these illusions using Cellular Neural Networks (CNNs). These imitations suggest that some illusions are processed by high-level brain functions. We next apply the morphological gradient operation to anomalous motion illusions. The processed images are classified into two kinds of images, which correspond to the central drift illusion and the peripheral drift illusion, respectively. It demonstrates that the contrast of the colors plays an important role in the anomalous motion illusion. We also imitate the anomalous motion illusions using both OpenCV and CNN. These imitations suggest that some visual illusions may be processed by the illusory movement of animations.

Download Full-text

Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision

10.1101/677237 ◽

2019 ◽

Cited By ~ 6

Author(s):

Courtney J Spoerer ◽

Tim C Kietzmann ◽

Johannes Mehrer ◽

Ian Charest ◽

Nikolaus Kriegeskorte

Keyword(s):

Neural Network ◽

Neural Networks ◽

Computer Vision ◽

Visual Recognition ◽

Network Models ◽

Neural Network Models ◽

Biological Vision ◽

Visual Systems ◽

Confidence Threshold ◽

Recurrent Processing

AbstractDeep feedforward neural network models of vision dominate in both computational neuroscience and engineering. The primate visual system, by contrast, contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain or model. Here we show: (1) Recurrent convolutional neural network models outperform feedforward convolutional models matched in their number of parameters in large-scale visual recognition tasks on natural images. (2) Setting a confidence threshold, at which recurrent computations terminate and a decision is made, enables flexible trading of speed for accuracy. At a given confidence threshold, the model expends more time and energy on images that are harder to recognise, without requiring additional parameters for deeper computations. (3) The recurrent model’s reaction time for an image predicts the human reaction time for the same image better than several parameter-matched and state-of-the-art feedforward models. (4) Across confidence thresholds, the recurrent model emulates the behaviour of feedforward control models in that it achieves the same accuracy at approximately the same computational cost (mean number of floating-point operations). However, the recurrent model can be run longer (higher confidence threshold) and then outperforms parameter-matched feedforward comparison models. These results suggest that recurrent connectivity, a hallmark of biological visual systems, may be essential for understanding the accuracy, flexibility, and dynamics of human visual recognition.Author summaryDeep neural networks provide the best current models of biological vision and achieve the highest performance in computer vision. Inspired by the primate brain, these models transform the image signals through a sequence of stages, leading to recognition. Unlike brains in which outputs of a given computation are fed back into the same computation, these models do not process signals recurrently. The ability to recycle limited neural resources by processing information recurrently could explain the accuracy and flexibility of biological visual systems, which computer vision systems cannot yet match. Here we report that recurrent processing can improve recognition performance compared to similarly complex feedforward networks. Recurrent processing also enabled models to behave more flexibly and trade off speed for accuracy. Like humans, the recurrent network models can compute longer when an object is hard to recognise, which boosts their accuracy. The model’s recognition times predicted human recognition times for the same images. The performance and flexibility of recurrent neural network models illustrates that modeling biological vision can help us improve computer vision.

Download Full-text

A scale-aware YOLO model for pedestrian detection

10.36227/techrxiv.13049129.v1 ◽

2020 ◽

Author(s):

Xingyi Yang ◽

Yonghu Wang ◽

Robert Laganiere

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Real Time ◽

State Of The Art ◽

Pedestrian Detection ◽

Small Scale ◽

Robust Detection ◽

Public Datasets ◽

Traditional Approaches ◽

New Framework

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>

Download Full-text

A scale-aware YOLO model for pedestrian detection

10.36227/techrxiv.13049129 ◽

2020 ◽

Author(s):

Xingyi Yang ◽

Yong Wang ◽

Robert Laganiere

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Real Time ◽

State Of The Art ◽

Pedestrian Detection ◽

Small Scale ◽

Robust Detection ◽

Public Datasets ◽

Traditional Approaches ◽

New Framework

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>

Download Full-text

Leveraging Prior Concept Learning Improves Generalization From Few Examples in Computational Models of Human Object Recognition

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2020.586671 ◽

2021 ◽

Vol 14 ◽

Author(s):

Joshua S. Rule ◽

Maximilian Riesenhuber

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Deep Learning ◽

Computational Models ◽

Visual Representations ◽

Hierarchical Organization ◽

Visual Hierarchy ◽

Human Object ◽

Visual Concepts ◽

Deep Learning Model

Humans quickly and accurately learn new visual concepts from sparse data, sometimes just a single example. The impressive performance of artificial neural networks which hierarchically pool afferents across scales and positions suggests that the hierarchical organization of the human visual system is critical to its accuracy. These approaches, however, require magnitudes of order more examples than human learners. We used a benchmark deep learning model to show that the hierarchy can also be leveraged to vastly improve the speed of learning. We specifically show how previously learned but broadly tuned conceptual representations can be used to learn visual concepts from as few as two positive examples; reusing visual representations from earlier in the visual hierarchy, as in prior approaches, requires significantly more examples to perform comparably. These results suggest techniques for learning even more efficiently and provide a biologically plausible way to learn new visual concepts from few examples.

Download Full-text

Interpretable, Data-Efficient and Verifiable Autonomy with High-Level Knowledge

10.36227/techrxiv.12591152.v1 ◽

2020 ◽

Author(s):

Zhe Xu

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Deep Neural Networks ◽

Autonomous Systems ◽

Real Data ◽

Data Driven ◽

Knowledge Representations ◽

Limited Availability ◽

High Level ◽

Level Performance

<p>Despite the fact that artificial intelligence boosted with data-driven methods (e.g., deep neural networks) has surpassed human-level performance in various tasks, its application to autonomous</p> <p>systems still faces fundamental challenges such as lack of interpretability, intensive need for data and lack of verifiability. In this overview paper, I overview some attempts to address these fundamental challenges by explaining, guiding and verifying autonomous systems, taking into account limited availability of simulated and real data, the expressivity of high-level</p> <p>knowledge representations and the uncertainties of the underlying model. Specifically, this paper covers learning high-level knowledge from data for interpretable autonomous systems,</p><p>guiding autonomous systems with high-level knowledge, and</p><p>verifying and controlling autonomous systems against high-level specifications.</p>

Download Full-text

The functional neuroanatomy of face perception: from brain measurements to deep neural networks

Interface Focus ◽

10.1098/rsfs.2018.0013 ◽

2018 ◽

Vol 8 (4) ◽

pp. 20180013 ◽

Cited By ~ 22

Author(s):

Kalanit Grill-Spector ◽

Kevin S. Weiner ◽

Jesse Gomez ◽

Anthony Stigliani ◽

Vaidehi S. Natu

Keyword(s):

Neural Networks ◽

Computational Models ◽

Human Performance ◽

Deep Neural Networks ◽

Functional Neuroanatomy ◽

Functional Architecture ◽

Visual Stream ◽

Ventral Visual Stream ◽

The Brain ◽

New Framework

A central goal in neuroscience is to understand how processing within the ventral visual stream enables rapid and robust perception and recognition. Recent neuroscientific discoveries have significantly advanced understanding of the function, structure and computations along the ventral visual stream that serve as the infrastructure supporting this behaviour. In parallel, significant advances in computational models, such as hierarchical deep neural networks (DNNs), have brought machine performance to a level that is commensurate with human performance. Here, we propose a new framework using the ventral face network as a model system to illustrate how increasing the neural accuracy of present DNNs may allow researchers to test the computational benefits of the functional architecture of the human brain. Thus, the review (i) considers specific neural implementational features of the ventral face network, (ii) describes similarities and differences between the functional architecture of the brain and DNNs, and (iii) provides a hypothesis for the computational value of implementational features within the brain that may improve DNN performance. Importantly, this new framework promotes the incorporation of neuroscientific findings into DNNs in order to test the computational benefits of fundamental organizational features of the visual system.

Download Full-text