Temporal Correlations in Presentation Order during Learning Affects Human Object Recognition

G M Wallis; H H Bülthoff

doi:10.1068/v970077

Temporal Correlations in Presentation Order during Learning Affects Human Object Recognition

Perception ◽

10.1068/v970077 ◽

1997 ◽

Vol 26 (1_suppl) ◽

pp. 33-33

Author(s):

G M Wallis ◽

H H Bülthoff

Keyword(s):

Neural Network ◽

Object Recognition ◽

Recognition Performance ◽

Presentation Order ◽

Temporal Association ◽

Spatial Correlations ◽

Temporal Correlations ◽

Human Object ◽

Network Simulations ◽

Recognition Learning

The view-based approach to object recognition supposes that objects are stored as a series of associated views. Although representation of these views as combinations of 2-D features allows generalisation to similar views, it remains unclear how very different views might be associated together to allow recognition from any viewpoint. One cue present in the real world other than spatial similarity, is that we usually experience different objects in temporally constrained, coherent order, and not as randomly ordered snapshots. In a series of recent neural-network simulations, Wallis and Baddeley (1997 Neural Computation9 883 – 894) describe how the association of views on the basis of temporal as well as spatial correlations is both theoretically advantageous and biologically plausible. We describe an experiment aimed at testing their hypothesis in human object-recognition learning. We investigated recognition performance of faces previously presented in sequences. These sequences consisted of five views of five different people's faces, presented in orderly sequence from left to right profile in 45° steps. According to the temporal-association hypothesis, the visual system should associate the images together and represent them as different views of the same person's face, although in truth they are images of different people's faces. In a same/different task, subjects were asked to say whether two faces seen from different viewpoints were views of the same person or not. In accordance with theory, discrimination errors increased for those faces seen earlier in the same sequence as compared with those faces which were not ( p<0.05).

Download Full-text

Temporal Order in Human Object Recognition Learning

Journal of Biological System ◽

10.1142/s0218339098000200 ◽

1998 ◽

Vol 06 (03) ◽

pp. 299-313 ◽

Cited By ~ 9

Author(s):

Guy Wallis

Keyword(s):

Neural Network ◽

Face Recognition ◽

Object Recognition ◽

Real World ◽

Temporal Order ◽

Physical Similarity ◽

Human Psychophysics ◽

Co Activation ◽

Human Object ◽

Recognition Learning

The view based approach to object recognition relies upon the co-activation of 2-D pictorial elements or features. This approach is limited to generalising recognition across transformations of objects in which considerable physical similarity is present in the stored 2-D images to which the object is being compared. It is, therefore, unclear how completely novel views of objects might correctly be assigned to known views of an object so as to allow correct recognition from any viewpoint. The answer to this problem may lie in the fact that in the real world we are presented with a further cue as to how we should associate these images, namely that we tend to view objects over extended periods of time. In this paper, neural network and human psychophysics data on face recognition are presented which support the notion that recognition learning can be affected by the order in which images appear, as well as their spatial similarity.

Download Full-text

A performance-optimized model of neural responses across the ventral visual stream

10.1101/036475 ◽

2016 ◽

Cited By ~ 8

Author(s):

Darren Seibert ◽

Daniel L Yamins ◽

Diego Ardila ◽

Ha Hong ◽

James J DiCarlo ◽

...

Keyword(s):

Neural Network ◽

Visual Cortex ◽

Object Recognition ◽

Recognition Performance ◽

Ventral Stream ◽

Visual Object ◽

Emergent Properties ◽

Visual Stream ◽

Response Properties ◽

High Level

Human visual object recognition is subserved by a multitude of cortical areas. To make sense of this system, one line of research focused on response properties of primary visual cortex neurons and developed theoretical models of a set of canonical computations such as convolution, thresholding, exponentiating and normalization that could be hierarchically repeated to give rise to more complex representations. Another line or research focused on response properties of high-level visual cortex and linked these to semantic categories useful for object recognition. Here, we hypothesized that the panoply of visual representations in the human ventral stream may be understood as emergent properties of a system constrained both by simple canonical computations and by top-level, object recognition functionality in a single unified framework (Yamins et al., 2014; Khaligh-Razavi and Kriegeskorte, 2014; Guclu and van Gerven, 2015). We built a deep convolutional neural network model optimized for object recognition and compared representations at various model levels using representational similarity analysis to human functional imaging responses elicited from viewing hundreds of image stimuli. Neural network layers developed representations that corresponded in a hierarchical consistent fashion to visual areas from V1 to LOC. This correspondence increased with optimization of the model's recognition performance. These findings support a unified view of the ventral stream in which representations from the earliest to the latest stages can be understood as being built from basic computations inspired by modeling of early visual cortex shaped by optimization for high-level object-based performance constraints.

Download Full-text

RF-ray

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3478115 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-24

Author(s):

Han Ding ◽

Linwei Zhai ◽

Cui Zhao ◽

Songjiang Hou ◽

Ge Wang ◽

...

Keyword(s):

Neural Network ◽

Object Recognition ◽

Real World ◽

Coupling Effect ◽

Recognition Performance ◽

Propagation Effect ◽

Linguistic Features ◽

Rfid Tag ◽

Non Invasive ◽

Unseen Objects

This paper presents a non-invasive design, namely RF-ray, to recognize the shape and material of an object simultaneously. RF-ray puts the object approximate to an RFID tag array, and explores the propagation effect as well as coupling effect between RFIDs and the object for sensing. In contrast to prior proposals, RF-ray is capable to recognize unseen objects, including unseen shape-material pairs and unseen materials within a certain container. To make it real, RF-ray introduces a sensing capability enhancement module and leverages a two-branch neural network for shape profiling and material identification respectively. Furthermore, we incorporate a Zero-Shot Learning based embedding module that incorporates the well-learned linguistic features to generalize RF-ray to recognize unseen materials. We build a prototype of RF-ray using commodity RFID devices. Comprehensive real-world experiments demonstrate our system can achieve high object recognition performance.

Download Full-text

Visual Noise Consisting of X-Junctions Has Only a Minimal Effect on Object Recognition

10.31234/osf.io/cje3y ◽

2019 ◽

Author(s):

Eshed Margalit ◽

Sarah B. Herald ◽

Emily X. Meschke ◽

Isabel Irawan ◽

Rafael Maarek ◽

...

Keyword(s):

Object Recognition ◽

Strong Evidence ◽

Deleterious Effect ◽

Recognition Performance ◽

Affect Recognition ◽

Minimal Effect ◽

Shape Features ◽

Line Drawings ◽

Change Of Direction ◽

Human Object

In 1968 Guzman showed how the myriad of surfaces composing a highly complex and novel assemblage of volumes can be readily assigned to their appropriate volumes in terms of the constraints offered by the vertices of coterminating edges. Of particular importance was the L-vertex, produced by the cotermination of two contours, which provides strong evidence for the termination of a 2D surface. An X-junction, formed by the crossing of two contours without a change of direction at the crossing, played no role in the segmentation of the scene. If the potency of noise elements to affect recognition performance reflected their relevancy to the segmentation of scenes, as suggested by Guzman, X-junctions would be expected to have little or no effect on shape-based object recognition whereas L-junctions would be expected to have a strong deleterious effect when disrupting the smooth continuation of contours. Guzman’s roles for the various vertices and junctions have never been put to systematic test with respect to human object recognition. By adding identical noise contours to line drawings of objects that produced either L-vertices or X-junctions, these shape features could be compared with respect to their disruption of object recognition. Guzman’s insights that irrelevant L-vertices should be disruptive and irrelevant X-vertices would have minimal effect were confirmed.

Download Full-text

Recognition Model with Extension Fields

Perception ◽

10.1068/v970347 ◽

1997 ◽

Vol 26 (1_suppl) ◽

pp. 202-202

Author(s):

P Kalocsai ◽

W I Biederman

Keyword(s):

Visual Cortex ◽

Object Recognition ◽

Primary Visual Cortex ◽

Recognition Performance ◽

Recognition Rate ◽

Gabor Filters ◽

Shape Similarity ◽

Recognition Model ◽

Connectivity Patterns ◽

Human Object

A recognition model which defines a measure of shape similarity on the direct output of multiscale and multiorientation Gabor filters does not manifest qualitative aspects of human object recognition of contour-deleted images in that: (a) it recognises recoverable and nonrecoverable contour-deleted images equally well, whereas humans recognise recoverable images much better; (b) it distinguishes complementary feature-deleted images whereas humans do not. Adding some of the known connectivity patterns of the primary visual cortex to the model in the form of extension fields (connections between collinear and curvilinear units) among filters (a) increased the overall recognition performance of the model, (b) boosted the recognition rate of the recoverable images far more than of the nonrecoverable ones, (c) increased the similarity of complementary feature-deleted images, but not part-deleted ones. These correspond more closely to human psychophysical results. Interestingly, performance was approximately equivalent for narrow (±15 deg) and broad (±90 deg) extension fields.

Download Full-text

Improving human object recognition performance using video enhancement techniques

10.1117/12.581827 ◽

2004 ◽

Author(s):

Lucy S. Whitman ◽

Colin Lewis ◽

John P. Oakley

Keyword(s):

Object Recognition ◽

Recognition Performance ◽

Video Enhancement ◽

Human Object

Download Full-text

Augmented Reality Maintenance Assistant Using YOLOv5

Applied Sciences ◽

10.3390/app11114758 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4758

Author(s):

Ana Malta ◽

Mateus Mendes ◽

Torres Farinha

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Recognition ◽

Augmented Reality ◽

Real Time ◽

Recognition System ◽

High Accuracy ◽

Video Streams ◽

The Neural Network ◽

Deep Learning Neural Network

Maintenance professionals and other technical staff regularly need to learn to identify new parts in car engines and other equipment. The present work proposes a model of a task assistant based on a deep learning neural network. A YOLOv5 network is used for recognizing some of the constituent parts of an automobile. A dataset of car engine images was created and eight car parts were marked in the images. Then, the neural network was trained to detect each part. The results show that YOLOv5s is able to successfully detect the parts in real time video streams, with high accuracy, thus being useful as an aid to train professionals learning to deal with new equipment using augmented reality. The architecture of an object recognition system using augmented reality glasses is also designed.

Download Full-text

Reducing multipath effect of low-cost GNSS receivers for monitoring by considering temporal correlations

Journal of Applied Geodesy ◽

10.1515/jag-2019-0059 ◽

2020 ◽

Vol 14 (2) ◽

pp. 167-175

Author(s):

Li Zhang ◽

Volker Schwieger

Keyword(s):

Low Cost ◽

Ground Plane ◽

Single Frequency ◽

Spatial Correlations ◽

Raw Data ◽

Multipath Effect ◽

Temporal Correlations ◽

Multipath Effects ◽

Gnss Receivers ◽

Multipath Signals

AbstractThe investigations on low-cost single frequency GNSS receivers at the Institute of Engineering Geodesy (IIGS) show that u-blox GNSS receivers combined with low-cost antennas and self-constructed L1-optimized choke rings can reach an accuracy which almost meets the requirements of geodetic applications (see Zhang and Schwieger [25]). However, the quality (accuracy and reliability) of low-cost GNSS receiver data should still be improved, particularly in environments with obstructions. The multipath effects are a major error source for the short baselines. The ground plate or the choke ring ground plane can reduce the multipath signals from the horizontal reflector (e. g. ground). However, the shieldings cannot reduce the multipath signals from the vertical reflectors (e. g. walls).Because multipath effects are spatially and temporally correlated, an algorithm is developed for reducing the multipath effect by considering the spatial correlations of the adjoined stations (see Zhang and Schwieger [24]). In this paper, an algorithm based on the temporal correlations will be introduced. The developed algorithm is based on the periodic behavior of the estimated coordinates and not on carrier phase raw data, which is easy to use. Because, for the users, coordinates are more accessible than the raw data. The multipath effect can cause periodic oscillations but the periods change over time. Besides this, the multipath effect’s influence on the coordinates is a mixture of different multipath signals from different satellites and different reflectors. These two properties will be used to reduce the multipath effect. The algorithm runs in two steps and iteratively. Test measurements were carried out in a multipath intensive environment; the accuracies of the measurements are improved by about 50 % and the results can be delivered in near-real-time (in ca. 30 minutes), therefore the algorithm is suitable for structural health monitoring applications.

Download Full-text

Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network

The Visual Computer ◽

10.1007/s00371-020-02033-x ◽

2021 ◽

Author(s):

Feng-Ping An ◽

Jun-e Liu ◽

Lei Bai

Keyword(s):

Neural Network ◽

Object Recognition ◽

Convolutional Neural Network ◽

Activation Function ◽

Recognition Algorithm ◽

Nonlinear Activation Function

Download Full-text

Face Perception in Face Transplant Patients

Facial Plastic Surgery ◽

10.1055/s-0038-1666786 ◽

2019 ◽

Vol 35 (05) ◽

pp. 525-533

Author(s):

Evrim Gülbetekin ◽

Seda Bayraktar ◽

Özlenen Özkan ◽

Hilmi Uysal ◽

Ömer Özkan

Keyword(s):

Face Recognition ◽

Object Recognition ◽

Face Processing ◽

Recognition Performance ◽

Discrimination Performance ◽

Object Discrimination ◽

Face Transplantation ◽

Transplant Patients ◽

Facial Surgery ◽

The Face

AbstractThe authors tested face discrimination, face recognition, object discrimination, and object recognition in two face transplantation patients (FTPs) who had facial injury since infancy, a patient who had a facial surgery due to a recent wound, and two control subjects. In Experiment 1, the authors showed them original faces and morphed forms of those faces and asked them to rate the similarity between the two. In Experiment 2, they showed old, new, and implicit faces and asked whether they recognized them or not. In Experiment 3, they showed them original objects and morphed forms of those objects and asked them to rate the similarity between the two. In Experiment 4, they showed old, new, and implicit objects and asked whether they recognized them or not. Object discrimination and object recognition performance did not differ between the FTPs and the controls. However, the face discrimination performance of FTP2 and face recognition performance of the FTP1 were poorer than that of the controls were. Therefore, the authors concluded that the structure of the face might affect face processing.

Download Full-text