Temporal Correlations in Presentation Order during Learning Affects Human Object Recognition

Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 33-33
Author(s):  
G M Wallis ◽  
H H Bülthoff

The view-based approach to object recognition supposes that objects are stored as a series of associated views. Although representation of these views as combinations of 2-D features allows generalisation to similar views, it remains unclear how very different views might be associated together to allow recognition from any viewpoint. One cue present in the real world other than spatial similarity, is that we usually experience different objects in temporally constrained, coherent order, and not as randomly ordered snapshots. In a series of recent neural-network simulations, Wallis and Baddeley (1997 Neural Computation9 883 – 894) describe how the association of views on the basis of temporal as well as spatial correlations is both theoretically advantageous and biologically plausible. We describe an experiment aimed at testing their hypothesis in human object-recognition learning. We investigated recognition performance of faces previously presented in sequences. These sequences consisted of five views of five different people's faces, presented in orderly sequence from left to right profile in 45° steps. According to the temporal-association hypothesis, the visual system should associate the images together and represent them as different views of the same person's face, although in truth they are images of different people's faces. In a same/different task, subjects were asked to say whether two faces seen from different viewpoints were views of the same person or not. In accordance with theory, discrimination errors increased for those faces seen earlier in the same sequence as compared with those faces which were not ( p<0.05).

1998 ◽  
Vol 06 (03) ◽  
pp. 299-313 ◽  
Author(s):  
Guy Wallis

The view based approach to object recognition relies upon the co-activation of 2-D pictorial elements or features. This approach is limited to generalising recognition across transformations of objects in which considerable physical similarity is present in the stored 2-D images to which the object is being compared. It is, therefore, unclear how completely novel views of objects might correctly be assigned to known views of an object so as to allow correct recognition from any viewpoint. The answer to this problem may lie in the fact that in the real world we are presented with a further cue as to how we should associate these images, namely that we tend to view objects over extended periods of time. In this paper, neural network and human psychophysics data on face recognition are presented which support the notion that recognition learning can be affected by the order in which images appear, as well as their spatial similarity.


2016 ◽  
Author(s):  
Darren Seibert ◽  
Daniel L Yamins ◽  
Diego Ardila ◽  
Ha Hong ◽  
James J DiCarlo ◽  
...  

Human visual object recognition is subserved by a multitude of cortical areas. To make sense of this system, one line of research focused on response properties of primary visual cortex neurons and developed theoretical models of a set of canonical computations such as convolution, thresholding, exponentiating and normalization that could be hierarchically repeated to give rise to more complex representations. Another line or research focused on response properties of high-level visual cortex and linked these to semantic categories useful for object recognition. Here, we hypothesized that the panoply of visual representations in the human ventral stream may be understood as emergent properties of a system constrained both by simple canonical computations and by top-level, object recognition functionality in a single unified framework (Yamins et al., 2014; Khaligh-Razavi and Kriegeskorte, 2014; Guclu and van Gerven, 2015). We built a deep convolutional neural network model optimized for object recognition and compared representations at various model levels using representational similarity analysis to human functional imaging responses elicited from viewing hundreds of image stimuli. Neural network layers developed representations that corresponded in a hierarchical consistent fashion to visual areas from V1 to LOC. This correspondence increased with optimization of the model's recognition performance. These findings support a unified view of the ventral stream in which representations from the earliest to the latest stages can be understood as being built from basic computations inspired by modeling of early visual cortex shaped by optimization for high-level object-based performance constraints.


Author(s):  
Han Ding ◽  
Linwei Zhai ◽  
Cui Zhao ◽  
Songjiang Hou ◽  
Ge Wang ◽  
...  

This paper presents a non-invasive design, namely RF-ray, to recognize the shape and material of an object simultaneously. RF-ray puts the object approximate to an RFID tag array, and explores the propagation effect as well as coupling effect between RFIDs and the object for sensing. In contrast to prior proposals, RF-ray is capable to recognize unseen objects, including unseen shape-material pairs and unseen materials within a certain container. To make it real, RF-ray introduces a sensing capability enhancement module and leverages a two-branch neural network for shape profiling and material identification respectively. Furthermore, we incorporate a Zero-Shot Learning based embedding module that incorporates the well-learned linguistic features to generalize RF-ray to recognize unseen materials. We build a prototype of RF-ray using commodity RFID devices. Comprehensive real-world experiments demonstrate our system can achieve high object recognition performance.


2019 ◽  
Author(s):  
Eshed Margalit ◽  
Sarah B. Herald ◽  
Emily X. Meschke ◽  
Isabel Irawan ◽  
Rafael Maarek ◽  
...  

In 1968 Guzman showed how the myriad of surfaces composing a highly complex and novel assemblage of volumes can be readily assigned to their appropriate volumes in terms of the constraints offered by the vertices of coterminating edges. Of particular importance was the L-vertex, produced by the cotermination of two contours, which provides strong evidence for the termination of a 2D surface. An X-junction, formed by the crossing of two contours without a change of direction at the crossing, played no role in the segmentation of the scene. If the potency of noise elements to affect recognition performance reflected their relevancy to the segmentation of scenes, as suggested by Guzman, X-junctions would be expected to have little or no effect on shape-based object recognition whereas L-junctions would be expected to have a strong deleterious effect when disrupting the smooth continuation of contours. Guzman’s roles for the various vertices and junctions have never been put to systematic test with respect to human object recognition. By adding identical noise contours to line drawings of objects that produced either L-vertices or X-junctions, these shape features could be compared with respect to their disruption of object recognition. Guzman’s insights that irrelevant L-vertices should be disruptive and irrelevant X-vertices would have minimal effect were confirmed.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 202-202
Author(s):  
P Kalocsai ◽  
W I Biederman

A recognition model which defines a measure of shape similarity on the direct output of multiscale and multiorientation Gabor filters does not manifest qualitative aspects of human object recognition of contour-deleted images in that: (a) it recognises recoverable and nonrecoverable contour-deleted images equally well, whereas humans recognise recoverable images much better; (b) it distinguishes complementary feature-deleted images whereas humans do not. Adding some of the known connectivity patterns of the primary visual cortex to the model in the form of extension fields (connections between collinear and curvilinear units) among filters (a) increased the overall recognition performance of the model, (b) boosted the recognition rate of the recoverable images far more than of the nonrecoverable ones, (c) increased the similarity of complementary feature-deleted images, but not part-deleted ones. These correspond more closely to human psychophysical results. Interestingly, performance was approximately equivalent for narrow (±15 deg) and broad (±90 deg) extension fields.


2021 ◽  
Vol 11 (11) ◽  
pp. 4758
Author(s):  
Ana Malta ◽  
Mateus Mendes ◽  
Torres Farinha

Maintenance professionals and other technical staff regularly need to learn to identify new parts in car engines and other equipment. The present work proposes a model of a task assistant based on a deep learning neural network. A YOLOv5 network is used for recognizing some of the constituent parts of an automobile. A dataset of car engine images was created and eight car parts were marked in the images. Then, the neural network was trained to detect each part. The results show that YOLOv5s is able to successfully detect the parts in real time video streams, with high accuracy, thus being useful as an aid to train professionals learning to deal with new equipment using augmented reality. The architecture of an object recognition system using augmented reality glasses is also designed.


2020 ◽  
Vol 14 (2) ◽  
pp. 167-175
Author(s):  
Li Zhang ◽  
Volker Schwieger

AbstractThe investigations on low-cost single frequency GNSS receivers at the Institute of Engineering Geodesy (IIGS) show that u-blox GNSS receivers combined with low-cost antennas and self-constructed L1-optimized choke rings can reach an accuracy which almost meets the requirements of geodetic applications (see Zhang and Schwieger [25]). However, the quality (accuracy and reliability) of low-cost GNSS receiver data should still be improved, particularly in environments with obstructions. The multipath effects are a major error source for the short baselines. The ground plate or the choke ring ground plane can reduce the multipath signals from the horizontal reflector (e. g. ground). However, the shieldings cannot reduce the multipath signals from the vertical reflectors (e. g. walls).Because multipath effects are spatially and temporally correlated, an algorithm is developed for reducing the multipath effect by considering the spatial correlations of the adjoined stations (see Zhang and Schwieger [24]). In this paper, an algorithm based on the temporal correlations will be introduced. The developed algorithm is based on the periodic behavior of the estimated coordinates and not on carrier phase raw data, which is easy to use. Because, for the users, coordinates are more accessible than the raw data. The multipath effect can cause periodic oscillations but the periods change over time. Besides this, the multipath effect’s influence on the coordinates is a mixture of different multipath signals from different satellites and different reflectors. These two properties will be used to reduce the multipath effect. The algorithm runs in two steps and iteratively. Test measurements were carried out in a multipath intensive environment; the accuracies of the measurements are improved by about 50 % and the results can be delivered in near-real-time (in ca. 30 minutes), therefore the algorithm is suitable for structural health monitoring applications.


2019 ◽  
Vol 35 (05) ◽  
pp. 525-533
Author(s):  
Evrim Gülbetekin ◽  
Seda Bayraktar ◽  
Özlenen Özkan ◽  
Hilmi Uysal ◽  
Ömer Özkan

AbstractThe authors tested face discrimination, face recognition, object discrimination, and object recognition in two face transplantation patients (FTPs) who had facial injury since infancy, a patient who had a facial surgery due to a recent wound, and two control subjects. In Experiment 1, the authors showed them original faces and morphed forms of those faces and asked them to rate the similarity between the two. In Experiment 2, they showed old, new, and implicit faces and asked whether they recognized them or not. In Experiment 3, they showed them original objects and morphed forms of those objects and asked them to rate the similarity between the two. In Experiment 4, they showed old, new, and implicit objects and asked whether they recognized them or not. Object discrimination and object recognition performance did not differ between the FTPs and the controls. However, the face discrimination performance of FTP2 and face recognition performance of the FTP1 were poorer than that of the controls were. Therefore, the authors concluded that the structure of the face might affect face processing.


Sign in / Sign up

Export Citation Format

Share Document