scholarly journals Cross-task perceptual learning of object recognition in simulated retinal implant perception

2018 ◽  
Author(s):  
Lihui Wang ◽  
Fariba Sharifian ◽  
Jonathan Napp ◽  
Carola Nath ◽  
Stefan Pollmann

AbstractThe perception gained by retina implants (RI) is limited, which asks for a learning regime to improve patients’ visual perception. Here we simulated RI vision and investigated if object recognition in RI patients can be improved and maintained through training. Importantly, we asked if the trained object recognition can be generalized to a new task context, and to new viewpoints of the trained objects. For this purpose, we adopted two training tasks, a naming task where participants had to choose the correct label out of other distracting labels for the presented object, and a discrimination task where participants had to choose the correct object out of other distracting objects to match the presented label. Our results showed that, despite of the task order, recognition performance was improved in both tasks and lasted at least for a week. The improved object recognition, however, can be transferred only from the naming task to the discrimination task but not vice versa. Additionally, the trained object recognition can be transferred to new viewpoints of the trained objects only in the naming task but not in the discrimination task. Training with the naming task is therefore recommended for RI patients to achieve persistent and flexible visual perception.

2019 ◽  
Vol 35 (05) ◽  
pp. 525-533
Author(s):  
Evrim Gülbetekin ◽  
Seda Bayraktar ◽  
Özlenen Özkan ◽  
Hilmi Uysal ◽  
Ömer Özkan

AbstractThe authors tested face discrimination, face recognition, object discrimination, and object recognition in two face transplantation patients (FTPs) who had facial injury since infancy, a patient who had a facial surgery due to a recent wound, and two control subjects. In Experiment 1, the authors showed them original faces and morphed forms of those faces and asked them to rate the similarity between the two. In Experiment 2, they showed old, new, and implicit faces and asked whether they recognized them or not. In Experiment 3, they showed them original objects and morphed forms of those objects and asked them to rate the similarity between the two. In Experiment 4, they showed old, new, and implicit objects and asked whether they recognized them or not. Object discrimination and object recognition performance did not differ between the FTPs and the controls. However, the face discrimination performance of FTP2 and face recognition performance of the FTP1 were poorer than that of the controls were. Therefore, the authors concluded that the structure of the face might affect face processing.


Author(s):  
Abd El Rahman Shabayek ◽  
Olivier Morel ◽  
David Fofi

For long time, it was thought that the sensing of polarization by animals is invariably related to their behavior, such as navigation and orientation. Recently, it was found that polarization can be part of a high-level visual perception, permitting a wide area of vision applications. Polarization vision can be used for most tasks of color vision including object recognition, contrast enhancement, camouflage breaking, and signal detection and discrimination. The polarization based visual behavior found in the animal kingdom is briefly covered. Then, the authors go in depth with the bio-inspired applications based on polarization in computer vision and robotics. The aim is to have a comprehensive survey highlighting the key principles of polarization based techniques and how they are biologically inspired.


1995 ◽  
Vol 73 (4) ◽  
pp. 1341-1354 ◽  
Author(s):  
G. Sary ◽  
R. Vogels ◽  
G. Kovacs ◽  
G. A. Orban

1. We recorded from neurons responsive to gratings in the inferior temporal (IT) cortices of macaque monkeys. One of the monkeys performed an orientation discrimination task; the other maintained fixation during stimulus presentation. Stimuli consisted of gratings based on discontinuities in luminance, relative motion, and texture. 2. IT cells responded well to gratings defined solely by relative motion, implying either direct or indirect motion input into IT, an area that is part of the ventral visual cortical pathway. 3. Response strength in general did not depend on the cue used to define the gratings. Latency values observed for the two static grating types (luminance- and texture-defined gratings) were similar, but significantly shorter than those measured for the kinetic gratings. 4. Stimulus orientation had a significant effect in 27%, 27%, and 9% of the cells tested with luminance-, kinetic-, and texture-defined gratings, respectively. 5. Only a small proportion of cells were orientation sensitive for more than one defining cue. The average preferred orientation for luminance and kinetic gratings matched; the tuning width was similar for the two cues. 6. Our results indicate that IT cells may contribute to cue-invariant coding of boundaries and edges. We discuss the relevance of these results to visual perception.


2015 ◽  
Vol 27 (9) ◽  
pp. 1708-1722 ◽  
Author(s):  
Edward B. O'Neil ◽  
Hilary C. Watson ◽  
Sonya Dhillon ◽  
Nancy J. Lobaugh ◽  
Andy C. H. Lee

Recent work has demonstrated that the perirhinal cortex (PRC) supports conjunctive object representations that aid object recognition memory following visual object interference. It is unclear, however, how these representations interact with other brain regions implicated in mnemonic retrieval and how congruent and incongruent interference influences the processing of targets and foils during object recognition. To address this, multivariate partial least squares was applied to fMRI data acquired during an interference match-to-sample task, in which participants made object or scene recognition judgments after object or scene interference. This revealed a pattern of activity sensitive to object recognition following congruent (i.e., object) interference that included PRC, prefrontal, and parietal regions. Moreover, functional connectivity analysis revealed a common pattern of PRC connectivity across interference and recognition conditions. Examination of eye movements during the same task in a separate study revealed that participants gazed more at targets than foils during correct object recognition decisions, regardless of interference congruency. By contrast, participants viewed foils more than targets for incorrect object memory judgments, but only after congruent interference. Our findings suggest that congruent interference makes object foils appear familiar and that a network of regions, including PRC, is recruited to overcome the effects of interference.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 33-33
Author(s):  
G M Wallis ◽  
H H Bülthoff

The view-based approach to object recognition supposes that objects are stored as a series of associated views. Although representation of these views as combinations of 2-D features allows generalisation to similar views, it remains unclear how very different views might be associated together to allow recognition from any viewpoint. One cue present in the real world other than spatial similarity, is that we usually experience different objects in temporally constrained, coherent order, and not as randomly ordered snapshots. In a series of recent neural-network simulations, Wallis and Baddeley (1997 Neural Computation9 883 – 894) describe how the association of views on the basis of temporal as well as spatial correlations is both theoretically advantageous and biologically plausible. We describe an experiment aimed at testing their hypothesis in human object-recognition learning. We investigated recognition performance of faces previously presented in sequences. These sequences consisted of five views of five different people's faces, presented in orderly sequence from left to right profile in 45° steps. According to the temporal-association hypothesis, the visual system should associate the images together and represent them as different views of the same person's face, although in truth they are images of different people's faces. In a same/different task, subjects were asked to say whether two faces seen from different viewpoints were views of the same person or not. In accordance with theory, discrimination errors increased for those faces seen earlier in the same sequence as compared with those faces which were not ( p<0.05).


Author(s):  
Michael S. Brickner ◽  
Amir Zvuloni

Thermal imaging (TI) systems, transform the distribution of relative temperatures in a scene into a visible TV image. TIs differ significantly from regular TV images. Most TI systems allow their operators to select preferred polarity which determines the way in which gray shades represent different temperatures. Polarity may be set to either black hot (BH) or white hot (WH). The present experiments were designed to investigate the effects of polarity on object recognition performance in TI and to compare object recognition performance of experts and novices. In the first experiment, twenty flight candidates were asked to recognize target objects in 60 dynamic TI recordings taken from two different TI systems. The targets included a variety of human placed and natural objects. Each subject viewed half the targets in BH and the other half in WH polarity in a balanced experimental design. For 24 out of the 60 targets one direction of polarity produced better performance than the other. Although the direction of superior polarity (BH or WH better) was not consistent, the preferred representation of the target object was very consistent. For example, vegetation was more readily recognized when presented as dark objects on a brighter background. The results are discussed in terms of importance of surface determinants versus edge determinants in the recognition of TI objects. In the second experiment, the performance of 10 expert TI users was found to be significantly more accurate but not much faster than the performance of 20 novice subjects.


2021 ◽  
Vol 7 (4) ◽  
pp. 65
Author(s):  
Daniel Silva ◽  
Armando Sousa ◽  
Valter Costa

Object recognition represents the ability of a system to identify objects, humans or animals in images. Within this domain, this work presents a comparative analysis among different classification methods aiming at Tactode tile recognition. The covered methods include: (i) machine learning with HOG and SVM; (ii) deep learning with CNNs such as VGG16, VGG19, ResNet152, MobileNetV2, SSD and YOLOv4; (iii) matching of handcrafted features with SIFT, SURF, BRISK and ORB; and (iv) template matching. A dataset was created to train learning-based methods (i and ii), and with respect to the other methods (iii and iv), a template dataset was used. To evaluate the performance of the recognition methods, two test datasets were built: tactode_small and tactode_big, which consisted of 288 and 12,000 images, holding 2784 and 96,000 regions of interest for classification, respectively. SSD and YOLOv4 were the worst methods for their domain, whereas ResNet152 and MobileNetV2 showed that they were strong recognition methods. SURF, ORB and BRISK demonstrated great recognition performance, while SIFT was the worst of this type of method. The methods based on template matching attained reasonable recognition results, falling behind most other methods. The top three methods of this study were: VGG16 with an accuracy of 99.96% and 99.95% for tactode_small and tactode_big, respectively; VGG19 with an accuracy of 99.96% and 99.68% for the same datasets; and HOG and SVM, which reached an accuracy of 99.93% for tactode_small and 99.86% for tactode_big, while at the same time presenting average execution times of 0.323 s and 0.232 s on the respective datasets, being the fastest method overall. This work demonstrated that VGG16 was the best choice for this case study, since it minimised the misclassifications for both test datasets.


Author(s):  
Billy Peralta ◽  
◽  
Luis Alberto Caro

Generic object recognition algorithms usually require complex classificationmodels because of intrinsic difficulties arising from problems such as changes in pose, lighting conditions, or partial occlusions. Decision trees present an inexpensive alternative for classification tasks and offer the advantage of being simple to understand. On the other hand, a common scheme for object recognition is given by the appearances of visual words, also known as the bag-of-words method. Although multiple co-occurrences of visual words are more informative regarding visual classes, a comprehensive evaluation of such combinations is unfeasible because it would result in a combinatorial explosion. In this paper, we propose to obtain the multiple co-occurrences of visual words using a variant of the CLIQUE subspace-clustering algorithm for improving the object recognition performance of simple decision trees. Experiments on standard object datasets show that our method improves the accuracy of the classification of generic objects in comparison to traditional decision tree techniques that are similar, in terms of accuracy, to ensemble techniques. In future we plan to evaluate other variants of decision trees, and apply other subspace-clustering algorithms.


Sign in / Sign up

Export Citation Format

Share Document