Evidence for Good Recovery of Lengths of Real Objects Seen with Natural Stereo Viewing

J P Frisby; D Buckley; P A Duke

doi:10.1068/v96l0902

Evidence for Good Recovery of Lengths of Real Objects Seen with Natural Stereo Viewing

Perception ◽

10.1068/v96l0902 ◽

1996 ◽

Vol 25 (1_suppl) ◽

pp. 51-51

Author(s):

J P Frisby ◽

D Buckley ◽

P A Duke

Keyword(s):

Weber Fraction ◽

Poor Performance ◽

Absolute Error ◽

Binocular Viewing ◽

Human Vision ◽

Size Constancy ◽

Good Recovery ◽

Scene Structure ◽

Real Objects ◽

Matching Performance

How good is human size constancy for real objects seen with natural stereo viewing, which minimises the opportunity for monocular size cues to play a role? This question has attracted renewed interest in recent years, arising mainly from the work of Todd and his colleagues. They have argued, initially from experiments in which stereograms were used, but more recently from studies based on real scenes, that poor performance on length judgment tasks suggests that human vision is weak at computing metric representations. At ECVP '95, we described several experiments demonstrating quite good performance on the task of matching the lengths of two stationary real objects, gnarled wooden sticks, under binocular viewing with head held fixed (1995 Perception24 Supplement, 129). We now report extensions to that work aimed at checking whether this good performance is maintained over three viewing distances (79, 158, and 355 cm), and when test and matching sticks are of different thicknesses. Matching performance was measured with a variety of indices: reliability, accuracy, Weber fraction, and absolute error. Relatively poor performance was observed when the sticks were viewed monocularly at the near and far distances but binocular viewing produced good performance at all distances. These results suggest that stereo can support good representations of metric scene structure when length judgments of natural objects are required under (quasi-)natural viewing. The implications of these results for theories of structure-from-stereo are discussed, and reasons are suggested why our results might differ from those of Todd and his colleagues.

Download Full-text

Evidence for Good Recovery of Lengths of Real Objects Seen with Natural Stereo Viewing

Perception ◽

10.1068/p250129 ◽

1996 ◽

Vol 25 (2) ◽

pp. 129-154 ◽

Cited By ~ 25

Author(s):

John P Frisby ◽

David Buckley ◽

Philip A Duke

Keyword(s):

Structure From Motion ◽

Oscillatory Motion ◽

Binocular Viewing ◽

Good Recovery ◽

Natural Objects ◽

Scene Structure ◽

Real Objects ◽

Matching Performance

Six experiments are described in which good performance of the task of matching the lengths of two stationary real objects, gnarled wooden sticks, under a variety of binocular viewing conditions, including variations in viewing distances was demonstrated. Relatively poor matching performance was observed when the sticks were viewed monocularly in oscillatory motion, or monocularly and stationary. The results suggest that stereo can support good representations of metric scene structure when length judgments of natural objects are required under (quasi-)natural viewing. The implications of these results for theories of structure from stereo and structure from motion are discussed.

Download Full-text

Boundary Conditions on Parallel Processing in Human Vision

Perception ◽

10.1068/p180457 ◽

1989 ◽

Vol 18 (4) ◽

pp. 457-469 ◽

Cited By ~ 88

Author(s):

John Duncan

Keyword(s):

Short Term Memory ◽

Essential Element ◽

Poor Performance ◽

Human Vision ◽

Alternative View ◽

Homogeneous Field ◽

Visual Short Term Memory ◽

Relevant Attribute ◽

Stimulus Identification ◽

Detection Search

A new theory of visual search is tested experimentally with simple colour patches. The essential element of this new theory is that, whatever the search materials, efficiency increases continuously with (i) decreasing similarity between targets and nontargets, and (ii) increasing similarity between one nontarget and another. Control of ‘attention’ (access to visual short-term memory) is seen as a competitive interaction between display elements, and the theory shows how stimulus similarities influence the outcome of this competition. One alternative view is that parallel visual processes are limited to local mismatch detection. Search is parallel if the target forms a break in an otherwise homogeneous field, but is serial when absolute stimulus identification is required. It is shown, however, that even colour identification can be parallel, providing targets and nontargets are sufficiently dissimilar. A second alternative view is that search for simple features is parallel whereas search for conjunctions is serial. Conjunction search, however, has a characteristic similarity structure: different kinds of nontarget each share one relevant attribute with the target, but none with one another. When this structure is mimicked in search for colour patches, correspondingly poor performance is obtained.

Download Full-text

Training for object recognition with increasing spatial frequency: A comparison of deep learning with human vision

10.1101/2021.01.24.427905 ◽

2021 ◽

Author(s):

Lev Kiar Avberšek ◽

Astrid Zeman ◽

Hans P. Op de Beeck

Keyword(s):

Spatial Frequency ◽

Poor Performance ◽

Visual Input ◽

Human Vision ◽

Striking Similarity ◽

Deep Convolutional Neural Networks ◽

Frequency Information ◽

Spatial Frequencies ◽

Wide Range ◽

Coarse To Fine

AbstractThe ontogenetic development of human vision, and the real-time neural processing of visual input, both exhibit a striking similarity – a sensitivity towards spatial frequencies that progress in a coarse-to-fine manner. During early human development, sensitivity for higher spatial frequencies increases with age. In adulthood, when humans receive new visual input, low spatial frequencies are typically processed first before subsequently guiding the processing of higher spatial frequencies. We investigated to what extent this coarse-to-fine progression might impact visual representations in artificial vision and compared this to adult human representations. We simulated the coarse-to-fine progression of image processing in deep convolutional neural networks (CNNs) by gradually increasing spatial frequency information during training. We compared CNN performance, after standard and coarse-to-fine training, with a wide range of datasets from behavioural and neuroimaging experiments. In contrast to humans, CNNs that are trained using the standard protocol are very insensitive to low spatial frequency information, showing very poor performance in being able to classify such object images. By training CNNs using our coarse-to-fine method, we improved the classification accuracy of CNNs from 0% to 32% on low-pass filtered images taken from the ImageNet dataset. When comparing differently trained networks on images containing full spatial frequency information, we saw no representational differences. Overall, this integration of computational, neural, and behavioural findings shows the relevance of the exposure to and processing of input with a variation in spatial frequency content for some aspects of high-level object representations.

Download Full-text

Using Single Image Parallelepipeds for Camera Calibration

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.496-500.1869 ◽

2014 ◽

Vol 496-500 ◽

pp. 1869-1872

Author(s):

Ye Tian ◽

Zhen Wei Wang ◽

Feng Chen

Keyword(s):

Camera Calibration ◽

Topological Structure ◽

Geometric Constraints ◽

Human Vision ◽

Single Image ◽

Paper Briefly ◽

Self Calibration ◽

Internal Parameters ◽

Complicated Process ◽

Real Objects

Human vision is generally regarded as a complicated process from feeling to consciousness. In other words, it refers to a projection form 3-D object to 2-D image, as well as a cognition of real objects according to 2-D image,The process that a real object is modeled through some images is called 3-D reconstruction. Presently, camera calibration attracts many researchers, and it includes the internal parameters and the external parameters, such as coordinate of main point, parameters of rotation and translation. Some researchers have pointed out that parallelepiped has a strict topological structure and geometric constraints. Therefore, it is suitable for the self-calibration of camera. This paper briefly explains parallelepiped methodsand tries to apply this method to deal with self-calibration. The experiments show that this method is flexible and available. image.

Download Full-text

Improved Subject Identification in Surveillance Video Using Super-Resolution

Multimedia Networking and Coding ◽

10.4018/978-1-4666-2660-7.ch011 ◽

2013 ◽

pp. 315-358

Author(s):

Simon Denman ◽

Frank Lin ◽

Vinod Chandran ◽

Sridha Sridharan ◽

Clinton Fookes

Keyword(s):

Object Tracking ◽

Graph Matching ◽

Super Resolution ◽

Human Vision ◽

Face Verification ◽

Surveillance Video ◽

Tracking Process ◽

Subject Identification ◽

Matching Performance ◽

Areas Of Interest

The time consuming and labour intensive task of identifying individuals in surveillance video is often challenged by poor resolution and the sheer volume of stored video. Faces or identifying marks such as tattoos are often too coarse for direct matching by machine or human vision. Object tracking and super-resolution can then be combined to facilitate the automated detection and enhancement of areas of interest. The object tracking process enables the automatic detection of people of interest, greatly reducing the amount of data for super-resolution. Smaller regions such as faces can also be tracked. A number of instances of such regions can then be utilized to obtain a super-resolved version for matching. Performance improvement from super-resolution is demonstrated using a face verification task. It is shown that there is a consistent improvement of approximately 7% in verification accuracy, using both Eigenface and Elastic Bunch Graph Matching approaches for automatic face verification, starting from faces with an eye to eye distance of 14 pixels. Visual improvement in image fidelity from super-resolved images over low-resolution and interpolated images is demonstrated on a small database. Current research and future directions in this area are also summarized.

Download Full-text

Cortical Sensitivity to Natural Scene Structure

10.1101/613885 ◽

2019 ◽

Cited By ~ 2

Author(s):

Daniel Kaiser ◽

Greta Häberle ◽

Radoslaw M. Cichy

Keyword(s):

Spatial Structure ◽

Human Vision ◽

Scene Analysis ◽

Natural Scenes ◽

Natural Environments ◽

Cortical Processing ◽

Natural Scene ◽

Visual Inputs ◽

Scene Structure ◽

Categorical Structure

AbstractNatural scenes are inherently structured, with meaningful objects appearing in predictable locations. Human vision is tuned to this structure: When scene structure is purposefully jumbled, perception is strongly impaired. Here, we tested how such perceptual effects are reflected in neural sensitivity to scene structure. During separate fMRI and EEG experiments, participants passively viewed scenes whose spatial structure (i.e., the position of scene parts) and categorical structure (i.e., the content of scene parts) could be intact or jumbled. Using multivariate decoding, we show that spatial (but not categorical) scene structure profoundly impacts on cortical processing: Scene-selective responses in occipital and parahippocampal cortices (fMRI) and after 255ms (EEG) accurately differentiated between spatially intact and jumbled scenes. Importantly, this differentiation was more pronounced for upright than for inverted scenes, indicating genuine sensitivity to spatial structure rather than sensitivity to low-level attributes. Our findings suggest that visual scene analysis is tightly linked to the spatial structure of our natural environments. This link between cortical processing and scene structure may be crucial for rapidly parsing naturalistic visual inputs.

Download Full-text

Timing Accuracy in Motion Extrapolation: Reverse Effects of Target Size and Visible Extent of Motion at Low and High Speeds

Perception ◽

10.1068/p3397 ◽

2003 ◽

Vol 32 (6) ◽

pp. 699-706 ◽

Cited By ~ 9

Author(s):

Alexander Sokolov ◽

Marina Pavlova

Keyword(s):

Absolute Error ◽

Human Vision ◽

Target Size ◽

Motion Processing ◽

Timing Accuracy ◽

Low Speed ◽

Motion Extrapolation ◽

High Speeds ◽

Processing Mechanisms ◽

Scaling Algorithms

By varying target size, speed, and extent of visible motion we examined the timing accuracy in motion extrapolation. Small or large targets (0.2 or 0.8 deg) moved at either 2.5, 5, or 10 deg s−1 across a horizontal path (2.5 or 10 deg) and then vanished behind an occluder. Observers responded when they judged that the target had reached a randomly specified position between 0 and 12 deg. With higher speeds, the timing accuracy (the reverse of absolute error) was better for small than for large targets, and for long than for short visible extents. With low speed, these effects were reversed. In addition, while long visible extents yielded a greater accuracy at high than at low speeds, for short extents the accuracy was much better with the low speed. The findings suggest that, when extrapolating motion with targets and visible extents of different sizes, the visual system implements different scaling algorithms depending on target speed. At higher speeds, processing of visible and occluded motion is likely to share a common scaling mechanism based on velocity transposition. Reverse effects for target size and extent of visible motion at low and high speeds converge with the assumption of two distinct speed-tuned motion-processing mechanisms in human vision.

Download Full-text

Memory for stimulus sequences: a divide between humans and other animals?

Royal Society Open Science ◽

10.1098/rsos.161011 ◽

2017 ◽

Vol 4 (6) ◽

pp. 161011 ◽

Cited By ~ 11

Author(s):

Stefano Ghirlanda ◽

Johan Lind ◽

Magnus Enquist

Keyword(s):

Human Performance ◽

Poor Performance ◽

Absolute Error ◽

Limited Capacity ◽

Mammal Species ◽

Human Sequence ◽

Memory Traces ◽

Data Points ◽

Sequential Information ◽

Learning Trials

Humans stand out among animals for their unique capacities in domains such as language, culture and imitation, yet it has been difficult to identify cognitive elements that are specifically human. Most research has focused on how information is processed after it is acquired, e.g. in problem solving or ‘insight’ tasks, but we may also look for species differences in the initial acquisition and coding of information. Here, we show that non-human species have only a limited capacity to discriminate ordered sequences of stimuli. Collating data from 108 experiments on stimulus sequence discrimination (1540 data points from 14 bird and mammal species), we demonstrate pervasive and systematic errors, such as confusing a red–green sequence of lights with green–red and green–green sequences. These errors can persist after thousands of learning trials in tasks that humans learn to near perfection within tens of trials. To elucidate the causes of such poor performance, we formulate and test a mathematical model of non-human sequence discrimination, assuming that animals represent sequences as unstructured collections of memory traces. This representation carries only approximate information about stimulus duration, recency, order and frequency, yet our model predicts non-human performance with a 5.9% mean absolute error across 68 datasets. Because human-level cognition requires more accurate encoding of sequential information than afforded by memory traces, we conclude that improved coding of sequential information is a key cognitive element that may set humans apart from other animals.

Download Full-text

Satellite Image Segmentation with the Application of PSO Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.833.157 ◽

2016 ◽

Vol 833 ◽

pp. 157-163 ◽

Cited By ~ 1

Author(s):

O.C.K. Jason ◽

S.M.W. Masra ◽

Mohd Saufee Muhammad

Keyword(s):

Image Segmentation ◽

Signal To Noise Ratio ◽

Satellite Image ◽

Absolute Error ◽

Pso Algorithm ◽

Human Vision ◽

Artificial Satellites ◽

Satellite Imaging ◽

Segmentation Technique ◽

Otsu’S Method

Satellite imaging consists of capturing images of the Earth through a series of artificial satellites. These images contain an abundance of information that can be used in several applications such as fishing, agriculture, regional planning, biodiversity conservation and many others. Digital image processing can help overcome the limitations of human vision by extracting key information from these images at a much higher rate through the speed of automation. This paper aims to achieve that by exploiting the potential of the Particle Swarm Optimization (PSO) algorithm in image segmentation. Various satellite images were segmented using PSO algorithm before a trace of the objects that have been isolated in the image was run to evaluate the accuracy of segmentation. Three objective measurements which are Peak Signal to Noise Ratio (PSNR), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE), were made on the outputs of the segmentation using PSO algorithm and the traditional segmentation technique which is Otsu’s method for comparison. The proposed method which applies the PSO algorithm proved to be superior in producing images of higher quality and accuracy as compared to the traditional segmentation technique.

Download Full-text

Orosensory Perception, Speech Production, and Deafness

Journal of Speech and Hearing Research ◽

10.1044/jshr.1602.257 ◽

1973 ◽

Vol 16 (2) ◽

pp. 257-266 ◽

Cited By ~ 10

Author(s):

Milo E. Bishop ◽

Robert L. Ringel ◽

Arthur S. House

Keyword(s):

High School ◽

Speech Production ◽

Poor Performance ◽

Normal Hearing ◽

The Other ◽

Functional Disorders ◽

Form Discrimination ◽

School Subjects ◽

Oral Form ◽

Similarities And Differences

The oral form-discrimination abilities of 18 orally educated and oriented deaf high school subjects were determined and compared to those of manually educated and oriented deaf subjects and normal-hearing subjects. The similarities and differences among the responses of the three groups were discussed and then compared to responses elicited from subjects with functional disorders of articulation. In general, the discrimination scores separated the manual deaf from the other two groups, particularly when differences in form shapes were involved in the test. The implications of the results for theories relating orosensory-discrimination abilities are discussed. It is postulated that, while a failure in oroperceptual functioning may lead to disorders of articulation, a failure to use the oral mechanism for speech activities, even in persons with normal orosensory capabilities, may result in poor performance on oroperceptual tasks.

Download Full-text