scholarly journals Natural scene statistics predict how humans pool information across space in the estimation of surface tilt

2019 ◽  
Author(s):  
Seha Kim ◽  
Johannes Burge

ABSTRACTVisual systems estimate the three-dimensional (3D) structure of scenes from information in two-dimensional (2D) retinal images. Visual systems use multiple sources of information to improve the accuracy of these estimates, including statistical knowledge of the probable spatial arrangements of natural scenes. Here, we examine how 3D surface tilts are spatially related in real-world scenes, and show that humans pool information across space when estimating surface tilt in accordance with these spatial relationships. We develop a hierarchical model of surface tilt estimation that is grounded in the statistics of tilt in natural scenes and images. The model computes a global tilt estimate by pooling local tilt estimates within an adaptive spatial neighborhood. The spatial neighborhood in which local estimates are pooled changes according to the value of the local estimate at a target location. The hierarchical model provides more accurate estimates of groundtruth tilt in natural scenes and provides a better account of human performance than the local model. Taken together, the results imply that the human visual system pools information about surface tilt across space in accordance with natural scene statistics.

2015 ◽  
Vol 15 (12) ◽  
pp. 726 ◽  
Author(s):  
Wendy Adams ◽  
James Elder ◽  
Erich Graf ◽  
Alex Muryy ◽  
Arthur Lugtigheid

2009 ◽  
Vol 26 (1) ◽  
pp. 35-49 ◽  
Author(s):  
THORSTEN HANSEN ◽  
KARL R. GEGENFURTNER

AbstractForm vision is traditionally regarded as processing primarily achromatic information. Previous investigations into the statistics of color and luminance in natural scenes have claimed that luminance and chromatic edges are not independent of each other and that any chromatic edge most likely occurs together with a luminance edge of similar strength. Here we computed the joint statistics of luminance and chromatic edges in over 700 calibrated color images from natural scenes. We found that isoluminant edges exist in natural scenes and were not rarer than pure luminance edges. Most edges combined luminance and chromatic information but to varying degrees such that luminance and chromatic edges were statistically independent of each other. Independence increased along successive stages of visual processing from cones via postreceptoral color-opponent channels to edges. The results show that chromatic edge contrast is an independent source of information that can be linearly combined with other cues for the proper segmentation of objects in natural and artificial vision systems. Color vision may have evolved in response to the natural scene statistics to gain access to this independent information.


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Seha Kim ◽  
Johannes Burge

Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world.


2017 ◽  
Author(s):  
Seha Kim ◽  
Johannes Burge

AbstractEstimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment, but it is unknown how well humans perform this task in natural scenes. Here, with a high-fidelity database of natural stereo-images with groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. With artificial stimuli, estimates are precise and unbiased. With natural stimuli, estimates are imprecise and strongly biased. An image-computable normative model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. These similarities suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. The current results help generalize our understanding of human vision from the lab to the real world.


2018 ◽  
Author(s):  
Noor Seijdel ◽  
Sara Jahfari ◽  
Iris I.A. Groen ◽  
H. Steven Scholte

A fundamental component of interacting with our environment is gathering and interpretation of sensory information. When investigating how perceptual information shapes the mechanisms of decision-making, most researchers have relied on the use of manipulated or unnatural information as perceptual input, resulting in findings that may not generalize to real-world scenes. Unlike simplified, artificial stimuli, real-world scenes contain low-level regularities (natural scene statistics) that are informative about the structural complexity of a scene, which the brain could exploit during perceptual decision-making. In this study, participants performed an animal detection task on low, medium or high complexity scenes as determined by two biologically plausible natural scene statistics, contrast energy (CE) or spatial coherence (SC). In experiment 1, stimuli were sampled such that CE and SC both influenced scene complexity. Diffusion modeling showed that both the speed of information processing and the required evidence were affected by the low-level scene complexity. Experiment 2a/b refined these observations by showing how the isolated manipulation of SC alone resulted in weaker but comparable effects on decision-making, whereas the manipulation of only CE had no effect. Overall, performance was best for scenes with intermediate complexity. Our systematic definition of natural scene statistics quantifies how complexity of natural scenes interacts with decision-making in an animal detection task. We speculate that the computation of CE and SC could serve as an indication to adjust perceptual decision-making based on the complexity of the input.


2018 ◽  
Author(s):  
João Barbosa ◽  
Albert Compte

AbstractSerial dependence, how recent experiences bias our current estimations, has been described experimentally during delayed-estimation of many different visual features, with subjects tending to make estimates biased towards previous ones. It has been proposed that these attractive biases help perception stabilization in the face of correlated natural scene statistics as an adaptive mechanism, although this remains mostly theoretical. Color, which is strongly correlated in natural scenes, has never been studied with regard to its serial dependencies. Here, we found significant serial dependence in 6 out of 7 datasets with behavioral data of humans (total n=111) performing delayed-estimation of color with uncorrelated sequential stimuli. Consistent with a drifting memory model, serial dependence was stronger when referenced relative to previous report, rather than to previous stimulus. In addition, it built up through the experimental session, suggesting metaplastic mechanisms operating at a slower time scale than previously proposed (e.g. short-term synaptic facilitation). Because, in contrast with natural scenes, stimuli were temporally uncorrelated, this build-up casts doubt on serial dependencies being an ongoing adaptation to the stable statistics of the environment.


2019 ◽  
Vol 31 (1) ◽  
pp. 109-125
Author(s):  
Andrea De Cesarei ◽  
Shari Cavicchi ◽  
Antonia Micucci ◽  
Maurizio Codispoti

Understanding natural scenes involves the contribution of bottom–up analysis and top–down modulatory processes. However, the interaction of these processes during the categorization of natural scenes is not well understood. In the current study, we approached this issue using ERPs and behavioral and computational data. We presented pictures of natural scenes and asked participants to categorize them in response to different questions (Is it an animal/vehicle? Is it indoors/outdoors? Are there one/two foreground elements?). ERPs for target scenes requiring a “yes” response began to differ from those of nontarget scenes, beginning at 250 msec from picture onset, and this ERP difference was unmodulated by the categorization questions. Earlier ERPs showed category-specific differences (e.g., between animals and vehicles), which were associated with the processing of scene statistics. From 180 msec after scene onset, these category-specific ERP differences were modulated by the categorization question that was asked. Categorization goals do not modulate only later stages associated with target/nontarget decision but also earlier perceptual stages, which are involved in the processing of scene statistics.


2017 ◽  
Author(s):  
Chih-Yang Chen ◽  
Lukas Sonnenberg ◽  
Simone Weller ◽  
Thede Witschel ◽  
Ziad M. Hafed

Visual brain areas exhibit tuning characteristics that are well suited for image statistics present in our natural environment. However, visual sensation is an active process, and if there are any brain areas that ought to be particularly 'in tune' with natural scene statistics, it would be sensory-motor areas critical for guiding behavior. Here we found that the primate superior colliculus, a structure instrumental for rapid visual exploration with saccades, detects low spatial frequencies, which are the most prevalent in natural scenes, much more rapidly than high spatial frequencies. Importantly, this accelerated detection happens independently of whether a neuron is more or less sensitive to low spatial frequencies to begin with. At the population level, the superior colliculus additionally over-represents low spatial frequencies in neural response sensitivity, even at near-foveal eccentricities. Thus, the superior colliculus possesses both temporal and response gain mechanisms for efficient gaze realignment in low-spatial-frequency dominated natural environments.


Author(s):  
Daryn R. Blanc-Goldhammer ◽  
Kevin J. MacKenzie

The minimum contrast needed for optimal text readability with additive displays (e.g. AR devices) will depend on the spatial structure of the background and text. Natural scenes and text follow similar spectral patterns. Therefore, natural scenes can mask low contrast text – making it difficult to read. In a set of experiments, we determine the minimum viable contrast for readability on an additive display. Reading performance was assessed with an RSVP task. Text was additively overlaid on algorithmically generated images that mimicked natural scene statistics. When the text to background contrast ratio was below 1.6:1, participants’ reading rates decreased rapidly. At low contrast ratios (<1.4:1), this effect was slightly mitigated for out of focus backgrounds, which reduced power from masking frequencies. Above 1.6:1, reading rate did not significantly increase but subjective quality ratings did. These results help inform the possible requirement for the development of additive display systems for consumer products.


Sign in / Sign up

Export Citation Format

Share Document