scholarly journals Closed-Loop Learning of Visual Control Policies

2007 ◽  
Vol 28 ◽  
pp. 349-391 ◽  
Author(s):  
S. R. Jodogne ◽  
J. H. Piater

In this paper we present a general, flexible framework for learning mappings from images to actions by interacting with the environment. The basic idea is to introduce a feature-based image classifier in front of a reinforcement learning algorithm. The classifier partitions the visual space according to the presence or absence of few highly informative local descriptors that are incrementally selected in a sequence of attempts to remove perceptual aliasing. We also address the problem of fighting overfitting in such a greedy algorithm. Finally, we show how high-level visual features can be generated when the power of local descriptors is insufficient for completely disambiguating the aliased states. This is done by building a hierarchy of composite features that consist of recursive spatial combinations of visual features. We demonstrate the efficacy of our algorithms by solving three visual navigation tasks and a visual version of the classical ``Car on the Hill'' control problem.

Author(s):  
Zewen Xu ◽  
Zheng Rong ◽  
Yihong Wu

AbstractIn recent years, simultaneous localization and mapping in dynamic environments (dynamic SLAM) has attracted significant attention from both academia and industry. Some pioneering work on this technique has expanded the potential of robotic applications. Compared to standard SLAM under the static world assumption, dynamic SLAM divides features into static and dynamic categories and leverages each type of feature properly. Therefore, dynamic SLAM can provide more robust localization for intelligent robots that operate in complex dynamic environments. Additionally, to meet the demands of some high-level tasks, dynamic SLAM can be integrated with multiple object tracking. This article presents a survey on dynamic SLAM from the perspective of feature choices. A discussion of the advantages and disadvantages of different visual features is provided in this article.


Author(s):  
Maarten J. G. M. van Emmerik

Abstract Feature modeling enables the specification of a model with standardized high-level shape aspects that have a functional meaning for design or manufacturing. In this paper an interactive graphical approach to feature-based modeling is presented. The user can represent features as new CSG primitives, specified as a Boolean combination of halfspaces. Constraints between halfspaces specify the geometric characteristics of a feature and control feature validity. Once a new feature is defined and stored in a library, it can be used in other objects and positioned, oriented and dimensioned by direct manipulation with a graphics cursor. Constraints between features prevent feature interference and specify spatial relations between features.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Ziqiang Wang ◽  
Xia Sun ◽  
Lijun Sun ◽  
Yuchun Huang

In many image classification applications, it is common to extract multiple visual features from different views to describe an image. Since different visual features have their own specific statistical properties and discriminative powers for image classification, the conventional solution for multiple view data is to concatenate these feature vectors as a new feature vector. However, this simple concatenation strategy not only ignores the complementary nature of different views, but also ends up with “curse of dimensionality.” To address this problem, we propose a novel multiview subspace learning algorithm in this paper, named multiview discriminative geometry preserving projection (MDGPP) for feature extraction and classification. MDGPP can not only preserve the intraclass geometry and interclass discrimination information under a single view, but also explore the complementary property of different views to obtain a low-dimensional optimal consensus embedding by using an alternating-optimization-based iterative algorithm. Experimental results on face recognition and facial expression recognition demonstrate the effectiveness of the proposed algorithm.


2019 ◽  
Vol 52 (3-4) ◽  
pp. 252-261 ◽  
Author(s):  
Xiaohua Cao ◽  
Daofan Liu ◽  
Xiaoyu Ren

Auto guide vehicle’s position deviation always appears in its walking process. Current edge approaches applied in the visual navigation field are difficult to meet the high-level requirements of complex environment in factories since they are easy to be affected by noise, which results in low measurement accuracy and unsteadiness. In order to avoid the defects of edge detection algorithm, an improved detection method based on image thinning and Hough transform is proposed to solve the problem of auto guide vehicle’s walking deviation. First, the image of lane line is preprocessed with gray processing, threshold segmentation, and mathematical morphology, and then, the refinement algorithm is employed to obtain the skeleton of the lane line, combined with Hough detection and line fitting, the equation of the guide line is generated, and finally, the value of auto guide vehicle’s walking deviation can be calculated. The experimental results show that the methodology we proposed can deal with non-ideal factors of the actual environment such as bright area, path breaks, and clutters on road, and extract the parameters of the guide line effectively, after which the value of auto guide vehicle’s walking deviation is obtained. This method is proved to be feasible for auto guide vehicle in indoor environment for visual navigation.


2013 ◽  
Vol 774-776 ◽  
pp. 1625-1628 ◽  
Author(s):  
Kai Hu ◽  
Wei Feng Chen ◽  
Dan Mao ◽  
Zi Chen Zheng ◽  
Jing Yi Duan

To make robot more intelligence, this paper proposed a new image feature named as ROLD-map which based on Rank-Ordered Logarithmic Difference (ROLD), and this feature enable researchers understand images complication directly and accuracy. Experimental data show that it can recognize the sky, tree and road obviously with very little time through proposed feature. It provides the fundamental analysis for improving the precision of image recognition, and also gives the reference research for improving the precision of image recognition for the process of visual navigation of robot.


2021 ◽  
Author(s):  
Marek A. Pedziwiatr ◽  
Elisabeth von dem Hagen ◽  
Christoph Teufel

Humans constantly move their eyes to explore the environment and obtain information. Competing theories of gaze guidance consider the factors driving eye movements within a dichotomy between low-level visual features and high-level object representations. However, recent developments in object perception indicate a complex and intricate relationship between features and objects. Specifically, image-independent object-knowledge can generate objecthood by dynamically reconfiguring how feature space is carved up by the visual system. Here, we adopt this emerging perspective of object perception, moving away from the simplifying dichotomy between features and objects in explanations of gaze guidance. We recorded eye movements in response to stimuli that appear as meaningless patches on initial viewing but are experienced as coherent objects once relevant object-knowledge has been acquired. We demonstrate that gaze guidance differs substantially depending on whether observers experienced the same stimuli as meaningless patches or organised them into object representations. In particular, fixations on identical images became object-centred, less dispersed, and more consistent across observers once exposed to relevant prior object-knowledge. Observers' gaze behaviour also indicated a shift from exploratory information-sampling to a strategy of extracting information mainly from selected, object-related image areas. These effects were evident from the first fixations on the image. Importantly, however, eye-movements were not fully determined by object representations but were best explained by a simple model that integrates image-computable features and high-level, knowledge-dependent object representations. Overall, the results show how information sampling via eye-movements in humans is guided by a dynamic interaction between image-computable features and knowledge-driven perceptual organisation.


2021 ◽  
Author(s):  
Maryam Nematollahi Arani

Object recognition has become a central topic in computer vision applications such as image search, robotics and vehicle safety systems. However, it is a challenging task due to the limited discriminative power of low-level visual features in describing the considerably diverse range of high-level visual semantics of objects. Semantic gap between low-level visual features and high-level concepts are a bottleneck in most systems. New content analysis models need to be developed to bridge the semantic gap. In this thesis, algorithms based on conditional random fields (CRF) from the class of probabilistic graphical models are developed to tackle the problem of multiclass image labeling for object recognition. Image labeling assigns a specific semantic category from a predefined set of object classes to each pixel in the image. By well capturing spatial interactions of visual concepts, CRF modeling has proved to be a successful tool for image labeling. This thesis proposes novel approaches to empowering the CRF modeling for robust image labeling. Our primary contributions are twofold. To better represent feature distributions of CRF potentials, new feature functions based on generalized Gaussian mixture models (GGMM) are designed and their efficacy is investigated. Due to its shape parameter, GGMM can provide a proper fit to multi-modal and skewed distribution of data in nature images. The new model proves more successful than Gaussian and Laplacian mixture models. It also outperforms a deep neural network model on Corel imageset by 1% accuracy. Further in this thesis, we apply scene level contextual information to integrate global visual semantics of the image with pixel-wise dense inference of fully-connected CRF to preserve small objects of foreground classes and to make dense inference robust to initial misclassifications of the unary classifier. Proposed inference algorithm factorizes the joint probability of labeling configuration and image scene type to obtain prediction update equations for labeling individual image pixels and also the overall scene type of the image. The proposed context-based dense CRF model outperforms conventional dense CRF model by about 2% in terms of labeling accuracy on MSRC imageset and by 4% on SIFT Flow imageset. Also, the proposed model obtains the highest scene classification rate of 86% on MSRC dataset.


2006 ◽  
Vol 18 (10) ◽  
pp. 1663-1665 ◽  
Author(s):  
Mark A. Elliott ◽  
Zhuanghua Shi ◽  
Sean D. Kelly

How does neuronal activity bring about the interpretation of visual space in terms of objects or complex perceptual events? If they group, simple visual features can bring about the integration of spikes from neurons responding to different features to within a few milliseconds. Considered as a potential solution to the “binding problem,” it is suggested that neuronal synchronization is the glue for binding together different features of the same object. This idea receives some support from correlated- and periodic-stimulus motion paradigms, both of which suggest that the segregation of a figure from ground is a direct result of the temporal correlation of visual signals. One could say that perception of a highly correlated visual structure permits space to be bound in time. However, on closer analysis, the concept of perceptual synchrony is insufficient to explain the conditions under which events will be seen as simultaneous. Instead, the grouping effects ascribed to perceptual synchrony are better explained in terms of the intervals of time over which stimulus events integrate and seem to occur simultaneously. This point is supported by the equivalence of some of these measures with well-established estimates of the perceptual moment. However, it is time in extension and not the instantaneous that may best describe how seemingly simultaneous features group. This means that studies of perceptual synchrony are insufficient to address the binding problem.


Sign in / Sign up

Export Citation Format

Share Document