Place Recognition: An Overview of Vision Perspective

Zhiqiang Zeng; Jian Zhang; Xiaodong Wang; Yuming Chen; Chaoyang Zhu

doi:10.3390/app8112257

Place Recognition: An Overview of Vision Perspective

Applied Sciences ◽

10.3390/app8112257 ◽

2018 ◽

Vol 8 (11) ◽

pp. 2257 ◽

Cited By ~ 4

Author(s):

Zhiqiang Zeng ◽

Jian Zhang ◽

Xiaodong Wang ◽

Yuming Chen ◽

Chaoyang Zhu

Keyword(s):

Place Recognition ◽

Image Description ◽

Query Image ◽

Recognition Systems ◽

Image Representations ◽

A New Technique ◽

Visual Place Recognition ◽

Future Work ◽

Traditional Image

Place recognition is one of the most fundamental topics in the computer-vision and robotics communities, where the task is to accurately and efficiently recognize the location of a given query image. Despite years of knowledge accumulated in this field, place recognition still remains an open problem due to the various ways in which the appearance of real-world places may differ. This paper presents an overview of the place-recognition literature. Since condition-invariant and viewpoint-invariant features are essential factors to long-term robust visual place-recognition systems, we start with traditional image-description methodology developed in the past, which exploits techniques from the image-retrieval field. Recently, the rapid advances of related fields, such as object detection and image classification, have inspired a new technique to improve visual place-recognition systems, that is, convolutional neural networks (CNNs). Thus, we then introduce the recent progress of visual place-recognition systems based on CNNs to automatically learn better image representations for places. Finally, we close with discussions and mention of future work on place recognition.

Download Full-text

Visual Place Recognition in Long-term and Large-scale Environment based on CNN Feature

2018 IEEE Intelligent Vehicles Symposium (IV) ◽

10.1109/ivs.2018.8500686 ◽

2018 ◽

Cited By ~ 2

Author(s):

Jianliang Zhu ◽

Yunfeng Ai ◽

Bin Tian ◽

Dongpu Cao ◽

Sebastian Scherer

Keyword(s):

Large Scale ◽

Place Recognition ◽

Visual Place Recognition

Download Full-text

Tree-based indexing for real-time ConvNet landmark-based visual place recognition

International Journal of Advanced Robotic Systems ◽

10.1177/1729881416686951 ◽

2017 ◽

Vol 14 (1) ◽

pp. 172988141668695 ◽

Cited By ~ 4

Author(s):

Yi Hou ◽

Hong Zhang ◽

Shilin Zhou

Keyword(s):

Real Time ◽

Environmental Changes ◽

Dimensional Space ◽

Search Time ◽

Lookup Table ◽

Current View ◽

Place Recognition ◽

Linear Search ◽

Query Image ◽

Visual Place Recognition

Recent impressive studies on using ConvNet landmarks for visual place recognition take an approach that involves three steps: (a) detection of landmarks, (b) description of the landmarks by ConvNet features using a convolutional neural network, and (c) matching of the landmarks in the current view with those in the database views. Such an approach has been shown to achieve the state-of-the-art accuracy even under significant viewpoint and environmental changes. However, the computational burden in step (c) significantly prevents this approach from being applied in practice, due to the complexity of linear search in high-dimensional space of the ConvNet features. In this article, we propose two simple and efficient search methods to tackle this issue. Both methods are built upon tree-based indexing. Given a set of ConvNet features of a query image, the first method directly searches the features’ approximate nearest neighbors in a tree structure that is constructed from ConvNet features of database images. The database images are voted on by features in the query image, according to a lookup table which maps each ConvNet feature to its corresponding database image. The database image with the highest vote is considered the solution. Our second method uses a coarse-to-fine procedure: the coarse step uses the first method to coarsely find the top- N database images, and the fine step performs a linear search in Hamming space of the hash codes of the ConvNet features to determine the best match. Experimental results demonstrate that our methods achieve real-time search performance on five data sets with different sizes and various conditions. Most notably, by achieving an average search time of 0.035 seconds/query, our second method improves the matching efficiency by the three orders of magnitude over a linear search baseline on a database with 20,688 images, with negligible loss in place recognition accuracy.

Download Full-text

CityLearn: Diverse Real-World Environments for Sample-Efficient Navigation Policy Learning

10.36227/techrxiv.12063582.v1 ◽

2020 ◽

Author(s):

Marvin Chancán

Keyword(s):

Real World ◽

Autonomous Driving ◽

Visual Navigation ◽

Place Recognition ◽

Visual Appearance ◽

Learning Techniques ◽

Benchmark Datasets ◽

Image Representations ◽

Visual Place Recognition ◽

Self Motion

<div>Visual navigation tasks in real-world environments often require both self-motion and place recognition feedback. While deep reinforcement learning has shown success in solving these perception and decision-making problems in an end-to-end manner, these algorithms require large amounts of experience to learn navigation policies from high-dimensional data, which is generally impractical for real robots due to sample complexity. In this paper, we address these problems with two main contributions. We first leverage place recognition and deep learning techniques combined with goal destination feedback to generate compact, bimodal image representations that can then be used to effectively learn control policies from a small amount of experience. Second, we present an interactive framework, CityLearn, that enables for the first time training and deployment of navigation algorithms across city-sized, realistic environments with extreme visual appearance changes. CityLearn features more than 10 benchmark datasets, often used in visual place recognition and autonomous driving research, including over 100 recorded traversals across 60 cities around the world. We evaluate our approach on two CityLearn environments, training our navigation policy on a single traversal. Results show our method can be over 2 orders of magnitude faster than when using raw images, and can also generalize across extreme visual changes including day to night and summer to winter transitions.</div>

Download Full-text

NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

10.1109/iros51168.2021.9636640 ◽

2021 ◽

Author(s):

Diwei Sheng ◽

Yuxiang Chai ◽

Xinru Li ◽

Chen Feng ◽

Jianzhe Lin ◽

...

Keyword(s):

Place Recognition ◽

Data Anonymization ◽

Visual Place Recognition

Download Full-text

Variational Bayesian Approach to Condition-Invariant Feature Extraction for Visual Place Recognition

Applied Sciences ◽

10.3390/app11198976 ◽

2021 ◽

Vol 11 (19) ◽

pp. 8976

Author(s):

Junghyun Oh ◽

Gyuho Eoh

Keyword(s):

Large Scale ◽

Environmental Changes ◽

Place Recognition ◽

Stochastic Variational Inference ◽

Invariant Features ◽

Variational Autoencoder ◽

Model Condition ◽

Visual Place Recognition ◽

Perceptual Changes

As mobile robots perform long-term operations in large-scale environments, coping with perceptual changes becomes an important issue recently. This paper introduces a stochastic variational inference and learning architecture that can extract condition-invariant features for visual place recognition in a changing environment. Under the assumption that a latent representation of the variational autoencoder can be divided into condition-invariant and condition-sensitive features, a new structure of the variation autoencoder is proposed and a variational lower bound is derived to train the model. After training the model, condition-invariant features are extracted from test images to calculate the similarity matrix, and the places can be recognized even in severe environmental changes. Experiments were conducted to verify the proposed method, and the experimental results showed that our assumption was reasonable and effective in recognizing places in changing environments.

Download Full-text

SRAL: Shared Representative Appearance Learning for Long-Term Visual Place Recognition

IEEE Robotics and Automation Letters ◽

10.1109/lra.2017.2662061 ◽

2017 ◽

Vol 2 (2) ◽

pp. 1172-1179 ◽

Cited By ~ 15

Author(s):

Fei Han ◽

Xue Yang ◽

Yiming Deng ◽

Mark Rentschler ◽

Dejun Yang ◽

...

Keyword(s):

Place Recognition ◽

Visual Place Recognition

Download Full-text

A spatio-temporal Long-term Memory approach for visual place recognition in mobile robotic navigation

Robotics and Autonomous Systems ◽

10.1016/j.robot.2012.12.004 ◽

2013 ◽

Vol 61 (12) ◽

pp. 1744-1758 ◽

Cited By ~ 4

Author(s):

Vu Anh Nguyen ◽

Janusz A. Starzyk ◽

Wooi-Boon Goh

Keyword(s):

Long Term Memory ◽

Place Recognition ◽

Robotic Navigation ◽

Term Memory ◽

Spatio Temporal ◽

Visual Place Recognition

Download Full-text

Long-Term Visual Robot Localization using Computational Models of the Neocortex

10.48011/asba.v2i1.1036 ◽

2020 ◽

Author(s):

Carlos Alexandre P. Pizzino ◽

Patricia A. Vargas ◽

Ramon R. Costa

Keyword(s):

Computational Models ◽

Internal Representation ◽

Fixed Number ◽

Place Recognition ◽

Biologically Inspired ◽

Hierarchical Temporal Memory ◽

Representational Capacity ◽

Visual Place Recognition ◽

High Degree

Visual place recognition is an essential capability for autonomous mobile robots which use cameras as their primary sensors. Although there has been a considerable amount of research in the topic, the high degree of image variability poses extra research challenges. Following advances in neuroscience, new biologically inspired models have been developed. Inspired by the human neocortex, hierarchical temporal memory model has potential to identify temporal sequences of spatial patterns using sparse distributed representations, which are known to have high representational capacity and high tolerance to noise. These features are interesting for place recognition applications. Some authors have proposed simplifications from the original framework, such as starting from an empty set of minicolumns and increasing the number of minicolumns on demand instead of the usage of a fixed number of minicolumns whose connections adapt over time. In this paper, we investigate the usage of framework originally proposed with the aim of extending the run-time during long-term operations. Results show that the proposed architecture can encode an internal representation of the world using a fixed number of cells in order to improve system scalability.

Download Full-text

Learning Context Flexible Attention Model for Long-Term Visual Place Recognition

IEEE Robotics and Automation Letters ◽

10.1109/lra.2018.2859916 ◽

2018 ◽

Vol 3 (4) ◽

pp. 4015-4022 ◽

Cited By ~ 19

Author(s):

Zetao Chen ◽

Lingqiao Liu ◽

Inkyu Sa ◽

Zongyuan Ge ◽

Margarita Chli

Keyword(s):

Place Recognition ◽

Learning Context ◽

Attention Model ◽

Visual Place Recognition

Download Full-text

Visual Place Recognition via Robust ℓ2-Norm Distance Based Holism and Landmark Integration

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018034 ◽

2019 ◽

Vol 33 ◽

pp. 8034-8041 ◽

Cited By ~ 1

Author(s):

Kai Liu ◽

Hua Wang ◽

Fei Han ◽

Hao Zhang

Keyword(s):

Large Scale ◽

Optimization Problem ◽

Data Sets ◽

Place Recognition ◽

Benchmark Data ◽

Localization And Mapping ◽

Novel Method ◽

Visual Place Recognition ◽

Holistic Representation

Visual place recognition is essential for large-scale simultaneous localization and mapping (SLAM). Long-term robot operations across different time of the days, months, and seasons introduce new challenges from significant environment appearance variations. In this paper, we propose a novel method to learn a location representation that can integrate the semantic landmarks of a place with its holistic representation. To promote the robustness of our new model against the drastic appearance variations due to long-term visual changes, we formulate our objective to use non-squared ℓ2-norm distances, which leads to a difficult optimization problem that minimizes the ratio of the ℓ2,1-norms of matrices. To solve our objective, we derive a new efficient iterative algorithm, whose convergence is rigorously guaranteed by theory. In addition, because our solution is strictly orthogonal, the learned location representations can have better place recognition capabilities. We evaluate the proposed method using two large-scale benchmark data sets, the CMU-VL and Nordland data sets. Experimental results have validated the effectiveness of our new method in long-term visual place recognition applications.

Download Full-text