A Two-Stream Symmetric Network with Bidirectional Ensemble for Aerial Image Matching

Jae-Hyun Park; Woo-Jeoung Nam; Seong-Whan Lee

doi:10.3390/rs12030465

A NEW PARADIGM FOR MATCHING UAV- AND AERIAL IMAGES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-iii-3-83-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 83-90 ◽

Cited By ~ 3

Author(s):

T. Koch ◽

X. Zhuo ◽

P. Reinartz ◽

F. Fraundorfer

Keyword(s):

Real World ◽

Image Matching ◽

Feature Matching ◽

Ground Truth ◽

Geometric Constraints ◽

Feature Point ◽

Aerial Images ◽

Aerial Image ◽

Ratio Test ◽

New Paradigm

This paper investigates the performance of SIFT-based image matching regarding large differences in image scaling and rotation, as this is usually the case when trying to match images captured from UAVs and airplanes. This task represents an essential step for image registration and 3d-reconstruction applications. Various real world examples presented in this paper show that SIFT, as well as A-SIFT perform poorly or even fail in this matching scenario. Even if the scale difference in the images is known and eliminated beforehand, the matching performance suffers from too few feature point detections, ambiguous feature point orientations and rejection of many correct matches when applying the ratio-test afterwards. Therefore, a new feature matching method is provided that overcomes these problems and offers thousands of matches by a novel feature point detection strategy, applying a one-to-many matching scheme and substitute the ratio-test by adding geometric constraints to achieve geometric correct matches at repetitive image regions. This method is designed for matching almost nadir-directed images with low scene depth, as this is typical in UAV and aerial image matching scenarios. We tested the proposed method on different real world image pairs. While standard SIFT failed for most of the datasets, plenty of geometrical correct matches could be found using our approach. Comparing the estimated fundamental matrices and homographies with ground-truth solutions, mean errors of few pixels can be achieved.

Download Full-text

Performance Validation of High Resolution Digital Surface Models Generated by Dense Image Matching with the Aerial Images

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-1-429-2014 ◽

2014 ◽

Vol XL-1 ◽

pp. 429-433

Author(s):

N. Yastikli ◽

H. Bayraktar ◽

Z. Erisir

Keyword(s):

High Resolution ◽

Image Matching ◽

Automatic Generation ◽

Aerial Images ◽

Aerial Image ◽

Ground Sample ◽

Dense Image ◽

Surface Models ◽

Performance Validation ◽

Sample Distance

The digital surface models (DSM) are the most popular products to determine visible surface of Earth which includes all non-terrain objects such as vegetation, forest, and man-made constructions. The airborne light detection and ranging (LiDAR) is the preferred technique for high resolution DSM generation in local coverage. The automatic generation of the high resolution DSM is also possible with stereo image matching using the aerial images. The image matching algorithms usually rely on the feature based matching for DSM generation. First, feature points are extracted and then corresponding features are searched in the overlapping images. These image matching algorithms face with the problems in the areas which have repetitive pattern such as urban structure and forest. <br><br> The recent innovation in camera technology and image matching algorithm enabled the automatic dense DSM generation for large scale city and environment modelling. The new pixel-wise matching approaches are generates very high resolution DSMs which corresponds to the ground sample distance (GSD) of the original images. The numbers of the research institutes and photogrammetric software vendors are currently developed software tools for dense DSM generation using the aerial images. This new approach can be used high resolution DSM generation for the larger cities, rural areas and forest even Nation-wide applications. In this study, the performance validation of high resolution DSM generated by pixel-wise dense image matching in part of Istanbul was aimed. The study area in Istanbul is including different land classes such as open areas, forest and built-up areas to test performance of dense image matching in different land classes. The obtained result from this performance validation in Istanbul test area showed that, high resolution DSM which corresponds to the ground sample distance (GSD) of original aerial image can be generated successfully by pixel-wise dense image matching using commercial and research institution’s software.

Download Full-text

A Two-Stage Deep Learning Registration Method for Remote Sensing Images Based on Sub-Image Matching

Remote Sensing ◽

10.3390/rs13173443 ◽

2021 ◽

Vol 13 (17) ◽

pp. 3443

Author(s):

Yuan Chen ◽

Jie Jiang

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Image Matching ◽

Google Earth ◽

Geometric Transformation ◽

Remote Sensing Images ◽

Two Stage ◽

Registration Method ◽

Transformation Parameters ◽

The Impact

The registration of multi-temporal remote sensing images with abundant information and complex changes is an important preprocessing step for subsequent applications. This paper presents a novel two-stage deep learning registration method based on sub-image matching. Unlike the conventional registration framework, the proposed network learns the mapping between matched sub-images and the geometric transformation parameters directly. In the first stage, the matching of sub-images (MSI), sub-images cropped from the images are matched through the corresponding heatmaps, which are made of the predicted similarity of each sub-image pairs. The second stage, the estimation of transformation parameters (ETP), a network with weight structure and position embedding estimates the global transformation parameters from the matched pairs. The network can deal with an uncertain number of matched sub-image inputs and reduce the impact of outliers. Furthermore, the sample sharing training strategy and the augmentation based on the bounding rectangle are introduced. We evaluated our method by comparing the conventional and deep learning methods qualitatively and quantitatively on Google Earth, ISPRS, and WHU Building Datasets. The experiments showed that our method obtained the probability of correct keypoints (PCK) of over 99% at α = 0.05 (α: the normalized distance threshold) and achieved a maximum increase of 16.8% at α = 0.01, compared with the latest method. The results demonstrated that our method has good robustness and improved the precision in the registration of optical remote sensing images with great variation.

Download Full-text

A NEW PARADIGM FOR MATCHING UAV- AND AERIAL IMAGES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iii-3-83-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 83-90 ◽

Cited By ~ 1

Author(s):

T. Koch ◽

X. Zhuo ◽

P. Reinartz ◽

F. Fraundorfer

Keyword(s):

Real World ◽

Image Matching ◽

Feature Matching ◽

Ground Truth ◽

Geometric Constraints ◽

Feature Point ◽

Aerial Images ◽

Aerial Image ◽

Ratio Test ◽

New Paradigm

This paper investigates the performance of SIFT-based image matching regarding large differences in image scaling and rotation, as this is usually the case when trying to match images captured from UAVs and airplanes. This task represents an essential step for image registration and 3d-reconstruction applications. Various real world examples presented in this paper show that SIFT, as well as A-SIFT perform poorly or even fail in this matching scenario. Even if the scale difference in the images is known and eliminated beforehand, the matching performance suffers from too few feature point detections, ambiguous feature point orientations and rejection of many correct matches when applying the ratio-test afterwards. Therefore, a new feature matching method is provided that overcomes these problems and offers thousands of matches by a novel feature point detection strategy, applying a one-to-many matching scheme and substitute the ratio-test by adding geometric constraints to achieve geometric correct matches at repetitive image regions. This method is designed for matching almost nadir-directed images with low scene depth, as this is typical in UAV and aerial image matching scenarios. We tested the proposed method on different real world image pairs. While standard SIFT failed for most of the datasets, plenty of geometrical correct matches could be found using our approach. Comparing the estimated fundamental matrices and homographies with ground-truth solutions, mean errors of few pixels can be achieved.

Download Full-text

SEMANTIC SEGMENTATION AND DIFFERENCE EXTRACTION VIA TIME SERIES AERIAL VIDEO CAMERA AND ITS APPLICATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-7-w3-1119-2015 ◽

2015 ◽

Vol XL-7/W3 ◽

pp. 1119-1122

Author(s):

S. N. K. Amit ◽

S. Saito ◽

S. Sasaki ◽

Y. Kiyoki ◽

Y. Aoki

Keyword(s):

Neural Network ◽

Time Series ◽

Deep Neural Network ◽

Ground Truth ◽

Semantic Segmentation ◽

Google Earth ◽

Multimedia Data ◽

Aerial Images ◽

Aerial Image ◽

Post Disaster

Google earth with high-resolution imagery basically takes months to process new images before online updates. It is a time consuming and slow process especially for post-disaster application. The objective of this research is to develop a fast and effective method of updating maps by detecting local differences occurred over different time series; where only region with differences will be updated. In our system, aerial images from Massachusetts’s road and building open datasets, Saitama district datasets are used as input images. Semantic segmentation is then applied to input images. Semantic segmentation is a pixel-wise classification of images by implementing deep neural network technique. Deep neural network technique is implemented due to being not only efficient in learning highly discriminative image features such as road, buildings etc., but also partially robust to incomplete and poorly registered target maps. Then, aerial images which contain semantic information are stored as database in 5D world map is set as ground truth images. This system is developed to visualise multimedia data in 5 dimensions; 3 dimensions as spatial dimensions, 1 dimension as temporal dimension, and 1 dimension as degenerated dimensions of semantic and colour combination dimension. Next, ground truth images chosen from database in 5D world map and a new aerial image with same spatial information but different time series are compared via difference extraction method. The map will only update where local changes had occurred. Hence, map updating will be cheaper, faster and more effective especially post-disaster application, by leaving unchanged region and only update changed region.

Download Full-text

Aerial Image Matching based on NSST and Quaternion Exponential Moment

10.23940/ijpe.18.11.p12.26632673 ◽

2018 ◽

Author(s):

Huan Wang

Keyword(s):

Image Matching ◽

Aerial Image ◽

Exponential Moment

Download Full-text

Fig Plant Segmentation from Aerial Images Using a Deep Convolutional Encoder-Decoder Network

Remote Sensing ◽

10.3390/rs11101157 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1157 ◽

Cited By ~ 8

Author(s):

Jorge Fuentes-Pacheco ◽

Juan Torres-Olivares ◽

Edgar Roman-Rangel ◽

Salvador Cervantes ◽

Porfirio Juarez-Lopez ◽

...

Keyword(s):

Precision Agriculture ◽

Image Data ◽

Ground Truth ◽

Aerial Images ◽

Aerial Image ◽

Data Set ◽

Visual Appearance ◽

Aerial Robots ◽

Lighting Conditions ◽

Convolutional Encoder

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.

Download Full-text

LARGE SCALE CITY MAPPING USING SATELLITE IMAGERY / KOSMINIŲ NUOTRAUKŲ IŠ GOOGLE EARTH, TAIKOMŲ MIESTAMS KARTOGRAFUOTI STAMBIUOJU MASTELIU, REKTIFIKAVIMAS / РЕКТИФИКАЦИЯ КОСМИЧЕСКИХ СНИМКОВ ИЗ GOOGLE EARTH ДЛЯ КРУПНОМАСШТАБНОГО КАРТОГРАФИРОВАНИЯ ГОРОДОВ

Geodesy and Cartography ◽

10.3846/13921541.2011.645348 ◽

2012 ◽

Vol 37 (4) ◽

pp. 168-171 ◽

Cited By ~ 1

Author(s):

Birutė Ruzgienė ◽

Qian Yi Xiang ◽

Silvija Gečytė

Keyword(s):

Satellite Imagery ◽

Large Scale ◽

Modern Technology ◽

Google Earth ◽

Aerial Images ◽

Control Points ◽

Image Rectification ◽

Image Deformation ◽

Map Construction ◽

Short Period

The rectification of high resolution digital aerial images or satellite imagery employed for large scale city mapping is modern technology that needs well distributed and accurately defined control points. Digital satellite imagery, obtained using widely known software Google Earth, can be applied for accurate city map construction. The method of five control points is suggested for imagery rectification introducing the algorithm offered by Prof. Ruan Wei (tong ji University, Shanghai). Image rectification software created on the basis of the above suggested algorithm can correct image deformation with required accuracy, is reliable and keeps advantages in flexibility. Experimental research on testing the applied technology has been executed using GeoEye imagery with Google Earth builder over the city of Vilnius. Orthophoto maps at the scales of 1:1000 and 1:500 are generated referring to the methodology of five control points. Reference data and rectification results are checked comparing with those received from processing digital aerial images using a digital photogrammetry approach. The image rectification process applying the investigated method takes a short period of time (about 4-5 minutes) and uses only five control points. The accuracy of the created models satisfies requirements for large scale mapping. Santrauka Didelės skiriamosios gebos skaitmeninių nuotraukų ir kosminių nuotraukų rektifikavimas miestams kartografuoti stambiuoju masteliu yra nauja technologija. Tai atliekant būtini tikslūs ir aiškiai matomi kontroliniai taškai. Skaitmeninės kosminės nuotraukos, gautos taikant plačiai žinomą programinį paketą Google Earth, gali būti naudojamos miestams kartografuoti dideliu tikslumu. Siūloma nuotraukas rektifikuoti Penkių kontrolinių taskų metodu pagal prof. Ruan Wei (Tong Ji universitetas, Šanchajus) algoritmą. Moksliniam eksperimentui pasirinkta Vilniaus GeoEye nuotrauka iš Google Earth. 1:1000 ir 1:500 mastelio ortofotografiniai žemėlapiai sudaromi Penkių kontrolinių taškų metodu. Rektifikavimo duomenys lyginami su skaitmeninių nuotraukų apdorojimo rezultatais, gautais skaitmeninės fotogrametrijos metodu. Nuotraukų rektifikavimas Penkių kontrolinių taskų metodu atitinka kartografavimo stambiuoju masteliu reikalavimus, sumažėja laiko sąnaudos. Резюме Ректификация цифровых и космических снимков высокой резолюции для крупномасштабного картографирования является новой технологией, требующей точных и четких контрольных точек. Цифровые космические снимки, полученные с использованием широкоизвестного программного пакета Google Earth, могут применяться для точного картографирования городов. Для ректификации снимков предложен метод пяти контрольных точек с применением алгоритма проф. Ruan Wei (Университет Tong Ji, Шанхай). Для научного эксперимента использован снимок города Вильнюса GeoEye из Google Earth. Ортофотографические карты в масштабе 1:1000 и 1:500 генерируются с применением метода пяти контрольных точек. Полученные результаты и данные ректификации сравниваются с результатами цифровых снимков, полученных с применением метода цифровой фотограмметрии. Ректификация снимков с применением метода пяти контрольных точек уменьшает временные расходы и удовлетворяет требования, предъявляемые к крупномасштабному картографированию.

Download Full-text

A Multi-Task Network with Distance–Mask–Boundary Consistency Constraints for Building Extraction from Aerial Images

Remote Sensing ◽

10.3390/rs13142656 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2656

Author(s):

Furong Shi ◽

Tong Zhang

Keyword(s):

Distance Estimation ◽

Image Data ◽

Learning Technologies ◽

Aerial Images ◽

Superior Performance ◽

Aerial Image ◽

Great Success ◽

Building Extraction ◽

Shape Information ◽

Multi Scale

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.

Download Full-text

Within-person variability can improve the identification of unfamiliar faces across changes in viewpoint

Quarterly Journal of Experimental Psychology ◽

10.1177/17470218211009771 ◽

2021 ◽

pp. 174702182110097

Author(s):

Niamh Hunnisett ◽

Simone Favelle

Keyword(s):

Image Matching ◽

Matching Task ◽

Target Image ◽

Front View ◽

Multiple Image ◽

Profile View ◽

Unfamiliar Face ◽

Matching Performance ◽

Sequential Matching ◽

Unfamiliar Faces

Unfamiliar face identification is concerningly error prone, especially across changes in viewing conditions. Within-person variability has been shown to improve matching performance for unfamiliar faces, but this has only been demonstrated using images of a front view. In this study, we test whether the advantage of within-person variability from front views extends to matching to target images of a face rotated in view. Participants completed either a simultaneous matching task (Experiment 1) or a sequential matching task (Experiment 2) in which they were tested on their ability to match the identity of a face shown in an array of either one or three ambient front-view images, with a target image shown in front, three-quarter, or profile view. While the effect was stronger in Experiment 2, we found a consistent pattern in match trials across both experiments in that there was a multiple image matching benefit for front, three-quarter, and profile-view targets. We found multiple image effects for match trials only, indicating that providing observers with multiple ambient images confers an advantage for recognising different images of the same identity but not for discriminating between images of different identities. Signal detection measures also indicate a multiple image advantage despite a more liberal response bias for multiple image trials. Our results show that within-person variability information for unfamiliar faces can be generalised across views and can provide insights into the initial processes involved in the representation of familiar faces.

Download Full-text