scholarly journals Efficient Discrimination and Localization of Multimodal Remote Sensing Images Using CNN-Based Prediction of Localization Uncertainty

2020 ◽  
Vol 12 (4) ◽  
pp. 703 ◽  
Author(s):  
Mykhail Uss ◽  
Benoit Vozel ◽  
Vladimir Lukin ◽  
Kacem Chehdi

Detecting similarities between image patches and measuring their mutual displacement are important parts in the registration of multimodal remote sensing (RS) images. Deep learning approaches advance the discriminative power of learned similarity measures (SM). However, their ability to find the best spatial alignment of the compared patches is often ignored. We propose to unify the patch discrimination and localization problems by assuming that the more accurately two patches can be aligned, the more similar they are. The uncertainty or confidence in the localization of a patch pair serves as a similarity measure of these patches. We train a two-channel patch matching convolutional neural network (CNN), called DLSM, to solve a regression problem with uncertainty. This CNN inputs two multimodal patches, and outputs a prediction of the translation vector between the input patches as well as the uncertainty of this prediction in the form of an error covariance matrix of the translation vector. The proposed patch matching CNN predicts a normal two-dimensional distribution of the translation vector rather than a simple value of it. The determinant of the covariance matrix is used as a measure of uncertainty in the matching of patches and also as a measure of similarity between patches. For training, we used the Siamese architecture with three towers. During training, the input of two towers is the same pair of multimodal patches but shifted by a random translation; the last tower is fed by a pair of dissimilar patches. Experiments performed on a large base of real RS images show that the proposed DLSM has both a higher discriminative power and a more precise localization compared to existing hand-crafted SMs and SMs trained with conventional losses. Unlike existing SMs, DLSM correctly predicts translation error distribution ellipse for different modalities, noise level, isotropic, and anisotropic structures.

2021 ◽  
Vol 13 (11) ◽  
pp. 2171
Author(s):  
Yuhao Qing ◽  
Wenyi Liu ◽  
Liuyan Feng ◽  
Wanjia Gao

Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.


2020 ◽  
Vol 12 (11) ◽  
pp. 1772
Author(s):  
Brian Alan Johnson ◽  
Lei Ma

Image segmentation and geographic object-based image analysis (GEOBIA) were proposed around the turn of the century as a means to analyze high-spatial-resolution remote sensing images. Since then, object-based approaches have been used to analyze a wide range of images for numerous applications. In this Editorial, we present some highlights of image segmentation and GEOBIA research from the last two years (2018–2019), including a Special Issue published in the journal Remote Sensing. As a final contribution of this special issue, we have shared the views of 45 other researchers (corresponding authors of published papers on GEOBIA in 2018–2019) on the current state and future priorities of this field, gathered through an online survey. Most researchers surveyed acknowledged that image segmentation/GEOBIA approaches have achieved a high level of maturity, although the need for more free user-friendly software and tools, further automation, better integration with new machine-learning approaches (including deep learning), and more suitable accuracy assessment methods was frequently pointed out.


2018 ◽  
Vol 10 (8) ◽  
pp. 1298 ◽  
Author(s):  
Lei Yin ◽  
Xiangjun Wang ◽  
Yubo Ni ◽  
Kai Zhou ◽  
Jilong Zhang

Multi-camera systems are widely used in the fields of airborne remote sensing and unmanned aerial vehicle imaging. The measurement precision of these systems depends on the accuracy of the extrinsic parameters. Therefore, it is important to accurately calibrate the extrinsic parameters between the onboard cameras. Unlike conventional multi-camera calibration methods with a common field of view (FOV), multi-camera calibration without overlapping FOVs has certain difficulties. In this paper, we propose a calibration method for a multi-camera system without common FOVs, which is used on aero photogrammetry. First, the extrinsic parameters of any two cameras in a multi-camera system is calibrated, and the extrinsic matrix is optimized by the re-projection error. Then, the extrinsic parameters of each camera are unified to the system reference coordinate system by using the global optimization method. A simulation experiment and a physical verification experiment are designed for the theoretical arithmetic. The experimental results show that this method is operable. The rotation error angle of the camera’s extrinsic parameters is less than 0.001rad and the translation error is less than 0.08 mm.


Author(s):  
Ying Qin

This study extracts the comments from a large scale of Chinese EFL learners' translation corpus to study the taxonomy of translation errors. Two unsupervised machine learning approaches are used to obtain the computational evidences of translation error taxonomy. After manually revision, ten types of English to Chinese (E2C) and eight types Chinese to English (C2E) translation errors are finally confirmed. There probably exists three categories of top-level errors according to the hierarchical clustering results. In addition, three supervised learning methods are applied to automatically recognize the types of errors, among which the highest performance reaches F1 = 0.85 on E2C and F1 = 0.90 on C2E translation. Further comparison to the intuitive or theoretical studies on translation taxonomy shows some phenomenon accompanied by language skill improvement of Chinese learners. Analysis on translation problems based on machine learning provides the objective insight and understanding on the students' translations.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3929 ◽  
Author(s):  
Grigorios Tsagkatakis ◽  
Anastasia Aidini ◽  
Konstantina Fotiadou ◽  
Michalis Giannopoulos ◽  
Anastasia Pentari ◽  
...  

Deep Learning, and Deep Neural Networks in particular, have established themselves as the new norm in signal and data processing, achieving state-of-the-art performance in image, audio, and natural language understanding. In remote sensing, a large body of research has been devoted to the application of deep learning for typical supervised learning tasks such as classification. Less yet equally important effort has also been allocated to addressing the challenges associated with the enhancement of low-quality observations from remote sensing platforms. Addressing such channels is of paramount importance, both in itself, since high-altitude imaging, environmental conditions, and imaging systems trade-offs lead to low-quality observation, as well as to facilitate subsequent analysis, such as classification and detection. In this paper, we provide a comprehensive review of deep-learning methods for the enhancement of remote sensing observations, focusing on critical tasks including single and multi-band super-resolution, denoising, restoration, pan-sharpening, and fusion, among others. In addition to the detailed analysis and comparison of recently presented approaches, different research avenues which could be explored in the future are also discussed.


2020 ◽  
Vol 12 (10) ◽  
pp. 1586
Author(s):  
Leonardo F. Arias-Rodriguez ◽  
Zheng Duan ◽  
Rodrigo Sepúlveda ◽  
Sergio I. Martinez-Martinez ◽  
Markus Disse

Remote-sensing-based machine learning approaches for water quality parameters estimation, Secchi Disk Depth (SDD) and Turbidity, were developed for the Valle de Bravo reservoir in central Mexico. This waterbody is a multipurpose reservoir, which provides drinking water to the metropolitan area of Mexico City. To reveal the water quality status of inland waters in the last decade, evaluation of MERIS imagery is a substantial approach. This study incorporated in-situ collected measurements across the reservoir and remote sensing reflectance data from the Medium Resolution Imaging Spectrometer (MERIS). Machine learning approaches with varying complexities were tested, and the optimal model for SDD and Turbidity was determined. Cross-validation demonstrated that the satellite-based estimates are consistent with the in-situ measurements for both SDD and Turbidity, with R2 values of 0.81 to 0.86 and RMSE of 0.15 m and 0.95 nephelometric turbidity units (NTU). The best model was applied to time series of MERIS images to analyze the spatial and temporal variations of the reservoir’s water quality from 2002 to 2012. Derived analysis revealed yearly patterns caused by dry and rainy seasons and several disruptions were identified. The reservoir varied from trophic to intermittent hypertrophic status, while SDD ranged from 0–1.93 m and Turbidity up to 23.70 NTU. Results suggest the effects of drought events in the years 2006 and 2009 on water quality were correlated with water quality detriment. The water quality displayed slow recovery through 2011–2012. This study demonstrates the usefulness of satellite observations for supporting inland water quality monitoring and water management in this region.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1302
Author(s):  
Luis Naranjo-Zeledón ◽  
Mario Chacón-Rivas ◽  
Jesús Peral ◽  
Antonio Ferrández

The study of phonological proximity makes it possible to establish a basis for future decision-making in the treatment of sign languages. Knowing how close a set of signs are allows the interested party to decide more easily its study by clustering, as well as the teaching of the language to third parties based on similarities. In addition, it lays the foundation for strengthening disambiguation modules in automatic recognition systems. To the best of our knowledge, this is the first study of its kind for Costa Rican Sign Language (LESCO, for its Spanish acronym), and forms the basis for one of the modules of the already operational system of sign and speech editing called the International Platform for Sign Language Edition (PIELS). A database of 2665 signs, grouped into eight contexts, is used, and a comparison of similarity measures is made, using standard statistical formulas to measure their degree of correlation. This corpus will be especially useful in machine learning approaches. In this work, we have proposed an analysis of different similarity measures between signs in order to find out the phonological proximity between them. After analyzing the results obtained, we can conclude that LESCO is a sign language with high levels of phonological proximity, particularly in the orientation and location components, but they are noticeably lower in the form component. We have also concluded as an outstanding contribution of our research that automatic recognition systems can take as a basis for their first prototypes the contexts or sign domains that map to clusters with lower levels of similarity. As mentioned, the results obtained have multiple applications such as in the teaching area or the Natural Language Processing area for automatic recognition tasks.


Sign in / Sign up

Export Citation Format

Share Document