scholarly journals Fully Convolutional Networks and Geographic Object-Based Image Analysis for the Classification of VHR Imagery

2019 ◽  
Vol 11 (5) ◽  
pp. 597 ◽  
Author(s):  
Nicholus Mboga ◽  
Stefanos Georganos ◽  
Tais Grippa ◽  
Moritz Lennert ◽  
Sabine Vanhuysse ◽  
...  

Land cover Classified maps obtained from deep learning methods such as Convolutional neural networks (CNNs) and fully convolutional networks (FCNs) usually have high classification accuracy but with the detailed structures of objects lost or smoothed. In this work, we develop a methodology based on fully convolutional networks (FCN) that is trained in an end-to-end fashion using aerial RGB images only as input. Skip connections are introduced into the FCN architecture to recover high spatial details from the lower convolutional layers. The experiments are conducted on the city of Goma in the Democratic Republic of Congo. We compare the results to a state-of-the art approach based on a semi-automatic Geographic object image-based analysis (GEOBIA) processing chain. State-of-the art classification accuracies are obtained by both methods whereby FCN and the best baseline method have an overall accuracy of 91.3% and 89.5% respectively. The maps have good visual quality and the use of an FCN skip architecture minimizes the rounded edges that is characteristic of FCN maps. Additional experiments are done to refine FCN classified maps using segments obtained from GEOBIA generated at different scale and minimum segment size. High OA of up to 91.5% is achieved accompanied with an improved edge delineation in the FCN maps, and future work will involve explicitly incorporating boundary information from the GEOBIA segmentation into the FCN pipeline in an end-to-end fashion. Finally, we observe that FCN has a lower computational cost than the standard patch-based CNN approach especially at inference.

2019 ◽  
Vol 11 (6) ◽  
pp. 684 ◽  
Author(s):  
Maria Papadomanolaki ◽  
Maria Vakalopoulou ◽  
Konstantinos Karantzalos

Deep learning architectures have received much attention in recent years demonstrating state-of-the-art performance in several segmentation, classification and other computer vision tasks. Most of these deep networks are based on either convolutional or fully convolutional architectures. In this paper, we propose a novel object-based deep-learning framework for semantic segmentation in very high-resolution satellite data. In particular, we exploit object-based priors integrated into a fully convolutional neural network by incorporating an anisotropic diffusion data preprocessing step and an additional loss term during the training process. Under this constrained framework, the goal is to enforce pixels that belong to the same object to be classified at the same semantic category. We compared thoroughly the novel object-based framework with the currently dominating convolutional and fully convolutional deep networks. In particular, numerous experiments were conducted on the publicly available ISPRS WGII/4 benchmark datasets, namely Vaihingen and Potsdam, for validation and inter-comparison based on a variety of metrics. Quantitatively, experimental results indicate that, overall, the proposed object-based framework slightly outperformed the current state-of-the-art fully convolutional networks by more than 1% in terms of overall accuracy, while intersection over union results are improved for all semantic categories. Qualitatively, man-made classes with more strict geometry such as buildings were the ones that benefit most from our method, especially along object boundaries, highlighting the great potential of the developed approach.


2021 ◽  
Vol 11 (15) ◽  
pp. 6975
Author(s):  
Tao Zhang ◽  
Lun He ◽  
Xudong Li ◽  
Guoqing Feng

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.


2012 ◽  
Vol 18 (2) ◽  
pp. 302-326 ◽  
Author(s):  
Cristiane Nunes Francisco ◽  
Cláudia Maria de Almeida

Este artigo tem como objetivo avaliar o desempenho de duas redes semânticas geradas por mineração de dados para a classificação de cobertura da terra por meio de análise de imagens baseada em objetos geográficos (GEographic Object-Based Image Analysis - GEOBIA). Para isto, uma rede utilizou-se de descritores estatísticos e texturais, e a outra, apenas de descritores estatísticos. A base de dados foi constituída de imagens ALOS/AVNIR fusionadas com imagens ALOS/PRISM e dados de relevo provenientes do banco de dados TOPODATA. A área de estudo corresponde ao município de Nova Friburgo, com 933 km², localizado na região serrana do estado do Rio de Janeiro. O índice Kappa alcançado pela classificação baseada em árvore de decisão composta por descritores estatísticos e texturais foi de 0,81, enquanto que este valor para a classificação derivada apenas de descritores estatísticos foi de 0,84. Considerando os índices alcançados, conclui-se que ambos os resultados apresentam excelente qualidade quanto à acurácia da classificação. O teste de hipótese entre os dois índices mostra, com nível de significância de 5%, que não há diferenças entre as duas classificações quanto à acurácia.


Sensors ◽  
2020 ◽  
Vol 20 (14) ◽  
pp. 3818
Author(s):  
Ye Zhang ◽  
Yi Hou ◽  
Shilin Zhou ◽  
Kewei Ouyang

Recent advances in time series classification (TSC) have exploited deep neural networks (DNN) to improve the performance. One promising approach encodes time series as recurrence plot (RP) images for the sake of leveraging the state-of-the-art DNN to achieve accuracy. Such an approach has been shown to achieve impressive results, raising the interest of the community in it. However, it remains unsolved how to handle not only the variability in the distinctive region scale and the length of sequences but also the tendency confusion problem. In this paper, we tackle the problem using Multi-scale Signed Recurrence Plots (MS-RP), an improvement of RP, and propose a novel method based on MS-RP images and Fully Convolutional Networks (FCN) for TSC. This method first introduces phase space dimension and time delay embedding of RP to produce multi-scale RP images; then, with the use of asymmetrical structure, constructed RP images can represent very long sequences (>700 points). Next, MS-RP images are obtained by multiplying designed sign masks in order to remove the tendency confusion. Finally, FCN is trained with MS-RP images to perform classification. Experimental results on 45 benchmark datasets demonstrate that our method improves the state-of-the-art in terms of classification accuracy and visualization evaluation.


2019 ◽  
Vol 8 (1) ◽  
pp. 46 ◽  
Author(s):  
François Merciol ◽  
Loïc Faucqueur ◽  
Bharath Damodaran ◽  
Pierre-Yves Rémy ◽  
Baudouin Desclée ◽  
...  

Land cover mapping has benefited a lot from the introduction of the Geographic Object-Based Image Analysis (GEOBIA) paradigm, that allowed to move from a pixelwise analysis to a processing of elements with richer semantic content, namely objects or regions. However, this paradigm requires to define an appropriate scale, that can be challenging in a large-area study where a wide range of landscapes can be observed. We propose here to conduct the multiscale analysis based on hierarchical representations, from which features known as differential attribute profiles are derived over each single pixel. Efficient and scalable algorithms for construction and analysis of such representations, together with an optimized usage of the random forest classifier, provide us with a semi-supervised framework in which a user can drive mapping of elements such as Small Woody Features at a very large area. Indeed, the proposed open-source methodology has been successfully used to derive a part of the High Resolution Layers (HRL) product of the Copernicus Land Monitoring service, thus showing how the GEOBIA framework can be used in a big data scenario made of more than 38,000 Very High Resolution (VHR) satellite images representing more than 120 TB of data.


2019 ◽  
Vol 8 (12) ◽  
pp. 551 ◽  
Author(s):  
Raphael Knevels ◽  
Helene Petschko ◽  
Philip Leopold ◽  
Alexander Brenning

With the increased availability of high-resolution digital terrain models (HRDTM) generated using airborne light detection and ranging (LiDAR), new opportunities for improved mapping of geohazards such as landslides arise. While the visual interpretation of LiDAR, HRDTM hillshades is a widely used approach, the automatic detection of landslides is promising to significantly speed up the compilation of inventories. Previous studies on automatic landslide detection often used a combination of optical imagery and geomorphometric data, and were implemented in commercial software. The objective of this study was to investigate the potential of open source software for automated landslide detection solely based on HRDTM-derived data in a study area in Burgenland, Austria. We implemented a geographic object-based image analysis (GEOBIA) consisting of (1) the calculation of land-surface variables, textural features and shape metrics, (2) the automated optimization of segmentation scale parameters, (3) region-growing segmentation of the landscape, (4) the supervised classification of landslide parts (scarp and body) using support vector machines (SVM), and (5) an assessment of the overall classification performance using a landslide inventory. We used the free and open source data-analysis environment R and its coupled geographic information system (GIS) software for the analysis; our code is included in the Supplementary Materials. The developed approach achieved a good performance (κ = 0.42) in the identification of landslides.


Water ◽  
2019 ◽  
Vol 11 (6) ◽  
pp. 1133 ◽  
Author(s):  
Mark Randall ◽  
Rasmus Fensholt ◽  
Yongyong Zhang ◽  
Marina Bergen Jensen

China’s Sponge City initiative will involve widespread installation of new stormwater infrastructure including green roofs, permeable pavements and rain gardens in at least 30 cities. Hydrologic modelling can support the planning of Sponge Cities at the catchment scale, however, highly detailed spatial data for model input can be challenging to compile from the various authorities, or, if available, may not be sufficiently detailed or updated. Remote sensing methods show great promise for mitigating this challenge due to their ability to efficiently classify satellite images into categories relevant to a specific application. In this study Geographic Object Based Image Analysis (GEOBIA) was applied to WorldView-3 satellite imagery (2017) to create a detailed land cover map of an urban catchment area in Beijing. While land cover classification results based on a Bayesian machine learning classifier alone provided an overall land cover classification accuracy of 63%, the subsequent inclusion of a series of refining rules in combination with supplementary data (including elevation and parcel delineations), yielded the significantly improved overall accuracy of 76%. Results of the land cover classification highlight the limitations of automated classification based on satellite imagery alone and the value of supplementary data and additional rules to refine classification results. Catchment scale hydrologic modelling based on the generated land cover results indicated that 61 to 82% of rainfall volume could be captured for a range of 24 h design storms under varying degrees of Sponge City implementation.


Sign in / Sign up

Export Citation Format

Share Document