scholarly journals Semantic Segmentation of Building Roof in Dense Urban Environment with Deep Convolutional Neural Network: A Case Study Using GF2 VHR Imagery in China

Sensors ◽  
2019 ◽  
Vol 19 (5) ◽  
pp. 1164 ◽  
Author(s):  
Yuchu Qin ◽  
Yunchao Wu ◽  
Bin Li ◽  
Shuai Gao ◽  
Miao Liu ◽  
...  

This paper presents a novel approach for semantic segmentation of building roofs in dense urban environments with a Deep Convolution Neural Network (DCNN) using Chinese Very High Resolution (VHR) satellite (i.e., GF2) imagery. To provide an operational end-to-end approach for accurately mapping build roofs with feature extraction and image segmentation, a fully convolutional DCNN with both convolutional and deconvolutional layers is designed to perform building roof segmentation. We selected typical cities with dense and diverse urban environments in different metropolitan regions of China as study areas, and sample images were collected over cities. High performance GPU-mounted workstations are employed to perform the model training and optimization. With the building roof samples collected over different cities, the predictive model with convolution layers is developed for building roof segmentation. The validation shows that the overall accuracy (OA) and the mean Intersection Over Union (mIOU) of DCNN-based semantic segmentation results are 94.67% and 0.85, respectively, and the CRF-refined segmentation results achieved OA of 94.69% and mIOU of 0.83. The results suggest that the proposed approach is a promising solution for building roof mapping with VHR images over large areas in dense urban environments with different building patterns. With the operational acquisition of GF2 VHR imagery, it is expected to develop an automated pipeline of operational built-up area monitoring, and the timely update of building roof map could be applied in urban management and assessment of human settlement-related sustainable development goals over large areas.

Author(s):  
Yuchu Qin ◽  
Yunchao Wu ◽  
Bin Li ◽  
Shuai Gao ◽  
Miao Liu ◽  
...  

This paper presents a novel approach for semantic segmentation of building roof in dense urban environment with Deep Convolution Neural Network (DCNN) using imagery acquired by a Chinese Very High Resolution (VHR) satellite mission, i.e. GaoFen-2 (GF-2). To provide an operational end-to-end work flow for accurate build roof mapping with feature extraction as well as image segmentation, a fully convolutional DCNN with both convolutional and deconvolutional layers is designed to perform the VHR image analysis for labeling pixels. Since the diverse urban patterns and building styles in large areas, sample image data sets of building roof and non-building roof are collected over different metropolitan regions in China. We selected typical cities with dense urban environment in each metropolitan region as study areas for collecting training and test samples. High performance cluster with GPU-mounted workstations is employed to perform the model training and optimization. With the building roof samples collected over different cities, the predictive model with multiple NN layers is developed for building roof labeling. The validation of the building roof map shows that the overall accuracy(OA) and the mean Intersection Over Union( mIOU) of DCNN based segmentation are 94.67%, 0.85 respectively, while CRF-refined segmentation achieved OA of 94.69% and mIOU of 0.83. The results suggest that the proposed approach is a promising solution for building roof mapping with VHR images over large areas across different urban and building patterns. With the operational acquisition of GF2 VHR imagery, it is expected to develop an automated pipeline for operational built-up area monitoring and timely update of building roof map over large areas.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1365
Author(s):  
Tao Zheng ◽  
Zhizhao Duan ◽  
Jin Wang ◽  
Guodong Lu ◽  
Shengjie Li ◽  
...  

Semantic segmentation of room maps is an essential issue in mobile robots’ execution of tasks. In this work, a new approach to obtain the semantic labels of 2D lidar room maps by combining distance transform watershed-based pre-segmentation and a skillfully designed neural network lidar information sampling classification is proposed. In order to label the room maps with high efficiency, high precision and high speed, we have designed a low-power and high-performance method, which can be deployed on low computing power Raspberry Pi devices. In the training stage, a lidar is simulated to collect the lidar detection line maps of each point in the manually labelled map, and then we use these line maps and the corresponding labels to train the designed neural network. In the testing stage, the new map is first pre-segmented into simple cells with the distance transformation watershed method, then we classify the lidar detection line maps with the trained neural network. The optimized areas of sparse sampling points are proposed by using the result of distance transform generated in the pre-segmentation process to prevent the sampling points selected in the boundary regions from influencing the results of semantic labeling. A prototype mobile robot was developed to verify the proposed method, the feasibility, validity, robustness and high efficiency were verified by a series of tests. The proposed method achieved higher scores in its recall, precision. Specifically, the mean recall is 0.965, and mean precision is 0.943.


2020 ◽  
pp. 15-21
Author(s):  
R. N. Kvetny ◽  
R. V. Masliy ◽  
A. M. Kyrylenko ◽  
V. V. Shcherba

The article is devoted to the study of object detection in ima­ges using neural networks. The structure of convolutional neural networks used for image processing is considered. The formation of the convolutional layer (Fig. 1), the sub-sampling layer (Fig. 2) and the fully connected layer (Fig. 3) are described in detail. An overview of popular high-performance convolutional neural network architectures used to detect R-FCN, Yolo, Faster R-CNN, SSD, DetectNet objects has been made. The basic stages of image processing by the DetectNet neural network, which is designed to detect objects in images, are discussed. NVIDIA DIGITS was used to create and train models, and several DetectNet models were trained using this environment. The parameters of experiments (Table 1) and the compari­son of the quality of the trained models (Table 2) are presented. As training and validation data, we used an image of the KITTI database, which was created to improve self-driving systems that do not go without built-in devices, one of which could be the Jetson TX2. KITTI’s images feature several object classes, including cars and pedestrians. Model training and testing was performed using a Jetson TX2 supercomputer. Five models were trained that differed in the Base learning rate parameter. The results obtained make it possible to find a compromise value for the Base learning rate para­meter to quickly obtain a model with a high mAP value. The qua­lity of the best model obtained on the KITTI validation dataset is mAP = 57.8%.


Author(s):  
M Vaishnavi ◽  
K Varshitha ◽  
G Usha ◽  
C Mounika ◽  
C Narasimha

This paper proposes a novel approach for Semantic segmentation which is one of the biggest challenge increasing in an order and have been making humans hold keen active interest to result in fast and accurate semantic segmentation. Whereas At present, we are trying to solve this problem of semantic segmentation using the segnet which makes its more accurate interms of accuracy, computational time, and inference time. and here we are using segnet model to take this to the next level which includes max-pooling, Batch normalization techniques to map low-resolution features to input resolution for pixelwise classification and the architecture here consists of an encoder which takes the input image and is identical to 13 convolutional layers and a decoder that uses segnet followed by pixel-wise classification layer. and also when compared with other architectures segnet provides good performance with competitive inference time and most efficient memory. So, therefore here we are presenting deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet.


2019 ◽  
Vol 2019 ◽  
pp. 1-13
Author(s):  
Huan Luo ◽  
Miaohua Huang ◽  
Wei Xiong

The durability and reliability of structural components are usually assessed based on fatigue loading under operating conditions. To obtain accurate fatigue loading in the form of continuous strain histories, a novel approach is proposed based on the combination of a recurrent neural network and simplified semianalytical method. The recurrent neural network named nonlinear autoregressive model with exogenous inputs (NLARX) is applied to determine the relationship between external loads and corresponding fatigue loading. Owing to the generalization ability of NLARX, semianalytical method, which is used to obtain sample database for NLARX model training and testing, is implemented with simplified multibody model. Durability tests of a torsion beam rear suspension are introduced to demonstrate the effectiveness of the proposed approach. The experimental results show that our proposed approach is able to achieve better estimation results, when compared with the conventional semianalytical method.


Author(s):  
L. Mohr ◽  
R. Benauer ◽  
P. Leitl ◽  
F. Fraundorfer

<p><strong>Abstract.</strong> Precise models of the impact of explosions in urban environments provide novel and valuable information in disaster management for developing precautionary, preventive and mitigating measures. Yet to date, no methods enabling accurate predictions of the process and effect of detonations at particular locations exist. We propose a novel approach mitigating this gap by combining state-of-the-art methods from photogrammetric 3D reconstruction, semantic segmentation and computational based numerical simulations. In a first step, we create an accurate urban 3D reconstruction from georeferenced aerial images. The resulting city model is then enriched with semantic information obtained from the original source images as well as from registered terrestrial images using deep neural networks. This allows for an efficient automatic preparation of a 3D model suitable for the use as a geometry for the numerical investigations. Using this approach, we are able to provide recent and precise models of an area of interest in an automated fashion. Within the model, we are now able to define the explosive charge size and location and simulate the resulting blast wave propagation using CFD simulation. This provides a full estimation for the expected pressure propagation of a defined charge size. From these results, arising damages and their extent, as well as possible access routes or countermeasures, can be estimated. Using georeferenced sources allows for the integration and utilization of simulation results into existing geoinformation systems of disaster management units, providing novel inputs for training, preparation and prevention. We demonstrate our proposed approach by evaluating expected glass breakage and expected damages impairing the structural integrity of buildings depending on the charge size using a 3D reconstruction from aerial images of an area in the inner city of Graz, Austria.</p>


2021 ◽  
Vol 6 (2 (114)) ◽  
pp. 86-95
Author(s):  
Vadym Slyusar ◽  
Mykhailo Protsenko ◽  
Anton Chernukha ◽  
Vasyl Melkin ◽  
Olena Petrova ◽  
...  

This paper considers a model of the neural network for semantically segmenting the images of monitored objects on aerial photographs. Unmanned aerial vehicles monitor objects by analyzing (processing) aerial photographs and video streams. The results of aerial photography are processed by the operator in a manual mode; however, there are objective difficulties associated with the operator's handling a large number of aerial photographs, which is why it is advisable to automate this process. Analysis of the models showed that to perform the task of semantic segmentation of images of monitored objects on aerial photographs, the U-Net model (Germany), which is a convolutional neural network, is most suitable as a basic model. This model has been improved by using a wavelet layer and the optimal values of the model training parameters: speed (step) ‒ 0.001, the number of epochs ‒ 60, the optimization algorithm ‒ Adam. The training was conducted by a set of segmented images acquired from aerial photographs (with a resolution of 6,000×4,000 pixels) by the Image Labeler software in the mathematical programming environment MATLAB R2020b (USA). As a result, a new model for semantically segmenting the images of monitored objects on aerial photographs with the proposed name U-NetWavelet was built. The effectiveness of the improved model was investigated using an example of processing 80 aerial photographs. The accuracy, sensitivity, and segmentation error were selected as the main indicators of the model's efficiency. The use of a modified wavelet layer has made it possible to adapt the size of an aerial photograph to the parameters of the input layer of the neural network, to improve the efficiency of image segmentation in aerial photographs; the application of a convolutional neural network has allowed this process to be automatic.


2020 ◽  
Vol 96 (3s) ◽  
pp. 585-588
Author(s):  
С.Е. Фролова ◽  
Е.С. Янакова

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.


Sign in / Sign up

Export Citation Format

Share Document