scholarly journals Crowd Counting with Semantic Scene Segmentation in Helicopter Footage

Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4855
Author(s):  
Gergely Csönde ◽  
Yoshihide Sekimoto ◽  
Takehiro Kashiyama

Continually improving crowd counting neural networks have been developed in recent years. The accuracy of these networks has reached such high levels that further improvement is becoming very difficult. However, this high accuracy lacks deeper semantic information, such as social roles (e.g., student, company worker, or police officer) or location-based roles (e.g., pedestrian, tenant, or construction worker). Some of these can be learned from the same set of features as the human nature of an entity, whereas others require wider contextual information from the human surroundings. The primary end-goal of developing recognition software is to involve them in autonomous decision-making systems. Therefore, it must be foolproof, which is, it must have good semantic understanding of the input. In this study, we focus on counting pedestrians in helicopter footage and introduce a dataset created from helicopter videos for this purpose. We use semantic segmentation to extract the required additional contextual information from the surroundings of an entity. We demonstrate that it is possible to increase the pedestrian counting accuracy in this manner. Furthermore, we show that crowd counting and semantic segmentation can be simultaneously achieved, with comparable or even improved accuracy, by using the same crowd counting neural network for both tasks through hard parameter sharing. The presented method is generic and it can be applied to arbitrary crowd density estimation methods. A link to the dataset is available at the end of the paper.

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3777
Author(s):  
Yani Zhang ◽  
Huailin Zhao ◽  
Zuodong Duan ◽  
Liangjun Huang ◽  
Jiahao Deng ◽  
...  

In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.


Author(s):  
Nermin Kamal Abdel-Wahab Negied ◽  
Elsayed B. Hemayed ◽  
Magda Fayek

This work presents a new approach for crowd counting and classification based upon human thermal and motion features. The technique is efficient for automatic crowd density estimation and type of motion determination. Crowd density is measured without any need for camera calibration or assumption of prior knowledge about the input videos. It does not need any human intervention so it can be used successfully in a fully automated crowd control systems. Two new features are introduced for crowd counting purpose: the first represents thermal characteristics of humans and is expressed by the ratio between their temperature and their ambient environment temperature. The second describes humans motion characteristics and is measured by the ratio between humans motion velocity and the ambient environment rigidity. Each ratio should exceed a certain predetermined threshold for human beings. These features have been investigated and proved to give accurate crowd counting performance in real time. Moreover, the two features are combined and used together for crowd classification into one of the three main types, which are: fully mobile, fully static, or mix of both types. Last but not least, the proposed system offers several advantages such as being a privacy preserving crowd counting system, reliable for homogeneous and inhomogeneous crowds, does not depend on a certain direction in motion detection, has no restriction on crowd size. The experimental results demonstrate the effectiveness of the approach.


Electronics ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 1293
Author(s):  
Khalil Khan ◽  
Rehan Ullah Khan ◽  
Waleed Albattah ◽  
Durre Nayab ◽  
Ali Mustafa Qamar ◽  
...  

Crowd counting is an active research area within scene analysis. Over the last 20 years, researchers proposed various algorithms for crowd counting in real-time scenarios due to many applications in disaster management systems, public events, safety monitoring, and so on. In our paper, we proposed an end-to-end semantic segmentation framework for crowd counting in a dense crowded image. Our proposed framework was based on semantic scene segmentation using an optimized convolutional neural network. The framework successfully highlighted the foreground and suppressed the background part. The framework encoded the high-density maps through a guided attention mechanism system. We obtained crowd counting through integrating the density maps. Our proposed algorithm classified the crowd counting in each image into groups to adapt the variations occurring in crowd counting. Our algorithm overcame the scale variations of a crowded image through multi-scale features extracted from the images. We conducted experiments with four standard crowd-counting datasets, reporting better results as compared to previous results.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3848
Author(s):  
Wei Cui ◽  
Meng Yao ◽  
Yuanjie Hao ◽  
Ziwei Wang ◽  
Xin He ◽  
...  

Pixel-based semantic segmentation models fail to effectively express geographic objects and their topological relationships. Therefore, in semantic segmentation of remote sensing images, these models fail to avoid salt-and-pepper effects and cannot achieve high accuracy either. To solve these problems, object-based models such as graph neural networks (GNNs) are considered. However, traditional GNNs directly use similarity or spatial correlations between nodes to aggregate nodes’ information, which rely too much on the contextual information of the sample. The contextual information of the sample is often distorted, which results in a reduction in the node classification accuracy. To solve this problem, a knowledge and geo-object-based graph convolutional network (KGGCN) is proposed. The KGGCN uses superpixel blocks as nodes of the graph network and combines prior knowledge with spatial correlations during information aggregation. By incorporating the prior knowledge obtained from all samples of the study area, the receptive field of the node is extended from its sample context to the study area. Thus, the distortion of the sample context is overcome effectively. Experiments demonstrate that our model is improved by 3.7% compared with the baseline model named Cluster GCN and 4.1% compared with U-Net.


2020 ◽  
Vol 1651 ◽  
pp. 012060
Author(s):  
Fujian Feng ◽  
Shuang Liu ◽  
Yongzheng Pan ◽  
Xin He ◽  
Jiayin Wei ◽  
...  

Author(s):  
Xinghao Ding ◽  
Fujin He ◽  
Zhirui Lin ◽  
Yu Wang ◽  
Huimin Guo ◽  
...  

2020 ◽  
Vol 34 (07) ◽  
pp. 11693-11700 ◽  
Author(s):  
Ao Luo ◽  
Fan Yang ◽  
Xin Li ◽  
Dong Nie ◽  
Zhicheng Jiao ◽  
...  

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.


Sign in / Sign up

Export Citation Format

Share Document