Crowd Counting with Semantic Scene Segmentation in Helicopter Footage

Gergely Csönde; Yoshihide Sekimoto; Takehiro Kashiyama

doi:10.3390/s20174855

Crowd Counting with Semantic Scene Segmentation in Helicopter Footage

Sensors ◽

10.3390/s20174855 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4855

Author(s):

Gergely Csönde ◽

Yoshihide Sekimoto ◽

Takehiro Kashiyama

Keyword(s):

Contextual Information ◽

Semantic Segmentation ◽

Estimation Methods ◽

Crowd Counting ◽

Crowd Density Estimation ◽

Crowd Density ◽

Pedestrian Counting ◽

Semantic Scene ◽

Recognition Software ◽

Improved Accuracy

Continually improving crowd counting neural networks have been developed in recent years. The accuracy of these networks has reached such high levels that further improvement is becoming very difficult. However, this high accuracy lacks deeper semantic information, such as social roles (e.g., student, company worker, or police officer) or location-based roles (e.g., pedestrian, tenant, or construction worker). Some of these can be learned from the same set of features as the human nature of an entity, whereas others require wider contextual information from the human surroundings. The primary end-goal of developing recognition software is to involve them in autonomous decision-making systems. Therefore, it must be foolproof, which is, it must have good semantic understanding of the input. In this study, we focus on counting pedestrians in helicopter footage and introduce a dataset created from helicopter videos for this purpose. We use semantic segmentation to extract the required additional contextual information from the surroundings of an entity. We demonstrate that it is possible to increase the pedestrian counting accuracy in this manner. Furthermore, we show that crowd counting and semantic segmentation can be simultaneously achieved, with comparable or even improved accuracy, by using the same crowd counting neural network for both tasks through hard parameter sharing. The presented method is generic and it can be applied to arbitrary crowd density estimation methods. A link to the dataset is available at the end of the paper.

Download Full-text

Pedestrian Counting Based on Crowd Density Estimation and Lucas-Kanade Optical Flow

2013 Seventh International Conference on Image and Graphics ◽

10.1109/icig.2013.98 ◽

2013 ◽

Cited By ~ 4

Author(s):

Zeyu Wu ◽

Huicheng Zheng ◽

Jing Wang

Keyword(s):

Optical Flow ◽

Density Estimation ◽

Crowd Density Estimation ◽

Crowd Density ◽

Pedestrian Counting

Download Full-text

Congested Crowd Counting via Adaptive Multi-Scale Context Learning

Sensors ◽

10.3390/s21113777 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3777

Author(s):

Yani Zhang ◽

Huailin Zhao ◽

Zuodong Duan ◽

Liangjun Huang ◽

Jiahao Deng ◽

...

Keyword(s):

Density Estimation ◽

State Of The Art ◽

Spatial Context ◽

Context Information ◽

Crowd Counting ◽

Multi Scale ◽

Context Learning ◽

Crowd Density Estimation ◽

Crowd Density ◽

Density Results

In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.

Download Full-text

HSBS: A Human’s Heat Signature and Background Subtraction Hybrid Approach for Crowd Counting and Analysis

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001416550259 ◽

2016 ◽

Vol 30 (08) ◽

pp. 1655025 ◽

Cited By ~ 1

Author(s):

Nermin Kamal Abdel-Wahab Negied ◽

Elsayed B. Hemayed ◽

Magda Fayek

Keyword(s):

Hybrid Approach ◽

Thermal Characteristics ◽

Human Beings ◽

Crowd Counting ◽

Ambient Environment ◽

Crowd Control ◽

Crowd Density Estimation ◽

Crowd Density ◽

Environment Temperature ◽

Motion Characteristics

This work presents a new approach for crowd counting and classification based upon human thermal and motion features. The technique is efficient for automatic crowd density estimation and type of motion determination. Crowd density is measured without any need for camera calibration or assumption of prior knowledge about the input videos. It does not need any human intervention so it can be used successfully in a fully automated crowd control systems. Two new features are introduced for crowd counting purpose: the first represents thermal characteristics of humans and is expressed by the ratio between their temperature and their ambient environment temperature. The second describes humans motion characteristics and is measured by the ratio between humans motion velocity and the ambient environment rigidity. Each ratio should exceed a certain predetermined threshold for human beings. These features have been investigated and proved to give accurate crowd counting performance in real time. Moreover, the two features are combined and used together for crowd classification into one of the three main types, which are: fully mobile, fully static, or mix of both types. Last but not least, the proposed system offers several advantages such as being a privacy preserving crowd counting system, reliable for homogeneous and inhomogeneous crowds, does not depend on a certain direction in motion detection, has no restriction on crowd size. The experimental results demonstrate the effectiveness of the approach.

Download Full-text

Cross-Line Pedestrian Counting Based on Spatially-Consistent Two-Stage Local Crowd Density Estimation and Accumulation

IEEE Transactions on Circuits and Systems for Video Technology ◽

10.1109/tcsvt.2018.2807806 ◽

2019 ◽

Vol 29 (3) ◽

pp. 787-799 ◽

Cited By ~ 3

Author(s):

Huicheng Zheng ◽

Zijian Lin ◽

Jiepeng Cen ◽

Zeyu Wu ◽

Yadan Zhao

Keyword(s):

Density Estimation ◽

Two Stage ◽

Crowd Density Estimation ◽

Crowd Density ◽

Pedestrian Counting

Download Full-text

Crowd Counting Using End-to-End Semantic Image Segmentation

Electronics ◽

10.3390/electronics10111293 ◽

2021 ◽

Vol 10 (11) ◽

pp. 1293

Author(s):

Khalil Khan ◽

Rehan Ullah Khan ◽

Waleed Albattah ◽

Durre Nayab ◽

Ali Mustafa Qamar ◽

...

Keyword(s):

Semantic Segmentation ◽

Research Area ◽

Crowd Counting ◽

Public Events ◽

Multi Scale ◽

Density Maps ◽

End To End ◽

Active Research ◽

Semantic Scene ◽

Active Research Area

Crowd counting is an active research area within scene analysis. Over the last 20 years, researchers proposed various algorithms for crowd counting in real-time scenarios due to many applications in disaster management systems, public events, safety monitoring, and so on. In our paper, we proposed an end-to-end semantic segmentation framework for crowd counting in a dense crowded image. Our proposed framework was based on semantic scene segmentation using an optimized convolutional neural network. The framework successfully highlighted the foreground and suppressed the background part. The framework encoded the high-density maps through a guided attention mechanism system. We obtained crowd counting through integrating the density maps. Our proposed algorithm classified the crowd counting in each image into groups to adapt the variations occurring in crowd counting. Our algorithm overcame the scale variations of a crowded image through multi-scale features extracted from the images. We conducted experiments with four standard crowd-counting datasets, reporting better results as compared to previous results.

Download Full-text

Knowledge and Geo-Object Based Graph Convolutional Network for Remote Sensing Semantic Segmentation

Sensors ◽

10.3390/s21113848 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3848

Author(s):

Wei Cui ◽

Meng Yao ◽

Yuanjie Hao ◽

Ziwei Wang ◽

Xin He ◽

...

Keyword(s):

Remote Sensing ◽

Prior Knowledge ◽

Contextual Information ◽

Information Aggregation ◽

Semantic Segmentation ◽

Spatial Correlations ◽

Convolutional Network ◽

Object Based ◽

Graph Neural Networks ◽

Salt And Pepper

Pixel-based semantic segmentation models fail to effectively express geographic objects and their topological relationships. Therefore, in semantic segmentation of remote sensing images, these models fail to avoid salt-and-pepper effects and cannot achieve high accuracy either. To solve these problems, object-based models such as graph neural networks (GNNs) are considered. However, traditional GNNs directly use similarity or spatial correlations between nodes to aggregate nodes’ information, which rely too much on the contextual information of the sample. The contextual information of the sample is often distorted, which results in a reduction in the node classification accuracy. To solve this problem, a knowledge and geo-object-based graph convolutional network (KGGCN) is proposed. The KGGCN uses superpixel blocks as nodes of the graph network and combines prior knowledge with spatial correlations during information aggregation. By incorporating the prior knowledge obtained from all samples of the study area, the receptive field of the node is extended from its sample context to the study area. Thus, the distortion of the sample context is overcome effectively. Experiments demonstrate that our model is improved by 3.7% compared with the baseline model named Cluster GCN and 4.1% compared with U-Net.

Download Full-text

Crowd density estimation method based on floor area

Journal of Physics Conference Series ◽

10.1088/1742-6596/1651/1/012060 ◽

2020 ◽

Vol 1651 ◽

pp. 012060

Author(s):

Fujian Feng ◽

Shuang Liu ◽

Yongzheng Pan ◽

Xin He ◽

Jiayin Wei ◽

...

Keyword(s):

Density Estimation ◽

Estimation Method ◽

Floor Area ◽

Crowd Density Estimation ◽

Crowd Density

Download Full-text

Crowd Density Estimation Using Fusion of Multi-Layer Features

IEEE Transactions on Intelligent Transportation Systems ◽

10.1109/tits.2020.2983475 ◽

2020 ◽

pp. 1-12

Author(s):

Xinghao Ding ◽

Fujin He ◽

Zhirui Lin ◽

Yu Wang ◽

Huimin Guo ◽

...

Keyword(s):

Density Estimation ◽

Crowd Density Estimation ◽

Crowd Density

Download Full-text

Hybrid Graph Neural Networks for Crowd Counting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6839 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11693-11700 ◽

Cited By ~ 2

Author(s):

Ao Luo ◽

Fan Yang ◽

Xin Li ◽

Dong Nie ◽

Zhicheng Jiao ◽

...

Keyword(s):

Network Architecture ◽

Message Passing ◽

Large Scale ◽

State Of The Art ◽

Density Variation ◽

Feature Maps ◽

Crowd Counting ◽

Multi Scale ◽

Crowd Density ◽

Graph Neural Networks

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.

Download Full-text