A Crowd Counting Framework Combining with Crowd Location

Journal of Advanced Transportation ◽

10.1155/2021/6664281 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Jin Zhang ◽

Sheng Chen ◽

Sen Tian ◽

Wenan Gong ◽

Guoshan Cai ◽

...

Keyword(s):

Neural Network ◽

Input Image ◽

Residual Network ◽

Crowd Counting ◽

The Past ◽

Detection And Counting ◽

Urban Safety ◽

Feature Information ◽

Density Map ◽

New Framework

In the past ten years, crowd detection and counting have been applied in many fields such as station crowd statistics, urban safety prevention, and people flow statistics. However, obtaining accurate positions and improving the performance of crowd counting in dense scenes still face challenges, and it is worthwhile devoting much effort to this. In this paper, a new framework is proposed to resolve the problem. The proposed framework includes two parts. The first part is a fully convolutional neural network (CNN) consisting of backend and upsampling. In the first part, backend uses the residual network (ResNet) to encode the features of the input picture, and upsampling uses the deconvolution layer to decode the feature information. The first part processes the input image, and the processed image is input to the second part. The second part is a peak confidence map (PCM), which is proposed based on an improvement over the density map (DM). Compared with DM, PCM can not only solve the problem of crowd counting but also accurately predict the location of the person. The experimental results on several datasets (Beijing-BRT, Mall, Shanghai Tech, and UCF_CC_50 datasets) show that the proposed framework can achieve higher crowd counting performance in dense scenarios and can accurately predict the location of crowds.

Download Full-text

Convolutional Neural Network for Crowd Counting on Metro Platforms

Symmetry ◽

10.3390/sym13040703 ◽

2021 ◽

Vol 13 (4) ◽

pp. 703

Author(s):

Jun Zhang ◽

Jiaze Liu ◽

Zhizhong Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Estimation Error ◽

Image Features ◽

Urban Rail Transit ◽

Crowd Counting ◽

Passenger Flow ◽

Urban Rail ◽

Density Map ◽

Flow Detection

Owing to the increased use of urban rail transit, the flow of passengers on metro platforms tends to increase sharply during peak periods. Monitoring passenger flow in such areas is important for security-related reasons. In this paper, in order to solve the problem of metro platform passenger flow detection, we propose a CNN (convolutional neural network)-based network called the MP (metro platform)-CNN to accurately count people on metro platforms. The proposed method is composed of three major components: a group of convolutional neural networks is used on the front end to extract image features, a multiscale feature extraction module is used to enhance multiscale features, and transposed convolution is used for upsampling to generate a high-quality density map. Currently, existing crowd-counting datasets do not adequately cover all of the challenging situations considered in this study. Therefore, we collected images from surveillance videos of a metro platform to form a dataset containing 627 images, with 9243 annotated heads. The results of the extensive experiments showed that our method performed well on the self-built dataset and the estimation error was minimum. Moreover, the proposed method could compete with other methods on four standard crowd-counting datasets.

Download Full-text

Low-Rank and Sparse Based Deep-Fusion Convolutional Neural Network for Crowd Counting

Mathematical Problems in Engineering ◽

10.1155/2017/5046727 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Siqi Tang ◽

Zhisong Pan ◽

Xingyu Zhou

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Regression Method ◽

Low Rank ◽

Counting Method ◽

Direct Integral ◽

Crowd Counting ◽

Counting Methods ◽

Density Map

This paper proposes an accurate crowd counting method based on convolutional neural network and low-rank and sparse structure. To this end, we firstly propose an effective deep-fusion convolutional neural network to promote the density map regression accuracy. Furthermore, we figure out that most of the existing CNN based crowd counting methods obtain overall counting by direct integral of estimated density map, which limits the accuracy of counting. Instead of direct integral, we adopt a regression method based on low-rank and sparse penalty to promote accuracy of the projection from density map to global counting. Experiments demonstrate the importance of such regression process on promoting the crowd counting performance. The proposed low-rank and sparse based deep-fusion convolutional neural network (LFCNN) outperforms existing crowd counting methods and achieves the state-of-the-art performance.

Download Full-text

A High-Density Crowd Counting Method Based on Convolutional Feature Fusion

Applied Sciences ◽

10.3390/app8122367 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2367 ◽

Cited By ~ 5

Author(s):

Hongling Luo ◽

Jun Sang ◽

Weiqun Wu ◽

Hong Xiang ◽

Zhili Xiang ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Large Scale ◽

Feature Fusion ◽

High Density ◽

Counting Problem ◽

Crowd Counting ◽

Low Level ◽

Density Map ◽

High Level

In recent years, the trampling events due to overcrowding have occurred frequently, which leads to the demand for crowd counting under a high-density environment. At present, there are few studies on monitoring crowds in a large-scale crowded environment, while there exists technology drawbacks and a lack of mature systems. Aiming to solve the crowd counting problem with high-density under complex environments, a feature fusion-based deep convolutional neural network method FF-CNN (Feature Fusion of Convolutional Neural Network) was proposed in this paper. The proposed FF-CNN mapped the crowd image to its crowd density map, and then obtained the head count by integration. The geometry adaptive kernels were adopted to generate high-quality density maps which were used as ground truths for network training. The deconvolution technique was used to achieve the fusion of high-level and low-level features to get richer features, and two loss functions, i.e., density map loss and absolute count loss, were used for joint optimization. In order to increase the sample diversity, the original images were cropped with a random cropping method for each iteration. The experimental results of FF-CNN on the ShanghaiTech public dataset showed that the fusion of low-level and high-level features can extract richer features to improve the precision of density map estimation, and further improve the accuracy of crowd counting.

Download Full-text

MH-MetroNet—A Multi-Head CNN for Passenger-Crowd Attendance Estimation

Journal of Imaging ◽

10.3390/jimaging6070062 ◽

2020 ◽

Vol 6 (7) ◽

pp. 62

Author(s):

Pier Luigi Mazzeo ◽

Riccardo Contino ◽

Paolo Spagnolo ◽

Cosimo Distante ◽

Ettore Stella ◽

...

Keyword(s):

Network Architecture ◽

Network Performance ◽

State Of The Art ◽

Mean Absolute Error ◽

Absolute Error ◽

Input Image ◽

Mean Square ◽

Metro Station ◽

Crowd Counting ◽

Density Map

Knowing an accurate passengers attendance estimation on each metro car contributes to the safely coordination and sorting the crowd-passenger in each metro station. In this work we propose a multi-head Convolutional Neural Network (CNN) architecture trained to infer an estimation of passenger attendance in a metro car. The proposed network architecture consists of two main parts: a convolutional backbone, which extracts features over the whole input image, and a multi-head layers able to estimate a density map, needed to predict the number of people within the crowd image. The network performance is first evaluated on publicly available crowd counting datasets, including the ShanghaiTech part_A, ShanghaiTech part_B and UCF_CC_50, and then trained and tested on our dataset acquired in subway cars in Italy. In both cases a comparison is made against the most relevant and latest state of the art crowd counting architectures, showing that our proposed MH-MetroNet architecture outperforms in terms of Mean Absolute Error (MAE) and Mean Square Error (MSE) and passenger-crowd people number prediction.

Download Full-text

Multi-Stream Networks and Ground Truth Generation for Crowd Counting

International journal of electrical and computer engineering systems ◽

10.32985/ijeces.11.1.4 ◽

2020 ◽

Vol 11 (1) ◽

pp. 33-41

Author(s):

Rodolfo Quispe ◽

Darwin Ttito ◽

Adín Rivera ◽

Helio Pedrini

Keyword(s):

Neural Network ◽

Network Architecture ◽

Receptive Fields ◽

Ground Truth ◽

Scene Analysis ◽

Stream Networks ◽

Single Image ◽

Crowd Counting ◽

Ground Truth Generation ◽

Density Map

Crowd scene analysis has received a lot of attention recently due to a wide variety of applications, e.g., forensic science, urban planning, surveillance and security. In this context, a challenging task is known as crowd counting [1–6], whose main purpose is to estimate the number of people present in a single image. A multi-stream convolutional neural network is developed and evaluated in this paper, which receives an image as input and produces a density map that represents the spatial distribution of people in an end-to-end fashion. In order to address complex crowd counting issues, such as extremely unconstrained scale and perspective changes, the network architecture utilizes receptive fields with different size filters for each stream. In addition, we investigate the influence of the two most common fashions on the generation of ground truths and propose a hybrid method based on tiny face detection and scale interpolation. Experiments conducted on two challenging datasets, UCF-CC-50 and ShanghaiTech, demonstrate that the use of our ground truth generation methods achieves superior results.

Download Full-text

One Shot Crowd Counting with Deep Scale Adaptive Neural Network

Electronics ◽

10.3390/electronics8060701 ◽

2019 ◽

Vol 8 (6) ◽

pp. 701 ◽

Cited By ~ 1

Author(s):

Junfeng Wu ◽

Zhiyang Li ◽

Wenyu Qu ◽

Yizhi Zhou

Keyword(s):

Neural Network ◽

Adaptive Neural Network ◽

Crowd Counting ◽

Crowd Density ◽

Proposed Model ◽

Perspective Image ◽

Perspective Effect ◽

Camera Perspective ◽

Density Map ◽

Public Datasets

This paper aims to utilize the deep learning architecture to break through the limitations of camera perspective, image background, uneven crowd density distribution and pedestrian occlusion to estimate crowd density accurately. In this paper, we proposed a new neural network called Deep Scale-Adaptive Convolutional Neural Network (DSA-CNN), which can convert a single crowd image to density map for crowd counting directly. For a crowd image with any size and resolution, our algorithm can output the density map of the crowd image by end-to-end method and finally estimate the number of the crowd in the image. The proposed DSA-CNN consists of two parts: the seven layers CNN network structure and DSA modules. In order to ensure the proposed method is robust to camera perspective effect, DSA-CNN has adopted different sizes of filters in the network and combines them ingeniously. In order to reduce the depth of the data to increase the speed of training, the proposed method utilized 1 × 1 filter in DSA module. To validate the effectiveness of the proposed model, we conducted comparative experiments on four popular public datasets (ShanghiTech dataset, UCF_CC_50 dataset, WorldExpo’10 dataset and UCSD dataset). We compare the proposed method with other well-known algorithms on the MAE and MSE indicators, such as MCNN, Switching-CNN, CSRNet, CP-CNN and Cascaded-MTL. Experimental results show that the proposed method has excellent performance. In addition, we found that the proposed model is easily trained, which further increases the usability of the proposed model.

Download Full-text

Where are the People? A Multi-Stream Convolutional Neural Network for Crowd Counting via Density Map from Complex Images

2019 International Conference on Systems, Signals and Image Processing (IWSSIP) ◽

10.1109/iwssip.2019.8787217 ◽

2019 ◽

Cited By ~ 2

Author(s):

Darwin Ttito ◽

Rodolfo Quispe ◽

Adin Ramirez Rivera ◽

Helio Pedrini

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Crowd Counting ◽

The People ◽

Density Map ◽

Complex Images

Download Full-text

Crowd Counting using Deep Recurrent Spatial-Aware Network

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/118 ◽

2018 ◽

Cited By ~ 48

Author(s):

Lingbo Liu ◽

Hongjun Wang ◽

Guanbin Li ◽

Wanli Ouyang ◽

Liang Lin

Keyword(s):

Neural Network ◽

Real World ◽

Local Refinement ◽

Crowd Counting ◽

Multi Scale ◽

Residual Learning ◽

Crowd Density ◽

Real World Applications ◽

Refinement Process ◽

Density Map

Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera’s perspective that causes huge appearance variations in people’s scales and rotations. Conventional methods address such challenges by resorting to fixed multi-scale architectures that are often unable to cover the largely varied scales while ignoring the rotation variations. In this paper, we propose a unified neural network framework, named Deep Recurrent Spatial-Aware Network, which adaptively addresses the two issues in a learnable spatial transform module with a region-wise refinement process. Specifically, our framework incorporates a Recurrent Spatial-Aware Refinement (RSAR) module iteratively conducting two components: i) a Spatial Transformer Network that dynamically locates an attentional region from the crowd density map and transforms it to the suitable scale and rotation for optimal crowd estimation; ii) a Local Refinement Network that refines the density map of the attended region with residual learning. Extensive experiments on four challenging benchmarks show the effectiveness of our approach. Specifically, comparing with the existing best-performing methods, we achieve an improvement of 12\% on the largest dataset WorldExpo’10 and 22.8\% on the most challenging dataset UCF\_CC\_50

Download Full-text

CoCoX: Generating Conceptual and Counterfactual Explanations via Fault-Lines

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5643 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2594-2601

Author(s):

Arjun Akula ◽

Shuai Wang ◽

Song-Chun Zhu

Keyword(s):

Neural Network ◽

State Of The Art ◽

Input Image ◽

Classification Model ◽

Learning Models ◽

Fault Line ◽

Semantic Level ◽

Explainable Ai ◽

Fault Lines ◽

Classification Category

We present CoCoX (short for Conceptual and Counterfactual Explanations), a model for explaining decisions made by a deep convolutional neural network (CNN). In Cognitive Psychology, the factors (or semantic-level features) that humans zoom in on when they imagine an alternative to a model prediction are often referred to as fault-lines. Motivated by this, our CoCoX model explains decisions made by a CNN using fault-lines. Specifically, given an input image I for which a CNN classification model M predicts class cpred, our fault-line based explanation identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class calt. We argue that, due to the conceptual and counterfactual nature of fault-lines, our CoCoX explanations are practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, showing that CoCoX significantly outperforms the state-of-the-art explainable AI models. Our implementation is available at https://github.com/arjunakula/CoCoX

Download Full-text

Framework for Indoor Elements Classification via Inductive Learning on Floor Plan Graphs

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020097 ◽

2021 ◽

Vol 10 (2) ◽

pp. 97

Author(s):

Jaeyoung Song ◽

Kiyun Yu

Keyword(s):

Neural Network ◽

Inductive Learning ◽

Network Data ◽

Floor Plan ◽

Vector Data ◽

Adjacency Graph ◽

Different Types ◽

Image Pixels ◽

Learning Frameworks ◽

New Framework

This paper presents a new framework to classify floor plan elements and represent them in a vector format. Unlike existing approaches using image-based learning frameworks as the first step to segment the image pixels, we first convert the input floor plan image into vector data and utilize a graph neural network. Our framework consists of three steps. (1) image pre-processing and vectorization of the floor plan image; (2) region adjacency graph conversion; and (3) the graph neural network on converted floor plan graphs. Our approach is able to capture different types of indoor elements including basic elements, such as walls, doors, and symbols, as well as spatial elements, such as rooms and corridors. In addition, the proposed method can also detect element shapes. Experimental results show that our framework can classify indoor elements with an F1 score of 95%, with scale and rotation invariance. Furthermore, we propose a new graph neural network model that takes the distance between nodes into account, which is a valuable feature of spatial network data.

Download Full-text