scholarly journals Multi-Scale Guided Attention Network for Crowd Counting

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Pengfei Li ◽  
Min Zhang ◽  
Jian Wan ◽  
Ming Jiang

The CNN-based crowd counting method uses image pyramid and dense connection to fuse features to solve the problems of multiscale and information loss. However, these operations lead to information redundancy and confusion between crowd and background information. In this paper, we propose a multi-scale guided attention network (MGANet) to solve the above problems. Specifically, the multilayer features of the network are fused by a top-down approach to obtain multiscale information and context information. The attention mechanism is used to guide the acquired features of each layer in space and channel so that the network pays more attention to the crowd in the image, ignores irrelevant information, and further integrates to obtain the final high-quality density map. Besides, we propose a counting loss function combining SSIM Loss, MAE Loss, and MSE Loss to achieve effective network convergence. We experiment on four major datasets and obtain good results. The effectiveness of the network modules is proved by the corresponding ablation experiments. The source code is available at https://github.com/lpfworld/MGANet.

2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Siqi Tang ◽  
Zhisong Pan ◽  
Xingyu Zhou

This paper proposes an accurate crowd counting method based on convolutional neural network and low-rank and sparse structure. To this end, we firstly propose an effective deep-fusion convolutional neural network to promote the density map regression accuracy. Furthermore, we figure out that most of the existing CNN based crowd counting methods obtain overall counting by direct integral of estimated density map, which limits the accuracy of counting. Instead of direct integral, we adopt a regression method based on low-rank and sparse penalty to promote accuracy of the projection from density map to global counting. Experiments demonstrate the importance of such regression process on promoting the crowd counting performance. The proposed low-rank and sparse based deep-fusion convolutional neural network (LFCNN) outperforms existing crowd counting methods and achieves the state-of-the-art performance.


Author(s):  
Shufang Li ◽  
Zhengping Hu ◽  
Mengyao Zhao ◽  
Zhe Sun

Author(s):  
Lingbo Liu ◽  
Hongjun Wang ◽  
Guanbin Li ◽  
Wanli Ouyang ◽  
Liang Lin

Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera’s perspective that causes huge appearance variations in people’s scales and rotations. Conventional methods address such challenges by resorting to fixed multi-scale architectures that are often unable to cover the largely varied scales while ignoring the rotation variations. In this paper, we propose a unified neural network framework, named Deep Recurrent Spatial-Aware Network, which adaptively addresses the two issues in a learnable spatial transform module with a region-wise refinement process. Specifically, our framework incorporates a Recurrent Spatial-Aware Refinement (RSAR) module iteratively conducting two components: i) a Spatial Transformer Network that dynamically locates an attentional region from the crowd density map and transforms it to the suitable scale and rotation for optimal crowd estimation; ii) a Local Refinement Network that refines the density map of the attended region with residual learning. Extensive experiments on four challenging benchmarks show the effectiveness of our approach. Specifically, comparing with the existing best-performing methods, we achieve an improvement of 12\% on the largest dataset WorldExpo’10 and 22.8\% on the most challenging dataset UCF\_CC\_50


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 703
Author(s):  
Jun Zhang ◽  
Jiaze Liu ◽  
Zhizhong Wang

Owing to the increased use of urban rail transit, the flow of passengers on metro platforms tends to increase sharply during peak periods. Monitoring passenger flow in such areas is important for security-related reasons. In this paper, in order to solve the problem of metro platform passenger flow detection, we propose a CNN (convolutional neural network)-based network called the MP (metro platform)-CNN to accurately count people on metro platforms. The proposed method is composed of three major components: a group of convolutional neural networks is used on the front end to extract image features, a multiscale feature extraction module is used to enhance multiscale features, and transposed convolution is used for upsampling to generate a high-quality density map. Currently, existing crowd-counting datasets do not adequately cover all of the challenging situations considered in this study. Therefore, we collected images from surveillance videos of a metro platform to form a dataset containing 627 images, with 9243 annotated heads. The results of the extensive experiments showed that our method performed well on the self-built dataset and the estimation error was minimum. Moreover, the proposed method could compete with other methods on four standard crowd-counting datasets.


Author(s):  
Haixing Li ◽  
Haibo Luo ◽  
Wang Huan ◽  
Zelin Shi ◽  
Chongnan Yan ◽  
...  

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Chih-Wei Lin ◽  
Yu Hong ◽  
Jinfu Liu

Abstract Background Glioma is a malignant brain tumor; its location is complex and is difficult to remove surgically. To diagnosis the brain tumor, doctors can precisely diagnose and localize the disease using medical images. However, the computer-assisted diagnosis for the brain tumor diagnosis is still the problem because the rough segmentation of the brain tumor makes the internal grade of the tumor incorrect. Methods In this paper, we proposed an Aggregation-and-Attention Network for brain tumor segmentation. The proposed network takes the U-Net as the backbone, aggregates multi-scale semantic information, and focuses on crucial information to perform brain tumor segmentation. To this end, we proposed an enhanced down-sampling module and Up-Sampling Layer to compensate for the information loss. The multi-scale connection module is to construct the multi-receptive semantic fusion between encoder and decoder. Furthermore, we designed a dual-attention fusion module that can extract and enhance the spatial relationship of magnetic resonance imaging and applied the strategy of deep supervision in different parts of the proposed network. Results Experimental results show that the performance of the proposed framework is the best on the BraTS2020 dataset, compared with the-state-of-art networks. The performance of the proposed framework surpasses all the comparison networks, and its average accuracies of the four indexes are 0.860, 0.885, 0.932, and 1.2325, respectively. Conclusions The framework and modules of the proposed framework are scientific and practical, which can extract and aggregate useful semantic information and enhance the ability of glioma segmentation.


Author(s):  
Anran Zhang ◽  
Xiaolong Jiang ◽  
Baochang Zhang ◽  
Xianbin Cao
Keyword(s):  

2020 ◽  
Vol 34 (07) ◽  
pp. 11693-11700 ◽  
Author(s):  
Ao Luo ◽  
Fan Yang ◽  
Xin Li ◽  
Dong Nie ◽  
Zhicheng Jiao ◽  
...  

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.


Sign in / Sign up

Export Citation Format

Share Document