JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2020 Vol.21 No.8 P.1150-1160

http://doi.org/10.1631/FITEE.1900282

A novel convolutional neural network method for crowd counting

Author(s): Jie-hao Huang, Xiao-guang Di, Jun-de Wu, Ai-yue Chen
Affiliation(s): Control and Simulation Center, Harbin Institute of Technology, Harbin 150080, China
Corresponding email(s): 18s004055@hit.edu.cn, dixiaoguang@hit.edu.cn
Key Words: Crowd counting, Density estimation, Segmentation prior map, Uniform function

Share this article to： More <<< Previous Article \|Next Article >>>

Jie-hao Huang, Xiao-guang Di, Jun-de Wu, Ai-yue Chen. A novel convolutional neural network method for crowd counting[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(8): 1150-1160.

@article{title="A novel convolutional neural network method for crowd counting",
author="Jie-hao Huang, Xiao-guang Di, Jun-de Wu, Ai-yue Chen",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="8",
pages="1150-1160",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900282"
}

%0 Journal Article
%T A novel convolutional neural network method for crowd counting
%A Jie-hao Huang
%A Xiao-guang Di
%A Jun-de Wu
%A Ai-yue Chen
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 8
%P 1150-1160
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900282

TY - JOUR
T1 - A novel convolutional neural network method for crowd counting
A1 - Jie-hao Huang
A1 - Xiao-guang Di
A1 - Jun-de Wu
A1 - Ai-yue Chen
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 8
SP - 1150
EP - 1160
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900282

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Crowd density estimation, in general, is a challenging task due to the large variation of head sizes in the crowds. Existing methods always use a multi-column convolutional neural network (MCNN) to adapt to this variation, which results in an average effect in areas with different densities and brings a lot of noise to the density map. To address this problem, we propose a new method called the segmentation-aware prior network (SAPNet), which generates a high-quality density map without noise based on a coarse head-segmentation map. SAPNet is composed of two networks, i.e., a foreground-segmentation convolutional neural network (FS-CNN) as the front end and a crowd-regression convolutional neural network (CR-CNN) as the back end. With only the single dot annotation, we generate the ground truth of segmentation masks in heads. Then, based on the ground truth, FS-CNN outputs a coarse head-segmentation map, which helps eliminate the noise in regions without people in the density map. By inputting the head-segmentation map generated by the front end, CR-CNN performs accurate crowd counting estimation and generates a high-quality density map. We demonstrate SAPNet on four datasets (i.e., ShanghaiTech, UCF-CC-50, WorldExpo’10, and UCSD), and show the state-of-the-art performances on ShanghaiTech part B and UCF-CC-50 datasets.

一种新的基于卷积神经网络的人群计数方法

黄杰浩，遆晓光，吴俊德，陈瑷玥
哈尔滨工业大学控制与仿真中心，中国哈尔滨市，150080

摘要：人群密度估计是一项具有挑战性的任务，因为人群中人头大小存在大范围变化。现有方法均采用多列式结构卷积神经网络去适应这种变化，但会导致密度图上不同密度区域产生平均效应并引入额外噪声。为解决该问题，提出一种新的基于分割先验图的神经网络方法，在分割图基础上生成一个高质量且没有噪声的密度图。该网络主要包括两个部分，即头部的人群前景分割神经网络和尾部的人群回归神经网络。在数据集只提供单点人头标记的情况下，采用均匀函数生成人群头部的掩膜真值图。基于该真值图，前景分割网络输出人群分割图，可有效减少密度图中无人区域噪声。将人群分割图输入人群回归网络，后者能生成高质量人群密度图并提供准确的人数估计。在4个公开数据集（即ShanghaiTech、UCF-CC-50、WorldExpo’10和UCSD）上验证了所提方法有效性；其中，在ShanghaiTech partB和UCF-CC-50两个数据集上该方法取得了当前最好结果。

关键词：人群计数；密度估计；分割先验图；均匀函数

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Canny J, 1986. A computational approach to edge detection. IEEE Trans Patt Anal Mach Intell, 8(6):679-698.

[2]Chan AB, Vasconcelos N, 2009. Bayesian Poisson regression for crowd counting. Proc IEEE 12^th Int Conf on Computer Vision, p.545-551.

[3]Chan AB, Liang ZSJ, Vasconcelos N, 2008. Privacy preserving crowd monitoring: counting people without people models or tracking. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1-7.

[4]Dai JF, Li Y, He KM, et al., 2016. R-FCN: object detection via region-based fully convolutional networks. Proc 30^th Int Conf on Neural Information Processing Systems, p.379-387.

[5]Dollar P, Wojek C, Schiele B, et al., 2012. Pedestrian detection: an evaluation of the state of the art. IEEE Trans Patt Anal Mach Intell, 34(4):743-761.

[6]Idrees H, Saleemi I, Seibert C, et al., 2013. Multi-source multi-scale counting in extremely dense crowd images. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2547-2554.

[7]Kang K, Wang XG, 2014. Fully convolutional neural networks for crowd segmentation. https://arxiv.org/abs/1411.4464

[8]Lempitsky V, Zisserman A, 2010. Learning to count objects in images. Proc 23^rd Int Conf on Neural Information Processing Systems, p.1324-1332.

[9]Li HH, He XJ, Wu HF, et al., 2018. Structured inhomogeneous density map learning for crowd counting. https://arxiv.org/abs/1801.06642

[10]Li JJ, Yang H, Wu S, 2016. Crowd semantic segmentation based on spatial-temporal dynamics. Proc 13^th IEEE Int Conf on Advanced Video and Signal Based Surveillance, p.102-108.

[11]Li T, Chang H, Wang M, et al., 2015. Crowded scene analysis: a survey. IEEE Trans Circ Syst Video Technol, 25(3):367-386.

[12]Li YH, Zhang XF, Chen DM, 2018. CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1091-1100.

[13]Liu J, Gao CQ, Meng DY, et al., 2018. DecideNet: counting varying density crowds through attention guided detection and density estimation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5197-5206.

[14]Long J, Shelhamer E, Darrell T, 2015. Fully convolutional networks for semantic segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.3431-3440.

[15]Sam DB, Surya S, Babu RV, 2017. Switching convolutional neural network for crowd counting. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.4031-4039.

[16]Sam DB, Sajjan NN, Babu RV, 2018. Divide and grow: capturing huge diversity in crowd images with incrementally growing CNN. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3618-3626.

[17]Shen Z, Xu Y, Ni B, et al., 2018. Crowd counting via adversarial cross-scale consistency pursuit. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5245-5254.

[18]Sindagi VA, Patel VM, 2017. Generating high-quality crowd density maps using contextual pyramid CNNs. Proc IEEE Int Conf on Computer Vision, p.1879-1888.

[19]Sindagi VA, Patel VM, 2018. A survey of recent advances in CNN-based single image crowd counting and density estimation. Patt Recogn Lett, 107:3-16.

[20]Zhan BB, Monekosso DN, Remagnino P, et al., 2008. Crowd analysis: a survey. Mach Vis Appl, 19(5-6):345-357.

[21]Zhang C, Li HS, Wang XG, et al., 2015. Cross-scene crowd counting via deep convolutional neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.833-841.

[22]Zhang C, Zhang K, Li HS, et al., 2016. Data-driven crowd understanding: a baseline for a large-scale crowd dataset. IEEE Trans Multim, 18(6):1048-1061.

[23]Zhang YY, Zhou DS, Chen SQ, et al., 2016. Single-image crowd counting via multi-column convolutional neural network. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.589-597.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

一种新的基于卷积神经网络的人群计数方法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference