JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2020 Vol.21 No.3 P.405-421

http://doi.org/10.1631/FITEE.1900245

A quantitative attribute-based benchmark methodology for single-target visual tracking

Author(s): Wen-jing Kang, Chang Liu, Gong-liang Liu
Affiliation(s): School of Information Science and Engineering, Harbin Institute of Technology, Weihai 264209, China; more
Corresponding email(s): kwjqq@hit.edu.cn, liuc0051@e.ntu.edu.sg, liugl@hit.edu.cn
Key Words: Visual tracking, Performance evaluation, Visual attributes, Computer vision

Share this article to： More <<< Previous Article \|Next Article >>>

Wen-jing Kang, Chang Liu, Gong-liang Liu. A quantitative attribute-based benchmark methodology for single-target visual tracking[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(3): 405-421.

@article{title="A quantitative attribute-based benchmark methodology for single-target visual tracking",
author="Wen-jing Kang, Chang Liu, Gong-liang Liu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="3",
pages="405-421",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900245"
}

%0 Journal Article
%T A quantitative attribute-based benchmark methodology for single-target visual tracking
%A Wen-jing Kang
%A Chang Liu
%A Gong-liang Liu
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 3
%P 405-421
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900245

TY - JOUR
T1 - A quantitative attribute-based benchmark methodology for single-target visual tracking
A1 - Wen-jing Kang
A1 - Chang Liu
A1 - Gong-liang Liu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 3
SP - 405
EP - 421
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900245

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: In the past several years, various visual object tracking benchmarks have been proposed, and some of them have been used widely in numerous recently proposed trackers. However, most of the discussions focus on the overall performance, and cannot describe the strengths and weaknesses of the trackers in detail. Meanwhile, several benchmark measures that are often used in tests lack convincing interpretation. In this paper, 12 frame-wise visual attributes that reflect different aspects of the characteristics of image sequences are collated, and a normalized quantitative formulaic definition has been given to each of them for the first time. Based on these definitions, we propose two novel test methodologies, a correlation-based test and a weight-based test, which can provide a more intuitive and easier demonstration of the trackers’ performance for each aspect. Then these methods have been applied to the raw results from one of the most famous tracking challenges, the Video Object Tracking (VOT) Challenge 2017. From the tests, most trackers did not perform well when the size of the target changed rapidly or intensely, and even the advanced deep learning based trackers did not perfectly solve the problem. The scale of the targets was not considered in the calculation of the center location error; however, in a practical test, the center location error is still sensitive to the targets’ changes in size.

基于定量属性的单目标视觉跟踪算法评价体系研究

康文静¹，刘畅^1,2，刘功亮¹
¹哈尔滨工业大学信息科学与工程学院，中国威海市，264209
²南洋理工大学电气与电子工程学院，新加坡，639798

摘要：视觉跟踪是计算机视觉领域热门研究课题之一。近年来，很多先进跟踪算法和性能评价基准相继发布，并取得巨大成功。现有评价体系大多定位于衡量整体性能，无法通过针对性的详细论证评估跟踪器的优势和缺点，且很多常用评测指标缺乏令人信服的含义解释。本文从测试数据、测试方法、测试指标3方面深入分析跟踪评价体系的细节。首先，归纳整理了12个反映图像序列不同特性的帧间视觉属性，并首次定量给出其归一化公式。基于这些属性定义，提出两种新的测试方法，即基于相关性的测试和基于权重的测试，使评价体系能更直观、更清晰地评定跟踪器各方面性能。然后，将所提测试方法应用于著名的跟踪挑战赛，即Video Object Tracking (VOT) Challenge 2017。测试结果表明，在目标尺寸快速或剧烈变化时，跟踪器大多表现不佳，即使基于深度学习的先进跟踪器也未能很好解决这一问题。此外发现，中心位置差错（center location error，CLE）性能指标虽未考虑到目标尺度，在实际测试中仍对目标尺寸变化很敏感。

关键词：视觉跟踪；性能评价；视觉属性；计算机视觉

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Babenko B, Yang MH, Belongie S, 2011. Robust object tracking with online multiple instance learning. IEEE Trans Patt Anal Mach Intell, 33(8):1619-1632.

[2]Bao CL, Wu Y, Ling HB, et al., 2012. Real time robust L1 tracker using accelerated proximal gradient approach. IEEE Conf on Computer Vision and Pattern Recognition, p.1830-1837.

[3]Battistone F, Petrosino A, Santopietro V, 2018. Watch out: embedded video tracking with BST for unmanned aerial vehicles. J Signal Process Syst, 90(6):891-900.

[4]Bertinetto L, Valmadre J, Golodetz S, et al., 2016. Staple: complementary learners for real-time tracking. IEEE Conf on Computer Vision and Pattern Recognition, p.1401-1409.

[5]Čehovin L, Kristan M, Leonardis A, 2011. An adaptive coupled-layer visual model for robust visual tracking. IEEE Int Conf on Computer Vision, p.1363-1370.

[6]Čehovin L, Leonardis A, Kristan M, 2016a. Visual object tracking performance measures revisited. IEEE Trans Image Process, 25(3):1261-1274.

[7]Čehovin L, Leonardis A, Kristan M, 2016b. Robust visual tracking using template anchors. IEEE Winter Conf on Applications of Computer Vision, p.1-8.

[8]Chen K, Tao WB, 2018. Convolutional regression for visual tracking. IEEE Trans Image Process, 27(7):3611-3620.

[9]Collins R, Zhou XH, Teh SK, 2005. An open source tracking testbed and evaluation web site. Proc IEEE Int Workshop on Performance Evaluation of Tracking and Surveillance, p.17-24.

[10]Dalal N, Triggs B, 2005. Histograms of oriented gradients for human detection. IEEE Conf on Computer Vision and Pattern Recognition, p.886-893.

[11]Danelljan M, Häger G, Khan F, et al., 2014. Accurate scale estimation for robust visual tracking. Proc British Machine Vision Conf, p.1-11.

[12]Danelljan M, Häger G, Khan FS, et al., 2015a. Convolutional features for correlation filter based visual tracking. Proc IEEE Int Conf on Computer Vision Workshops, p.621- 629.

[13]Danelljan M, Häger G, Khan FS, et al., 2015b. Learning spatially regularized correlation filters for visual tracking. IEEE Int Conf on Computer Vision, p.4310-4318.

[14]Danelljan M, Robinson A, Khan FS, et al., 2016. Beyond correlation filters: learning continuous convolution operators for visual tracking. 14^th European Conf on Computer Vision, p.472-488.

[15]Danelljan M, Bhat G, Khan FS, et al., 2017. ECO: efficient convolution operators for tracking. IEEE Conf on Computer Vision and Pattern Recognition, p.6931-6939.

[16]Fischler MA, Bolles RC, 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM, 24(6):381-395.

[17]Funt B, Barnard K, Martin L, 1998. Is machine colour constancy good enough? 5^th European Conf on Computer Vision, p.445-459.

[18]Galoogahi HK, Fagg A, Huang C, et al., 2017. Need for speed: a benchmark for higher frame rate object tracking. IEEE Int Conf on Computer Vision, p.1134-1143.

[19]Gao SB, Yang KF, Li CY, et al., 2015. Color constancy using double-opponency. IEEE Trans Patt Anal Mach Intell, 37(10):1973-1985.

[20]Gundogdu E, Alatan AA, 2018. Good features to correlate for visual tracking. IEEE Trans Image Process, 27(5):2526- 2540.

[21]Hare S, Saffari A, Torr PHS, 2011. Struck: structured output tracking with kernels. Int Conf on Computer Vision, p.263-270.

[22]Hare S, Golodetz S, Saffari A, et al., 2016. Struck: structured output tracking with kernels. IEEE Trans Patt Anal Mach Intell, 38(10):2096-2109.

[23]He Z, Fan Y, Zhuang J, et al., 2017. Correlation filters with weighted convolution responses. IEEE Int Conf on Computer Vision Workshop, p.1992-2000.

[24]Henriques JF, Caseiro R, Martins P, et al., 2015. High-speed tracking with kernelized correlation filters. IEEE Trans Patt Anal Mach Intell, 37(3):583-596.

[25]Karasulu B, Korukoglu S, 2011. A software for performance evaluation and comparison of people detection and tracking methods in video processing. Multim Tools Appl, 55(3):677-723.

[26]Kristan M, Perš J, Perše M, et al., 2006. A Bayes-spectral- entropy-based measure of camera focus using a discrete cosine transform. Patt Recogn Lett, 27(13):1431-1439.

[27]Kristan M, Pflugfelder R, Leonardis A, et al., 2013. The Visual Object Tracking VOT2013 Challenge results. IEEE Int Conf on Computer Vision Workshops, p.98-111.

[28]Kristan M, Pflugfelder R, Leonardis A, et al., 2015a. The Visual Object Tracking VOT2014 Challenge results. European Conf on Computer Vision, p.191-217.

[29]Kristan M, Matas J, Leonardis A, et al., 2015b. The Visual Object Tracking VOT2015 Challenge results. IEEE Int Conf on Computer Vision Workshop, p.564-586.

[30]Kristan M, Matas J, Leonardis A, et al., 2016a. A novel performance evaluation methodology for single-target trackers. IEEE Trans Patt Anal Mach Intell, 38(11): 2137-2155.

[31]Kristan M, Leonardis A, Matas J, et al., 2016b. The Visual Object Tracking VOT2016 Challenge results. European Conf on Computer Vision, p.777-823.

[32]Kristan M, Leonardis A, Matas J, et al., 2017. The Visual Object Tracking VOT2017 Challenge results. IEEE Int Conf on Computer Vision Workshops, p.1949-1972.

[33]Kwon J, Lee KM, 2008. Tracking of abrupt motion using Wang-Landau Monte Carlo estimation. 10^th European Conf on Computer Vision, p.387-400.

[34]Li AN, Lin M, Wu Y, et al., 2016. NUS-PRO: a new visual tracking challenge. IEEE Trans Patt Anal Mach Intell, 38(2):335-349.

[35]Li B, Wu W, Wang Q, et al., 2019. SiamRPN++: evolution of Siamese visual tracking with very deep networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.4282-4291.

[36]Li SY, Yeung DY, 2017. Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. Proc 31^st AAAI Conf on Artificial Intelligence, p.4140-4146.

[37]Lowe DG, 1999. Object recognition from local scale-invariant features. Proc 7^th IEEE Int Conf on Computer Vision, p.150-1157.

[38]Lukežič A, Vojír T, Zajc LC, et al., 2017. Discriminative correlation filter with channel and spatial reliability. IEEE Conf on Computer Vision and Pattern Recognition, p.4847-4856.

[39]Lukežič A, Zajc LČ, Kristan M, 2018. Deformable parts correlation filters for robust visual tracking. IEEE Trans Cybern, 48(6):1849-1861.

[40]Mathew R, Hiremath SS, 2018. Control of velocity- constrained stepper motor-driven Hilare robot for waypoint navigation. Engineering, 4(4):491-499.

[41]Mocanu B, Tapu R, Zaharia T, 2017. Single object tracking using offline trained deep regression networks. 7^th Int Conf on Image Processing Theory, Tools and Applications, p.1-6.

[42]Nebehay G, Pflugfelder R, 2015. Clustering of static-adaptive correspondences for deformable object tracking. IEEE Conf on Computer Vision and Pattern Recognition, p.2784-2791.

[43]Ross DA, Lim J, Lin RS, et al., 2008. Incremental learning for robust visual tracking. Int J Comput Vis, 77(1-3):125-141.

[44]Senna P, Drummond IN, Bastos GS, 2017. Real-time ensemble-based tracker with Kalman filter. 30^th SIBGRAPI Conf on Graphics, Patterns and Images, p.338- 344.

[45]Smeulders AWM, Chu DM, Cucchiara R, et al., 2014. Visual tracking: an experimental survey. IEEE Trans Patt Anal Mach Intell, 36(7):1442-1468.

[46]Sun C, Wang D, Lu HC, et al., 2018. Learning spatial-aware regressions for visual tracking. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.8962-8970.

[47]Tran A, Manzanera A, 2017. Mixing Hough and color histogram models for accurate real-time object tracking. 17^th Int Conf on Computer Analysis of Images and Patterns, p.43-54.

[48]Valmadre J, Bertinetto L, Henriques J, et al., 2017. End-to-end representation learning for correlation filter based tracking. IEEE Conf on Computer Vision and Pattern Recognition, p.5000-5008.

[49]Vojíř T, Matas J, 2014. The enhanced flock of trackers. In: Cipolla R, Battiato S, Farinella GM (Eds.), Registration and Recognition in Images and Videos. Springer, Berlin, p.113-136.

[50]Vojíř T, Noskova J, Matas J, 2014. Robust scale-adaptive mean-shift for tracking. Patt Recogn Lett, 49:250-258.

[51]Wu Y, Lim J, Yang MH, 2013. Online object tracking: a benchmark. IEEE Conf on Computer Vision and Pattern Recognition, p.2411-2418.

[52]Wu Y, Lim J, Yang MH, 2015. Object tracking benchmark. IEEE Trans Patt Anal Mach Intell, 37(9):1834-1848.

[53]Yang LX, Liu RS, Zhang D, et al., 2017. Deep location- specific tracking. Proc 25^th ACM Int Conf on Multimedia, p.1309-1317.

[54]Yilmaz A, Javed O, Shah M, 2006. Object tracking: a survey. ACM Comput Surv, 38(4):13.

[55]Young DP, Ferryman JM, 2005. PETS metrics: on-line performance evaluation service. IEEE Int Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, p.317-324.

[56]Zhang JC, Peng YX, 2019. Object-aware aggregation with bidirectional temporal graph for video captioning. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.8327-8336.

[57]Zhang JM, Ma SG, Sclaroff S, 2014. MEEM: robust tracking via multiple experts using entropy minimization. 13^th European Conf on Computer Vision, p.188-203.

[58]Zhang RF, Deng T, Wang GH, et al., 2017. A robust object tracking framework based on a reliable point assignment algorithm. Front Inform Technol Electron Eng, 18(4): 545-558.

[59]Zhang TZ, Liu S, Xu CS, et al., 2018. Correlation particle filter for visual tracking. IEEE Trans Image Process, 27(6): 2676-2687.

[60]Zuo WM, Wu XH, Lin L, et al., 2019. Learning support correlation filters for visual tracking. IEEE Trans Patt Anal Mach Intell, 41(5):1158-1172.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

基于定量属性的单目标视觉跟踪算法评价体系研究

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference