CLC number: TN919.8
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2018-03-15
Cited: 0
Clicked: 7195
Xu-guang Zuo, Lu Yu. Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(3): 459-470.
@article{title="Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots",
author="Xu-guang Zuo, Lu Yu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="3",
pages="459-470",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601552"
}
%0 Journal Article
%T Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots
%A Xu-guang Zuo
%A Lu Yu
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 3
%P 459-470
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601552
TY - JOUR
T1 - Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots
A1 - Xu-guang Zuo
A1 - Lu Yu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 3
SP - 459
EP - 470
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601552
Abstract: The latest video coding standard high Efficiency Video Coding (HEVC) can achieve much higher coding efficiency than previous video coding standards. Particularly, by exploiting the hierarchical B-picture prediction structure, temporal redundancy among neighbor frames is eliminated remarkably well. In practice, videos available to consumers usually contain many repeated shots, such as TV series, movies, and talk shows. According to our observations, when these videos are encoded by HEVC with the hierarchical B-picture structure, the temporal correlation in each shot is well exploited. However, the long-term correlation between repeated shots has not been used. We propose a long-term prediction (LTP) scheme to use the long-term temporal correlation between correlated shots in a video. The long-term reference (LTR) frames of a source video are chosen by clustering similar shots and extracting the representative frames, and a modified hierarchical B-picture coding structure based on an LTR frame is introduced to support long-term temporal prediction. An adaptive quantization method is further designed for LTR frames to improve the overall video coding efficiency. Experimental results show that up to 22.86% coding gain can be achieved using the new coding scheme.
[1]Alfonso D, Biffi B, Pezzoni L, 2006. Adaptive GOP size control in H.264/AVC encoding based on scene change detection. Proc 7th Nordic Signal Processing Symp, p.86-89.
[2]Bjontegaard G, 2001. Calculation of average PSNR differences between RD curves. Document VCEG-M33. Austin, TX, USA.
[3]Bossen F, 2013. Common HM test conditions and software reference configurations. Document JCT-VC L1100. Geneva, Switzerland.
[4]Cendrowski M, 2013. The Hofstadter Insufficiency. The Big Bang Theory. DVD. Season 7. Episode 1. CBS.
[5]Dahl J, 2015. Chapter 33. House of Cards. DVD. Season 3, Episode 7. Netflix.
[6]Gao YB, Zhu C, Li S, 2016. Hierarchical temporal dependent rate-distortion optimization for low-delay coding. Proc IEEE Int Symp on Circuits and Systems, p.570-573.
[7]Hartigan JA, Wong MA, 1979. Algorithm AS 136:a K-means clustering algorithm. J R Stat Soc, 28(1):100-108.
[8]Hu N, Yang EH, 2015. Fast mode selection for HEVC intra-frame coding with entropy coding refinement based on a transparent composite model. IEEE Trans Circ Syst Video Technol, 25(9):1521-1532.
[9]Lee J, Kim S, Lim K, et al., 2015. A fast CU size decision algorithm for HEVC. IEEE Trans Circ Syst Video Technol, 25(3):411-421.
[10]Lenka K, Jaroslav P, Michal M, 2018. Adaptive group of pictures structure based on the positions of video cuts. Proc World Academy of Science, Engineering and Technology, p.377-380.
[11]Li S, Zhu C, Gao YB, et al., 2016. Lagrangian multiplier adaptation for rate-distortion optimization with inter-frame dependency. IEEE Trans Circ Syst Video Technol, 26(1):117-129.
[12]Liu D, Zhao DB, Ji XY, et al., 2010. Dual frame motion compensation with optimal long-term reference frame selection and bit allocation. IEEE Trans Circ Syst Video Technol, 20(3):325-339.
[13]McCarthy C, 2014. The Sign of Three. Sherlock. DVD. Season 3, Episode 2. BBC.
[14]Ngo CW, Pong TC, Zhang HJ, 2001. On clustering and retrieval of video shots. Proc 9th ACM Int Conf on Multimedia, p.51-60.
[15]Nutter D, 2012. A Man Without Honor. Game of Thrones. DVD. Season 2, Episode 7. HBO.
[16]Pan ZQ, Kwong S, Sun MT, et al., 2014. Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC. IEEE Trans Broadcast, 60(2):405-412.
[17]Pan ZQ, Zhang Y, Lei JJ, et al., 2016a. Early DIRECT mode decision based on all-zero block and rate distortion cost for multiview video coding. IET Image Process, 10(1):9-15.
[18]Pan ZQ, Zhang Y, Kwong S, 2016b. Fast mode decision based on texture–depth correlation and motion prediction for multiview depth video coding. J Real-Time Image Process, 11(1):27-36.
[19]Pan ZQ, Lei JJ, Zhang Y, et al., 2016c. Fast motion estimation based on content property for low-complexity H.265/HEVC encoder. IEEE Trans Broadcast, 62(3):675-684.
[20]Pan ZQ, Jin P, Lei JJ, et al., 2016d. Fast reference frame selection based on content similarity for low complexity HEVC encoder. J Vis Commun Image Represent, 40:516-524.
[21]Paul M, Lin WS, Lau CT, et al., 2011. Explore and model better I-frames for video coding. IEEE Trans Circ Syst Video Technol, 21(9):1242-1254.
[22]Paul M, Lin WS, Lau CT, et al., 2014. A long-term reference frame for hierarchical B-picture-based video coding. IEEE Trans Circ Syst Video Technol, 24(10):1729-1742.
[23]Rosewarne C, Bross B, Naccari M, et al., 2016. High Efficiency Video Coding (HEVC) Test Model 16 (HM 16). Document JCTVC-X1002. Geneva, Switzerland.
[24]Scardino D, 2015. And the Show and Don’t Tell. 2 Broke Girls. DVD. Season 5, Episode 17. CBS.
[25]Schwarz H, Marpe D, Wiegand T, 2007. Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans Circ Syst Video Technol, 17(9):1103-1120.
[26]Sullivan GJ, Ohm JR, Han WJ, et al., 2012. Overview of the High Efficiency Video Coding (HEVC) standard. IEEE Trans Circ Syst Video Technol, 22(12):1649-1668.
[27]Tang XL, Dai SK, Cai CH, 2010. An analysis of TZSearch algorithm in JMVC. Proc IEEE Int Conf on Green Circuits and Systems, p.516-520.
[28]Tirone R, 2015. The Price. Once Upon a Time. DVD. Season 5, Episode 2. ABC.
[29]Tiwari M, Cosman PC, 2008. Selection of long-term reference frames in dual-frame video coding using simulated annealing. IEEE Signal Process Lett, 15:249-252.
[30]Vendrig J, Worring M, 2002. Systematic evaluation of logical story unit segmentation. IEEE Trans Multim, 4(4):492-499.
[31]Wang XY, Weng ZK, 2000. Scene abrupt change detection. Proc IEEE Conf on Electrical and Computer Engineering, p.880-883.
[32]Wiegand T, Sullivan GJ, Bjontegaard G, et al., 2003. Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol, 13(7):560-576.
[33]Youm S, Kim W, 2003. Dynamic threshold method for scene change detection. Proc IEEE Int Conf on Multimedia and Expo, p.337-340.
[34]Zhang XG, Liang LH, Huang H, et al., 2010. An efficient coding scheme for surveillance videos captured by stationary cameras. Proc SPIE Visual Communications and Image Processing, p.1-10.
[35]Zhang XG, Tian YH, Huang TJ, et al., 2012. Low-complexity and high-efficiency background modeling for surveillance video coding. Proc IEEE Visual Communications and Image Processing, p.769-784.
[36]Zhang XG, Huang TJ, Tian YH, et al., 2014. Background-modeling-based adaptive prediction for surveillance video coding. IEEE Trans Image Process, 23(2):769-784.
[37]Zheng XL, 2012. Empresses in the Palace. DVD. Beijing Television Arts Centre, Beijing, China (in Chinese).
[38]Zuo XG, Yu L, 2015. A novel interpolation-free scheme for fractional pixel motion estimation. Proc Picture Coding Symp, p.80-84.
Open peer comments: Debate/Discuss/Question/Opinion
<1>