Full Text:   <203>

CLC number: 

On-line Access: 2023-12-01

Received: 2023-01-13

Revision Accepted: 2023-07-06

Crosschecked: 0000-00-00

Cited: 0

Clicked: 237

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 1998 Vol.-1 No.-1 P.

http://doi.org/10.1631/FITEE.2300024


Enhancing action discrimination via category-specific frame clustering for weakly supervised temporal action localization


Author(s):  Huifen XIA, Yongzhao ZHAN, Honglin LIU, Xiaopeng REN

Affiliation(s):  School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China; more

Corresponding email(s):   106338096@qq.com, yzzhan@ujs.edu.cn, 534110389@qq.com, 1815772718@qq.com

Key Words:  Weakly supervised, Temporal action localization, Single-frame annotation, Category specific, Action discrimination Manual, Word template


Huifen XIA, Yongzhao ZHAN, Honglin LIU, Xiaopeng REN. Enhancing action discrimination via category-specific frame clustering for weakly supervised temporal action localization[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .

@article{title="Enhancing action discrimination via category-specific frame clustering for weakly supervised temporal action localization",
author="Huifen XIA, Yongzhao ZHAN, Honglin LIU, Xiaopeng REN",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2300024"
}

%0 Journal Article
%T Enhancing action discrimination via category-specific frame clustering for weakly supervised temporal action localization
%A Huifen XIA
%A Yongzhao ZHAN
%A Honglin LIU
%A Xiaopeng REN
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2300024

TY - JOUR
T1 - Enhancing action discrimination via category-specific frame clustering for weakly supervised temporal action localization
A1 - Huifen XIA
A1 - Yongzhao ZHAN
A1 - Honglin LIU
A1 - Xiaopeng REN
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2300024


Abstract: 
temporal action localization (TAL) is a task of detecting the start and end times of action instances and classifying them in an untrimmed video. As the number of action categories per video increases, existing weakly supervised temporal action localization (W-TAL) methods with only video-level labels cannot provide sufficient supervision. Single-frame supervision has attracted the interest of researchers. Existing paradigms model single-frame annotations from the perspective of video snippet sequences, neglect action discrimination of annotated frames, and do not pay sufficient attention to their correlations in the same category. Considering a category, the annotated frames exhibit distinctive appearance characteristics or clear action patterns. Thus, a novel method to enhance action discrimination via category-specific frame clustering for W-TAL is proposed. Specifically, the K-means clustering algorithm is employed to aggregate the annotated discriminative frames of the same category, which are regarded as exemplars to exhibit the characteristics of the action category. Then, the class activation scores are obtained by calculating the similarities between a frame and exemplars of various categories. Category-specific representation modelling can provide complimentary guidance to snippet sequence modelling in the mainline. As a result, a convex combination fusion mechanism is presented for annotated frames and snippet sequences to enhance the consistency properties of action discrimination, which can generate a robust class activation sequence for precise action classification and localization. Due to the supplementary guidance of action discriminative enhancement for video snippet sequences, our method outperforms existing single-frame annotation-based methods. Experiments conducted on three datasets THUMOS14, GTEA and BEOID show that our method achieves high localization performance compared with state-of-the-art methods.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE