Affiliation(s):
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;
moreAffiliation(s): College of Intelligence and Computing, Tianjin University, Tianjin 300350, China; Songshan Laboratory, Zhengzhou 450000, China; School of Electronic Engineering, Xidian University, Xi’an 710401, China;
less
Deng LI, Peng LI, Aming WU, Yahong HAN. Prototype-guided cross-task knowledge distillation[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2400383
@article{title="Prototype-guided cross-task knowledge distillation", author="Deng LI, Peng LI, Aming WU, Yahong HAN", journal="Frontiers of Information Technology & Electronic Engineering", year="in press", publisher="Zhejiang University Press & Springer", doi="https://doi.org/10.1631/FITEE.2400383" }
%0 Journal Article %T Prototype-guided cross-task knowledge distillation %A Deng LI %A Peng LI %A Aming WU %A Yahong HAN %J Frontiers of Information Technology & Electronic Engineering %P %@ 2095-9184 %D in press %I Zhejiang University Press & Springer doi="https://doi.org/10.1631/FITEE.2400383"
TY - JOUR T1 - Prototype-guided cross-task knowledge distillation A1 - Deng LI A1 - Peng LI A1 - Aming WU A1 - Yahong HAN J0 - Frontiers of Information Technology & Electronic Engineering SP - EP - %@ 2095-9184 Y1 - in press PB - Zhejiang University Press & Springer ER - doi="https://doi.org/10.1631/FITEE.2400383"
Abstract: Recently, large-scale pretrained models have revealed their benefits in various tasks. However, due to the enormous computation complexity and storage demands, it is challenging to apply large-scale models to real scenarios. Existing knowledge distillation methods mainly require the teacher model and the student model to share the same label space, which restricts its application in the real scenario. To alleviate the constraint of different label spaces, we propose a prototype-guided cross-task knowledge distillation (ProC-KD) method to migrate the intrinsic local-level object knowledge of the teacher network to various task scenarios. First, to better learn the generalized knowledge in cross-task scenarios, we present a prototype learning module to learn the invariant intrinsic local representation of objects from the teacher network. Secondly, for diverse downstream tasks, a task-adaptive feature augmentation module is proposed to enhance the student network features with the learned generalization prototype representations and guide the learning of the student network to improve its generalization ability. The experimental results on various visual tasks demonstrate the effectiveness of our approach for cross-task knowledge distillation scenarios.
Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference
Open peer comments: Debate/Discuss/Question/Opinion
Open peer comments: Debate/Discuss/Question/Opinion
<1>