Affiliation(s):
School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212016, China;
moreAffiliation(s): School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212016, China; Jiangsu Engineering Research Center of Big Data Ubiquitous Perception and Intelligent Agricultural Applications, Zhenjiang 212016, China;
less
Abstract: Semi-supervised sound event detection (SSED) tasks typically leverage a large amount of unlabeled and synthetic data to facilitate model generalization during training, reducing overfitting on a limited set of labeled data. However, the generalization training process often encounters challenges associated with noise interference introduced by pseudo-labels or domain knowledge gaps. To alleviate noise interference in class distribution learning, we propose an efficient semi-supervised class distribution learning method through dynamic prompt tuning, named prompting class distribution optimization (PADO). Specifically, when modeling real labeled data, PADO dynamically incorporates independent learnable prompt tokens to explore prior knowledge about the true distribution. Then, the prior knowledge serves as prompt information, dynamically interacting with the posterior noisy class distribution information. In this case, PADO achieves class distribution optimization while maintaining model generalization, leading to a significant improvement in the efficiency of class distribution learning. Compared with state-of-the-art (SOTA) methods on the DCASE 2019, 2020, and 2021 challenge SSED datasets, PADO demonstrates significant performance improvements. Furthermore, it is ready to be extended to other benchmark models.
Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference
Open peer comments: Debate/Discuss/Question/Opinion
Open peer comments: Debate/Discuss/Question/Opinion
<1>