|
Journal of Zhejiang University SCIENCE C
ISSN 1869-1951(Print), 1869-196x(Online), Monthly
2014 Vol.15 No.2 P.107-118
Transfer active learning by querying committee
Abstract: In real applications of inductive learning for classification, labeled instances are often deficient, and labeling them by an oracle is often expensive and time-consuming. Active learning on a single task aims to select only informative unlabeled instances for querying to improve the classification accuracy while decreasing the querying cost. However, an inevitable problem in active learning is that the informative measures for selecting queries are commonly based on the initial hypotheses sampled from only a few labeled instances. In such a circumstance, the initial hypotheses are not reliable and may deviate from the true distribution underlying the target task. Consequently, the informative measures will possibly select irrelevant instances. A promising way to compensate this problem is to borrow useful knowledge from other sources with abundant labeled information, which is called transfer learning. However, a significant challenge in transfer learning is how to measure the similarity between the source and the target tasks. One needs to be aware of different distributions or label assignments from unrelated source tasks; otherwise, they will lead to degenerated performance while transferring. Also, how to design an effective strategy to avoid selecting irrelevant samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to alleviate the negative transfer caused by distribution differences. To avoid querying irrelevant instances, we also present an adaptive strategy which could eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both the synthetic and the real data sets show that the proposed algorithm is able to query fewer instances with a higher accuracy and that it converges faster than the state-of-the-art methods.
Key words: Active learning, Transfer learning, Classification
创新要点:采用专家系统和混合模型,进一步优化迁移学习方法。在借助专家指导的过程中,主动学习(active learning)理论可以更好提供最有价值的数据集。因此,本研究引入专家系统对迁移算法的辅助方法设计,以及使用主动学习理论来进行未知数据的人工选择,以弥补迁移学习算法在初始数据集匮乏的情况下性能不足的弱点。
研究手段:将大量冗余数据(源数据)作为专家系统,在迭代过程中设置阈值,淘汰不符合条件的专家以及数据集合,可以大大提升算法性能。
重要结论:主动学习和迁移学习的结合,能够补偿迁移学习算法对初始数据集质量的高度依赖,避免负面迁移并大大提升算法性能。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/jzus.C1300167
CLC number:
TP3
Download Full Text:
Downloaded:
3392
Download summary:
<Click Here>Downloaded:
2207Clicked:
8688
Cited:
1
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2014-01-15