JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2018 Vol.19 No.1 P.91-103

Layer-wise domain correction for unsupervised domain adaptation

Author(s): Shuang Li, Shi-ji Song, Cheng Wu
Affiliation(s): Automation Department, Tsinghua University, Beijing 100084, China
Corresponding email(s): l-s12@mails.tsinghua.edu.cn, shijis@mail.tsinghua.edu.cn, wuc@tsinghua.edu.cn
Key Words: Unsupervised domain adaptation, Maximum mean discrepancy, Residual network, Deep learning

Share this article to： More <<< Previous Article \|Next Article >>>

Shuang Li, Shi-ji Song, Cheng Wu. Layer-wise domain correction for unsupervised domain adaptation[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(1): 91-103.

@article{title="Layer-wise domain correction for unsupervised domain adaptation",
author="Shuang Li, Shi-ji Song, Cheng Wu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="1",
pages="91-103",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1700774"
}

%0 Journal Article
%T Layer-wise domain correction for unsupervised domain adaptation
%A Shuang Li
%A Shi-ji Song
%A Cheng Wu
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 1
%P 91-103
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1700774

TY - JOUR
T1 - Layer-wise domain correction for unsupervised domain adaptation
A1 - Shuang Li
A1 - Shi-ji Song
A1 - Cheng Wu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 1
SP - 91
EP - 103
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1700774

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Deep neural networks have been successfully applied to numerous machine learning tasks because of their impressive feature abstraction capabilities. However, conventional deep networks assume that the training and test data are sampled from the same distribution, and this assumption is often violated in real-world scenarios. To address the domain shift or data bias problems, we introduce layer-wise domain correction (LDC), a new unsupervised domain adaptation algorithm which adapts an existing deep network through additive correction layers spaced throughout the network. Through the additive layers, the representations of source and target domains can be perfectly aligned. The corrections that are trained via maximum mean discrepancy, adapt to the target domain while increasing the representational capacity of the network. LDC requires no target labels, achieves state-of-the-art performance across several adaptation benchmarks, and requires significantly less training time than existing adaptation methods.

The online version of this article contains electronic supplementary materials, which are available to authorized users.

针对无监督域自适应问题的深度逐层领域修正算法

概要：深度神经网络凭借强大的特征抽象能力，已成功应用在机器学习的多个领域。然而，传统深度网络假设训练样本和测试样本来自同一分布，这一假设在很多实际应用中并不成立。为借助深度网络解决领域偏移问题，本文提出逐层领域修正（layer-wise domain correction, LDC）深度域自适应算法。该算法通过在已有深度网络中增加领域修正层，将源域网络成功适配到目标领域。逐层增加的领域修正层能够将两个领域特征的最大均值偏差（maximum mean discrepancy, MMD）距离最小化，从而完美匹配源域和目标域样本的特征表示。与此同时，网络深度的增加极大提高了网络表达能力。LDC算法不需要目标领域有标记样本，在几个跨领域分类识别数据集都取得了当时最好结果，且其训练比已有深度域自适应算法快近10倍。

关键词：无监督域自适应；最大均值偏差；残差网络；深度学习

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Ajakan H, Germain P, Larochelle H, et al., 2014. Domain-adversarial neural networks. https://arxiv.org/abs/1412.4446

[2]Ben-David S, Blitzer J, Crammer K, et al., 2010. A theory of learning from different domains. Mach Learn, 79(1-2):151-175.

[3]Blitzer J, McDonald R, Pereira F, 2006. Domain adaptation with structural correspondence learning. Proc Conf on Empirical Methods in Natural Language Processing, p.120-128.

[4]Borgwardt KM, Gretton A, Rasch MJ, et al., 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, 22(14):e49-e57.

[5]Chen MM, Weinberger KQ, Blitzer JC, 2011. Co-training for domain adaptation. Advances in Neural Information Processing Systems, p.2456-2464.

[6]Chen MM, Xu ZX, Weinberger K, et al., 2012. Marginalized denoising autoencoders for domain adaptation. https://arxiv.org/abs/1206.4683

[7]Donahue J, Jia YQ, Vinyals O, et al., 2014. Decaf: a deep convolutional activation feature for generic visual recognition. Proc 31^st Int Conf on Machine Learning, p.647-655.

[8]Duan LX, Tsang IW, Xu D, et al., 2009. Domain transfer SVM for video concept detection. IEEE Conf on Computer Vision and Pattern Recognition, p.1375-1381.

[9]Duan LX, Tsang IW, Xu D, 2012. Domain transfer multiple kernel learning. IEEE Trans Patt Anal Mach Intell, 34(3):465-479.

[10]Ganin Y, Lempitsky V, 2015. Unsupervised domain adaptation by backpropagation. Proc 32^nd Int Conf on Machine Learning, p.1180-1189.

[11]Gardner JR, Upchurch P, Kusner MJ, et al., 2015. Deep manifold traversal: changing labels with convolutional features. https://arxiv.org/abs/1511.06421

[12]Gehring J, Auli M, Grangier D, et al., 2017. Convolutional sequence to sequence learning. https://arxiv.org/abs/1705.03122

[13]Glorot X, Bordes A, Bengio Y, 2011. Domain adaptation for large-scale sentiment classification: a deep learning approach. Proc 28^th Int Conf on Machine Learning, p.513-520.

[14]Gong BQ, Shi Y, Sha F, et al., 2012. Geodesic flow kernel for unsupervised domain adaptation. IEEE Conf on Computer Vision and Pattern Recognition, p.2066-2073.

[15]Gong BQ, Grauman K, Sha F, 2013. Connecting the dots with landmarks: discriminatively learning domain-invariant features for unsupervised domain adaptation. Proc 30^th Int Conf on Machine Learning, p.222-230.

[16]Gretton A, Borgwardt KM, Rasch MJ, et al., 2012. A kernel two-sample test. J Mach Learn Res, 13(1):723-773.

[17]He KM, Zhang XY, Ren SQ, et al., 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. IEEE Int Conf on Computer Vision, p.1026-1034.

[18]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.

[19]Hoffman J, Tzeng E, Park T, et al., 2017. CyCADA: cycle-consistent adversarial domain adaptation. https://arxiv.org/abs/1711.03213

[20]Ioffe S, Szegedy C, 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proc 32^nd Int Conf on Machine Learning, p.448-456.

[21]Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980

[22]Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84-90.

[23]LeCun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278-2324.

[24]Li YJ, Swersky K, Zemel R, 2015. Generative moment matching networks. Proc 32^nd Int Conf on Machine Learning, p.1718-1727.

[25]Long MS, Wang JM, Ding GG, et al., 2013. Transfer feature learning with joint distribution adaptation. Proc IEEE Int Conf on Computer Vision, p.2200-2207.

[26]Long MS, Wang JM, Ding GG, et al., 2014. Transfer joint matching for unsupervised domain adaptation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1410-1417.

[27]Long MS, Cao Y, Wang JM, et al., 2015. Learning transferable features with deep adaptation networks. Proc 32^nd Int Conf on Machine Learning, p.97-105.

[28]Long MS, Wang JM, Cao Y, et al., 2016a. Deep learning of transferable representation for scalable domain adaptation. IEEE Trans Knowl Data Eng, 28(8):2027-2040.

[29]Long MS, Zhu H, Wang JM, et al., 2016b. Unsupervised domain adaptation with residual transfer networks. Advances in Neural Information Processing Systems, p.136-144.

[30]Mikolov T, Sutskever I, Chen K, et al., 2013. Distributed representations of words and phrases and their compositionality. Proc 26^th Int Conf on Neural Information Processing Systems, p.3111-3119.

[31]Netzer Y, Wang T, Coates A, et al., 2011. Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, p.1-9.

[32]Oquab M, Bottou L, Laptev I, et al., 2014. Learning and transferring mid-level image representations using convolutional neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1717-1724.

[33]Pan SJL, Yang Q, 2010. A survey on transfer learning. IEEE Trans Knowl Data Eng, 22(10):1345-1359.

[34]Pan SJL, Tsang IW, Kwok JT, et al., 2011. Domain adaptation via transfer component analysis. IEEE Trans Neur Netw, 22(2):199-210.

[35]Russakovsky O, Deng J, Su H, et al., 2015. ImageNet large scale visual recognition challenge. Int J Comput Vis, 115(3):211-252.

[36]Saenko K, Kulis B, Fritz M, et al., 2010. Adapting visual category models to new domains. LNCS, 6314:213-226.

[37]Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556

[38]Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929-1958.

[39]Sutskever I, Martens J, Dahl G, et al., 2013. On the importance of initialization and momentum in deep learning. Proc 30^th Int Conf on Machine Learning, p.1139-1147.

[40]Sutskever I, Vinyals O, Le Q, 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104-3112.

[41]Tzeng E, Hoffman J, Zhang N, et al., 2014. Deep domain confusion: maximizing for domain invariance. https://arxiv.org/abs/1412.3474

[42]van der Maaten L, Hinton G, 2008. Visualizing data using t-SNE. J Mach Learn Res, 9(11):2579-2605.

[43]Yosinski J, Clune J, Bengio Y, et al., 2014. How transferable are features in deep neural networks? Proc 27^th Int Conf on Neural Information Processing Systems, p.3320-3328.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

针对无监督域自适应问题的深度逐层领域修正算法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference