CLC number: TP391.9
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2020-06-10
Cited: 0
Clicked: 6921
Citations: Bibtex RefMan EndNote GB/T7714
Liang Hou, Xiao-yi Luo, Zi-yang Wang, Jun Liang. Representation learning via a semi-supervised stacked distance autoencoder for image classification[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(7): 1005-1018.
@article{title="Representation learning via a semi-supervised stacked distance autoencoder for image classification",
author="Liang Hou, Xiao-yi Luo, Zi-yang Wang, Jun Liang",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="7",
pages="1005-1018",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900116"
}
%0 Journal Article
%T Representation learning via a semi-supervised stacked distance autoencoder for image classification
%A Liang Hou
%A Xiao-yi Luo
%A Zi-yang Wang
%A Jun Liang
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 7
%P 1005-1018
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900116
TY - JOUR
T1 - Representation learning via a semi-supervised stacked distance autoencoder for image classification
A1 - Liang Hou
A1 - Xiao-yi Luo
A1 - Zi-yang Wang
A1 - Jun Liang
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 7
SP - 1005
EP - 1018
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900116
Abstract: image classification is an important application of deep learning. In a typical classification task, the classification accuracy is strongly related to the features that are extracted via deep learning methods. An autoencoder is a special type of neural network, often used for dimensionality reduction and feature extraction. The proposed method is based on the traditional autoencoder, incorporating the “distance” information between samples from different categories. The model is called a semi-supervised distance autoencoder. Each layer is first pre-trained in an unsupervised manner. In the subsequent supervised training, the optimized parameters are set as the initial values. To obtain more suitable features, we use a stacked model to replace the basic autoencoder structure with a single hidden layer. A series of experiments are carried out to test the performance of different models on several datasets, including the MNIST dataset, street view house numbers (SVHN) dataset, German traffic sign recognition benchmark (GTSRB), and CIFAR-10 dataset. The proposed semi-supervised distance autoencoder method is compared with the traditional autoencoder, sparse autoencoder, and supervised autoencoder. Experimental results verify the effectiveness of the proposed model.
[1]Bengio Y, 2009. Learning deep architectures for AI. Found Trends Mach Learn, 2(1):1-127.
[2]Bengio Y, Courville A, Vincent P, 2013. Representation learning: a review and new perspectives. IEEE Trans Patt Anal Mach Intell, 35(8):1798-1828.
[3]Bianco S, Buzzelli M, Schettini R, 2018. Multiscale fully convolutional network for image saliency. J Electron Imag, 27(5):051221.
[4]Deng J, Zhang ZX, Marchi E, et al., 2013. Sparse autoencoder- based feature transfer learning for speech emotion recognition. Humaine Association Conf on Affective Computing and Intelligent Interaction, p.511-516.
[5]Du F, Zhang JS, Ji NN, et al., 2018. Discriminative representation learning with supervised auto-encoder. Neur Process Lett, 49(2):507-520.
[6]Feng SW, Duarte MF, 2018. Graph autoencoder-based unsupervised feature selection with broad and local data structure preservation. Neurocomputing, 312:310-323.
[7]Glorot X, Bengio Y, 2010. Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res, 9:249-256.
[8]Gong YC, Lazebnik S, Gordo A, et al., 2013. Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Patt Anal Mach Intell, 35(12):2916-2929.
[9]Haralick RM, Shanmugam K, Dinstein I, 1973. Textural features for image classification. IEEE Trans Syst Man Cybern, SMC-3(6):610-621.
[10]He XT, Peng YX, Zhao JJ, 2018. Fast fine-grained image classification via weakly supervised discriminative localization. IEEE Trans Circ Syst Video Technol, 29(5): 1394-1407.
[11]He XT, Peng YX, Zhao JJ, 2019. Which and how many regions to gaze: focus discriminative regions for fine- grained visual categorization. Int J Comput Vis, 127(9): 1235-1255.
[12]Hinton GE, 2007. Learning multiple layers of representation. Trends Cogn Sci, 11(10):428-434.
[13]Hinton GE, Salakhutdinov RR, 2006. Reducing the dimensionality of data with neural networks. Science, 313(5786):504-507.
[14]Kingma DP, Welling M, 2016. Auto-encoding variational Bayes. https://arxiv.org/abs/1312.6114
[15]Meng LH, Ding SF, Zhang N, et al., 2018. Research of stacked denoising sparse autoencoder. Neur Comput Appl, 30(7): 2083-2100.
[16]Meng QX, Catchpoole D, Skillicom D, et al., 2017. Relational autoencoder for feature extraction. Int Joint Conf on Neural Networks, p.364-371.
[17]Peng YX, He XT, Zhao JJ, 2018. Object-part attention model for fine-grained image classification. IEEE Trans Image Process, 27(3):1487-1500.
[18]Rahmani MH, Almasganj F, Ali Seyyedsalehi S, 2018. Audio- visual feature fusion via deep neural networks for automatic speech recognition. Dig Signal Process, 82(5): 54-63.
[19]Rifai S, Vincent P, Muller X, et al., 2011. Contractive auto- encoders: explicit invariance during feature extraction. Proc 28th Int Conf on Machine Learning, p.833-840.
[20]Santana E, Emigh M, Principe JC, 2016. Information theoretic- learning auto-encoder. Int Joint Conf on Neural Networks.
[21]Sun Y, Chen Y, Wang XG, et al., 2014. Deep learning face representation by joint identification-verification. Proc 27th Int Conf on Neural Information Processing, p.1988- 1996.
[22]Sun YN, Xue B, Zhang MJ, et al., 2017. A particle swarm optimization-based flexible convolutional autoencoder for image classification. IEEE Trans Neur Netw Learn Syst, 30(8):2295-2309.
[23]Taherkhani A, Cosma G, Mcginnity TM, 2018. Deep-FS: a feature selection algorithm for deep Boltzmann machines. Neurocomputing, 322:22-37.
[24]Tang JH, Li ZC, Wang M, et al., 2015. Neighborhood discriminant hashing for large-scale image retrieval. IEEE Trans Image Process, 24(9):2827-2840.
[25]Tolstikhin I, Bousquet O, Gelly S, et al., 2017. Wasserstein auto-encoders. https://arxiv.org/abs/1711.01558
[26]Vincent P, Larochelle H, Lajoie I, et al., 2010. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res, 11(12):3371-3408.
[27]Wang W, Huang Y, Wang YZ, et al., 2014. Generalized autoencoder: a neural network framework for dimensionality reduction. IEEE Conf on Computer Vision and Pattern Recognition.
[28]Wu J, Cai ZH, Zhu XQ, 2013. Self-adaptive probability estimation for Naive Bayes classification. Int Joint Conf on Neural Networks.
[29]Xu WD, Sun HZ, Deng C, et al., 2016. Variational autoencoders for semi-supervised text classification. https://arxiv.org/abs/1603.02514
[30]Zhang TS, Wang W, Ye H, et al., 2016. Fault detection for ironmaking process based on stacked denoising autoencoders. American Control Conf, p.3261-3267.
Open peer comments: Debate/Discuss/Question/Opinion
<1>