CLC number: TP391.4
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2019-04-11
Cited: 0
Clicked: 8005
Yue Wu, Can Wang, Yue-qing Zhang, Jia-jun Bu. Unsupervised feature selection via joint local learning and group sparse regression[J]. Frontiers of Information Technology & Electronic Engineering, 2019, 20(4): 538-553.
@article{title="Unsupervised feature selection via joint local learning and group sparse regression",
author="Yue Wu, Can Wang, Yue-qing Zhang, Jia-jun Bu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="20",
number="4",
pages="538-553",
year="2019",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1700804"
}
%0 Journal Article
%T Unsupervised feature selection via joint local learning and group sparse regression
%A Yue Wu
%A Can Wang
%A Yue-qing Zhang
%A Jia-jun Bu
%J Frontiers of Information Technology & Electronic Engineering
%V 20
%N 4
%P 538-553
%@ 2095-9184
%D 2019
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1700804
TY - JOUR
T1 - Unsupervised feature selection via joint local learning and group sparse regression
A1 - Yue Wu
A1 - Can Wang
A1 - Yue-qing Zhang
A1 - Jia-jun Bu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 20
IS - 4
SP - 538
EP - 553
%@ 2095-9184
Y1 - 2019
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1700804
Abstract: feature selection has attracted a great deal of interest over the past decades. By selecting meaningful feature subsets, the performance of learning algorithms can be effectively improved. Because label information is expensive to obtain, unsupervised feature selection methods are more widely used than the supervised ones. The key to unsupervised feature selection is to find features that effectively reflect the underlying data distribution. However, due to the inevitable redundancies and noise in a dataset, the intrinsic data distribution is not best revealed when using all features. To address this issue, we propose a novel unsupervised feature selection algorithm via joint local learning and group sparse regression (JLLGSR). JLLGSR incorporates local learning based clustering with group sparsity regularized regression in a single formulation, and seeks features that respect both the manifold structure and group sparse structure in the data space. An iterative optimization method is developed in which the weights finally converge on the important features and the selected features are able to improve the clustering results. Experiments on multiple real-world datasets (images, voices, and web pages) demonstrate the effectiveness of JLLGSR.
[1]Belkin M, Niyogi P, 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. 14th Int Conf on Neural Information Processing Systems: Natural and Synthetic, p.585-591.
[2]Bellman RE, 1961. Adaptive Control Processes: a Guided Tour. Princeton University Press, Princeton, NJ.
[3]Cai D, Zhang C, He X, 2010. Unsupervised feature selection for multi-cluster data. 16th Int Conf on Knowledge Discovery and Data Mining, p.333-342.
[4]Chang XJ, Nie FP, Yang Y, et al., 2016. Convex sparse PCA for unsupervised feature learning. ACM Trans Knowl Dis Data, 11(1):3.
[5]Cheung Y, Zeng H, 2009. Local kernel regression score for selecting features of high-dimensional data. IEEE Trans Knowl Data Eng, 21(12):1798-1802.
[6]Doquire G, Verleysen M, 2013. Mutual information-based feature selection for multilabel classification. Neurocomputing, 122:148-155.
[7]Du L, Shen YD, 2015. Unsupervised feature selection with adaptive structure learning. 21st Int Conf on Knowledge Discovery and Data Mining, p.209-218.
[8]Fanty M, Cole R, 1990. Spoken letter recognition. Conf on Advances in Neural Information Processing Systems, p.220-226.
[9]Georghiades AS, Belhumeur PN, Kriegman DJ, 2001. From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Patt Anal Mach Intell, 23(6):643-660.
[10]Guyon I, Elisseeff A, 2003. An introduction to variable and feature selection. J Mach Learn Res, 3:1157-1182.
[11]Guyon I, Weston J, Barnhill S, et al., 2002. Gene selection for cancer classification using support vector machines. Mach Learn, 46(1-3):389-422.
[12]Han YH, Wu F, Tian Q, et al., 2012. Image annotation by input-output structural grouping sparsity. IEEE Trans Image Proc, 21(6):3066-3079.
[13]Han YH, Yang Y, Yan Y, et al., 2015. Semisupervised feature selection via spline regression for video semantic recognition. IEEE Trans Neur Netw Learn Syst, 26(2):252- 264.
[14]He X, Niyogi P, 2004. Locality preserving projections. Conf on Advances in Neural Information Processing Systems, p.153-160.
[15]He X, Cai D, Niyogi P, 2005. Laplacian score for feature selection. Conf on Advances in Neural Information Processing Systems, p.507-514.
[16]Hou CP, Nie FP, Li XL, et al., 2014. Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern, 44(6):793-804.
[17]Hull JJ, 1994. A database for handwritten text recognition research. IEEE Trans Patt Anal Mach Intell, 16(5):550-554.
[18]Jiang Y, Ren JT, 2011. Eigenvalue sensitive feature selection. 28th Int Conf on Machine Learning, p.89-96.
[19]Jolliffe IT, 2002. Principal Component Analysis (2nd Ed.). Springer, New York.
[20]Krizhevsky A, 2009. Learning Multiple Layers of Features from Tiny Images. Science Department, University of Toronto, Tech, Toronto.
[21]Kuhn HW, 1955. The Hungarian method for the assignment problem. Nav Res Log Q, 2(1-2):83-97.
[22]Lee KC, Ho J, Kriegman DJ, 2005. Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Patt Anal Mach Intell, 27(5):684-698.
[23]Luo MN, Nie FP, Chang XJ, et al., 2018. Adaptive unsupervised feature selection with structure regularization. IEEE Trans Neur Netw Learn Syst, 29(4):944-956.
[24]Munkres J, 1957. Algorithms for the assignment and transportation problems. J Soc Ind Appl Math, 5(1):32-38.
[25]Nie FP, Xiang SM, Jia YQ, et al., 2008. Trace ratio criterion for feature selection. 23rd Int Conf on Artificial Intelligence, p.671-676.
[26]Nie FP, Xiang SM, Song YQ, et al., 2009. Orthogonal locality minimizing globality maximizing projections for feature extraction. Opt Eng, 48(1):017202.
[27]Nie FP, Huang H, Cai X, et al., 2010a. Efficient and robust feature selection via joint l>2,1-norms minimization. 23rd Int Conf on Neural Information Processing Systems, p.1813-1821.
[28]Nie FP, Xu D, Tsang IWH, et al., 2010b. Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans Image Proc, 19(7):1921-1932.
[29]Nie FP, Zeng ZN, Tsang IW, et al., 2011. Spectral embedded clustering: a framework for in-sample and out-of-sample spectral clustering. IEEE Trans Neur Netw, 22(11):1796-1808.
[30]Nie FP, Wang XQ, Jordan MI, et al., 2016a. The constrained Laplacian rank algorithm for graph-based clustering. 30th AAAI Conf on Artificial Intelligence, p.1969-1976.
[31]Nie FP, Zhu W, Li XI, 2016b. Unsupervised feature selection with structured graph optimization. 30th AAAI Conf on Artificial Intelligence, p.1302-1308.
[32]Peng HC, Long FH, Ding C, 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Patt Anal Mach Intell, 27(8):1226-1238.
[33]Roweis ST, Saul LK, 2000. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323-2326.
[34]Sun YJ, Todorovic S, Goodison S, 2010. Local-learning-based feature selection for high-dimensional data analysis. IEEE Trans Patt Anal Mach Intell, 32(9):1610-1626.
[35]Tan MK, Wang L, Tsang IW, 2010. Learning sparse SVM for feature selection on very high dimensional datasets. 27th Int Conf on Machine Learning, p.1047-1054.
[36]Tenenbaum JB, de Silva V, Langford JC, 2000. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323.
[37]Tibshirani R, 1996. Regression shrinkage and selection via the Lasso. J R Stat Soc B, 58(1):267-288.
[38]Verleysen M, 2003. Learning high-dimensional data. In: Ablameyko S, Goras L, Gori M (Eds.), Limitations and Future Trends in Neural Computation. IOS Press, Amsterdam, p.141-162.
[39]Wang D, Nie FP, Huang H, 2014. Unsupervised feature selection via unified trace ratio formulation and K-means clustering (TRACK). European Conf on Machine Learning and Knowledge Discovery in Databases, p.306-321.
[40]Wu Y, Wang C, Bu JJ, et al., 2016. Group sparse feature selection on local learning based clustering. Neurocomputing, 171:1118-1130.
[41]Yang Y, Shen HT, Ma ZG, et al., 2011. $l_2,1$-norm regularized discriminative feature selection for unsupervised learning. 22nd Int Joint Conf on Artificial Intelligence, p.1589-1594.
[42]Zeng H, Cheung YM, 2009. Feature selection for local learning based clustering. 13th Pacific-Asia Conf on Advances in Knowledge Discovery and Data Mining, p.414-425.
[43]Zeng H, Cheung YM, 2011. Feature selection and kernel learning for local learning-based clustering. IEEE Trans Patt Anal Mach Intell, 33(8):1532-1547.
[44]Zhao Z, Liu H, 2007. Spectral feature selection for supervised and unsupervised learning. 24th Int Conf on Machine Learning, p.1151-1157.
[45]Zou H, Hastie T, 2005. Regularization and variable selection via the elastic net. J R Stat Soc Ser B, 67(2):301-320.
Open peer comments: Debate/Discuss/Question/Opinion
<1>