JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2017 Vol.18 No.4 P.445-463

A systematic review of structured sparse learning

Author(s): Lin-bo Qiao, Bo-feng Zhang, Jin-shu Su, Xi-cheng Lu
Affiliation(s): College of Computer, National University of Defense Technology, Changsha 410073, China; more
Corresponding email(s): qiao.linbo@nudt.edu.cn, bfzhang@nudt.edu.cn, sjs@nudt.edu.cn, xclu@nudt.edu.cn
Key Words: Sparse learning, Structured sparse learning, Structured regularization

Share this article to： More \|Next Article >>>

Lin-bo Qiao, Bo-feng Zhang, Jin-shu Su, Xi-cheng Lu. A systematic review of structured sparse learning[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 445-463.

@article{title="A systematic review of structured sparse learning",
author="Lin-bo Qiao, Bo-feng Zhang, Jin-shu Su, Xi-cheng Lu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="18",
number="4",
pages="445-463",
year="2017",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601489"
}

%0 Journal Article
%T A systematic review of structured sparse learning
%A Lin-bo Qiao
%A Bo-feng Zhang
%A Jin-shu Su
%A Xi-cheng Lu
%J Frontiers of Information Technology & Electronic Engineering
%V 18
%N 4
%P 445-463
%@ 2095-9184
%D 2017
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601489

TY - JOUR
T1 - A systematic review of structured sparse learning
A1 - Lin-bo Qiao
A1 - Bo-feng Zhang
A1 - Jin-shu Su
A1 - Xi-cheng Lu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 18
IS - 4
SP - 445
EP - 463
%@ 2095-9184
Y1 - 2017
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601489

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: High dimensional data arising from diverse scientific research fields and industrial development have led to increased interest in sparse learning due to model parsimony and computational advantage. With the assumption of sparsity, many computational problems can be handled efficiently in practice. Structured sparse learning encodes the structural information of the variables and has been quite successful in numerous research fields. With various types of structures discovered, sorts of structured regularizations have been proposed. These regularizations have greatly improved the efficacy of sparse learning algorithms through the use of specific structural information. In this article, we present a systematic review of structured sparse learning including ideas, formulations, algorithms, and applications. We present these algorithms in the unified framework of minimizing the sum of loss and penalty functions, summarize publicly accessible software implementations, and compare the computational complexity of typical optimization methods to solve structured sparse learning problems. In experiments, we present applications in unsupervised learning, for structured signal recovery and hierarchical image reconstruction, and in supervised learning in the context of a novel graph-guided logistic regression.

结构化稀疏学习综述

概要：稀疏学习由于其简约特性和计算优势而获得了越来越多的关注，在具有稀疏性的条件下，许多计算问题可以在实践中得到有效的处理。而结构化稀疏学习则进一步将结构信息进行编码，在多个研究领域取得成功。随着各类型结构的发现，人们相继提出了各种结构化正则函数。这些正则函数通过利用特定的结构信息极大提高了稀疏学习算法的性能。在本文中，我们从想法、形式化、算法和应用等方面系统的回顾了结构化稀疏学习。我们将这些算法置于最小化损失函数和惩罚函数的统一框架中，总结了算法的开源软件实现，并比较了典型优化算法解决结构化稀疏学习问题时的计算复杂度。在实验中，我们给出了无监督学习在结构化信号恢复和层次化图像重建中的应用，以及具有图结构引导的逻辑回归的在监督学习中的应用。

关键词：结构化稀疏学习；算法；应用

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Asaei, A., Bourlard, H., Cevher, V., 2011a. Model-based compressive sensing for multi-party distant speech recognition. Proc. ICASSP, p.4600-4603.

[2]Asaei, A., Taghizadeh, M.J., Bourlard, H., et al., 2011b. Multi-party speech recovery exploiting structured sparsity models. Proc. Conf. on Int. Speech Communication Association, p.192-195.

[3]Asaei, A., Bourlard, H., Taghizadeh, M.J., et al., 2014a. Model-based sparse component analysis for reverberant speech localization. Proc. ICASSP, p.1439-1443

[4]Asaei, A., Golbabaee, M., Bourlard, H., et al., 2014b. Structured sparsity models for reverberant speech separation. IEEE/ACM Trans. Audio Speech Lang. Process., 22(3):620-633.

[5]Bach, F., 2008a. Consistency of trace norm minimization. J. Mach. Learn. Res., 9:1019-1048.

[6]Bach, F., 2008b. Consistency of the group Lasso and multiple kernel learning. J. Mach. Learn. Res., 9:1179-1225.

[7]Bach, F., Jenatton, R., Mairal, J., et al., 2011. Convex optimization with sparsity-inducing norms. In: Sra, S., Nowozin, S., Wright, S.J. (Eds.), Optimization for Machine Learning. MIT Press, Cambridge, p.1-35.

[8]Bach, F., Jenatton, R., Mairal, J., et al., 2012a. Optimization with sparsity-inducing penalties. Found. Trends Mach. Learn., 4(1):1-106.

[9]Bach, F., Jenatton, R., Mairal, J., et al., 2012b. Structured sparsity through convex optimization. Stat. Sci., 27(4):450-468.

[10]Bach, F., Jordan, M.I., 2006. Learning spectral clustering, with application to speech separation. J. Mach. Learn. Res., 7:1963-2001.

[11]Banerjee, O., El Ghaoui, L., d’Aspremont, A., 2008. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res., 9:485-516.

[12]Baraniuk, R.G., Cevher, V., Duarte, M.F., et al., 2010. Model-based compressive sensing. IEEE Trans. Inform. Theory, 56(4):1982-2001.

[13]Beck, A., Teboulle, M., 2003. Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett., 31(3):167-175.

[14]Beck, A., Teboulle, M., 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci., 2(1):183-202.

[15]Bengio, S., Pereira, F., Singer, Y., et al., 2009. Group sparse coding. Proc. NIPS, p.82-89.

[16]Blei, D.M., Griffiths, T.L., Jordan, M.I., 2010. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM, 57(2):7.

[17]Borne, K., 2009. Scientific data mining in astronomy. arXiv:0911.0505.

[18]Boyd, S., Parikh, N., Chu, E., et al., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn., 3(1):1-122.

[19]Bruckstein, A.M., Donoho, D.L., Elad, M., 2009. From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Rev., 51(1):34-81.

[20]Candés, E., Tao, T., 2007. The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat., 35(6):2313-2351.

[21]Candés, E.J., 2008. The restricted isometry property and its implications for compressed sensing. Comput. Rend. Math., 346(9-10):589-592.

[22]Candés, E.J., Recht, B., 2009. Exact matrix completion via convex optimization. Found. Comput. Math., 9(6):717-772.

[23]Candés, E.J., Romberg, J.K., Tao, T., 2006. Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math., 59(8):1207-1223.

[24]Candés, E.J., Wakin, M.B., Boyd, S.P., 2008. Enhancing sparsity by reweighted ℓ₁ minimization. J. Four. Anal. Appl., 14(5):877-905.

[25]Chandrasekaran, V., Parrilo, P.A., Willsky, A.S., 2012. Latent variable graphical model selection via convex optimization. Ann. Stat., 40(4):1935-1967.

[26]Chartrand, R., Yin, W.T., 2008. Iteratively reweighted algorithms for compressive sensing. Proc. ICASSP, p.3869-3872.

[27]Chen, C., Huang, J.Z., 2014. Exploiting the wavelet structure in compressed sensing MRI. Magn. Reson. Imag., 32(10):1377-1389.

[28]Chen, C., Li, Y.Q., Huang, J.Z., 2014. Forest sparsity for multi-channel compressive sensing. IEEE Trans. Signal Process., 62(11):2803-2813.

[29]Chen, H.Y., Sun, Z.G., Yi, F., et al., 2016. BufferBank storage: an economic, scalable and universally usable in-network storage model for streaming data applications. Sci. China Inform. Sci., 59(1):1-15.

[30]Chen, S., Donoho, D., 1994. Basis pursuit. Proc. Asilomar Conf. on Signals, Systems and Computers, p.41-44.

[31]Chen, X., Lin, Q.H., Kim, S., et al., 2012. Smoothing proximal gradient method for general structured sparse regression. Ann. Appl. Stat., 6(2):719-752.

[32]Combettes, P.L., Pesquet, J.C., 2011. Proximal splitting methods in signal processing. In: Bauschke, H.H., Burachik, R.S., Combettes, P.L., et al. (Eds.), Fixed-Point Algorithms for Inverse Problems in Science and Engineering. Springer, Berlin, p.185-212.

[33]Dempster, A.P., 1972. Covariance selection. Biometrics, 28:157-175.

[34]Donoho, D.L., Huo, X., 2001. Uncertainty principles and ideal atomic decomposition. IEEE Trans. Inform. Theory, 47(7):2845-2862.

[35]Donoho, D.L, Drori, I., Stodden, V.C, et al., 2007. SparseLab. http://sparselab.stanford.edu/

[36]Duarte, M.F., Eldar, Y.C., 2011. Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process., 59(9):4053-4085.

[37]Elad, M., 2010. Sparse and Redundant Representations: from Theory to Applications in Signal and Image Processing. Springer, Berlin.

[38]Fan, J.Q., Li, R.Z., 2011. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc., 96(456):1348-1360.

[39]Fan, J.Q., Lv, J.C., Qi, L., 2011. Sparse high-dimensional models in economics. Ann. Rev. Econ., 3:291-317.

[40]Foucart, S., Lai, M.J., 2009. Sparsest solutions of under-determined linear systems via ℓ_q -minimization for 0≦q≦1. Appl. Comput. Harmon. Anal., 26(3):395-407.

[41]Friedman, J., Hastie, T., Höfling, H., et al., 2007. Pathwise coordinate optimization. Ann. Appl. Stat., 1(2):302-332.

[42]Friedman, J., Hastie, T., Tibshirani, R., 2008. Sparse inverse covariance estimation with the graphical Lasso. Biostatistics, 9(3):432-441.

[43]Garber, D., Hazan, E., 2015. Faster rates for the Frank-Wolfe method over strongly-convex sets. Proc. ICML, p.541-549.

[44]Gill, P.E., Murray, W., Saunders, M.A., 2008. User’s Guide for SQOPT Version 7: Software for Large-Scale Linear and Quadratic Programming. http://www-leland.stanford.edu/group/SOL/guides/ sqdoc7.pdf

[45]Gong, P.H., Zhang, C.S., Lu, Z.S., et al., 2013. A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. Proc. ICML, p.37-45.

[46]Grant, M., Boyd, S., 2013. CVX: Matlab Software for Disciplined Convex Programming. Version 2.0 Beta. http://cvxr.com/cvx/

[47]Hazan, E., Agarwal, A., Kale, S., 2007. Logarithmic regret algorithms for online convex optimization. Mach. Learn., 69(2):169-192.

[48]Hoefling, H., 2010. A path algorithm for the fused Lasso signal approximator. J. Comput. Graph. Stat., 19(4):984-1006.

[49]Hong, M.Y., Razaviyayn, M., Luo, Z.Q., et al., 2015. A unified algorithmic framework for block-structured optimization involving big data. arXiv:1511.02746.

[50]Hu, T.C., Yu, J.H., 2016. Max-margin based Bayesian classifier. Front. Inform. Technol. Electron. Eng., 17(10): 973-981.

[51]Huang, J.Z., Zhang, T., Metaxas, D., 2011. Learning with structured sparsity. J. Mach. Learn. Res., 12:3371-3412.

[52]Huang, T., Wu, B.L., Lizardi, P., et al., 2005. Detection of DNA copy number alterations using penalized least squares regression. Bioinformatics, 21(20):3811-3817.

[53]Jacob, L., Obozinski, G., Vert, J.P., 2009. Group Lasso with overlap and graph Lasso. Proc. ICML, p.433-440.

[54]Jaggi, M., 2013. Revisiting Frank-Wolfe: projection-free sparse convex optimization. Proc. ICML, p.427-435.

[55]Jenatton, R., 2011. Structured Sparsity-Inducing Norms: Statistical and Algorithmic Properties with Applications to Neuroimaging. PhD Thesis, École Normale Supérieure de Cachan, Cachan, France.

[56]Jenatton, R., Obozinski, G., Bach, F., 2009. Structured sparse principal component analysis. Proc. AISTATS, p.366-373.

[57]Jenatton, R., Mairal, J., Bach, F.R., et al., 2010. Proximal methods for sparse hierarchical dictionary learning. Proc. ICML, p.487-494.

[58]Jenatton, R., Mairal, J., Obozinski, G., et al., 2011. Proximal methods for hierarchical sparse coding. J. Mach. Learn. Res., 12:2297-2334.

[59]Jenatton, R., Gramfort, A., Michel, V., et al., 2012. Multiscale mining of fMRI data with hierarchical structured sparsity. SIAM J. Imag. Sci., 5(3):835-856.

[60]John Lu, Z.Q., 2010. The elements of statistical learning: data mining, inference, and prediction. J. R. Stat. Soc. A, 173(3):693-694.

[61]Jones, B., West, M., 2005. Covariance decomposition in undirected Gaussian graphical models. Biometrika, 92(4): 779-786.

[62]Karygianni, S., Frossard, P., 2014. Structured sparse coding for image denoising or pattern detection. Proc. ICASSP, p.3533-3537.

[63]Kim, B.S., Park, J.Y., Gilbert, A.C., et al., 2013. Hierarchical classification of images by sparse approximation. Image Vis. Comput., 31(12):982-991.

[64]Kim, S., Xing, E.P., 2010. Tree-guided group Lasso for multi-task regression with structured sparsity. Proc. ICML, p.543-550.

[65]Kim, S., Xing, E.P., 2012. Tree-guided group Lasso for multi-response regression with structured sparsity, with an application to eQTL mapping. Ann. Appl. Stat., 6(3):1095-1117.

[66]Kim, S., Xing, E.P., 2014. Exploiting genome structure in association analysis. J. Comput. Biol., 21(4):345-360.

[67]Kolar, M., Xing, E.P., 2011. On time varying undirected graphs. Proc. AISTATS, p.407-415.

[68]Koren, Y., Bell, R., Volinsky, C., 2009. Matrix factorization techniques for recommender systems. Computer, 42(8):30-37.

[69]Lacoste-Julien, S., Schmidt, M., Bach, F., 2012. A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method. arXiv:1212.2002.

[70]Lai, M.J., Xu, Y.Y., Yin, W.T., 2013. Improved iteratively reweighted least squares for unconstrained smoothed ℓ_q minimization. SIAM J. Numer. Anal., 51(2):927-957.

[71]Lai, Z.Q., Lam, K.T., Wang, C.L., et al., 2015. Latency-aware DVFS for efficient power state transitions on many-core architectures. J. Supercomput., 71(7):2720-2747.

[72]Lai, Z.Q., Lam, K.T., Wang, C.L., et al., 2016. PoweRock: power modeling and flexible dynamic power management for many-core architectures. IEEE Syst. J., in press.

[73]Leng, C.L., Tang, C.Y., 2012. Sparse matrix graphical models. J. Am. Stat. Assoc., 107(499):1187-1200.

[74]Li, X.X., Mo, L.L., Yuan, X.M., et al., 2014. Linearized alternating direction method of multipliers for sparse group and fused Lasso models. Comput. Stat. Data Anal., 79:203-221.

[75]Lin, H.Z., Mairal, J.L., Harchaoui, Z., 2015. A universal catalyst for first-order optimization. Proc. NIPS, p.3384-3392.

[76]Liu, H., Palatucci, M., Zhang, J., 2009. Blockwise coordinate descent procedures for the multi-task Lasso, with applications to neural semantic basis discovery. Proc. ICML, p.649-656.

[77]Liu, J., Ji, S., Ye, J., 2009. SLEP: Sparse Learning with Efficient Projections. http://www.public.asu.edu/symbol~jye02/Software/SLEP

[78]Ma, S.Q., Xue, L.Z., Zou, H., 2013. Alternating direction methods for latent variable Gaussian graphical model selection. Neur. Comput., 25(8):2172-2198.

[79]Mairal, J., 2013. Optimization with first-order surrogate functions. Proc. ICML, p.783-791.

[80]Mairal, J., Bach, F., Ponce, J., et al., 2011. SPAMS: SPArse Modeling Software. http://spams-devel.gforge.inria.fr/

[81]Mairal, J., Bach, F., Ponce, J., 2014. Sparse modeling for image and vision processing. Found. Trends Comput. Graph. Vis., 8(2-3):85-283.

[82]Mallat, S., 2008. A Wavelet Tour of Signal Processing: the Sparse Way (3rd Ed.). Elsevier/Academic Press, Amsterdam.

[83]McAuley, J., Ming, J., Stewart, D., et al., 2005. Subband correlation and robust speech recognition. IEEE Trans. Speech Audio Process., 13(5):956-964.

[84]Meier, L., van de Geer, S., Bühlmann, P., 2008. The group Lasso for logistic regression. J. R. Stat. Soc. B, 70(1):53-71.

[85]Meinshausen, N., B#x00FC;hlmann, P., 2006. High-dimensional graphs and variable selection with the Lasso. Ann. Stat., 34(3):1436-1462.

[86]Meinshausen, N., Yu, B., 2008. Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat., 37(1):246-270.

[87]Micchelli, C.A., Morales, J.M., Pontil, M., 2013. Regularizers for structured sparsity. Adv. Comput. Math., 38(3):455-489.

[88]Mosci, S., Rosasco, L., Santoro, M., et al., 2010. Solving structured sparsity regularization with proximal methods. LNCS, 6322:418-433.

[89]Mougeot, M., Picard, D., Tribouley, K., 2013. Grouping strategies and thresholding for high dimensional linear models. J. Stat. Plan. Infer., 143(9):1417-1438.

[90]Najafian, M., 2016. Acoustic Model Selection for Recognition of Regional Accented Speech. PhD Thesis, University of Birmingham, Birmingham, UK.

[91]Negahban, S.N., Ravikumar, P., Wainwright, M.J., et al., 2012. A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. Stat. Sci., 27(4):538-557.

[92]Nemirovski, A., 2004. Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim., 15(1):229-251.

[93]Nesterov, Y., 2004. Introductory Lectures on Convex Optimization: a Basic Course. Springer Science ∧ Business Media.

[94]Nesterov, Y., 2009. Primal-dual subgradient methods for convex problems. Math. Program., 120(1):221-259.

[95]Parikh, N., Boyd, S., 2014. Proximal algorithms. Found. Trends Optim., 1(3):127-239.

[96]Peng, Z.M., Wu, T.Y., Xu, Y.Y., et al., 2016. Coordinate friendly structures, algorithms and applications. arXiv:1601.00863.

[97]Qiao, L.B., Lin, T.Y., Jiang, Y.G., et al., 2016a. On stochastic primal-dual hybrid gradient approach for compositely regularized minimization. Proc. European Conf. on Artificial Intelligence, p.167-174.

[98]Qiao, L.B., Zhang, B.F., Su, J.S., et al., 2016b. Linearized alternating direction method of multipliers for constrained nonconvex regularized optimization. Proc. Asian Conf. on Machine Learning, p.97-109.

[99]Qiao, L.B., Zhang, B.F., Zhuang, L., et al., 2016c. An efficient algorithm for tensor principal component analysis via proximal linearized alternating direction method of multipliers. Proc. Int. Conf. on Advanced Cloud and Big Data, p.283-288.

[100]Rakotomamonjy, A., 2011. Surveying and comparing simultaneous sparse approximation (or group-Lasso) algorithms. Signal Process., 91(7):1505-1526.

[101]Rasmussen, C.E., Ghahramani, Z., 2001. Occam’s razor. Proc. NIPS, p.294-300.

[102]Rendle, S., Schmidt-Thieme, L., 2010. Pairwise interaction tensor factorization for personalized tag recommendation. Proc. 3rd ACM Int. Conf. on Web Wearch and Data Mining, p.81-90.

[103]Roth, V., Fischer, B., 2008. The group-Lasso for generalized linear models: uniqueness of solutions and efficient algorithms. Proc. ICML, p.848-855.

[104]Rudin, L.I., Osher, S., Fatemi, E., 1992. Nonlinear total variation based noise removal algorithms. Phys. D, 60(1-4):259-268.

[105]Scheinberg, K., Ma, S., Goldfarb, D., 2010. Sparse inverse covariance selection via alternating linearization methods. Proc. NIPS, p.2101-2109.

[106]Selesnick, I.W., Bayram, I., 2014. Sparse signal estimation by maximally sparse convex optimization. IEEE Trans. Signal Process., 62(5):1078-1092.

[107]Simon, N., Friedman, J., Hastie, T., et al., 2013. A sparse-group Lasso. J. Comput. Graph. Stat., 22(2):231-245.

[108]Su, W.J., Boyd, S., Candés, E., 2014. A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. Proc. NIPS, p.2510-2518.

[109]Sun, Y.P., Chen, S.H., Han, B., et al., 2015a. A novel location privacy mining threat in vehicular Internet access service. LNCS, 9204:467-476.

[110]Sun, Y.P., Zhang, B.F., Zhao, B.K., et al., 2015b. Mix-zones optimal deployment for protecting location privacy in VANET. Peer-to-Peer Netw. Appl., 8(6):1108-1121.

[111]Suzuki, T.J., 2013. Dual averaging and proximal gradient descent for online alternating direction multiplier method. Proc. ICML, p.392-400.

[112]Takacs, G., Pilaszy, I., Nemeth, B., et al., 2009. Scalable collaborative filtering approaches for large recommender systems. J. Mach. Learn. Res., 10:623-656.

[113]Tibshirani, R., 1996. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B, 58(1):267-288.

[114]Tibshirani, R., Wang, P., 2008. Spatial smoothing and hot spot detection for CGH data using the fused Lasso. Biostatistics, 9(1):18-29.

[115]Tibshirani, R., Saunders, M., Rosset, S., et al., 2005. Sparsity and smoothness via the fused Lasso. J. R. Stat. Soc. B, 67(1):91-108.

[116]Toh, K., Todd, M.J., T#x00FC;t#x00FC;nc#x00FC;, R.H., 2006. SDPT3 Version 4.0: a Matlab Software for Semidefinite-Quadratic-Linear Programming. http://www.math.nus.edu.sg/symbol˜176mattohkc/sdpt3.html

[117]Tropp, J.A., 2004. Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inform. Theory, 50(10):2231-2242.

[118]Tropp, J.A., Gilbert, A.C., Muthukrishnan, S., et al., 2003. Improved sparse approximation over quasi-incoherent dictionaries. Proc. Int. Conf. on Image Processing, p.37-40.

[119]Tseng, P., 2008. On Accelerated Proximal Gradient Methods for Convex-Concave Optimization. http://www.mit.edu/dimitrib/PTseng/papers/apgm.pdf

[120]Tseng, P., Yun, S., 2009. A coordinate gradient descent method for nonsmooth separable minimization. Math. Program., 117(1):387-423.

[121]van den Berg, E., Friedlander, M.P., 2007. SPGL1: a Solver for Large-Scale Sparse Reconstruction. http://www.cs.ubc.ca/labs/scl/spgl1

[122]Villa, S., Rosasco, L., Mosci, S., et al., 2014. Proximal methods for the latent group Lasso penalty. Compt. Optim. Appl., 58(2):381-407.

[123]Vincent, M., Hansen, N.R., 2014. Sparse group Lasso and high dimensional multinomial classification. Comput. Stat. Data Anal., 71:771-786.

[124]Wainwright, M.J., Jordan, M.I., 2008. Graphical models, exponential families, and variational inference. Found. Trend. Mach. Learn., 1(1-2):1-305.

[125]Wang, H.S., Leng, C.L., 2008. A note on adaptive group Lasso. Comput. Stat. Data Anal., 52(12):5277-5286.

[126]Wang, L.C., You, Y., Lian, H., 2013. A simple and efficient algorithm for fused Lasso signal approximator with convex loss function. Comput. Stat., 28(4):1699-1714.

[127]Wang, Y., Wang, J.J., Xu, Z.B., 2013. On recovery of block-sparse signals via mixed ℓ₂/ℓ_q (0<≦1) norm minimization. EURASIP J. Adv. Signal Process., 2013: 1-17.

[128]Wen, Z., Goldfarb, D., Scheinberg, K., 2012. Block coordinate descent methods for semidefinite programming. In:: Anjos, M.F., Lasserre, J.B. (Eds.), Handbook on Semidefinite, Conic and Polynomial Optimization. Springer US, Boston, p.533-564.

[129]Wermuth, N., 1976. Analogies between multiplicative models for contingency tables and covariance selection. Biometrics, 32:95-108.

[130]Wille, A., B#x00FC;hlmann, P., 2006. Low-order conditional independence graphs for inferring genetic networks. Stat. Appl. Genet. Mol. Biol., 5(1).

[131]Wrinch, D., Jeffreys, H., 1921. On certain fundamental principles of scientific inquiry. Phil. Mag., 42(249):369-390.

[132]Wu, Y.L., Lu, X.C., Su, J.S., et al., 2016. An efficient searchable encryption against keyword guessing attacks for sharable electronic medical records in cloud-based system. J. Med. Syst., 40:258.

[133]Xiao, J.J., Qiao, L.B., Stolkin, R., et al., 2016. Distractor-supported single target tracking in extremely cluttered scenes. LNCS, 9908:121-136.

[134]Xiao, L., Zhang, T., 2014. A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim., 24(4):2057-2075.

[135]Xie, H., Tong, R.F., 2016. Image meshing via hierarchical optimization. Front. Inform. Technol. Electron. Eng., 17(1):32-40.

[136]Xie, Y.C., Huang, H., Hu, Y., et al., 2016. Applications of advanced control methods in spacecrafts: progress, challenges, and future prospects. Front. Inform. Technol. Electron. Eng., 17(9):841-861.

[137]Xie, Z.X., Xu, Y., 2014. Sparse group Lasso based uncertain feature selection. Int. J. Mach. Learn. Cybern., 5(2):201-210.

[138]Xu, X., Zhang, B.F., Zhong, Q.X., 2005. Text categorization using SVMs with Rocchio ensemble for Internet information classification. LNCS, 3619:1022-1031.

[139]Xu, X., Hu, D.W., Lu, X.C., 2007. Kernel-based least squares policy iteration for reinforcement learning. IEEE Trans. Neur. Netw., 18(4):973-992.

[140]Xu, X., Liu, C.M., Yang, S.X., et al., 2011. Hierarchical approximate policy iteration with binary-tree state space decomposition. IEEE Trans. Neur. Netw., 22(12):1863-1877.

[141]Xu, Z., Chang, X., Xu, F., et al., 2012. L_{1/2<> regularization: a thresholding representation theory and a fast solver. IEEE Trans. Neur. Netw. Learn. Syst., 23(7):1013-1027.}

[142]Yang, J.F., Yuan, X.M., 2013. Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Math. Comput., 82:301-329.

[143]Yang, X.J., Liao, X.K., Xu, W.X., et al., 2010. Th-1: China’s first petaflop supercomputer. Front. Comput. Sci. China, 4(4):445-455.

[144]Yang, X.J., Liao, X.K., Lu, K., et al., 2011. The TianHe-1A supercomputer: its hardware and software. J. Comput. Sci. Technol., 26(3):344-351.

[145]Ye, G.B., Xie, X.H., 2011. Split Bregman method for large scale fused Lasso. Comput. Stat. Data Anal., 55(4):1552-1569.

[146]Yuan, M., Lin, Y., 2006. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B, 68(1):49-67.

[147]Yuan, M., Lin, Y., 2007. Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1):19-35.

[148]Yuan, M., Yang, B.X., Ma, Y.D., et al., 2015. Multi-scale UDCT dictionary learning based highly undersampled MR image reconstruction using patch-based constraint splitting augmented Lagrangian shrinkage algorithm. Front. Inform. Technol. Electron. Eng., 16(12):1069-1087.

[149]Zhang, B.F., Su, J.S., Xu, X., 2006. A class-incremental learning method for multi-class support vector machines in text classification. Proc. ICMLC, p.2581-2585.

[150]Zhang, C.H., 2010. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat., 38(2):894-942.

[151]Zhang, S.Z., Wang, K., Chen, B.L., et al., 2011. A new framework for co-clustering of gene expression data. LNCS, 7036:1-12.

[152]Zhang, T., 2009. Some sharp performance bounds for least squares regression with L₁ regularization. Ann. Stat., 37(5A):2109-2144.

[153]Zhang, T., 2010. Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res., 11:1081-1107.

[154]Zhang, T., 2013. Multi-stage convex relaxation for feature selection. Bernoulli, 19(5B):2277-2293.

[155]Zhang, T.Z., Ghanem, B., Liu, S., et al., 2012. Robust visual tracking via multi-task sparse learning. Proc. CVPR, p.2042-2049.

[156]Zhang, T.Z., Ghanem, B., Liu, S., et al., 2013. Robust visual tracking via structured multi-task sparse learning. Int. J. Comput. Vis., 101(2):367-383.

[157]Zhang, T.Z., Jia, K., Xu, C.S., et al., 2014. Partial occlusion handling for visual tracking via robust part matching. Proc. CVPR, p.1258-1265.

[158]Zhang, T.Z., Liu, S., Ahuja, N., et al., 2015a. Robust visual tracking via consistent low-rank sparse learning. Int. J. Comput. Vis., 111(2):171-190.

[159]Zhang, T.Z., Liu, S., Xu, C.S., et al., 2015b. Structural sparse tracking. Proc. CVPR, p.150-158.

[160]Zhang, Y., Yang, J., Yin, W., 2011. YALL1: Your Algorithms for L1. http://yall1.blogs.rice.edu

[161]Zhang, Z.K., Zhou, T., Zhang, Y.C., 2011. Tag-aware recommender systems: a state-of-the-art survey. J. Comput. Sci. Technol., 26:767-777.

[162]Zhao, P., Yu, B., 2006. On model selection consistency of Lasso. J. Mach. Learn. Res., 7:2541-2563.

[163]Zhao, P., Yu, B., 2007. Stagewise Lasso. J. Mach. Learn. Res., 8:2701-2726.

[164]Zhao, P., Rocha, G., Yu, B., 2009. The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat., 37(6a):3468-3497.

[165]Zhu, Y.T., Zhao, Y.B., Liu, J., et al., 2016. Low complexity robust adaptive beamforming for general-rank signal model with positive semidefinite constraint. Front. Inform. Technol. Electron. Eng., 17(11):1245-1252.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

结构化稀疏学习综述

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference