CLC number: TP181
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2023-04-20
Cited: 0
Clicked: 2793
Citations: Bibtex RefMan EndNote GB/T7714
Han YAN, Chongquan ZHONG, Yuhu WU, Liyong ZHANG, Wei LU. A hybrid-model optimization algorithm based on the Gaussian process and particle swarm optimization for mixed-variable CNN hyperparameter automatic search[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2200515 @article{title="A hybrid-model optimization algorithm based on the Gaussian process and particle swarm optimization for mixed-variable CNN hyperparameter automatic search", %0 Journal Article TY - JOUR
一种基于高斯过程与粒子群算法的CNN超参数自动搜索混合模型优化算法大连理工大学控制科学与工程学院,中国大连市,116024 摘要:卷积神经网络(CNN)在许多实际应用领域中有着快速发展。然而,CNN性能很大程度上取决于其超参数,而为CNN配置合适的超参数通常面临着以下3个挑战:(1)不同类型CNN超参数的混合变量编码问题;(2)评估候选模型的昂贵计算成本问题;(3)确保搜索过程中收敛速率和模型性能问题。针对上述问题,提出一种基于高斯过程(GP)和粒子群优化算法(PSO)的混合模型优化算法(GPPSO),用于自动搜索最优的CNN超参数配置。首先,设计一种新的编码方法高效编码CNN中不同类型的超参数。其次,提出一种混合代理辅助(HSA)模型降低评估候选模型的高计算成本。最后,设计一种新的激活函数改善模型性能并确保收敛速率。在图像分类基准数据集上进行了大量实验,验证GPPSO优于最先进的方法。以金属断口诊断为例,验证GPPSO算法在实际应用中的有效性。实验结果表明,GPPSO仅需0.04和1.70 GPU天即可在CIFAR-10和CIFAR-100数据集上实现95.26%和76.36%识别准确率。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Abadi M, Agarwal A, Barham P, et al., 2016. TensorFlow: large-scale machine learning on heterogeneous distributed systems. https://arxiv.org/abs/1603.04467 ![]() [2]Alvarez-Rodriguez U, Battiston F, de Arruda GF, et al., 2021. Evolutionary dynamics of higher-order interactions in social networks. Nat Hum Behav, 5(5):586-595. ![]() [3]Alzubaidi L, Zhang JL, Humaidi AJ, et al., 2021. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data, 8(1):53. ![]() [4]Baker B, Gupta O, Naik N, et al., 2017. Designing neural network architectures using reinforcement learning. https://arxiv.org/abs/1611.02167 ![]() [5]Cai H, Chen TY, Zhang WN, et al., 2018. Efficient architecture search by network transformation. Proc 32nd AAAI Conf on Artificial Intelligence, p.2787-2794. ![]() [6]Chen ZG, Zhan ZH, Kwong S, et al., 2022. Evolutionary computation for intelligent transportation in smart cities: a survey. IEEE Comput Intell Mag, 17(2):83-102. ![]() [7]Darwish A, Hassanien AE, Das S, 2020. A survey of swarm and evolutionary computing approaches for deep learning. Artif Intell Rev, 53(3):1767-1812. ![]() [8]Fernandes FE, Yen GG, 2021. Automatic searching and pruning of deep neural networks for medical imaging diagnostic. IEEE Trans Neur Netw Learn Syst, 32(12):5664-5674. ![]() [9]Fielding B, Lawrence T, Zhang L, 2019. Evolving and ensembling deep CNN architectures for image classification. Int Joint Conf on Neural Networks, p.1-8. ![]() [10]Goodfellow IJ, Warde-Farley D, Mirza M, et al., 2013. Maxout networks. Proc 30th Int Conf on Machine Learning, p.1319-1327. ![]() [11]Grigorescu S, Trasnea B, Cocias T, et al., 2020. A survey of deep learning techniques for autonomous driving. J Field Robot, 37(3):362-386. ![]() [12]Guo H, Zhang W, Nie XY, et al., 2022. High-speed planar imaging of OH radicals in turbulent flames assisted by deep learning. Appl Phys B, 128(3):52. ![]() [13]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770-778. ![]() [14]Huang G, Liu Z, van der Maaten L, et al., 2017. Densely connected convolutional networks. 30th IEEE Conf on Computer Vision and Pattern Recognition, p.2261-2269. ![]() [15]Jiang WW, Luo JY, 2022. Graph neural network for traffic forecasting: a survey. Expert Syst Appl, 207:117921. ![]() [16]Jin HF, Song QQ, Hu X, 2019. Auto-Keras: an efficient neural architecture search system. Proc 25th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining, p.1946-1956. ![]() [17]Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84-90. ![]() [18]Larsson G, Maire M, Shakhnarovich G, 2016. FractalNet: ultra-deep neural networks without residuals. https://arxiv.org/abs/1605.07648 ![]() [19]Li JY, Zhan ZH, Wang C, et al., 2020. Boosting data-driven evolutionary algorithm with localized data generation. IEEE Trans Evol Comput, 24(5):923-937. ![]() [20]Li JY, Zhan ZH, Liu RD, et al., 2021. Generation-level parallelism for evolutionary computation: a pipeline-based parallel particle swarm optimization. IEEE Trans Cybern, 51(10):4848-4859. ![]() [21]Li JY, Zhan ZH, Zhang J, 2022. Evolutionary computation for expensive optimization: a survey. Mach Intell Res, 19(1):3-23. ![]() [22]Li JY, Zhan ZH, Xu J, et al., 2023. Surrogate-assisted hybrid-model estimation of distribution algorithm for mixed-variable hyperparameters optimization in convolutional neural networks. IEEE Trans Neur Netw Learn Syst, 34(5):2338-2352. ![]() [23]Li X, Lai SQ, Qian XM, 2022. DBCFace: towards pure convolutional neural network face detection. IEEE Trans Circ Syst Video Technol, 32(4):1792-1804. ![]() [24]Lin M, Chen Q, Yan SC, 2013. Network in network. https://arxiv.org/abs/1312.4400 ![]() [25]Liu HX, Simonyan K, Vinyals O, et al., 2017. Hierarchical representations for efficient architecture search. https://arxiv.org/abs/1711.00436 ![]() [26]Miranda LJV, 2018. PySwarms: a research toolkit for particle swarm optimization in Python. J Open Source Softw, 3(21):433. ![]() [27]Poli R, Kennedy J, Blackwell T, 2007. Particle swarm optimization. Swarm Intell, 1(1):33-57. ![]() [28]Real E, Moore S, Selle A, et al., 2017. Large-scale evolution of image classifiers. https://arxiv.org/abs/1703.01041v2 ![]() [29]Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556 ![]() [30]Snoek J, Larochelle H, Adams RP, 2012. Practical Bayesian optimization of machine learning algorithms. https://arxiv.org/abs/1206.2944 ![]() [31]Springenberg JT, Dosovitskiy A, Brox T, et al., 2014. Striving for simplicity: the all convolutional net. https://arxiv.org/abs/1412.6806v3 ![]() [32]Srivastava RK, Greff K, Schmidhuber J, 2015. Highway networks. https://arxiv.org/abs/1505.00387 ![]() [33]Suganuma M, Shirakawa S, Nagao T, 2017. A genetic programming approach to designing convolutional neural network architectures. Proc Genetic and Evolutionary Computation Conf, p.497-504. ![]() [34]Sun YN, Xue B, Zhang MJ, et al., 2019. A particle swarm optimization-based flexible convolutional autoencoder for image classification. IEEE Trans Neur Netw Learn Syst, 30(8):2295-2309. ![]() [35]Sun YN, Xue B, Zhang MJ, et al., 2020a. Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans Cybern, 50(9):3840-3854. ![]() [36]Sun YN, Xue B, Zhang M, et al., 2020b. Completely automated CNN architecture design based on blocks. IEEE Trans Neur Netw Learn Syst, 31(4):1242-1254. ![]() [37]Sun YN, Wang HD, Xue B, et al., 2020c. Surrogate-assisted evolutionary deep learning using an end-to-end random forest-based performance predictor. IEEE Trans Evol Comput, 24(2):350-364. ![]() [38]Tulbure AA, Tulbure AA, Dulf EH, 2022. A review on modern defect detection models using DCNNs-deep convolutional neural networks. J Adv Res, 35:33-48. ![]() [39]Wang B, Sun YN, Xue B, et al., 2018. Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. IEEE Congress on Evolutionary Computation, p.1-8. ![]() [40]Wang B, Xue B, Zhang MJ, 2020. Particle swarm optimisation for evolving deep neural networks for image classification by evolving and stacking transferable blocks. IEEE Congress on Evolutionary Computation, p.1-8. ![]() [41]Wang YQ, Li JY, Chen CH, et al., 2022. Scale adaptive fitness evaluation-based particle swarm optimisation for hyperparameter and architecture optimisation in neural networks and deep learning. CAAI Trans Intell Technol, early access. ![]() [42]Wu SH, Zhan ZH, Tan KC, et al., 2023. Orthogonal transfer for multitask optimization. IEEE Trans Evol Comput, 27(1):185-200. ![]() [43]Wu T, Shi J, Zhou DY, et al., 2019. A multi-objective particle swarm optimization for neural networks pruning. IEEE Congress on Evolutionary Computation, p.570-577. ![]() [44]Xie LX, Yuille A, 2017. Genetic CNN. IEEE Int Conf on Computer Vision, p.1388-1397. ![]() [45]Zhan ZH, Li JY, Zhang J, 2022a. Evolutionary deep learning: a survey. Neurocomputing, 483:42-58. ![]() [46]Zhan ZH, Zhang J, Lin Y, et al., 2022b. Matrix-based evolutionary computation. IEEE Trans Emerg Top Comput Intell, 6(2):315-328. ![]() [47]Zhong Z, Yan JJ, Wu W, et al., 2018. Practical block-wise neural network architecture generation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.2423-2432. ![]() [48]Zoph B, Le QV, 2017. Neural architecture search with reinforcement learning. https://arxiv.org/abs/1611.01578 ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>