CLC number: O235; N93
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2020-03-31
Cited: 0
Clicked: 6794
Citations: Bibtex RefMan EndNote GB/T7714
Yi-fei Pu, Jian Wang. Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(6): 809-833.
@article{title="Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method",
author="Yi-fei Pu, Jian Wang",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="6",
pages="809-833",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900593"
}
%0 Journal Article
%T Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method
%A Yi-fei Pu
%A Jian Wang
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 6
%P 809-833
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900593
TY - JOUR
T1 - Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method
A1 - Yi-fei Pu
A1 - Jian Wang
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 6
SP - 809
EP - 833
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900593
Abstract: We introduce the fractional-order global optimal backpropagation machine, which is trained by an improved fractional-order steepest descent method (FSDM). This is a fractional-order backpropagation neural network (FBPNN), a state-of-the-art fractional-order branch of the family of backpropagation neural networks (BPNNs), different from the majority of the previous classic first-order BPNNs which are trained by the traditional first-order steepest descent method. The reverse incremental search of the proposed FBPNN is in the negative directions of the approximate fractional-order partial derivatives of the square error. First, the theoretical concept of an FBPNN trained by an improved FSDM is described mathematically. Then, the mathematical proof of fractional-order global optimal convergence, an assumption of the structure, and fractional-order multi-scale global optimization of the FBPNN are analyzed in detail. Finally, we perform three (types of) experiments to compare the performances of an FBPNN and a classic first-order BPNN, i.e., example function approximation, fractional-order multi-scale global optimization, and comparison of global search and error fitting abilities with real data. The higher optimal search ability of an FBPNN to determine the global optimal solution is the major advantage that makes the FBPNN superior to a classic first-order BPNN.
[1]Andramonov M, Rubinov A, Glover B, 1999. Cutting angle methods in global optimization. Appl Math Lett, 12(3): 95-100.
[2]Barnard E, 1992. Optimization for training neural nets. IEEE Trans Neur Netw, 3(2):232-240.
[3]Barron AR, 1993. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inform Theory, 39(3):930-945.
[4]Battiti R, 1992. First- and second-order methods for learning: between steepest descent and Newton’s method. Neur Comput, 4(2):141-166.
[5]Browne CB, Powley E, Whitehouse D, et al., 2012. A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI Games, 4(1):1-43.
[6]Cantu-Paz E, Kamath C, 2005. An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems. IEEE Trans Syst Man Cybern, 35(5):915-927.
[7]Charalambous C, 1992. Conjugate gradient algorithm for efficient training of artificial neural networks. IEE Proc G, 139(3):301-310.
[8]Chuang CC, Su SF, Hsiao CC, 2000. The annealing robust backpropagation (ARBP) learning algorithm. IEEE Trans Neur Netw, 11(5):1067-1077.
[9]Cybenko G, 1989. Approximation by superpositions of a sigmoidal function. Math Contr Signals Syst, 2(4):303- 314.
[10]Elwakil AS, 2010. Fractional-order circuits and systems: an emerging interdisciplinary research area. IEEE Circ Syst Mag, 10(4):40-50.
[11]Hagan MT, Menhaj MB, 1994. Training feedforward networks with the Marquardt algorithm. IEEE Trans Neur Netw, 5(6):989-993.
[12]Heymans N, Podlubny I, 2006. Physical interpretation of initial conditions for fractional differential equations with Riemann-Liouville fractional derivatives. Rheol Acta, 45(5):765-771.
[13]Hornik K, Stinchcombe M, White H, 1989. Multilayer feedforward networks are universal approximators. Neur Netw, 2(5):359-366.
[14]Jacobs RA, 1988. Increased rates of convergence through learning rate adaptation. Neur Netw, 1(4):295-307.
[15]Kaslik E, Sivasundaram S, 2011. Dynamics of fractional-order neural networks. Proc Int Joint Conf on Neural Networks, p.1375-1380.
[16]Koeller RC, 1984. Applications of fractional calculus to the theory of viscoelasticity. J Appl Mech, 51(2):299-307.
[17]LeCun Y, 1985. Une procedure d’apprentissage pour reseau a seuil assymetrique. Proc Cogn, 85:599-604 (in French).
[18]Leung FHF, Lam HK, Ling SH, et al., 2003. Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans Neur Netw, 14(1):79-88.
[19]Ludermir TB, Yamazaki A, Zanchettin C, 2006. An optimization methodology for neural network weights and architectures. IEEE Trans Neur Netw, 17(6):1452-1459.
[20]Manabe S, 2002. A suggestion of fractional-order controller for flexible spacecraft attitude control. Nonl Dynam, 29(1-4):251-268.
[21]Maniezzo V, 1994. Genetic evolution of the topology and weight distribution of neural networks. IEEE Trans Neur Netw, 5(1):39-53.
[22]Nikolaev NY, Iba H, 2003. Learning polynomial feedforward neural networks by genetic programming and backpropagation. IEEE Trans Neur Netw, 14(2):337-350.
[23]Oldham KB, Spanier J, 1974. The Fractional Calculus: Integrations and Differentiations of Arbitrary Order. Academic Press, New York, USA, p.1-234.
[24]Özdemir N, Karadeniz D, 2008. Fractional diffusion-wave problem in cylindrical coordinates. Phys Lett A, 372(38): 5968-5972.
[25]Palmes PP, Hayasaka T, Usui S, 2005. Mutation-based genetic neural network. IEEE Trans Neur Netw, 16(3):587-600.
[26]Parker DB, 1985. Learning-Logic: Casting the Cortex of the Human Brain in Silicon. Technical Report, No. TR-47, Center for Computational Research in Economics and Management Science, MIT, USA.
[27]Petráš I, 2011. Fractional-Order Nonlinear Systems: Modeling, Analysis and Simulation. Springer Berlin Heidelberg, Berlin, Germany, p.1-218.
[28]Podlubny I, 1998. Fractional Differential Equations: an Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of Their Solution and Some of Their Applications. Academic Press, San Diego, USA, p.1-340.
[29]Podlubny I, Petráš I, Vinagre BM, et al., 2002. Analogue realizations of fractional-order controllers. Nonl Dynam, 29(1-4):281-296.
[30]Pu YF, Zhou JL, Yuan X, 2010. Fractional differential mask: a fractional differential-based approach for multiscale texture enhancement. IEEE Trans Image Process, 19(2): 491-511.
[31]Pu YF, Zhou JL, Zhang Y, et al., 2015. Fractional extreme value adaptive training method: fractional steepest descent approach. IEEE Trans Neur Netw Learn Syst, 26(4): 653-662.
[32]Pu YF, Yi Z, Zhou JL, 2016. Defense against chip cloning attacks based on fractional Hopfield neural networks. Int J Neur Syst, 27(4):1750003.
[33]Pu YF, Yi Z, Zhou JL, 2017. Fractional Hopfield neural networks: fractional dynamic associative recurrent neural networks. IEEE Trans Neur Netw Learn Syst, 28(10): 2319-2333.
[34]Pu YF, Yuan X, Yu B, 2018a. Analog circuit implementation of fractional-order memristor: arbitrary-order lattice scaling fracmemristor. IEEE Trans Circ Syst I, 65(9): 2903-2916.
[35]Pu YF, Siarry P, Chatterjee A, et al., 2018b. A fractional-order variational framework for retinex: fractional-order partial differential equation-based formulation for multi-scale nonlocal contrast enhancement with texture preserving. IEEE Trans Image Process, 27(3):1214-1229.
[36]Rigler AK, Irvine JM, Vogl TP, 1991. Rescaling of variables in back propagation learning. Neur Netw, 4(2):225-229.
[37]Rossikhin YA, Shitikova MV, 1997. Applications of fractional calculus to dynamic problems of linear and nonlinear hereditary mechanics of solids. Appl Mech Rev, 50(1): 15-67.
[38]Rumelhart DE, Hinton GE, Williams RJ, 1986a. Learning representations by back-propagating errors. Nature, 323(6088):533-536.
[39]Rumelhart DE, McClelland JL, PDP Research Group, 1986b. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 1, MIT Press, Cambridge, USA, p.547-611.
[40]Shanno DF, 1990. Recent advances in numerical techniques for large-scale optimization. In: Miller WT, Sutton RS, Werbos PJ (Eds.), Neural Networks for Control. MIT Press, Cambridge, USA, p.171-178.
[41]Sontag ED, 1992. Feedback stabilization using two-hidden- layer nets. IEEE Trans Neur Netw, 3(6):981-990.
[42]Tollenaere T, 1990. SuperSAB: fast adaptive back propagation with good scaling properties. Neur Netw, 3(5):561-573.
[43]Treadgold NK, Gedeon TD, 1998. Simulated annealing and weight decay in adaptive learning: the SARPROP algorithm. IEEE Trans Neur Netw, 9(4):662-668.
[44]Vogl TP, Mangis JK, Rigler AK, et al., 1988. Accelerating the convergence of the back-propagation method. Biol Cybern, 59(4-5):257-263.
[45]Werbos PJ, 1974. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD Thesis, Harvard University, Cambridge, USA.
[46]Yeh WC, 2013. New parameter-free simplified swarm optimization for artificial neural network training and its application in the prediction of time series. IEEE Trans Neur Netw Learn Syst, 24(4):661-665.
[47]Zanchettin C, Ludermir TB, Almeida LM, 2011. Hybrid training method for MLP: optimization of architecture and training. IEEE Trans Syst Man Cybern, 41(4):1097- 1109.
Open peer comments: Debate/Discuss/Question/Opinion
<1>