CLC number: TP277
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2022-05-09
Cited: 0
Clicked: 2273
Citations: Bibtex RefMan EndNote GB/T7714
https://orcid.org/0000-0002-7512-0168
https://orcid.org/0000-0002-0528-2778
Weijun WANG, Yun WANG, Jun WANG, Xinyun FANG, Yuchen HE. Ensemble enhanced active learning mixture discriminant analysis model and its application for semi-supervised fault classification[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(12): 1814-1827.
@article{title="Ensemble enhanced active learning mixture discriminant analysis model and its application for semi-supervised fault classification",
author="Weijun WANG, Yun WANG, Jun WANG, Xinyun FANG, Yuchen HE",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="12",
pages="1814-1827",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200053"
}
%0 Journal Article
%T Ensemble enhanced active learning mixture discriminant analysis model and its application for semi-supervised fault classification
%A Weijun WANG
%A Yun WANG
%A Jun WANG
%A Xinyun FANG
%A Yuchen HE
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 12
%P 1814-1827
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200053
TY - JOUR
T1 - Ensemble enhanced active learning mixture discriminant analysis model and its application for semi-supervised fault classification
A1 - Weijun WANG
A1 - Yun WANG
A1 - Jun WANG
A1 - Xinyun FANG
A1 - Yuchen HE
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 12
SP - 1814
EP - 1827
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200053
Abstract: As an indispensable part of process monitoring, the performance of fault classification relies heavily on the sufficiency of process knowledge. However, data labels are always difficult to acquire because of the limited sampling condition or expensive laboratory analysis, which may lead to deterioration of classification performance. To handle this dilemma, a new semi-supervised fault classification strategy is performed in which enhanced active learning is employed to evaluate the value of each unlabeled sample with respect to a specific labeled dataset. Unlabeled samples with large values will serve as supplementary information for the training dataset. In addition, we introduce several reasonable indexes and criteria, and thus human labeling interference is greatly reduced. Finally, the fault classification effectiveness of the proposed method is evaluated using a numerical example and the Tennessee Eastman process.
[1]Abellán J, Masegosa AR, 2010. Bagging decision trees on data sets with classification noise. Proc 6th Int Symp on Foundations of Information and Knowledge Systems, p.248-265.
[2]Abramson N, Braverman D, Sebestyen G, 1963. Pattern recognition and machine learning. IEEE Trans Inform Theory, 9(4):257-261.
[3]Araya DB, Grolinger K, ElYamany HF, et al., 2017. An ensemble learning framework for anomaly detection in building energy consumption. Energy Build, 144:191-206.
[4]Blum A, Chawla S, 2001. Learning from labeled and unlabeled data using graph mincuts. Proc 18th Int Conf on Machine Learning, p.19-26.
[5]Botre C, Mansouri M, Karim MN, et al., 2017. Multiscale PLS-based GLRT for fault detection of chemical processes. J Loss Prev Process Ind, 46:143-153.
[6]Bouveyron C, Girard S, 2009. Robust supervised classification with mixture models: learning from data with uncertain labels. Patt Recogn, 42(11):2649-2658.
[7]Chapelle O, Sindhwani V, Sathiya Keerthi S, 2006. Branch and bound for semi-supervised support vector machines. Proc 19th Int Conf on Neural Information Processing Systems, p.217-224.
[8]Chen X, Wang ZP, Zhang Z, et al., 2018. A semi-supervised approach to bearing fault diagnosis under variable conditions towards imbalanced unlabeled data. Sensors, 18(7):2097.
[9]Chiang LH, Russell EL, Braatz RD, 2000. Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis. Chemom Intell Lab Syst, 50(2):243-252.
[10]Chiang LH, Kotanchek ME, Kordon AK, 2004. Fault diagnosis based on Fisher discriminant analysis and support vector machines. Comput Chem Eng, 28(8):1389-1401.
[11]Cui XD, Huang J, Chien JT, 2012. Multi-view and multi-objective semi-supervised learning for HMM-based automatic speech recognition. IEEE Trans Audio Speech Lang Process, 20(7):1923-1935.
[12]Deng XG, Liu XY, Cao YP, et al., 2022. Incipient fault detection for dynamic chemical processes based on enhanced CVDA integrated with probability information and fault-sensitive features. J Process Contr, 114:29-41.
[13]Dietterich TG, 2000. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn, 40(2):139-157.
[14]Dong YN, Qin SJ, 2018. A novel dynamic PCA algorithm for dynamic data modeling and process monitoring. J Process Contr, 67:1-11.
[15]Downs JJ, Vogel EF, 1993. A plant-wide industrial process control problem. Comput Chem Eng, 17(3):245-255.
[16]Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, et al., 2021. Adversarial semi-supervised learning for diagnosing faults and attacks in power grids. IEEE Trans Smart Grid, 12(4):3468-3478.
[17]Feng J, Wang J, Han ZY, 2013. Process monitoring for chemical process based on semi-supervised principal component analysis. Proc 25th Chinese Control and Decision Conf, p.4282-4286.
[18]Fraley C, Raftery AE, 2002. Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc, 97(458):611-631.
[19]Ge ZQ, 2016. Supervised latent factor analysis for process data regression modeling and soft sensor application. IEEE Trans Contr Syst Technol, 24(3):1004-1011.
[20]Ge ZQ, 2017. Review on data-driven modeling and monitoring for plant-wide industrial processes. Chemom Intell Lab Syst, 171:16-25.
[21]Ge ZQ, 2018. Process data analytics via probabilistic latent variable models: a tutorial review. Ind Eng Chem Res, 57(38):12646-12661.
[22]Ge ZQ, Song ZH, Gao FR, 2013. Review of recent research on data-based process monitoring. Ind Eng Chem Res, 52(10):3543-3562.
[23]Ge ZQ, Song ZH, Ding SX, et al., 2017. Data mining and analytics in the process industry: the role of machine learning. IEEE Access, 5:20590-20616.
[24]Hady MFA, Schwenker F, 2010. Combining committee-based semi-supervised learning and active learning. J Comput Sci Technol, 25(4):681-698.
[25]Harkat MF, Mansouri M, Nounou MN, et al., 2019. Fault detection of uncertain chemical processes using interval partial least squares-based generalized likelihood ratio test. Inform Sci, 490:265-284.
[26]Hastie T, Tibshirani R, 1996. Discriminant analysis by Gaussian mixtures. J Roy Stat Soc Ser B, 58(1):155-176.
[27]He YL, Li K, Zhang N, et al., 2021. Fault diagnosis using improved discrimination locality preserving projections integrated with sparse autoencoder. IEEE Trans Instrum Meas, 70:3527108.
[28]Huang CC, Chen T, Yao Y, 2013. Mixture discriminant monitoring: a hybrid method for statistical process monitoring and fault diagnosis/isolation. Ind Eng Chem Res, 52(31):10720-10731.
[29]Ipeirotis PG, Provost F, Wang J, 2010. Quality management on Amazon Mechanical Turk. Proc ACM SIGKDD Workshop on Human Computation, p.64-67.
[30]Jin YR, Qin CJ, Huang YX, et al., 2021. Actual bearing compound fault diagnosis based on active learning and decoupling attentional residual network. Measurement, 173:108500.
[31]Kalantar B, Al-Najjar HAH, Pradhan B, et al., 2019. Optimized conditioning factors using machine learning techniques for groundwater potential mapping. Water, 11(9):1909.
[32]Liu J, Song CY, Zhao J, 2018. Active learning based semi-supervised exponential discriminant analysis and its application for fault classification in industrial processes. Chemom Intell Lab Syst, 180:42-53.
[33]Liu J, Song CY, Zhao J, et al., 2020. Manifold-preserving sparse graph-based ensemble FDA for industrial label-noise fault classification. IEEE Trans Instrum Meas, 69(6):2621-2634.
[34]Liu JW, Liu Y, Luo XL, 2015. Semi-supervised learning methods. Chin J Comput, 38(8):1592-1617 (in Chinese).
[35]Liu Y, Ge ZQ, 2018. Weighted random forests for fault classification in industrial processes with hierarchical clustering model selection. J Process Contr, 64:62-70.
[36]MacGregor J, Cinar A, 2012. Monitoring, fault diagnosis, fault-tolerant control and optimization: data driven methods. Comput Chem Eng, 47:111-120.
[37]Pu XK, Li CG, 2021. Probabilistic information-theoretic discriminant analysis for industrial label-noise fault diagnosis. IEEE Trans Ind Inform, 17(4):2664-2674.
[38]Raina R, Battle A, Lee H, et al., 2007. Self-taught learning: transfer learning from unlabeled data. Proc 24th Int Conf on Machine Learning, p.759-766.
[39]Raykar VC, Yu SP, Zhao LH, et al., 2010. Learning from crowds. J Mach Learn Res, 11:1297-1322.
[40]Schwenker F, Trentin E, 2014. Pattern classification and clustering: a review of partially supervised learning approaches. Patt Recogn Lett, 37:4-14.
[41]Settles B, 2012. Active Learning. Morgan & Claypool Publishers, USA.
[42]Shao WM, Tian XM, 2017. Semi-supervised selective ensemble learning based on distance to model for nonlinear soft sensor development. Neurocomputing, 222:91-104.
[43]Shao WM, Ge ZQ, Song ZH, 2019a. Semi-supervised mixture of latent factor analysis models with application to online key variable estimation. Contr Eng Pract, 84:32- 47.
[44]Shao WM, Ge ZQ, Song ZH, et al., 2019b. Nonlinear industrial soft sensor development based on semi-supervised probabilistic mixture of extreme learning machines. Contr Eng Pract, 91:104098.
[45]Snow R, O’Connor B, Jurafsky D, et al., 2008. Cheap and fast—but is it good? Evaluating non-expert annotations for natural language tasks. Proc Conf on Empirical Methods in Natural Language Processing, p.254-263.
[46]Wang J, Feng J, Han ZY, 2014. Fault detection for the class imbalance problem in semiconductor manufacturing processes. J Circ Syst Comput, 23(4):1450049.
[47]Wang JB, Shao WM, Song ZH, 2019. Semi-supervised variational Bayesian student's t mixture regression and robust inferential sensor application. Contr Eng Pract, 92:104155.
[48]Wang L, Tian H, Zhang H, 2021. Soft fault diagnosis of analog circuits based on semi-supervised support vector machine. Analog Integr Circ Signal Process, 108(2):305-315.
[49]Yan ZB, Huang CC, Yao Y, 2014. Semi-supervised mixture discriminant monitoring for chemical batch processes. Chemom Intell Lab Syst, 134:10-22.
[50]Yao L, Ge ZQ, 2017. Locally weighted prediction methods for latent factor analysis with supervised and semisupervised process data. IEEE Trans Autom Sci Eng, 14(1):126-138.
[51]Yin LL, Wang HG, Fan WH, et al., 2018. Combining active learning and Fisher discriminant analysis for the semi-supervised process monitoring. IFAC-PapersOnLine, 51(21):147-151.
[52]Yin LL, Wang HG, Fan WH, et al., 2019. Incorporate active learning to semi-supervised industrial fault classification. J Process Contr, 78:88-97.
[53]Yuen MC, King I, Leung KS, 2011. A survey of crowdsourcing systems. Proc IEEE 3rd Int Conf on Privacy, Security, Risk and Trust and IEEE 3rd Int Conf on Social Computing, p.766-773.
[54]Zaman SMK, Liang XD, 2021. An effective induction motor fault diagnosis approach using graph-based semi-supervised learning. IEEE Access, 9:7471-7482.
[55]Zhang N, Xu Y, Zhu QX, et al., 2022. Improved locality preserving projections based on heat-kernel and cosine weights for fault classification in complex industrial processes. IEEE Trans Reliab, early access.
[56]Zheng JH, Wang HJ, Song ZH, et al., 2019. Ensemble semi-supervised Fisher discriminant analysis model for fault classification in industrial processes. ISA Trans, 92:109-117.
[57]Zheng JH, Zhu JL, Chen GJ, et al., 2020. Dynamic Bayesian network for robust latent variable modeling and fault classification. Eng Appl Artif Intell, 89:103475.
[58]Zhong K, Han M, Qiu T, et al., 2020. Fault diagnosis of complex processes using sparse kernel local Fisher discriminant analysis. IEEE Trans Neur Netw Learn Syst, 31(5):1581-1591.
[59]Zou Y, Yu ZD, Liu XF, et al., 2019. Confidence regularized self-training. Proc IEEE/CVF Int Conf on Computer Vision, p.5981-5990.
Open peer comments: Debate/Discuss/Question/Opinion
<1>