
CLC number:
On-line Access: 2026-03-25
Received: 2025-04-11
Revision Accepted: 2025-07-11
Crosschecked: 2026-03-25
Cited: 0
Clicked: 1558
Citations: Bibtex RefMan EndNote GB/T7714
Chengyu YU, Hongling YU, Xiaofeng QU, Baoxi LIU, Liangsi XU, Xinyu LIU, Xiangyu CHEN. Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions[J]. Journal of Zhejiang University Science A, 2026, 27(3): 215-230.
@article{title="Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions",
author="Chengyu YU, Hongling YU, Xiaofeng QU, Baoxi LIU, Liangsi XU, Xinyu LIU, Xiangyu CHEN",
journal="Journal of Zhejiang University Science A",
volume="27",
number="3",
pages="215-230",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A2500127"
}
%0 Journal Article
%T Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions
%A Chengyu YU
%A Hongling YU
%A Xiaofeng QU
%A Baoxi LIU
%A Liangsi XU
%A Xinyu LIU
%A Xiangyu CHEN
%J Journal of Zhejiang University SCIENCE A
%V 27
%N 3
%P 215-230
%@ 1673-565X
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A2500127
TY - JOUR
T1 - Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions
A1 - Chengyu YU
A1 - Hongling YU
A1 - Xiaofeng QU
A1 - Baoxi LIU
A1 - Liangsi XU
A1 - Xinyu LIU
A1 - Xiangyu CHEN
J0 - Journal of Zhejiang University Science A
VL - 27
IS - 3
SP - 215
EP - 230
%@ 1673-565X
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A2500127
Abstract: Accurate prediction of the permeability coefficient is crucial for evaluating the compaction quality of earthworks. However, during the compaction process, on-site testing is often time-consuming and expensive, leading to fewer samples, which affects prediction accuracy. Moreover, most current predictive models have limited capabilities and tend to be black-box models with poor explainability. To overcome these issues, in this study, we proposed a new method to predict the permeability coefficient of earth-rock material based on an improved generative adversarial network (GAN) and explainable osprey optimization algorithm–Huber loss–light gradient boosting machine (OOA–HL–LightGBM). Firstly, by introducing the Wasserstein distance as the loss function into the conditional generative adversarial network (CGAN), the wasserstein conditional generative adversarial network (WCGAN) was proposed to generate high-quality data, addressing the issue of insufficient information caused by small samples. Furthermore, by incorporating material and compaction parameters as inputs, a high-accuracy permeability coefficient prediction model was developed using LightGBM with the Huber loss function and the OOA. Finally, the shapley additive explanation (SHAP) method was introduced into OOA–HL–LightGBM to analyze the specific roles of different features within the dataset to enhance the credibility of the prediction results. The proposed method was applied to a large-scale high-core rockfill dam in southwestern China to thoroughly verify its effectiveness and superiority.
[1]AhmedK,2023.Batch-stochastic sub-gradient method for solving non-smooth convex loss function problems.Proceedings of the Computer Science & Information Technology (CS & IT), p.65-84.
[2]ArjovskyM,ChintalaS,BottouL,2017.Wasserstein generative adversarial networks.Proceedings of the 34th International Conference on Machine Learning, p.214-223.
[3]BagheriM,RezaeiH,2019.Reservoir rock permeability prediction using SVR based on radial basis function kernel.Carbonates and Evaporites,34(3):699-707.
[4]CakirogluC,DemirS,OzdemirMH,et al.,2024.Data-driven interpretable ensemble learning methods for the prediction of wind turbine power incorporating SHAP analysis.Expert Systems with Applications,237:121464.
[5]ChenZS,HouKR,ZhuMY,et al.,2021.A virtual sample generation approach based on a modified conditional GAN and centroidal Voronoi tessellation sampling to cope with small sample size problems: application to soft sensing for chemical process.Applied Soft Computing,101:107070.
[6]DehghaniM,TrojovskýP,2023.Osprey optimization algorithm: a new bio-inspired metaheuristic algorithm for solving engineering optimization problems.Frontiers in Mechanical Engineering,8:1126450.
[7]DingF,ZhangWJ,CaoSH,et al.,2023.Optimization of water quality index models using machine learning approaches.Water Research,243:120337.
[8]ElsebachR,1994.Evaluation of forecasts in AR models with outliers.Operations-Research-Spektrum,16(1):41-45.
[9]EssaE,OmarK,AlqahtaniA,2023.Fake news detection based on a hybrid BERT and LightGBM models.Complex & Intelligent Systems,9(6):6581-6592.
[10]FriedmanJH,2001.Greedy function approximation: a gradient boosting machine.The Annals of Statistics,29(5):1189-1232.
[11]FuXY,LuoH,ZhangGY,et al.,2019.A lazy support vector regression model for prediction problems with small sample size.Neural Network World,29(1):33-44.
[12]GoodfellowI,Pouget-AbadieJ,MirzaM,et al.,2014.Generative adversarial nets.Proceedings of the Annual Conference on Neural Information Processing Systems, p.2672-2680.
[13]GuoJX,YunSN,MengY,et al.,2023.Prediction of heating and cooling loads based on light gradient boosting machine algorithms.Building and Environment,236:110252.
[14]GuoMX,GuoY,PengYF,et al.,2023.Fault diagnosis of bolt loosening based on LightGBM recognition of sound signal features.IEEE Sensors Journal,23(19):22777-22787.
[15]HanK,YuY,LuT,2024.Transfer learning and interpretable analysis-based quality assessment of synthetic optical coherence tomography images by CGAN model for retinal diseases.Processes,12(1):182.
[16]HuberPJ,1964.Robust estimation of a location parameter.The Annals of Mathematical Statistics,35(1):73-101.
[17]IslamZ,Abdel-AtyM,CaiQ,et al.,2021.Crash data augmentation using variational autoencoder.Accident Analysis & Prevention,151:105950.
[18]JabbarA,LiX,OmarB,2022.A survey on generative adversarial networks: variants, applications, and training.ACM Computing Surveys (CSUR),54(8):157.
[19]JiangXY,YaoL,YangZY,et al.,2023.Gaussian mixture model and double-weighted deep neural networks for data augmentation soft sensing.Proceedings of 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS), p.1914-1919.
[20]KeGL,MengQ,FinleyT,et al.,2017.LightGBM: a highly efficient gradient boosting decision tree.Proceedings of the 31st International Conference on Neural Information Processing Systems, p.3149-3157.
[21]KimMH,SongCM,2023.Prediction of the soil permeability coefficient of reservoirs using a deep neural network based on a dendrite concept.Processes,11(3):661.
[22]LinLIK,1989.A concordance correlation coefficient to evaluate reproducibility.Biometrics,45(1):255-268.
[23]LinWW,WangJJ,WangXL,et al.,2023.An enhanced multiobjective bacterial foraging algorithm for the compaction parameter optimization of earth-rock dams.Construction and Building Materials,394:132178.
[24]LiuDH,SunJ,ZhongDH,et al.,2012.Compaction quality control of earth-rock dam construction using real-time field operation data.Journal of Construction Engineering and Management,138(9):1085-1094.
[25]LiuHC,ZhangN,YinZY,2025.Probabilistic stratigraphic modelling from sparse boreholes based on deep learning.Geotechnique,75(11):1457-1469.
[26]LiuXL,LiDL,YangJH,et al.,2020.Automatic well test interpretation based on convolutional neural network for infinite reservoir.Journal of Petroleum Science and Engineering,195:107618.
[27]LiuYX,ZhongDH,CuiB,et al.,2015.Study on real-time construction quality monitoring of storehouse surfaces for RCC dams.Automation in Construction,49:100-112.
[28]LundbergSM,LeeSI,2017.A unified approach to interpreting model predictions.Proceedings of the 31st International Conference on Neural Information Processing Systems, p.4768-4777.
[29]LvP,WangXL,LiuZ,et al.,2017.Porosity-and reliability-based evaluation of concrete-face rock dam compaction quality.Automation in Construction,81:196-209.
[30]MashhadiAH,RashidiA,MarkovićN,2024.A GAN-augmented CNN approach for automated roadside safety assessment of rural roadways.Journal of Computing in Civil Engineering,38(2):04023043.
[31]MengCC,QuDY,DuanXC,2024.Cost estimation of metro construction projects using interpretable machine learning.Journal of Computing in Civil Engineering,38(6):04024038.
[32]MirzaM,OsinderoS,2014.Conditional generative adversarial nets.arXiv:1411.1784.
[33]NeroC,AningAA,DanuorSK,et al.,2023.Prediction of compressional sonic log in the western (Tano) sedimentary basin of Ghana, West Africa using supervised machine learning algorithms.Heliyon,9(9):e20242.
[34]NjockPGA,YinZY,ZhangN,2025.High-fidelity data augmentation for few-shot learning in jet grout injection applications.International Journal for Numerical and Analytical Methods in Geomechanics,49(1):83-100.
[35]OliveiraFM,BalbinoMS,ZarateLE,et al.,2024.Predicting inmates misconduct using the SHAP approach.Artificial Intelligence and Law,32(2):369-395.
[36]RibeiroMT,SinghS,GuestrinC,2016.“Why should I trust you?”: explaining the predictions of any classifier.Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p.1135-1144.
[37]SalehiS,ArashpourM,Mohammadi GolafshaniE,et al.,2023.Prediction of rheological properties and ageing performance of recycled plastic modified bitumen using machine learning models.Construction and Building Materials,40:132728.
[38]SangXK,XuLJ,2022.Research on the generation of creative animation driven by deep learning model.Scientific Programming,2022(1):5815693.
[39]SeyyedattarM,ZendehboudiS,ButtS,2022.Relative permeability modeling using extra trees, ANFIS, and hybrid LSSVM–CSA methods.Natural Resources Research,31(1):571-600.
[40]ShapleyLS,1952.A value forn-person games. In: Kuhn H, Tucker A (Eds.),Contributions to the Theory of Games II. Princeton University Press, Princeton,USA, p.307-317.
[41]SunDL,WuXQ,WenHJ,et al.,2023.A LightGBM-based landslide susceptibility model considering the uncertainty of non-landslide samples.Geomatics, Natural Hazards and Risk,14(1):2213807.
[42]SunYL,DongYN,WangDH,et al.,2023.Correlation between travel experiences and post-COVID outbound tourism intention: a case study from China.Journal of Zhejiang University-SCIENCE A,24(11):1003-1016.
[43]TsaregorodtsevA,BelagiannisV,2023.ParticleAugment: sampling-based data augmentation.Computer Vision and Image Understanding,228:103633.
[44]WangDN,LiL,ZhaoD,2022.Corporate finance risk prediction based on LightGBM.Information Sciences,602:259-268.
[45]WangY,WangT,2020.Application of improved LightGBM model in blood glucose prediction.Applied Sciences,10(9):3227.
[46]WenLY,ZhangXM,LiQF,et al.,2023.KGA: integrating KPCA and GAN for microbial data augmentation.International Journal of Machine Learning and Cybernetics,14(4):1427-1444.
[47]WrzesińskiG,MarkiewiczA,2022.Prediction of permeability coefficientk in sandy soils using ANN.Sustainability,14(11):6736.
[48]XingLM,ZhangYJ,2022.Forecasting crude oil prices with shrinkage methods: can nonconvex penalty and Huber loss help?Energy Economics,110:106014.
[49]YangY,ZhouH,WuJR,et al.,2022.Robustified extreme learning machine regression with applications in outlier-blended wind-speed forecasting.Applied Soft Computing,122:108814.
[50]YouMY,LuAN,2021.A robust TDOA based solution for source location using mixed Huber loss.Journal of Systems Engineering and Electronics,32(6):1375-1380.
[51]ZhangN,XuKP,YinZY,et al.,2025.Finite element-integrated neural network framework for elastic and elastoplastic solids.Computer Methods in Applied Mechanics and Engineering,433:117474.
[52]ZhaoW,YinQG,WenLF,2023.Intelligent inversion analysis of hydraulic engineering geological permeability coefficient based on an RF–HHO model.Water,15(11):1993.
[53]ZhongDH,LiuDH,CuiB,2011.Real-time compaction quality monitoring of high core rockfill dam.Science China Technological Sciences,54(7):1906-1913.
[54]ZhouSX,2023.An analysis of the small sample datasets based on machine learning.Proceedings of 2022 6th International Conference on Electronic Information Technology and Computer Engineering, p.1654-1658.
Open peer comments: Debate/Discuss/Question/Opinion
<1>