Full Text:   <2044>

Summary:  <5>

CLC number: 

On-line Access: 2026-03-25

Received: 2025-04-11

Revision Accepted: 2025-07-11

Crosschecked: 2026-03-25

Cited: 0

Clicked: 1558

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Chengyu YU

https://orcid.org/0009-0009-9126-7344

Hongling YU

https://orcid.org/0009-0009-1478-7338

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE A 2026 Vol.27 No.3 P.215-230

http://doi.org/10.1631/jzus.A2500127


Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions


Author(s):  Chengyu YU, Hongling YU, Xiaofeng QU, Baoxi LIU, Liangsi XU, Xinyu LIU, Xiangyu CHEN

Affiliation(s):  1. College of Water Resources and Civil Engineering, China Agricultural University,Beijing100083,China

Corresponding email(s):   yuhongling@cau.edu.cn

Key Words:  Permeability coefficient prediction, Light gradient boosting machine (LightGBM), Wasserstein conditional generative adversarial network (WCGAN), Shapley additive explanation (SHAP)


Chengyu YU, Hongling YU, Xiaofeng QU, Baoxi LIU, Liangsi XU, Xinyu LIU, Xiangyu CHEN. Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions[J]. Journal of Zhejiang University Science A, 2026, 27(3): 215-230.

@article{title="Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions",
author="Chengyu YU, Hongling YU, Xiaofeng QU, Baoxi LIU, Liangsi XU, Xinyu LIU, Xiangyu CHEN",
journal="Journal of Zhejiang University Science A",
volume="27",
number="3",
pages="215-230",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A2500127"
}

%0 Journal Article
%T Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions
%A Chengyu YU
%A Hongling YU
%A Xiaofeng QU
%A Baoxi LIU
%A Liangsi XU
%A Xinyu LIU
%A Xiangyu CHEN
%J Journal of Zhejiang University SCIENCE A
%V 27
%N 3
%P 215-230
%@ 1673-565X
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A2500127

TY - JOUR
T1 - Predicting permeability coefficients of earth-rock material using an improved generative adversarial network and explainable ensemble learning under small sample conditions
A1 - Chengyu YU
A1 - Hongling YU
A1 - Xiaofeng QU
A1 - Baoxi LIU
A1 - Liangsi XU
A1 - Xinyu LIU
A1 - Xiangyu CHEN
J0 - Journal of Zhejiang University Science A
VL - 27
IS - 3
SP - 215
EP - 230
%@ 1673-565X
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A2500127


Abstract: 
Accurate prediction of the permeability coefficient is crucial for evaluating the compaction quality of earthworks. However, during the compaction process, on-site testing is often time-consuming and expensive, leading to fewer samples, which affects prediction accuracy. Moreover, most current predictive models have limited capabilities and tend to be black-box models with poor explainability. To overcome these issues, in this study, we proposed a new method to predict the permeability coefficient of earth-rock material based on an improved generative adversarial network (GAN) and explainable osprey optimization algorithm–Huber loss–light gradient boosting machine (OOA–HL–LightGBM). Firstly, by introducing the Wasserstein distance as the loss function into the conditional generative adversarial network (CGAN), the wasserstein conditional generative adversarial network (WCGAN) was proposed to generate high-quality data, addressing the issue of insufficient information caused by small samples. Furthermore, by incorporating material and compaction parameters as inputs, a high-accuracy permeability coefficient prediction model was developed using LightGBM with the Huber loss function and the OOA. Finally, the shapley additive explanation (SHAP) method was introduced into OOA–HL–LightGBM to analyze the specific roles of different features within the dataset to enhance the credibility of the prediction results. The proposed method was applied to a large-scale high-core rockfill dam in southwestern China to thoroughly verify its effectiveness and superiority.

小样本条件下基于改进生成对抗网络和可解释性集成学习的土石料渗透系数预测

作者:俞承禹,余红玲,瞿晓峰,刘宝熙,徐良嗣,刘新宇,陈翔宇
机构:中国农业大学,水利与土木工程学院,中国北京,100083
目的:本文旨在针对土石料渗透系数预测中存在小样本和可解释性不足的问题,探究基于生成模型的数据增强方法,以提高预测模型在小样本条件下的泛化能力。同时,结合具备可解释性的集成学习算法,增强预测结果的可信度,实现对土石料渗透系数的高精度预测。
创新点:1.提出一种基于改进生成对抗网络的数据增强方法,有效提升小样本条件下渗透系数预测模型的性能;2.构建基于改进轻量级梯度提升机(LightGBM)的渗透系数预测模型,结合优化算法实现渗透系数更高精度的预测;3.使用沙普利可加性解释方法(SHAP)对预测结果进行全局和局部解释,增强模型的可解释性;4.将所提方法应用于实际土方工程案例中,验证所提方法在工程实践中的有效性。
方法:1.将瓦瑟斯坦(Wasserstein)距离作为损失函数引入到条件生成对抗网络中,并基于Wasserstein条件生成对抗网络;2.利用LightGBM算法建立具有Huber损失函数和鱼鹰优化算法的高精度渗透系数预测模型;3.使用SHAP方法探究影响预测结果的关键特征,并分析不同特征在数据集中的具体作用。
结论:1.基于Wasserstein条件生成对抗网络的数据增强方法能够生成高质量的样本,有效解决小样本数据问题;2.基于结合Huber损失和鱼鹰优化的LightGBM算法建立的渗透系数预测模型具有较高的预测性能;3.使用SHAP方法能够对预测结果进行全局和局部分析,提升预测模型的可解释性。

关键词:渗透系数预测;轻量级梯度提升机(LightGBM);瓦瑟斯坦(Wasserstein)条件生成对抗网络;沙普利可加性解释方法(SHAP)

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]AhmedK,2023.Batch-stochastic sub-gradient method for solving non-smooth convex loss function problems.Proceedings of the Computer Science & Information Technology (CS & IT), p.65-84.

[2]ArjovskyM,ChintalaS,BottouL,2017.Wasserstein generative adversarial networks.Proceedings of the 34th International Conference on Machine Learning, p.214-223.

[3]BagheriM,RezaeiH,2019.Reservoir rock permeability prediction using SVR based on radial basis function kernel.Carbonates and Evaporites,34(3):699-707.

[4]CakirogluC,DemirS,OzdemirMH,et al.,2024.Data-driven interpretable ensemble learning methods for the prediction of wind turbine power incorporating SHAP analysis.Expert Systems with Applications,237:121464.

[5]ChenZS,HouKR,ZhuMY,et al.,2021.A virtual sample generation approach based on a modified conditional GAN and centroidal Voronoi tessellation sampling to cope with small sample size problems: application to soft sensing for chemical process.Applied Soft Computing,101:107070.

[6]DehghaniM,TrojovskýP,2023.Osprey optimization algorithm: a new bio-inspired metaheuristic algorithm for solving engineering optimization problems.Frontiers in Mechanical Engineering,8:1126450.

[7]DingF,ZhangWJ,CaoSH,et al.,2023.Optimization of water quality index models using machine learning approaches.Water Research,243:120337.

[8]ElsebachR,1994.Evaluation of forecasts in AR models with outliers.Operations-Research-Spektrum,16(1):41-45.

[9]EssaE,OmarK,AlqahtaniA,2023.Fake news detection based on a hybrid BERT and LightGBM models.Complex & Intelligent Systems,9(6):6581-6592.

[10]FriedmanJH,2001.Greedy function approximation: a gradient boosting machine.The Annals of Statistics,29(5):1189-1232.

[11]FuXY,LuoH,ZhangGY,et al.,2019.A lazy support vector regression model for prediction problems with small sample size.Neural Network World,29(1):33-44.

[12]GoodfellowI,Pouget-AbadieJ,MirzaM,et al.,2014.Generative adversarial nets.Proceedings of the Annual Conference on Neural Information Processing Systems, p.2672-2680.

[13]GuoJX,YunSN,MengY,et al.,2023.Prediction of heating and cooling loads based on light gradient boosting machine algorithms.Building and Environment,236:110252.

[14]GuoMX,GuoY,PengYF,et al.,2023.Fault diagnosis of bolt loosening based on LightGBM recognition of sound signal features.IEEE Sensors Journal,23(19):22777-22787.

[15]HanK,YuY,LuT,2024.Transfer learning and interpretable analysis-based quality assessment of synthetic optical coherence tomography images by CGAN model for retinal diseases.Processes,12(1):182.

[16]HuberPJ,1964.Robust estimation of a location parameter.The Annals of Mathematical Statistics,35(1):73-101.

[17]IslamZ,Abdel-AtyM,CaiQ,et al.,2021.Crash data augmentation using variational autoencoder.Accident Analysis & Prevention,151:105950.

[18]JabbarA,LiX,OmarB,2022.A survey on generative adversarial networks: variants, applications, and training.ACM Computing Surveys (CSUR),54(8):157.

[19]JiangXY,YaoL,YangZY,et al.,2023.Gaussian mixture model and double-weighted deep neural networks for data augmentation soft sensing.Proceedings of 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS), p.1914-1919.

[20]KeGL,MengQ,FinleyT,et al.,2017.LightGBM: a highly efficient gradient boosting decision tree.Proceedings of the 31st International Conference on Neural Information Processing Systems, p.3149-3157.

[21]KimMH,SongCM,2023.Prediction of the soil permeability coefficient of reservoirs using a deep neural network based on a dendrite concept.Processes,11(3):661.

[22]LinLIK,1989.A concordance correlation coefficient to evaluate reproducibility.Biometrics,45(1):255-268.

[23]LinWW,WangJJ,WangXL,et al.,2023.An enhanced multiobjective bacterial foraging algorithm for the compaction parameter optimization of earth-rock dams.Construction and Building Materials,394:132178.

[24]LiuDH,SunJ,ZhongDH,et al.,2012.Compaction quality control of earth-rock dam construction using real-time field operation data.Journal of Construction Engineering and Management,138(9):1085-1094.

[25]LiuHC,ZhangN,YinZY,2025.Probabilistic stratigraphic modelling from sparse boreholes based on deep learning.Geotechnique,75(11):1457-1469.

[26]LiuXL,LiDL,YangJH,et al.,2020.Automatic well test interpretation based on convolutional neural network for infinite reservoir.Journal of Petroleum Science and Engineering,195:107618.

[27]LiuYX,ZhongDH,CuiB,et al.,2015.Study on real-time construction quality monitoring of storehouse surfaces for RCC dams.Automation in Construction,49:100-112.

[28]LundbergSM,LeeSI,2017.A unified approach to interpreting model predictions.Proceedings of the 31st International Conference on Neural Information Processing Systems, p.4768-4777.

[29]LvP,WangXL,LiuZ,et al.,2017.Porosity-and reliability-based evaluation of concrete-face rock dam compaction quality.Automation in Construction,81:196-209.

[30]MashhadiAH,RashidiA,MarkovićN,2024.A GAN-augmented CNN approach for automated roadside safety assessment of rural roadways.Journal of Computing in Civil Engineering,38(2):04023043.

[31]MengCC,QuDY,DuanXC,2024.Cost estimation of metro construction projects using interpretable machine learning.Journal of Computing in Civil Engineering,38(6):04024038.

[32]MirzaM,OsinderoS,2014.Conditional generative adversarial nets.arXiv:1411.1784.

[33]NeroC,AningAA,DanuorSK,et al.,2023.Prediction of compressional sonic log in the western (Tano) sedimentary basin of Ghana, West Africa using supervised machine learning algorithms.Heliyon,9(9):e20242.

[34]NjockPGA,YinZY,ZhangN,2025.High-fidelity data augmentation for few-shot learning in jet grout injection applications.International Journal for Numerical and Analytical Methods in Geomechanics,49(1):83-100.

[35]OliveiraFM,BalbinoMS,ZarateLE,et al.,2024.Predicting inmates misconduct using the SHAP approach.Artificial Intelligence and Law,32(2):369-395.

[36]RibeiroMT,SinghS,GuestrinC,2016.“Why should I trust you?”: explaining the predictions of any classifier.Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p.1135-1144.

[37]SalehiS,ArashpourM,Mohammadi GolafshaniE,et al.,2023.Prediction of rheological properties and ageing performance of recycled plastic modified bitumen using machine learning models.Construction and Building Materials,40:132728.

[38]SangXK,XuLJ,2022.Research on the generation of creative animation driven by deep learning model.Scientific Programming,2022(1):5815693.

[39]SeyyedattarM,ZendehboudiS,ButtS,2022.Relative permeability modeling using extra trees, ANFIS, and hybrid LSSVM–CSA methods.Natural Resources Research,31(1):571-600.

[40]ShapleyLS,1952.A value forn-person games. In: Kuhn H, Tucker A (Eds.),Contributions to the Theory of Games II. Princeton University Press, Princeton,USA, p.307-317.

[41]SunDL,WuXQ,WenHJ,et al.,2023.A LightGBM-based landslide susceptibility model considering the uncertainty of non-landslide samples.Geomatics, Natural Hazards and Risk,14(1):2213807.

[42]SunYL,DongYN,WangDH,et al.,2023.Correlation between travel experiences and post-COVID outbound tourism intention: a case study from China.Journal of Zhejiang University-SCIENCE A,24(11):1003-1016.

[43]TsaregorodtsevA,BelagiannisV,2023.ParticleAugment: sampling-based data augmentation.Computer Vision and Image Understanding,228:103633.

[44]WangDN,LiL,ZhaoD,2022.Corporate finance risk prediction based on LightGBM.Information Sciences,602:259-268.

[45]WangY,WangT,2020.Application of improved LightGBM model in blood glucose prediction.Applied Sciences,10(9):3227.

[46]WenLY,ZhangXM,LiQF,et al.,2023.KGA: integrating KPCA and GAN for microbial data augmentation.International Journal of Machine Learning and Cybernetics,14(4):1427-1444.

[47]WrzesińskiG,MarkiewiczA,2022.Prediction of permeability coefficientk in sandy soils using ANN.Sustainability,14(11):6736.

[48]XingLM,ZhangYJ,2022.Forecasting crude oil prices with shrinkage methods: can nonconvex penalty and Huber loss help?Energy Economics,110:106014.

[49]YangY,ZhouH,WuJR,et al.,2022.Robustified extreme learning machine regression with applications in outlier-blended wind-speed forecasting.Applied Soft Computing,122:108814.

[50]YouMY,LuAN,2021.A robust TDOA based solution for source location using mixed Huber loss.Journal of Systems Engineering and Electronics,32(6):1375-1380.

[51]ZhangN,XuKP,YinZY,et al.,2025.Finite element-integrated neural network framework for elastic and elastoplastic solids.Computer Methods in Applied Mechanics and Engineering,433:117474.

[52]ZhaoW,YinQG,WenLF,2023.Intelligent inversion analysis of hydraulic engineering geological permeability coefficient based on an RF–HHO model.Water,15(11):1993.

[53]ZhongDH,LiuDH,CuiB,2011.Real-time compaction quality monitoring of high core rockfill dam.Science China Technological Sciences,54(7):1906-1913.

[54]ZhouSX,2023.An analysis of the small sample datasets based on machine learning.Proceedings of 2022 6th International Conference on Electronic Information Technology and Computer Engineering, p.1654-1658.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE