
CLC number: TP391
On-line Access: 2025-11-17
Received: 2025-04-30
Revision Accepted: 2025-11-18
Crosschecked: 2025-09-05
Cited: 0
Clicked: 1652
Jiaqi SHI, Xulong ZHANG, Xiaoyang QU, Junfei XIE, Jianzong WANG. Knowledge distillation for financial large language models: a systematic review of strategies, applications, and evaluation[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2500282 @article{title="Knowledge distillation for financial large language models: a systematic review of strategies, applications, and evaluation", %0 Journal Article TY - JOUR
金融大语言模型知识蒸馏:策略、应用与评估的系统综述1平安科技(深圳)有限公司,中国深圳市,518046 2中国科学技术大学先进技术研究院,中国合肥市,230027 摘要:金融大语言模型为金融应用提供了巨大潜力。然而,过高的部署成本和巨大的推理延迟构成了主要障碍。作为一种重要压缩方法,知识蒸馏为这些难题提供了有效解决方案。本文对知识蒸馏如何与金融大语言模型相互作用进行了全面调查,涵盖了策略、应用和评估3个核心方面。在策略层面,引入一个结构化分类法,以比较分析现有蒸馏路径。在应用层面,提出一个逻辑的上游–中游–下游框架,系统地解释蒸馏模型在金融领域的实际价值。在评估层面,为解决金融领域缺乏标准的问题,构建了一个综合评估框架,从金融准确性、推理保真度和稳健性等多个维度进行评估。总而言之,本文旨在为这一跨学科领域提供清晰的路线图,以加速蒸馏型金融大模型发展。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Acharya K, Velasquez A, Song HH, 2024. A survey on symbolic knowledge distillation of large language models. IEEE Trans Artif Intell, 5(12):5928-5948. ![]() [2]Agarwal R, Vieillard N, Zhou YC, et al., 2024. On-policy distillation of language models: learning from self-generated mistakes. Proc 12th Int Conf on Learning Representations. ![]() [3]Alvarado JCS, Verspoor K, Baldwin T, 2015. Domain adaption of named entity recognition to support credit risk assessment. Proc Australasian Language Technology Association Workshop, p.84-90. ![]() [4]Barocas S, Hardt M, Narayanan A, 2023. Fairness and Machine Learning: Limitations and Opportunities. MIT Press, Cambridge, USA. ![]() [5]Bhatia G, Nagoudi EMB, Cavusoglu H, et al., 2024. FinTral: a family of GPT-4 level multimodal financial large language models. Proc Findings of the Association for Computational Linguistics, p.13064-13087. ![]() [6]Bollerslev T, 1986. Generalized autoregressive conditional heteroskedasticity. J Econom, 31(3):307-327. ![]() [7]Brown TB, Mann B, Ryder N, et al., 2020. Language models are few-shot learners. Proc 34th Int Conf on Neural Information Processing Systems, Article 159. ![]() [8]Burnett S, Lloyd A, 2020. Hidden and forbidden: conceptualising dark knowledge. J Doc, 76(6):1341-1358. ![]() [9]Chang HY, Shejwalkar V, Shokri R, et al., 2019. Cronus: robust and heterogeneous collaborative learning with black-box knowledge transfer. https://arxiv.org/abs/1912.11279 ![]() [10]Chen CC, Tseng YM, Kang J, et al., 2023. Multi-lingual ESG issue identification. Proc 5th Workshop on Financial Technology and Natural Language Processing and the 2nd Multimodal AI for Financial Forecasting, p.111-115. ![]() [11]Chen XX, Yang Y, Wang ZY, et al., 2024. Data distillation can be like vodka: distilling more times for better quality. Proc 12th Int Conf on Learning Representations. ![]() [12]Chen ZY, Chen WH, Smiley C, et al., 2021. FinQA: a dataset of numerical reasoning over financial data. Proc Conf on Empirical Methods in Natural Language Processing, p.3697-3711. ![]() [13]Chen ZY, Li SY, Smiley C, et al., 2022. ConvFinQA: exploring the chain of numerical reasoning in conversational finance question answering. Proc Conf on Empirical Methods in Natural Language Processing, p.6279-6292. ![]() [14]Costantino M, Coletti P, 2008. Information Extraction in Finance. WIT Press, Billerica, USA. ![]() [15]Daudert T, 2022. A multi-source entity-level sentiment corpus for the financial domain: the FinLin corpus. Lang Resour Eval, 56(1):333-356. ![]() [16]De Prado ML, 2018. Advances in Financial Machine Learning. John Wiley & Sons, Hoboken, USA. ![]() [17]Dow J, Gorton G, 1997. Stock market efficiency and economic efficiency: is there a connection? J Finance, 52(3):1087-1129. ![]() [18]Duffie D, Pan J, 1997. An overview of value at risk. J Deriv, 4(3):7-49. ![]() [19]Dwork C, McSherry F, Nissim K, et al., 2006. Calibrating noise to sensitivity in private data analysis. Proc 3rd Theory of Cryptography Conf, p.265-284. ![]() [20]Feng DY, Dai YF, Huang JM, et al., 2023. Empowering many, biasing a few: generalist credit scoring through large language models. https://arxiv.org/abs/2310.00566 ![]() [21]Galichin AV, Pautov M, Zhavoronkin A, et al., 2025. GLiRA: closed-box membership inference attack via knowledge distillation. IEEE Trans Inform Forens Secur, 20:3893-3906. ![]() [22]Gu YX, Dong L, Wei FR, et al., 2024. MiniLLM: knowledge distillation of large language models. Proc 12th Int Conf on Learning Representations. ![]() [23]Guo C, Pleiss G, Sun Y, et al., 2017. On calibration of modern neural networks. Proc 34th Int Conf on Machine Learning, p.1321-1330. ![]() [24]Han PC, Shi XY, Huang JW, 2024. FedAL: black-box federated knowledge distillation enabled by adversarial learning. IEEE J Sel Areas Commun, 42(11):3064-3077. ![]() [25]Han ZY, Gao C, Liu JY, et al., 2024. Parameter-efficient fine-tuning for large models: a comprehensive survey. https://arxiv.org/abs/2403.14608 ![]() [26]Hershey JR, Olsen PA, 2007. Approximating the Kullback Leibler divergence between Gaussian mixture models. Proc IEEE Int Conf on Acoustics, Speech, and Signal Processing, p.317-320. ![]() [27]Hristova D, Satani N, 2025. DiFiLE: a knowledge-distillation Longformer model for finance with ensembling. Proc 58th Annual Hawaii Int Conf on System Sciences, p.1585-1594. https://hdl.handle.net/10125/109031 ![]() [28]Huang AH, Wang H, Yang Y, 2023. FinBERT: a large language model for extracting information from financial text. Contemp Account Res, 40(2):806-841. ![]() [29]Jain S, Wallace BC, 2019. Attention is not explanation. Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.3543-3556. ![]() [30]Jensen MC, 1968. The performance of mutual funds in the period 1945-1964. J Finance, 23(2):389-416. ![]() [31]Ji ZW, Lee N, Frieske R, et al., 2023. Survey of hallucination in natural language generation. ACM Comput Surv, 55(12):248. ![]() [32]Jørgensen R, Brandt O, Hartmann M, et al., 2023. MultiFin: a dataset for multilingual financial NLP. Proc Findings of the Association for Computational Linguistics, p.894-909. ![]() [33]Jorion P, 1996. Risk2: measuring the risk in value at risk. Financ Anal J, 52(6):47-56. ![]() [34]Kaur S, Smiley C, Gupta A, et al., 2023. REFinD: relation extraction financial dataset. Proc 46th Int Conf on Research and Development in Information Retrieval, p.3054-3063. ![]() [35]Kim M, Lee S, Lee J, et al., 2023. Token-scaled logit distillation for ternary weight generative language models. Proc 37th Int Conf on Neural Information Processing Systems, p.42097-42118. ![]() [36]Kong YX, Nie YQ, Dong XW, et al., 2024. Large language models for financial and investment management: applications and benchmarks. J Portfolio Manage, 51(2):162-210. ![]() [37]Lamm M, Chaganty AT, Manning CD, et al., 2018. Textual analogy parsing: what's shared and what's compared among analogous facts. Proc Conf on Empirical Methods in Natural Language Processing, p.82-92. ![]() [38]Lee J, Stevens N, Han SC, 2025. Large language models in finance (FinLLMs). Neur Comput Appl, 37:24853-24867. ![]() [39]Lei SY, Tao DC, 2023. A comprehensive survey of dataset distillation. IEEE Trans Pattern Anal Mach Intell, 46(1):17-32. ![]() [40]Li JY, Tang TY, Zhao WX, et al., 2024. Pre-trained language models for text generation: a survey. ACM Comput Surv, 56(9):230. ![]() [41]Li LJ, Dong PJ, Li AG, et al., 2023. Kd-zero: evolving knowledge distiller for any teacher–student pairs. Proc 37th Int Conf on Neural Information Processing Systems, Article 3043. ![]() [42]Li YH, Wang SF, Ding H, et al., 2023. Large language models in finance: a survey. Proc 4th ACM Int Conf on AI in Finance, p.374-382. ![]() [43]Li Z, Li YX, Zhao PH, et al., 2023. Is synthetic data from diffusion models ready for knowledge distillation? https://arxiv.org/abs/2305.12954 ![]() [44]Liang C, Zuo SM, Zhang QR, et al., 2023. Less is more: task-aware layer-wise distillation for language model compression. Proc 40th Int Conf on Machine Learning, p.20852-20867. ![]() [45]Liebenwein L, Baykal C, Lang H, et al., 2020. Provable filter pruning for efficient neural networks. Proc 8th Int Conf on Learning Representations. ![]() [46]Liu XY, Xuan W, Zha DC, 2023. FinGPT: democratizing Internet-scale data for financial large language models. https://arxiv.org/abs/2307.10485 ![]() [47]Liu ZC, Oguz B, Zhao CS, et al., 2024. LLM-QAT: data-free quantization aware training for large language models. Proc Findings of the Association for Computational Linguistics, p.467-484. ![]() [48]Loughran T, McDonald B, 2011. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J Finance, 66(1):35-65. ![]() [49]Magdon-Ismail M, Atiya AF, 2004. Maximum drawdown. Risk Mag, 17(10):99-102. ![]() [50]Mariko D, Abi-Akl H, Labidurie E, et al., 2020. The financial document causality detection shared task (FinCausal 2020). Proc 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation, p.23-32. ![]() [51]Moreno-Ortiz A, Fernández-Cruz J, Pérez-Hernández C, 2020. Design and evaluation of SentiEcon: a fine-grained economic/financial sentiment lexicon from a corpus of business news. Proc 12th Language Resources and Evaluation Conf, p.5065-5072. ![]() [52]Mukherjee R, Bohra A, Banerjee A, et al., 2022. ECTSum: a new benchmark dataset for bullet point summarization of long earnings call transcripts. Proc Conf on Empirical Methods in Natural Language Processing, p.10893-10906. ![]() [53]Nguyen D, Gupta S, Do K, et al., 2022. Black-box few-shot knowledge distillation. Proc 17th European Conf on Computer Vision, p.196-211. ![]() [54]Nie YQ, Kong YX, Dong XW, et al., 2024. A survey of large language models for financial applications: progress, prospects and challenges. https://arxiv.org/abs/2406.11903 ![]() [55]Pandya HA, Bhatt BS, 2021. Question answering survey: directions, challenges, datasets, evaluation matrices. https://arxiv.org/abs/2112.03572 ![]() [56]Qin CW, Xia WH, Jiao FK, et al., 2023. Beyond output matching: bidirectional alignment for enhanced in-context learning. ![]() [57]Raza M, Jahangir Z, Riaz MB, et al., 2025. Industrial applications of large language models. Sci Rep, 15(1):13755. ![]() [58]Ribeiro MT, Singh S, Guestrin C, 2016. “Why should I trust you?”: explaining the predictions of any classifier. Proc 22nd Int Conf on Knowledge Discovery and Data Mining, p.1135-1144. ![]() [59]Shah A, Gullapalli A, Vithani R, et al., 2023a. FiNER-ORD: financial named entity recognition open research dataset. https://arxiv.org/abs/2302.11157 ![]() [60]Shah A, Paturi S, Chava S, 2023b. Trillion dollar words: a new financial dataset, task & market analysis. Proc 61st Annual Meeting of the Association for Computational Linguistics, p.6664-6679. ![]() [61]Sharma S, Nayak T, Bose A, et al., 2022. FinRED: a dataset for relation extraction in financial domain. Proc 31st Companion of the Web Conf, p.595-597. ![]() [62]Sharma S, Khatuya S, Hegde M, et al., 2023. Financial numeric extreme labelling: a dataset and benchmarking. Proc Findings of the Association for Computational Linguistics, p.3550-3561. ![]() [63]Singh S, 2018. Natural language processing for information extraction. https://arxiv.org/abs/1807.02383 ![]() [64]Sinha A, Khandait T, 2021. Impact of news on the commodity market: dataset and results. In: Arai K (Ed.), Advances in Information and Communication. Springer, Cham, p.589-601. ![]() [65]Sinha A, Kedas S, Kumar R, et al., 2022. SEntFiN 1.0: entity-aware sentiment analysis for financial news. J Assoc Inform Sci Technol, 73(9):1314-1335. ![]() [66]Soun Y, Yoo J, Cho MY, et al., 2022. Accurate stock movement prediction with self-supervised learning from sparse noisy tweets. Proc Int Conf on Big Data, p.1691-1700. ![]() [67]Sundaram JPS, Du W, Zhao Z, 2019. A survey on LoRa networking: research problems, current solutions, and open issues. IEEE Commun Surv Tutor, 22(1):371-388. ![]() [68]Sy E, Peng TC, Huang SH, et al., 2023. Fine-grained argument understanding with BERT ensemble techniques: a deep dive into financial sentiment analysis. Proc 35th Conf on Computational Linguistics and Speech Processing, p.242-249. ![]() [69]Tang YX, Liu ZJ, 2024. A distributed knowledge distillation framework for financial fraud detection based on Transformer. IEEE Access, 12:62899-62911. ![]() [70]Timiryasov I, Tastet J, 2023. Baby LLaMA: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty. https://arxiv.org/abs/2308.02019 ![]() [71]Touvron H, Martin L, Stone K, et al., 2023. Llama 2: open foundation and fine-tuned chat models. https://arxiv.org/abs/2307.09288 ![]() [72]van Erven T, Harremos P, 2014. Rényi divergence and Kullback-Leibler divergence. IEEE Trans Inform Theory, 60(7):3797-3820. ![]() [73]Varmedja D, Karanovic M, Sladojevic S, et al., 2019. Credit card fraud detection—machine learning methods. Proc 18th Int Symp INFOTEH-JAHORINA, p.1-5. ![]() [74]Wan FQ, Huang XT, Cai D, et al., 2024. Knowledge fusion of large language models. Proc 12th Int Conf on Learning Representations. ![]() [75]Wang Z, 2021. Zero-shot knowledge distillation from a decision-based black-box model. Proc 38th Int Conf on Machine Learning, p.10675-10685. ![]() [76]Wen YQ, Li ZC, Du WY, et al., 2023. f-divergence minimization for sequence-level knowledge distillation. Proc 61st Annual Meeting of the Association for Computational, p.10817-10834. ![]() [77]Wu HZ, Zhang W, Shen WW, et al., 2018. Hybrid deep sequential modeling for social text-driven stock prediction. Proc 27th Int Conf on Information and Knowledge Management, p.1627-1630. ![]() [78]Wu SJ, Irsoy O, Lu S, et al., 2023. BloombergGPT: a large language model for finance. https://arxiv.org/abs/2303.17564 ![]() [79]Xie QQ, Han WG, Zhang X, et al., 2023. PIXIU: a comprehensive benchmark, instruction dataset and large language model for finance. Proc 37th Int Conf on Neural Information Processing Systems, p.33469-33484. ![]() [80]Xie QQ, Han WG, Chen ZY, et al., 2024. FinBen: a holistic financial benchmark for large language models. Proc 38th Int Conf on Neural Information Processing Systems, p.95716-95743. ![]() [81]Xu XH, Li M, Tao CY, et al., 2024. A survey on knowledge distillation of large language models. https://arxiv.org/abs/2402.13116 ![]() [82]Xu YM, Cohen SB, 2018. Stock movement prediction from tweets and historical prices. Proc 56th Annual Meeting of the Association for Computational Linguistics, p.1970-1979. ![]() [83]Yang CP, Zhu Y, Lu W, et al., 2024. Survey on knowledge distillation for large language models: methods, evaluation, and application. ACM Trans Intell Syst Technol. ![]() [84]Yang LY, Kenny EM, Ng TLJ, et al., 2020. Generating plausible counterfactual explanations for deep transformers in financial text classification. Proc 28th Int Conf on Computational Linguistics, p.6150-6160. ![]() [85]Yang Y, Tang YX, Tam KY, 2023. InvestLM: a large language model for investment using financial domain instruction tuning. https://arxiv.org/abs/2309.13064 ![]() [86]Zhang XY, Yang Q, 2023. XuanYuan 2.0: a large Chinese financial chat model with hundreds of billions parameters. Proc 32nd Int Conf on Information and Knowledge Management, p.4435-4439. ![]() [87]Zhao QY, Zhu BH, 2024. Towards the fundamental limits of knowledge transfer over finite domains. Proc 12th Int Conf on Learning Representations. ![]() [88]Zhao YX, Yu B, Hui BY, et al., 2024. Tree-instruct: a preliminary study of the intrinsic relationship between complexity and alignment. Proc Joint Int Conf on Computational Linguistics, Language Resources and Evaluation, p.16776-16789. ![]() [89]Zhao ZH, Fan WQ, Li JT, et al., 2024. Recommender systems in the era of large language models (LLMs). IEEE Trans Knowl Data Eng, 36(11):6889-6907. ![]() [90]Zhou ZH, Ma LQ, Liu H, 2021. Trade the event: corporate events detection for news-based event-driven trading. Proc Findings of the Association for Computational Linguistics, p.2114-2124. ![]() [91]Zhu FB, Lei WQ, Huang YC, et al., 2021. TAT-QA: a question answering benchmark on a hybrid of tabular and textual content in finance. Proc 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing, p.3277-3287. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE | ||||||||||||||


ORCID:
Open peer comments: Debate/Discuss/Question/Opinion
<1>