Full Text:   <440>

Summary:  <82>

Suppl. Mater.: 

CLC number: 

On-line Access: 2025-11-17

Received: 2025-06-18

Revision Accepted: 2025-11-18

Crosschecked: 2025-09-29

Cited: 0

Clicked: 387

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Shurui XU

https://orcid.org/0009-0005-9669-8582

Feng LUO

https://orcid.org/0009-0006-3851-843X

Shuyan LI

https://orcid.org/0000-0002-5107-0338

Mengzhen FAN

https://orcid.org/0009-0006-2391-4659

Zhongtian SUN

https://orcid.org/0000-0003-0489-5203

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2025 Vol.26 No.10 P.1871-1878

http://doi.org/10.1631/FITEE.2500421


Three trustworthiness challenges in large language model-based financial systems: real-world examples and mitigation strategies


Author(s):  Shurui XU, Feng LUO, Shuyan LI, Mengzhen FAN, Zhongtian SUN

Affiliation(s):  School of Electronics, Electrical Engineering and Computer Science, Queen's University Belfast, Belfast BT9 5BN, UK; more

Corresponding email(s):   li-sy16@tsinghua.org.cn

Key Words: 


Shurui XU, Feng LUO, Shuyan LI, Mengzhen FAN, Zhongtian SUN. Three trustworthiness challenges in large language model-based financial systems: real-world examples and mitigation strategies[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(10): 1871-1878.

@article{title="Three trustworthiness challenges in large language model-based financial systems: real-world examples and mitigation strategies",
author="Shurui XU, Feng LUO, Shuyan LI, Mengzhen FAN, Zhongtian SUN",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="10",
pages="1871-1878",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2500421"
}

%0 Journal Article
%T Three trustworthiness challenges in large language model-based financial systems: real-world examples and mitigation strategies
%A Shurui XU
%A Feng LUO
%A Shuyan LI
%A Mengzhen FAN
%A Zhongtian SUN
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 10
%P 1871-1878
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2500421

TY - JOUR
T1 - Three trustworthiness challenges in large language model-based financial systems: real-world examples and mitigation strategies
A1 - Shurui XU
A1 - Feng LUO
A1 - Shuyan LI
A1 - Mengzhen FAN
A1 - Zhongtian SUN
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 10
SP - 1871
EP - 1878
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2500421


Abstract: 
The integration of large language models (LLMs) into financial applications has demonstrated remarkable potential for enhancing decision-making processes, automating operations, and delivering personalized services. However, the high-stakes nature of financial systems demands a very high level of trustworthiness that current LLMs often fail to meet. This study identifies and examines three major trustworthiness challenges in LLM-based financial systems: (1) jailbreak prompts that exploit vulnerabilities in model alignment to produce harmful or noncompliant responses; (2) hallucination, where models generate factually incorrect outputs that can mislead financial decision-making; (3) bias and fairness concerns, where demographic or institutional bias embedded in LLMs may result in unfair treatment of individuals or regions. To make these risks concrete, we design three finance-relevant probes and evaluate a set of mainstream LLMs spanning both proprietary and open-source families. Across models, we observe risky behavior in at least one scenario per probe. Based on these findings, we systematically summarize the existing mitigation strategies that aim to address these risks. We argue that resolving these issues is vital not only for ensuring the responsible use of artificial intelligence (AI) in the financial sector but also for enabling its safe and scalable deployment.

基于大语言模型的金融系统面临的三大可信度挑战:现实案例与缓解策略

徐殊睿1,罗凤2,李舒燕1,范梦真3,孙中天4
1贝尔法斯特女王大学电子、电气工程与计算机科学学院,英国贝尔法斯特,BT9 5BN
2莱斯大学计算机科学系,美国得克萨斯州休斯敦市,77005
3北京大学汇丰商学院牛津校区,英国英格兰,OX1 5HR
4肯特大学计算机学院,英国肯特郡坎特伯雷,CT2 7NZ
摘要:大语言模型(LLM)在金融应用中的集成展现出显著潜力,可提升决策流程、实现操作自动化并提供个性化服务。然而,金融系统的高风险特性要求极高的可信度,而当前LLM往往难以满足这一要求。本研究识别并探讨了基于LLM的金融系统中的3大可信度挑战:(1)逃逸式提示—利用模型对齐漏洞生成有害或违规响应;(2)幻觉现象—模型产出事实错误的输出误导金融决策;(3)偏见与公平性问题—LLM内嵌的人口统计或制度偏见可能导致个体或区域遭受不公平对待。为具体呈现这些风险,我们设计了3项金融相关测试,并对涵盖专有与开源家族的主流LLM进行评估。在所有模型中,每项测试至少出现一次风险行为。基于这些发现,系统性地总结了现有风险缓解策略。我们认为,解决这些问题不仅对确保金融领域人工智能的负责任使用至关重要,更是实现其安全可扩展部署的关键所在。

关键词:可信赖的人工智能;大语言模型;金融;金融科技

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Andriushchenko M, Croce F, Flammarion N, 2025. Jailbreaking leading safety-aligned LLMs with simple adaptive attacks. 13th Int Conf on Learning Representations.

[2]Authority FIR, 2009. Financial Industry Regulatory Authority. https://www.kurtalawfirm.com/wp-content/uploads/2019064126802-Clearview-Trading-Advisors-Inc.-CRD-142873-Gregg-H.-Ettin-CRD-1604260-AWC-geg-2022-1670804406029.pdf [Accessed on Mar. 23, 2025].

[3]Barry M, Caillaut G, Halftermeyer P, et al., 2025. GraphRAG: leveraging graph-based efficiency to minimize hallucinations in LLM-driven RAG for finance data. Proc Workshop on Generative AI and Knowledge Graphs, p.54-65. https://hal.science/hal-04907346

[4]Bowen DEIII, Price SM, Stein LCD, et al., 2025. Measuring and Mitigating Racial Disparities in LLMs: Evidence from a Mortgage Underwriting Experiment. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4812158 [Accessed on Mar. 23, 2025].

[5]Calver J, Church P, Ford J, et al., 2024. AI in financial services—the legal and regulatory landscape. In: Law 2024. Edward Elgar Publishing, Cheltenham, p.420-458.

[6]Choe J, Kim J, Jung W, 2025. Hierarchical retrieval with evidence curation for open-domain financial question answering on standardized documents. Findings of the Association for Computational Linguistics, p.16663-16681.

[7]Davis HA, 2007. Summary of selected FINRA regulatory notices. J Invest Compl, 8(4):60-67.

[8]Dong MM, Stratopoulos TC, Wang VX, 2024. A scoping review of ChatGPT research in accounting and finance. Int J Account Inform Syst, 55:100715.

[9]Gallegos IO, Rossi RA, Barrow J, et al., 2024. Bias and fairness in large language models: a survey. Comput Linguist, 50(3):1097-1179.

[10]Huang L, Yu WJ, Ma WT, et al., 2024. A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions.

[11]Kabra S, Jha A, Reddy CK, 2025. Reasoning towards fairness: mitigating bias in language models through reasoning-guided fine-tuning.

[12]Khachaturov D, Mullins R, 2025. Adversarial suffix filtering: a defense pipeline for LLMs.

[13]Kumar R, Kumar H, Shalini K, 2025. Detecting and mitigating bias in LLMs through knowledge graph-augmented training. Int Conf on Artificial Intelligence and Data Engineering, p.608-613.

[14]Lee J, Stevens N, Han SC, 2025. Language models in finance (FinLLMs). Neur Comput Appl, 37:24853-24867.

[15]Li XY, Chen ZP, Zhang JM, et al., 2024. Benchmarking bias in large language models during role-playing.

[16]Liu XG, Xu N, Chen MH, et al., 2024. AutoDAN: generating stealthy jailbreak prompts on aligned large language models. 12th Int Conf on Learning Representations.

[17]Lundberg SM, Lee SI, 2017. A unified approach to interpreting model predictions. Proc 31st Int Conf on Neural Information Processing Systems, p.4768-4777.

[18]Manakul P, Liusie A, Gales MJF, 2023. SelfCheckGPT: zero-resource black-box hallucination detection for generative large language models. Proc Conf on Empirical Methods in Natural Language Processing, p.9004-9017.

[19]Nakagawa K, Hirano M, Fujimoto Y, 2024. Evaluating company-specific biases in financial sentiment analysis using large language models. IEEE Int Conf on Big Data, p.6614-6623.

[20]Ribeiro MT, Singh S, Guestrin C, 2016. “Why should I trust you?”: explaining the predictions of any classifier. Proc 22nd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.1135-1144.

[21]Sharma M, Tong M, Mu J, et al., 2025. Constitutional classifiers: defending against universal jailbreaks across thousands of hours of red teaming.

[22]Shen XY, Chen ZY, Backes M, et al., 2024. “Do anything now”: characterizing and evaluating in-the-wild jailbreak prompts on large language models. Proc ACM SIGSAC Conf on Computer and Communications Security, p.1671-1685.

[23]Simpson S, Nukpezah J, Brooks K, et al., 2025. Parity benchmark for measuring bias in LLMs. AI Ethics, 5(3):3087-3101.

[24]Tatsat H, Shater A, 2025. Beyond the black box: interpretability of LLMs in finance.

[25]Wu Z, Wang J, Zou C, et al., 2025. Towards competent AI for fundamental analysis in finance: a benchmark dataset and evaluation. https://arxiv.org/abs/2506.07315

[26]Yan SQ, Gu JC, Zhu Y, et al., 2024. Corrective retrieval augmented generation.

[27]Yu JH, Lin XW, Yu Z, et al., 2024. GPTFUZZER: red teaming large language models with auto-generated jailbreak prompts.

[28]Zhang YX, Zhou F, 2024. Bias mitigation in fine-tuning pre-trained models for enhanced fairness and efficiency.

[29]Zhou YJ, Han YF, Zhuang HM, et al., 2025. Defending jailbreak prompts via in-context adversarial game.

[30]Zou A, Wang Z, Kolter JZ, et al., 2023. Universal and transferable adversarial attacks on aligned language models.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE