Full Text:   <2083>

Summary:  <49>

CLC number: 

On-line Access: 2025-11-17

Received: 2025-04-10

Revision Accepted: 2025-04-28

Crosschecked: 2025-11-18

Cited: 0

Clicked: 638

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Shuoling LIU

https://orcid.org/0009-0003-1960-3004

Qiang YANG

https://orcid.org/0000-0001-5059-8360

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2025 Vol.26 No.10 P.1862-1870

http://doi.org/10.1631/FITEE.2500227


When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations


Author(s):  Shuoling LIU, Liyuan CHEN, Jiangpeng YAN, Yuhang JIANG, Xiaoyu WANG, Xiu LI, Qiang YANG

Affiliation(s):  The Hong Kong University of Science and Technology, Hong Kong 999077, China; more

Corresponding email(s):   liushuoling@efunds.com.cn, chenly@efunds.com.cn, yanjiangpeng@efunds.com.cn, li.xiu@sz.tsinghua.edu.cn, qyang@cse.ust.hk

Key Words: 


Shuoling LIU, Liyuan CHEN, Jiangpeng YAN, Yuhang JIANG, Xiaoyu WANG, Xiu LI, Qiang YANG. When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(10): 1862-1870.

@article{title="When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations",
author="Shuoling LIU, Liyuan CHEN, Jiangpeng YAN, Yuhang JIANG, Xiaoyu WANG, Xiu LI, Qiang YANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="10",
pages="1862-1870",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2500227"
}

%0 Journal Article
%T When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations
%A Shuoling LIU
%A Liyuan CHEN
%A Jiangpeng YAN
%A Yuhang JIANG
%A Xiaoyu WANG
%A Xiu LI
%A Qiang YANG
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 10
%P 1862-1870
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2500227

TY - JOUR
T1 - When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations
A1 - Shuoling LIU
A1 - Liyuan CHEN
A1 - Jiangpeng YAN
A1 - Yuhang JIANG
A1 - Xiaoyu WANG
A1 - Xiu LI
A1 - Qiang YANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 10
SP - 1862
EP - 1870
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2500227


Abstract: 
How the recent progress of reasoning large language models (LLMs), especially the new open-source model DeepSeek-R1, can benefit financial services is an underexplored problem. While LLMs have ignited numerous applications within the financial sector, including financial news analysis and general customer interactions, DeepSeek-R1 further unlocks the advanced reasoning ability with multiple reinforcement learning-integrated training steps for more complex financial queries and provides distilled student models for resource-constrained scenarios. In this paper, we first introduce the technological preliminaries of DeepSeek-R1. Subsequently, we benchmark the performance of DeepSeek-R1 and its distilled students on two public financial question–answer (QA) datasets as a starting point for interdisciplinary research on financial artificial intelligence (AI). Then, we discuss the opportunities that DeepSeek-R1 offers to current financial services, its current limitations, and three future research directions. In conclusion, we argue for a proper approach to adopting reasoning LLMs for financial AI.

DeepSeek-R1遇上金融应用:基准、机遇与局限

刘硕凌1,2,陈丽园2,严江鹏2,4,蒋昱航2,王笑予2,李秀4,杨强1,3
1香港科技大学,中国香港特别行政区,999077
2易方达基金管理有限公司,中国广州市,510000
3微众银行,中国深圳市,518054
4清华大学深圳国际研究生院,中国深圳市,518055
摘要:在金融服务领域,推理型大语言模型--尤其新兴开源模型DeepSeek-R1--的潜在价值仍处于初步探索阶段。尽管通用大语言模型已在金融新闻分析、客户交互等场景实现较多应用,但DeepSeek-R1凭借集成强化学习的多阶段训练机制,突破性地解锁了高级推理能力,不仅能精准应对复杂金融问答任务,还针对资源受限环境推出轻量级蒸馏学生模型,显著提升部署灵活性。本文以跨学科视角切入金融人工智能领域,首先系统剖析DeepSeek-R1的技术架构与核心原理,随后基于两个公开金融问答数据集,对DeepSeek-R1及其蒸馏模型开展初步但全面的性能基准测试。在此基础上,深入探讨该模型为金融服务带来的创新机遇,客观分析其现存局限性,并前瞻性地提出3个未来研究方向。本文旨在为推理型大语言模型在金融人工智能领域的合理应用与深度发展提供理论依据与实践指引,推动金融科技迈向更高层次。

关键词:大语言模型;模型推理;人工智能;金融科技

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Achiam J, Adler S, Agarwal S, et al., 2023. GPT-4 technical report. https://arxiv.org/abs/2303.08774

[2]Bai JZ, Bai S, Chu YF, et al., 2023. Qwen technical report. https://arxiv.org/abs/2309.16609

[3]Bai YT, Jones A, Ndousse K, et al., 2022. Training a helpful and harmless assistant with reinforcement learning from human feedback. https://arxiv.org/abs/2204.05862

[4]Besta M, Blach N, Kubicek A, et al., 2024. Graph of thoughts: solving elaborate problems with large language models. Proc 38th AAAI Conf on Artificial Intelligence, p.17682-17690.

[5]Brown TB, Mann B, Ryder N, et al., 2020. Language models are few-shot learners. Proc 34th Int Conf on Neural Information Processing Systems, Article 159.

[6]Chang YP, Wang X, Wang JD, et al., 2024. A survey on evaluation of large language models. ACM Trans Intell Syst Technol, 15(3):39.

[7]Chervonyi Y, Trinh TH, Olšák M, et al., 2025. Gold-medalist performance in solving Olympiad geometry with AlphaGeometry2. https://arxiv.org/abs/2502.03544

[8]Daiya D, Lin C, 2021. Stock movement prediction and portfolio management via multimodal learning with Transformer. Proc IEEE Int Conf on Acoustics, Speech and Signal Processing, p.3305-3309.

[9]Fan T, Kang Y, Ma GQ, et al., 2023. FATE-LLM: an industrial grade federated learning framework for large language models. https://arxiv.org/abs/2310.10049

[10]Gemini Team of Google, 2023. Gemini: a family of highly capable multimodal models. https://arxiv.org/abs/2312.11805

[11]Guo DY, Yang DJ, Zhang HW, et al., 2025. DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://arxiv.org/abs/2501.12948

[12]Guo X, Xia HT, Liu ZW, et al., 2023. FinEval: a Chinese financial domain knowledge evaluation benchmark for large language models. https://arxiv.org/abs/2308.09975

[13]Hinton G, Vinyals O, Dean J, 2015. Distilling the knowledge in a neural network. https://arxiv.org/abs/1503.02531

[14]Hua WY, Zhang YF, 2022. System 1+system 2=better world: neural-symbolic chain of logic reasoning. Proc Findings of the Association for Computational Linguistics, p.601-612.

[15]Huang L, Yu WJ, Ma WT, et al., 2025. A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions. ACM Trans Inform Syst, 43(2):42.

[16]Jaech A, Kalai A, Lerer A, et al., 2024. OpenAI o1 system card. https://arxiv.org/abs/2412.16720

[17]Kaplan J, McCandlish S, Henighan T, et al., 2020. Scaling laws for neural language models. https://arxiv.org/abs/2001.08361

[18]Kojima T, Gu SS, Reid M, et al., 2022. Large language models are zero-shot reasoners. Proc 36th Int Conf on Neural Information Processing Systems, Article 1613.

[19]Kuang WR, Qian BC, Li ZT, et al., 2024. FederatedScope-LLM: a comprehensive package for fine-tuning large language models in federated learning. Proc 30th ACM SIGKDD Conf on Knowledge Discovery and Data Mining, p.5260-5271.

[20]Lightman H, Kosaraju V, Burda Y, et al., 2024. Let’s verify step by step. Proc 12th Int Conf on Learning Representations, p.1-24.

[21]Liu AX, Feng B, Xue B, et al., 2024. DeepSeek-V3 technical report. https://arxiv.org/abs/2412.19437

[22]Liu Y, Yao YS, Ton JF, et al., 2023. Trustworthy LLMs: a survey and guideline for evaluating large language models’ alignment. https://arxiv.org/abs/2308.05374

[23]Lusardi A, Mitchell OS, 2014. The economic importance of financial literacy: theory and evidence. J Econ Liter, 52(1):5-44.

[24]Miao XP, Oliaro G, Zhang ZH, et al., 2024. SpecInfer: accelerating large language model serving with tree-based speculative inference and verification. Proc 29th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems, p.932-949.

[25]Peng YF, Malin BA, Rousseau JF, et al., 2025. From GPT to DeepSeek: significant gaps remain in realizing AI in healthcare. J Biomed Inform, 163:104791.

[26]Qian LF, Zhou WP, Wang Y, et al., 2025. Fino1: on the transferability of reasoning enhanced LLMs to finance. https://arxiv.org/abs/2502.08127

[27]Shao ZH, Wang PY, Zhu QH, et al., 2024. DeepSeekMath: pushing the limits of mathematical reasoning in open language models. https://arxiv.org/abs/2402.03300

[28]Sun GY, Jin MY, Wang ZT, et al., 2024. Visual agents as fast and slow thinkers. https://arxiv.org/abs/2408.08862

[29]Touvron H, Martin L, Stone K, et al., 2023. Llama 2: open foundation and fine-tuned chat models. https://arxiv.org/abs/2307.09288

[30]Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 31st Int Conf on Neural Information Processing Systems, p.6000-6010.

[31]Webb T, Holyoak KJ, Lu HJ, 2023. Emergent analogical reasoning in large language models. Nat Human Behav, 7(9):1526-1541.

[32]Wei J, Wang XZ, Schuurmans D, et al., 2022. Chain-of-thought prompting elicits reasoning in large language models. Proc 36th Int Conf on Neural Information Processing Systems, Article 1800.

[33]Wu SJ, Irsoy O, Lu S, et al., 2023. BloombergGPT: a large language model for finance. https://arxiv.org/abs/2303.17564

[34]Yao SY, Yu D, Zhao J, et al., 2023. Tree of thoughts: deliberate problem solving with large language models. Proc 37th Int Conf on Neural Information Processing Systems, Article 517.

[35]Zelikman E, Harik G, Shao Y, et al., 2024. Quiet-STaR: language models can teach themselves to think before speaking. https://arxiv.org/abs/2403.09629

[36]Zhang WT, Zhao LX, Xia HC, et al., 2024. A multimodal foundation agent for financial trading: tool-augmented, diversified, and generalist. Proc 30th ACM SIGKDD Conf on Knowledge Discovery and Data Mining, p.4314-4325.

[37]Zhang XY, Yang Q, 2023. XuanYuan 2.0: a large Chinese financial chat model with hundreds of billions parameters. Proc 32nd ACM Int Conf on Information and Knowledge Management, p.4435-4439.

[38]Zhou J, Ke P, Qiu XP, et al., 2024. ChatGPT: potential, prospects, and limitations. Front Inform Technol Electron Eng, 25(1):6-11.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE