
CLC number:
On-line Access: 2025-11-17
Received: 2025-04-10
Revision Accepted: 2025-04-28
Crosschecked: 2025-11-18
Cited: 0
Clicked: 638
Citations: Bibtex RefMan EndNote GB/T7714
Shuoling LIU, Liyuan CHEN, Jiangpeng YAN, Yuhang JIANG, Xiaoyu WANG, Xiu LI, Qiang YANG. When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(10): 1862-1870.
@article{title="When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations",
author="Shuoling LIU, Liyuan CHEN, Jiangpeng YAN, Yuhang JIANG, Xiaoyu WANG, Xiu LI, Qiang YANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="10",
pages="1862-1870",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2500227"
}
%0 Journal Article
%T When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations
%A Shuoling LIU
%A Liyuan CHEN
%A Jiangpeng YAN
%A Yuhang JIANG
%A Xiaoyu WANG
%A Xiu LI
%A Qiang YANG
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 10
%P 1862-1870
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2500227
TY - JOUR
T1 - When DeepSeek-R1 meets financial applications: benchmarking, opportunities, and limitations
A1 - Shuoling LIU
A1 - Liyuan CHEN
A1 - Jiangpeng YAN
A1 - Yuhang JIANG
A1 - Xiaoyu WANG
A1 - Xiu LI
A1 - Qiang YANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 10
SP - 1862
EP - 1870
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2500227
Abstract: How the recent progress of reasoning large language models (LLMs), especially the new open-source model DeepSeek-R1, can benefit financial services is an underexplored problem. While LLMs have ignited numerous applications within the financial sector, including financial news analysis and general customer interactions, DeepSeek-R1 further unlocks the advanced reasoning ability with multiple reinforcement learning-integrated training steps for more complex financial queries and provides distilled student models for resource-constrained scenarios. In this paper, we first introduce the technological preliminaries of DeepSeek-R1. Subsequently, we benchmark the performance of DeepSeek-R1 and its distilled students on two public financial question–answer (QA) datasets as a starting point for interdisciplinary research on financial artificial intelligence (AI). Then, we discuss the opportunities that DeepSeek-R1 offers to current financial services, its current limitations, and three future research directions. In conclusion, we argue for a proper approach to adopting reasoning LLMs for financial AI.
[1]Achiam J, Adler S, Agarwal S, et al., 2023. GPT-4 technical report. https://arxiv.org/abs/2303.08774
[2]Bai JZ, Bai S, Chu YF, et al., 2023. Qwen technical report. https://arxiv.org/abs/2309.16609
[3]Bai YT, Jones A, Ndousse K, et al., 2022. Training a helpful and harmless assistant with reinforcement learning from human feedback. https://arxiv.org/abs/2204.05862
[4]Besta M, Blach N, Kubicek A, et al., 2024. Graph of thoughts: solving elaborate problems with large language models. Proc 38th AAAI Conf on Artificial Intelligence, p.17682-17690.
[5]Brown TB, Mann B, Ryder N, et al., 2020. Language models are few-shot learners. Proc 34th Int Conf on Neural Information Processing Systems, Article 159.
[6]Chang YP, Wang X, Wang JD, et al., 2024. A survey on evaluation of large language models. ACM Trans Intell Syst Technol, 15(3):39.
[7]Chervonyi Y, Trinh TH, Olšák M, et al., 2025. Gold-medalist performance in solving Olympiad geometry with AlphaGeometry2. https://arxiv.org/abs/2502.03544
[8]Daiya D, Lin C, 2021. Stock movement prediction and portfolio management via multimodal learning with Transformer. Proc IEEE Int Conf on Acoustics, Speech and Signal Processing, p.3305-3309.
[9]Fan T, Kang Y, Ma GQ, et al., 2023. FATE-LLM: an industrial grade federated learning framework for large language models. https://arxiv.org/abs/2310.10049
[10]Gemini Team of Google, 2023. Gemini: a family of highly capable multimodal models. https://arxiv.org/abs/2312.11805
[11]Guo DY, Yang DJ, Zhang HW, et al., 2025. DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://arxiv.org/abs/2501.12948
[12]Guo X, Xia HT, Liu ZW, et al., 2023. FinEval: a Chinese financial domain knowledge evaluation benchmark for large language models. https://arxiv.org/abs/2308.09975
[13]Hinton G, Vinyals O, Dean J, 2015. Distilling the knowledge in a neural network. https://arxiv.org/abs/1503.02531
[14]Hua WY, Zhang YF, 2022. System 1+system 2=better world: neural-symbolic chain of logic reasoning. Proc Findings of the Association for Computational Linguistics, p.601-612.
[15]Huang L, Yu WJ, Ma WT, et al., 2025. A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions. ACM Trans Inform Syst, 43(2):42.
[16]Jaech A, Kalai A, Lerer A, et al., 2024. OpenAI o1 system card. https://arxiv.org/abs/2412.16720
[17]Kaplan J, McCandlish S, Henighan T, et al., 2020. Scaling laws for neural language models. https://arxiv.org/abs/2001.08361
[18]Kojima T, Gu SS, Reid M, et al., 2022. Large language models are zero-shot reasoners. Proc 36th Int Conf on Neural Information Processing Systems, Article 1613.
[19]Kuang WR, Qian BC, Li ZT, et al., 2024. FederatedScope-LLM: a comprehensive package for fine-tuning large language models in federated learning. Proc 30th ACM SIGKDD Conf on Knowledge Discovery and Data Mining, p.5260-5271.
[20]Lightman H, Kosaraju V, Burda Y, et al., 2024. Let’s verify step by step. Proc 12th Int Conf on Learning Representations, p.1-24.
[21]Liu AX, Feng B, Xue B, et al., 2024. DeepSeek-V3 technical report. https://arxiv.org/abs/2412.19437
[22]Liu Y, Yao YS, Ton JF, et al., 2023. Trustworthy LLMs: a survey and guideline for evaluating large language models’ alignment. https://arxiv.org/abs/2308.05374
[23]Lusardi A, Mitchell OS, 2014. The economic importance of financial literacy: theory and evidence. J Econ Liter, 52(1):5-44.
[24]Miao XP, Oliaro G, Zhang ZH, et al., 2024. SpecInfer: accelerating large language model serving with tree-based speculative inference and verification. Proc 29th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems, p.932-949.
[25]Peng YF, Malin BA, Rousseau JF, et al., 2025. From GPT to DeepSeek: significant gaps remain in realizing AI in healthcare. J Biomed Inform, 163:104791.
[26]Qian LF, Zhou WP, Wang Y, et al., 2025. Fino1: on the transferability of reasoning enhanced LLMs to finance. https://arxiv.org/abs/2502.08127
[27]Shao ZH, Wang PY, Zhu QH, et al., 2024. DeepSeekMath: pushing the limits of mathematical reasoning in open language models. https://arxiv.org/abs/2402.03300
[28]Sun GY, Jin MY, Wang ZT, et al., 2024. Visual agents as fast and slow thinkers. https://arxiv.org/abs/2408.08862
[29]Touvron H, Martin L, Stone K, et al., 2023. Llama 2: open foundation and fine-tuned chat models. https://arxiv.org/abs/2307.09288
[30]Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 31st Int Conf on Neural Information Processing Systems, p.6000-6010.
[31]Webb T, Holyoak KJ, Lu HJ, 2023. Emergent analogical reasoning in large language models. Nat Human Behav, 7(9):1526-1541.
[32]Wei J, Wang XZ, Schuurmans D, et al., 2022. Chain-of-thought prompting elicits reasoning in large language models. Proc 36th Int Conf on Neural Information Processing Systems, Article 1800.
[33]Wu SJ, Irsoy O, Lu S, et al., 2023. BloombergGPT: a large language model for finance. https://arxiv.org/abs/2303.17564
[34]Yao SY, Yu D, Zhao J, et al., 2023. Tree of thoughts: deliberate problem solving with large language models. Proc 37th Int Conf on Neural Information Processing Systems, Article 517.
[35]Zelikman E, Harik G, Shao Y, et al., 2024. Quiet-STaR: language models can teach themselves to think before speaking. https://arxiv.org/abs/2403.09629
[36]Zhang WT, Zhao LX, Xia HC, et al., 2024. A multimodal foundation agent for financial trading: tool-augmented, diversified, and generalist. Proc 30th ACM SIGKDD Conf on Knowledge Discovery and Data Mining, p.4314-4325.
[37]Zhang XY, Yang Q, 2023. XuanYuan 2.0: a large Chinese financial chat model with hundreds of billions parameters. Proc 32nd ACM Int Conf on Information and Knowledge Management, p.4435-4439.
[38]Zhou J, Ke P, Qiu XP, et al., 2024. ChatGPT: potential, prospects, and limitations. Front Inform Technol Electron Eng, 25(1):6-11.
Open peer comments: Debate/Discuss/Question/Opinion
<1>