Journal of Zhejiang University

ENGINEERING Information Technology & Electronic Engineering 2026 Vol.27 No.3 P.1-12

http://doi.org/10.1631/ENG.ITEE.2025.0047

DDiNER: domain dictionary-guided Chinese named entity recognition for complex industrial contexts

Author(s): Ronghui LIU, Wei CUI, Xiaojun LIANG, Weihua GUI
Affiliation(s): 1. School of Future Technology, South China University of Technology, Guangzhou 510641, China more
Corresponding email(s): aucuiwei@scut.edu.cn
Key Words: Named entity recognition (NER), Process industry, Domain dictionary, Hierarchical lexicon adapter (HLA)

Share this article to： More <<< Previous Article \|Next Article >>>

Ronghui LIU, Wei CUI, Xiaojun LIANG, Weihua GUI. DDiNER: domain dictionary-guided Chinese named entity recognition for complex industrial contexts[J]. Journal of Zhejiang University Science C, 2026, 27(3): 1-12.

@article{title="DDiNER: domain dictionary-guided Chinese named entity recognition for complex industrial contexts",
author="Ronghui LIU, Wei CUI, Xiaojun LIANG, Weihua GUI",
journal="Journal of Zhejiang University Science C",
volume="27",
number="3",
pages="1-12",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/ENG.ITEE.2025.0047"
}

%0 Journal Article
%T DDiNER: domain dictionary-guided Chinese named entity recognition for complex industrial contexts
%A Ronghui LIU
%A Wei CUI
%A Xiaojun LIANG
%A Weihua GUI
%J Frontiers of Information Technology & Electronic Engineering
%V 27
%N 3
%P 1-12
%@ 1869-1951
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/ENG.ITEE.2025.0047

TY - JOUR
T1 - DDiNER: domain dictionary-guided Chinese named entity recognition for complex industrial contexts
A1 - Ronghui LIU
A1 - Wei CUI
A1 - Xiaojun LIANG
A1 - Weihua GUI
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 27
IS - 3
SP - 1
EP - 12
%@ 1869-1951
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/ENG.ITEE.2025.0047

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Accurate Chinese named entity recognition (NER) in the process industry is crucial for applications such as information extraction, knowledge graph construction, and intelligent decision-making. However, challenges, including ambiguous entity boundaries, semantic overlaps, and limited annotated data, significantly hinder performance. To address these issues, this study proposes DDiNER, a domain dictionary-guided Chinese NER framework that integrates a hierarchical industrial domain dictionary with bidirectional encoder representations from Transformers (BERT) via a hierarchical lexicon adapter (HLA), combined with bidirectional long short-term memory (BiLSTM) and conditional random field (CRF) layers for multilevel feature fusion. Experimental results show that DDiNER achieves superior performance, with average precision, recall, and F1-scores of 95.75%, 95.73%, and 95.74%, respectively, outperforming state-of-the-art models. Validation on an independent dataset confirms its robustness and strong capability in recognizing unseen and long-tail entities. This study provides an effective and scalable solution for industrial Chinese NER, with significant potential for downstream intelligent applications.

DDiNER：面向复杂工业场景的领域词典引导中文命名实体识别方法

刘荣辉^1,2，崔巍^1,2，梁骁俊²，桂卫华^2,3
¹华南理工大学未来技术学院，中国广州市，510641
²鹏城实验室，中国深圳市，518055
³中南大学自动化学院，中国长沙市，410083
摘要：在工业流程中，准确的中文命名实体识别（NER）对于信息抽取、知识图谱构建及智能决策等应用具有重要意义。然而，实体边界模糊、语义重叠及标注数据不足等问题严重制约其性能。针对上述挑战，本文提出一种领域词典引导的中文NER框架--DDiNER。该框架通过层次化词典适配器将层次化工业领域词典与双向编码表示模型进行融合，并结合双向长短期记忆网络和条件随机场实现多层级特征融合。实验结果表明，DDiNER取得优异性能，其平均精确率、召回率和F1值分别达到95.75%、95.73%和95.74%，显著优于现有主流方法。在独立数据集上的验证结果证实，该模型在识别未注册实体和长尾实体方面具有良好的鲁棒性与泛化能力。本研究为工业领域中文NER提供了一种高效且可扩展的解决方案，在下游智能应用领域具有显著潜力。

关键词：命名实体识别；流程工业；领域词典；层次化词典适配器

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]An Y, Xia XY, Chen XL, et al., 2022. Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF. Artif Intell Med, 127:102282.

[2]Ashok D, Lipton ZC, 2023. PromptNER: prompting for named entity recognition.

[3]Chen JW, Lu YJ, Lin HY, et al., 2023. Learning in-context learning for named entity recognition. Proc 61^st Annual Meeting of the Association for Computational Linguistics, p.13661-13675.

[4]Collobert R, Weston J, Bottou L, et al., 2011. Natural language processing (almost) from scratch. J Mach Learn Res, 12:2493-2537.

[5]De S, Sanyal DK, Mukherjee I, 2025. Fine-tuned encoder models with data augmentation beat ChatGPT in agricultural named entity recognition and relation extraction. Expert Syst Appl, 277:127126.

[6]Devlin J, Chang MW, Lee K, et al., 2019. BERT: pre-training of deep bidirectional Transformers for language understanding. Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.4171-4186.

[7]Ehrmann M, Hamdi A, Pontes EL, et al., 2024. Named entity recognition and classification in historical documents: a survey. ACM Comput Surv, 56(2):1-47.

[8]G V, Kanjirangat V, Gupta D, 2023. AGRONER: an unsupervised agriculture named entity recognition using weighted distributional semantic model. Expert Syst Appl, 229:120440.

[9]Gao C, Zhang X, Han MT, et al., 2021. A review on cyber security named entity recognition. Front Inform Technol Electron Eng, 22(9):1153-1168.

[10]Geng RS, Chen YP, Huang RZ, et al., 2023. Planarized sentence representation for nested named entity recognition. Inform Process Manag, 60(4):103352.

[11]Gong S, Xiong X, Liu YF, et al., 2022. A Transformer-based longer entity attention model for Chinese named entity recognition in aerospace. 5^th Int Conf on Advanced Electronic Materials, Computers and Software Engineering, p.348-355.

[12]Hu Y, Chen QY, Du JC, et al., 2024. Improving large language models for clinical named entity recognition via prompt engineering. J Am Med Inform Assn, 31(9):1812-1820.

[13]Hu Z, Ma XN, 2023. A novel neural network model fusion approach for improving medical named entity recognition in online health expert question-answering services. Expert Syst Appl, 223:119880.

[14]Hu ZT, Hou W, Liu XX, 2024. Deep learning for named entity recognition: a survey. Neur Comput Appl, 36(16):8995-9022.

[15]Ji ZC, Xiao YL, 2024. LLET: lightweight lexicon-enhanced Transformer for Chinese NER. IEEE Int Conf on Acoustics, Speech and Signal Processing, p.12677-12681.

[16]Kumar A, Starly B, 2022. “FabNER”: information extraction from manufacturing process science domain literature using named entity recognition. J Intell Manuf, 33(8):2393-2407.

[17]Li WG, Ramos RM, Brom PC, et al., 2025. Threshold study for Hanzi image recognition: defining character and component limits in Chinese, Japanese, and Korean script processing. Int J Asian Lang Process, 35(1):2450011.

[18]Li XN, Yan H, Qiu XP, et al., 2020. FLAT: Chinese NER using flat-lattice Transformer. Proc 58^th Annual Meeting of the Association for Computational Linguistics, p.6836-6842.

[19]Li ZZ, Feng DW, Li DS, et al., 2020. Learning to select pseudo labels: a semi-supervised method for named entity recognition. Front Inform Technol Electron Eng, 21(6):903-916.

[20]Liu C, Yang SW, 2022. Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Syst Appl, 207:117991.

[21]Liu P, Guo YM, Wang FL, et al., 2022. Chinese named entity recognition: the state of the art. Neurocomputing, 473:37-53.

[22]Liu W, Fu XY, Zhang Y, et al., 2021. Lexicon enhanced Chinese sequence labeling using BERT adapter. Proc 59^th Annual Meeting of the Association for Computational Linguistics and the 11^th Int Joint Conf on Natural Language Processing, p.5847-5858.

[23]Liu YH, Ott M, Goyal N, et al., 2019. RoBERTa: a robustly optimized BERT pretraining approach.

[24]Lu QH, Li R, Wen A, et al., 2025. Large language models struggle in token-level clinical named entity recognition. AMIA Annu Symp Proc, 2024:748-757.

[25]Ma RT, Peng ML, Zhang Q, et al., 2020. Simplify the usage of lexicon in Chinese NER. Proc 58^th Annual Meeting of the Association for Computational Linguistics, p.5951-5960.

[26]McCallum A, Li W, 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. Proc 7^th Conf on Natural Language Learning at HLT-NAACL, p.188-191.

[27]Niu PY, Hou C, 2024. Named entity recognition in Chinese rice breeding questions based on text data augmentation. Trans Chin Soc Agri Mach, 55(8):333-343 (in Chinese).

[28]Prasanna KSL, Lokesh S, Chandramouli G, et al., 2024. BERT-QA: empowering intelligent question answering with NLP and entity recognition. 3^rd Int Conf on Applied Artificial Intelligence and Computing, p.1006-1010.

[29]Qin H, Li M, Wang L, et al., 2025. A radical-based token representation method for enhancing Chinese pre-trained language models. Electronics, 14(5):1031.

[30]Qiu QJ, Tian M, Huang Z, et al., 2024. Chinese engineering geological named entity recognition by fusing multi-features and data enhancement using deep learning. Expert Syst Appl, 238:121925.

[31]Qu XY, Gu YJ, Xia QR, et al., 2024. A survey on Arabic named entity recognition: past, recent advances, and future trends. IEEE Trans Knowl Data Eng, 36(3):943-959.

[32]Seow WL, Chaturvedi I, Hogarth A, et al., 2025. A review of named entity recognition: from learning methods to modelling paradigms and tasks. Artif Intell Rev, 58(10):315.

[33]Sun ZJ, Li XY, Sun XF, et al., 2021. ChineseBERT: Chinese pretraining enhanced by glyph and pinyin information. Proc 59^th Annual Meeting of the Association for Computational Linguistics and the 11^th Int Joint Conf on Natural Language Processing, p.2065-2075.

[34]Wang SH, Sun XF, Li XY, et al., 2023. GPT-NER: named entity recognition via large language models. Findings of the Association for Computational Linguistics: NAACL 2025, p.4257-4275.

[35]Wang XZ, Li JH, Zheng Z, et al., 2022. Entity and relation extraction with rule-guided dictionary as domain knowledge. Front Eng Manag, 9(4):610-622.

[36]Wang ZH, Chen HR, Xu G, et al., 2025. A novel large-language-model-driven framework for named entity recognition. Inform Process Manag, 62(3):104054.

[37]Weigang L, Marinho MC, Li DL, et al., 2024. Six-writings multimodal processing with pictophonetic coding to enhance Chinese language models. Front Inform Technol Electron Eng, 25(1):84-105.

[38]Xuan ZY, Bao R, Jiang SY, 2021. FGN: fusion glyph network for Chinese named entity recognition. Proc 5^th China Conf on Knowledge Graph and Semantic Computing: Knowledge Graph and Cognitive Intelligence, p.28-40.

[39]Yang K, Yang ZW, Zhao SW, et al., 2024. Uncertainty-aware contrastive learning for semi-supervised named entity recognition. Knowl-Based Syst, 296:111762.

[40]Yu YQ, Wang YZ, Mu JQ, et al., 2022. Chinese mineral named entity recognition based on BERT model. Expert Syst Appl, 206:117727.

[41]Zhang H, Wang XY, Liu JX, et al., 2023. Chinese named entity recognition method for the finance domain based on enhanced features and pretrained language models. Inform Sci, 625:385-400.

[42]Zhang H, Dang YP, Zhang YZ, et al., 2024. Chinese nested entity recognition method for the finance domain based on heterogeneous graph network. Inform Process Manag, 61(5):103812.

[43]Zhong L, Wu J, Li Q, et al., 2024. A comprehensive survey on automatic knowledge graph construction. ACM Comput Surv, 56(4):94.

[44]Zhou JY, Ma ZL, 2025. Named entity recognition for construction documents based on fine-tuning of large language models with low-quality datasets. Autom Constr, 174:106151.

[45]Zhou WX, Zhang S, Gu Y, et al., 2023. UniversalNER: targeted distillation from large language models for open named entity recognition.

Open peer comments: Debate/Discuss/Question/Opinion

<1>