Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

DDiNER: domain dictionary-guided Chinese named entity recognition for complex industrial contexts

Abstract: Accurate Chinese named entity recognition (NER) in the process industry is crucial for applications such as information extraction, knowledge graph construction, and intelligent decision-making. However, challenges, including ambiguous entity boundaries, semantic overlaps, and limited annotated data, significantly hinder performance. To address these issues, this study proposes DDiNER, a domain dictionary-guided Chinese NER framework that integrates a hierarchical industrial domain dictionary with bidirectional encoder representations from Transformers (BERT) via a hierarchical lexicon adapter (HLA), combined with bidirectional long short-term memory (BiLSTM) and conditional random field (CRF) layers for multilevel feature fusion. Experimental results show that DDiNER achieves superior performance, with average precision, recall, and F1-scores of 95.75%, 95.73%, and 95.74%, respectively, outperforming state-of-the-art models. Validation on an independent dataset confirms its robustness and strong capability in recognizing unseen and long-tail entities. This study provides an effective and scalable solution for industrial Chinese NER, with significant potential for downstream intelligent applications.

Key words: Named entity recognition (NER); Process industry; Domain dictionary; Hierarchical lexicon adapter (HLA)

Chinese Summary  <1> DDiNER:面向复杂工业场景的领域词典引导中文命名实体识别方法

刘荣辉1,2,崔巍1,2,梁骁俊2,桂卫华2,3
1华南理工大学未来技术学院,中国广州市,510641
2鹏城实验室,中国深圳市,518055
3中南大学自动化学院,中国长沙市,410083
摘要:在工业流程中,准确的中文命名实体识别(NER)对于信息抽取、知识图谱构建及智能决策等应用具有重要意义。然而,实体边界模糊、语义重叠及标注数据不足等问题严重制约其性能。针对上述挑战,本文提出一种领域词典引导的中文NER框架--DDiNER。该框架通过层次化词典适配器将层次化工业领域词典与双向编码表示模型进行融合,并结合双向长短期记忆网络和条件随机场实现多层级特征融合。实验结果表明,DDiNER取得优异性能,其平均精确率、召回率和F1值分别达到95.75%、95.73%和95.74%,显著优于现有主流方法。在独立数据集上的验证结果证实,该模型在识别未注册实体和长尾实体方面具有良好的鲁棒性与泛化能力。本研究为工业领域中文NER提供了一种高效且可扩展的解决方案,在下游智能应用领域具有显著潜力。

关键词组:命名实体识别;流程工业;领域词典;层次化词典适配器


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/ENG.ITEE.2025.0047

CLC number:

TP391.1

Download Full Text:

Click Here

Downloaded:

15

Clicked:

24

Cited:

0

On-line Access:

2026-03-23

Received:

2025-09-23

Revision Accepted:

2026-02-10

Crosschecked:

2026-03-23

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE