Affiliation(s):
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;
moreAffiliation(s): School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China; Beijing Institute of Technology Southeast Academy of Information Technology, Putian 351100, China; Huawei Noah’s Ark Lab, Shenzhen 518129, China;
less
Yinghao LI, Heyan HUANG, Baojun WANG, Yang GAO. DRMSpell: dynamically reweighting multimodality for Chinese spelling correction[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2300816
@article{title="DRMSpell: dynamically reweighting multimodality for Chinese spelling correction", author="Yinghao LI, Heyan HUANG, Baojun WANG, Yang GAO", journal="Frontiers of Information Technology & Electronic Engineering", year="in press", publisher="Zhejiang University Press & Springer", doi="https://doi.org/10.1631/FITEE.2300816" }
%0 Journal Article %T DRMSpell: dynamically reweighting multimodality for Chinese spelling correction %A Yinghao LI %A Heyan HUANG %A Baojun WANG %A Yang GAO %J Frontiers of Information Technology & Electronic Engineering %P %@ 2095-9184 %D in press %I Zhejiang University Press & Springer doi="https://doi.org/10.1631/FITEE.2300816"
TY - JOUR T1 - DRMSpell: dynamically reweighting multimodality for Chinese spelling correction A1 - Yinghao LI A1 - Heyan HUANG A1 - Baojun WANG A1 - Yang GAO J0 - Frontiers of Information Technology & Electronic Engineering SP - EP - %@ 2095-9184 Y1 - in press PB - Zhejiang University Press & Springer ER - doi="https://doi.org/10.1631/FITEE.2300816"
Abstract: Chinese spelling correction (CSC) is a task that aims to detect and correct the spelling errors that may occur in Chinese texts. However, the Chinese language exhibits a high degree of complexity, characterized by the presence of multiple phonetic representations known as pinyin, which possess distinct tonal variations that can correspond to various characters. In light of the complexity inherent in the Chinese language, the CSC task becomes imperative for ensuring the accuracy and clarity of written communication. Recent research has included external knowledge into the model using phonological and visual modalities. However, these methods do not effectively utilize the modality information in a targeted manner for addressing the different types of errors. In this paper, we propose a multimodal pretrained language model called DRMSpell for CSC, which takes into consideration the interaction between the modalities. A dynamically reweighting multimodality (DRM) module is introduced to reweight various modalities for obtaining more multimodal information. To fully utilize the multimodal information obtained and to further strengthen the model, an independent-modality masking strategy (IMS) is proposed to independently mask three modalities of a token in the pretraining stage. Our method achieves state-of-the-art performance on most metrics constituting widely used benchmarks. The findings of the experiments demonstrate that our method is capable of modeling the interactive information between modalities and is also robust to incorrect modal information.
Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference
Open peer comments: Debate/Discuss/Question/Opinion
Open peer comments: Debate/Discuss/Question/Opinion
<1>