Full Text:  <18>

CLC number: 

On-line Access: 2024-12-16

Received: 2004-06-11

Revision Accepted: 2024-10-10

Crosschecked: 0000-00-00

Cited: 0

Clicked: 28

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering 

Accepted manuscript available online (unedited version)


Shared-weightmultimodal translation model for recognizingChinese variant characters


Author(s):  Yuankang SUN, Bing LI, Lexiang LI, Peng YANG, Dongmei YANG

Affiliation(s):  School of Computer Science and Engineering, Southeast University, Nanjing 210000, China; more

Corresponding email(s):  syk@seu.edu.cn, libing@seu.edu.cn, lexiangli@seu.edu.cn, pengyang@seu.edu.cn

Key Words:  Chinese variant characters; Multimodal model; Translation model; Phonology and morphology


Share this article to: More <<< Previous Paper|Next Paper >>>

Yuankang SUN, Bing LI, Lexiang LI, Peng YANG, Dongmei YANG. Shared-weightmultimodal translation model for recognizingChinese variant characters[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2400504

@article{title="Shared-weightmultimodal translation model for recognizingChinese variant characters",
author="Yuankang SUN, Bing LI, Lexiang LI, Peng YANG, Dongmei YANG",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2400504"
}

%0 Journal Article
%T Shared-weightmultimodal translation model for recognizingChinese variant characters
%A Yuankang SUN
%A Bing LI
%A Lexiang LI
%A Peng YANG
%A Dongmei YANG
%J Frontiers of Information Technology & Electronic Engineering
%P
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2400504"

TY - JOUR
T1 - Shared-weightmultimodal translation model for recognizingChinese variant characters
A1 - Yuankang SUN
A1 - Bing LI
A1 - Lexiang LI
A1 - Peng YANG
A1 - Dongmei YANG
J0 - Frontiers of Information Technology & Electronic Engineering
SP -
EP -
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2400504"


Abstract: 
The task of recognizing Chinese variant characters aims to address the challenges of semantic ambiguity and confusion, which potentially cause risks to the security of Web content and complicate the governance of sensitive words. Most existing approaches predominantly prioritize the acquisition of contextual knowledge from Chinese corpora and vocabularies during pretraining, often overlooking the inherent phonological and morphological characteristics of the Chinese language. To address these issues, we propose a shared-weight multimodal translation model (SMTM) based on multimodal information of Chinese characters, which integrates the phonology of Pinyin and the morphology of fonts into each Chinese character token to learn the deeper semantics of variant texts. Specifically, we encode the Pinyin features of Chinese characters using the embedding layer, and the font features of Chinese characters are extracted based on convolutional neural networks directly. Considering the multimodal similarity between the source and the target sentences of the Chinese variant-character-recognition task, we design the shared-weight embedding mechanism to generate target sentences using the heuristic information from the source sentences in the training process. The experimental results show that our proposed SMTM model achieves remarkable performance of 89.550% and 79.480% on bilingual evaluation understudy-1 (BLEU1) and F1 metrics, respectively, which is a significant improvement of 4.344% and 3.088%, respectively, compared with the state-of-the-art baseline model.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE