JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2018 Vol.19 No.4 P.524-535

Syntactic word embedding based on dependency syntax and polysemous analysis

Author(s): Zhong-lin Ye, Hai-xing Zhao
Affiliation(s): School of Computer Science, Shaanxi Normal University, Xi�an 710119, China; more
Corresponding email(s): h.x.zhao@163.com
Key Words: Dependency-based context, Polysemous word representation, Representation learning, Syntactic word embedding

Share this article to： More <<< Previous Article \|Next Article >>>

Zhong-lin Ye, Hai-xing Zhao. Syntactic word embedding based on dependency syntax and polysemous analysis[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(4): 524-535.

@article{title="Syntactic word embedding based on dependency syntax and polysemous analysis",
author="Zhong-lin Ye, Hai-xing Zhao",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="4",
pages="524-535",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601846"
}

%0 Journal Article
%T Syntactic word embedding based on dependency syntax and polysemous analysis
%A Zhong-lin Ye
%A Hai-xing Zhao
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 4
%P 524-535
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601846

TY - JOUR
T1 - Syntactic word embedding based on dependency syntax and polysemous analysis
A1 - Zhong-lin Ye
A1 - Hai-xing Zhao
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 4
SP - 524
EP - 535
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601846

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Most word embedding models have the following problems: (1) In the models based on bag-of-words contexts, the structural relations of sentences are completely neglected; (2) Each word uses a single embedding, which makes the model indiscriminative for polysemous words; (3) Word embedding easily tends to contextual structure similarity of sentences. To solve these problems, we propose an easy-to-use representation algorithm of syntactic word embedding (SWE). The main procedures are: (1) A polysemous tagging algorithm is used for polysemous representation by the latent Dirichlet allocation (LDA) algorithm; (2) Symbols ‘+’ and ‘−’ are adopted to indicate the directions of the dependency syntax; (3) Stopwords and their dependencies are deleted; (4) Dependency skip is applied to connect indirect dependencies; (5) dependency-based contexts are inputted to a word2vec model. Experimental results show that our model generates desirable word embedding in similarity evaluation tasks. Besides, semantic and syntactic features can be captured from dependency-based syntactic contexts, exhibiting less topical and more syntactic similarity. We conclude that SWE outperforms single embedding learning models.

基于依存关系和多义词分析的句法词嵌入

摘要：现有大多数词嵌入学习模型存在以下问题：（1）基于词袋上下文的模型完全忽略句子的句法结构关系；（2）每个词使用单个嵌入向量使多义词共享一个嵌入向量；（3）词嵌入往往趋向于句子上下文共性。为解决这些问题，提出一种基于依存关系和多义词分析的句法词嵌入（syntactic word embedding, SWE）。该算法主要处理：（1）基于主题模型，提出一个多义词识别算法；（2）采用符号"+"和"?"表示依存关系方向；（3）删除停用词及其依存关系；（4）引入"skip"依存关系表示依存关系之间的间接关系；（5）将基于依存关系的上下文输入到Word2Vec模型中训练语言模型。实验结果表明，SWE模型在词相似度评测任务中表现出优异性能。基于依存关系句法上下文捕获词语的语义和句法特征，使词语表现出较少的上下文主题相似性和更多的句法和语义相似性。综上，包含更多信息的SWE模型性能优于单一的词嵌入学习模型。

关键词：基于依存关系的上下文；多义词表示；表示学习；句法词向量

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Baroni M, Lenci A, 2010. Distributional memory: a general framework for corpus-based semantics. Comput Ling, 36(4):673-721.

[2]Bengio Y, Ducharme R, Vincent P, et al., 2003. A neural probabilistic language model. J Mach Learn Res, 3(6): 1137-1155.

[3]Bullinaria JA, Levy JP, 2007. Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods, 39(3):510-526.

[4]Finkelstein L, Gabrilovich E, Matias Y, et al., 2002. Placing search in context: the concept revisited. ACM Trans Inform Syst, 20(1):116-131.

[5]Firth JR, 1957. A synopsis of linguistic theory. Stud Ling Anal, 41(4):1-32.

[6]Goldberg Y, Nivre J, 2012. A dynamic oracle for arc-eager dependency parsing. Proc Coling, p.959-976.

[7]Goldberg Y, Nivre J, 2014. Training deterministic parsers with non-deterministic oracles. Trans Assoc Comput Ling, p.403-414.

[8]Harris ZS, 1981. Distributional structure. Word, 10(2-3):146- 162.

[9]Hill F, Reichart R, Korhonen A, 2015. SimLex-999: evaluating semantic models with (genuine) similarity estimation. Comput Ling, 41(2):665-695.

[10]Hinton GE, 1986. Learning distributed representations of concepts. Proc 8^th Annual Conf of the Cognitive Science Society, p.1-12.

[11]Huang EH, Socher R, Manning CD, et al., 2012. Improving word representations via global context and multiple word prototypes. Proc 50^th Annual Meeting of Association for Computational Linguistics, p.873-882.

[12]Krishna K, Murty MN, 1999. Genetic K-means algorithm. IEEE Trans Syst Man Cybern Part B, 29(3):433-439.

[13]Lebret R, Collobert R, 2014. Word embeddings through Hellinger PCA. Proc 14^th Conf on European Chapter of the Association for Computational Linguistics, p.482- 490.

[14]Lebret R, Collobert R, 2015. Rehabilitation of count-based models for word vector representations. Int Conf on Intelligent Text Processing and Computational Linguistics, p.417-429.

[15]Levy O, Goldberg Y, 2014. Dependency-based word embeddings. Proc 52^nd Annual Meeting of Association for Computational Linguistics, p.302-308.

[16]Liu Y, Liu ZY, Chua TS, et al., 2015. Topical word embeddings. Proc 29^th AAAI Conf on Artificial Intelligence, p.2418-2424.

[17]Luong MT, Socher R, Manning CD, 2013. Better word representations with recursive neural networks for morphology. Proc 17^th Conf on Computational Natural Language Learning, p.104-113.

[18]Mikolov T, Sutskever I, Chen K, et al., 2013. Distributed representations of words and phrases and their compositionality. Int Conf on Neural Information Processing Systems, p.3111-3119.

[19]Mnih A, Hinton GE, 2008. A scalable hierarchical distributed language model. Proc 21^st Int Conf on Neural Information Processing System, p.1081-1088.

[20]Nguyen KA, Walde SSI, Vu NT, 2016. Neural-based noise filtering from word embeddings. Proc 26^th Int Conf on Computational Linguistics, p.2699-2707.

[21]Pennington J, Socher R, Manning CD, 2014. Glove: global vectors for word representation. Proc Conf on Empirical Methods in Natural Language Processing, p.1532-1543.

[22]Ren YF, Wang RM, Ji DH, 2016. A topic-enhanced word embedding for Twitter sentiment classification. Inform Sci, 369:188-198.

[23]Ritter A, Mausam, Etzioni O, 2010. A latent Dirichlet allocation method for selectional preferences. Proc 48^th Annual Meeting of Association for Computational Linguistics, p.424-434.

[24]Rubenstein H, Goodenough JB, 1965. Contextual correlates of synonymy. Commun ACM, 8(10):627-633.

[25]Tian F, Dai HJ, Bian J, et al., 2014. A probabilistic model for learning multi-prototype word embeddings. Proc 25^th Int Conf on Computational Linguistics, p.151-160.

[26]Turney PD, Pantel P, 2010. From frequency to meaning: vector space models of semantics. J Artif Intell Res, 37(1):141- 188.

[27]Wang P, Xu B, Xu JM, et al., 2016. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing, 174(B):806-814.

[28]Xu W, Rudnicky AI, 2000. Can artificial neural networks learn language models? Proc 6^th Int Conf on Spoken Language Processing, p.202-205.

[29]Zhai M, Tan J, Choi DJ, 2016. Intrinsic and extrinsic evaluations of word embeddings. Proc 30^th AAAI Conf on Artificial Intelligence, p.4282-4283.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

基于依存关系和多义词分析的句法词嵌入

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference