CLC number: TP391
On-line Access: 2018-06-07
Received: 2016-12-21
Revision Accepted: 2017-04-17
Crosschecked: 2018-04-12
Cited: 0
Clicked: 6568
Zhong-lin Ye, Hai-xing Zhao. Syntactic word embedding based on dependency syntax and polysemous analysis[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(4): 524-535.
@article{title="Syntactic word embedding based on dependency syntax and polysemous analysis",
author="Zhong-lin Ye, Hai-xing Zhao",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="4",
pages="524-535",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601846"
}
%0 Journal Article
%T Syntactic word embedding based on dependency syntax and polysemous analysis
%A Zhong-lin Ye
%A Hai-xing Zhao
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 4
%P 524-535
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601846
TY - JOUR
T1 - Syntactic word embedding based on dependency syntax and polysemous analysis
A1 - Zhong-lin Ye
A1 - Hai-xing Zhao
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 4
SP - 524
EP - 535
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601846
Abstract: Most word embedding models have the following problems: (1) In the models based on bag-of-words contexts, the structural relations of sentences are completely neglected; (2) Each word uses a single embedding, which makes the model indiscriminative for polysemous words; (3) Word embedding easily tends to contextual structure similarity of sentences. To solve these problems, we propose an easy-to-use representation algorithm of syntactic word embedding (SWE). The main procedures are: (1) A polysemous tagging algorithm is used for polysemous representation by the latent Dirichlet allocation (LDA) algorithm; (2) Symbols ‘+’ and ‘−’ are adopted to indicate the directions of the dependency syntax; (3) Stopwords and their dependencies are deleted; (4) Dependency skip is applied to connect indirect dependencies; (5) dependency-based contexts are inputted to a word2vec model. Experimental results show that our model generates desirable word embedding in similarity evaluation tasks. Besides, semantic and syntactic features can be captured from dependency-based syntactic contexts, exhibiting less topical and more syntactic similarity. We conclude that SWE outperforms single embedding learning models.
[1]Baroni M, Lenci A, 2010. Distributional memory: a general framework for corpus-based semantics. Comput Ling, 36(4):673-721.
[2]Bengio Y, Ducharme R, Vincent P, et al., 2003. A neural probabilistic language model. J Mach Learn Res, 3(6): 1137-1155.
[3]Bullinaria JA, Levy JP, 2007. Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods, 39(3):510-526.
[4]Finkelstein L, Gabrilovich E, Matias Y, et al., 2002. Placing search in context: the concept revisited. ACM Trans Inform Syst, 20(1):116-131.
[5]Firth JR, 1957. A synopsis of linguistic theory. Stud Ling Anal, 41(4):1-32.
[6]Goldberg Y, Nivre J, 2012. A dynamic oracle for arc-eager dependency parsing. Proc Coling, p.959-976.
[7]Goldberg Y, Nivre J, 2014. Training deterministic parsers with non-deterministic oracles. Trans Assoc Comput Ling, p.403-414.
[8]Harris ZS, 1981. Distributional structure. Word, 10(2-3):146- 162.
[9]Hill F, Reichart R, Korhonen A, 2015. SimLex-999: evaluating semantic models with (genuine) similarity estimation. Comput Ling, 41(2):665-695.
[10]Hinton GE, 1986. Learning distributed representations of concepts. Proc 8th Annual Conf of the Cognitive Science Society, p.1-12.
[11]Huang EH, Socher R, Manning CD, et al., 2012. Improving word representations via global context and multiple word prototypes. Proc 50th Annual Meeting of Association for Computational Linguistics, p.873-882.
[12]Krishna K, Murty MN, 1999. Genetic K-means algorithm. IEEE Trans Syst Man Cybern Part B, 29(3):433-439.
[13]Lebret R, Collobert R, 2014. Word embeddings through Hellinger PCA. Proc 14th Conf on European Chapter of the Association for Computational Linguistics, p.482- 490.
[14]Lebret R, Collobert R, 2015. Rehabilitation of count-based models for word vector representations. Int Conf on Intelligent Text Processing and Computational Linguistics, p.417-429.
[15]Levy O, Goldberg Y, 2014. Dependency-based word embeddings. Proc 52nd Annual Meeting of Association for Computational Linguistics, p.302-308.
[16]Liu Y, Liu ZY, Chua TS, et al., 2015. Topical word embeddings. Proc 29th AAAI Conf on Artificial Intelligence, p.2418-2424.
[17]Luong MT, Socher R, Manning CD, 2013. Better word representations with recursive neural networks for morphology. Proc 17th Conf on Computational Natural Language Learning, p.104-113.
[18]Mikolov T, Sutskever I, Chen K, et al., 2013. Distributed representations of words and phrases and their compositionality. Int Conf on Neural Information Processing Systems, p.3111-3119.
[19]Mnih A, Hinton GE, 2008. A scalable hierarchical distributed language model. Proc 21st Int Conf on Neural Information Processing System, p.1081-1088.
[20]Nguyen KA, Walde SSI, Vu NT, 2016. Neural-based noise filtering from word embeddings. Proc 26th Int Conf on Computational Linguistics, p.2699-2707.
[21]Pennington J, Socher R, Manning CD, 2014. Glove: global vectors for word representation. Proc Conf on Empirical Methods in Natural Language Processing, p.1532-1543.
[22]Ren YF, Wang RM, Ji DH, 2016. A topic-enhanced word embedding for Twitter sentiment classification. Inform Sci, 369:188-198.
[23]Ritter A, Mausam, Etzioni O, 2010. A latent Dirichlet allocation method for selectional preferences. Proc 48th Annual Meeting of Association for Computational Linguistics, p.424-434.
[24]Rubenstein H, Goodenough JB, 1965. Contextual correlates of synonymy. Commun ACM, 8(10):627-633.
[25]Tian F, Dai HJ, Bian J, et al., 2014. A probabilistic model for learning multi-prototype word embeddings. Proc 25th Int Conf on Computational Linguistics, p.151-160.
[26]Turney PD, Pantel P, 2010. From frequency to meaning: vector space models of semantics. J Artif Intell Res, 37(1):141- 188.
[27]Wang P, Xu B, Xu JM, et al., 2016. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing, 174(B):806-814.
[28]Xu W, Rudnicky AI, 2000. Can artificial neural networks learn language models? Proc 6th Int Conf on Spoken Language Processing, p.202-205.
[29]Zhai M, Tan J, Choi DJ, 2016. Intrinsic and extrinsic evaluations of word embeddings. Proc 30th AAAI Conf on Artificial Intelligence, p.4282-4283.
Open peer comments: Debate/Discuss/Question/Opinion
<1>