JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2018 Vol.19 No.2 P.260-272

Words alignment based on association rules for cross-domain sentiment classification

Author(s): Xi-bin Jia, Ya Jin, Ning Li, Xing Su, Barry Cardiff, Bir Bhanu
Affiliation(s): Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; more
Corresponding email(s): jiaxibin@bjut.edu.cn, jinya@emails.bjut.edu.cn, xingsu@bjut.edu.cn
Key Words: Sentiment classification, Cross-domain, Association rules

Share this article to： More <<< Previous Article \|Next Article >>>

Xi-bin Jia, Ya Jin, Ning Li, Xing Su, Barry Cardiff, Bir Bhanu. Words alignment based on association rules for cross-domain sentiment classification[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(2): 260-272.

@article{title="Words alignment based on association rules for cross-domain sentiment classification",
author="Xi-bin Jia, Ya Jin, Ning Li, Xing Su, Barry Cardiff, Bir Bhanu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="2",
pages="260-272",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601679"
}

%0 Journal Article
%T Words alignment based on association rules for cross-domain sentiment classification
%A Xi-bin Jia
%A Ya Jin
%A Ning Li
%A Xing Su
%A Barry Cardiff
%A Bir Bhanu
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 2
%P 260-272
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601679

TY - JOUR
T1 - Words alignment based on association rules for cross-domain sentiment classification
A1 - Xi-bin Jia
A1 - Ya Jin
A1 - Ning Li
A1 - Xing Su
A1 - Barry Cardiff
A1 - Bir Bhanu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 2
SP - 260
EP - 272
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601679

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Automatic classification of sentiment data (e.g., reviews, blogs) has many applications in enterprise user management systems, and can help us understand people&x2019;s attitudes about products or services. However, it is difficult to train an accurate sentiment classifier for different domains. One of the major reasons is that people often use different words to express the same sentiment in different domains, and we cannot easily find a direct mapping relationship between them to reduce the differences between domains. So, the accuracy of the sentiment classifier will decline sharply when we apply a classifier trained in one domain to other domains. In this paper, we propose a novel approach called words alignment based on association rules (WAAR) for cross-domain sentiment classification, which can establish an indirect mapping relationship between domain-specific words in different domains by learning the strong association rules between domain-shared words and domain-specific words in the same domain. In this way, the differences between the source domain and target domain can be reduced to some extent, and a more accurate cross-domain classifier can be trained. Experimental results on Amazon^® datasets show the effectiveness of our approach on improving the performance of cross-domain sentiment classification.

基于关联规则进行词对齐的跨领域情感分类算法

概要：文本情感分类被应用于企业用户管理系统，通过自动对诸如评论、博客等带有情感倾向性文字进行分析，帮助商家更好地了解用户对商品或者服务的态度。然而，评论和博客等内容常源于不同应用领域，为每个领域训练一个能准确预测情感倾向的分类器非常困难。主要原因是，在不同领域，人们通常会用不同特征词表达相同情感，并且难以找到一个直接的映射函数，以建立不同领域特征词间的映射关系，从而消除领域间差异。因此，将某个领域训练好的分类器直接应用到另一个领域时，会因为领域间差异使得分类器准确率急速下降。本文提出一个新的基于关联规则进行特征词对齐的跨领域情感分类算法，该算法通过在同一领域中挖掘具有强关联关系的领域共享词和领域专有词词对，建立直接映射关系，并以领域共享词为桥梁，在不同领域的特征专有词之间建立间接映射关系，从而在一定程度上消除了源领域和目标领域之间的差异，有效提升了跨领域情感分类准确率。在亚马逊数据库上的实验结果证明该算法提高了跨领域情感分类性能。

关键词：情感分类；跨领域；关联规则

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Agrawal R, Srikant R, 1994. Fast algorithms for mining association rules. 20th Int Conf on Very Large Data Bases, 15(6):487-499.

[2]Ando R, Zhang T, 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res, 6:1817-1853.

[3]Balazs J, Velasquez J, 2016. Opinion mining and information fusion: a survey. Inf Fusions, 27(C):95-110.

[4]Blitzer J, Dredze M, Pereira F, 2007. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. 45th Annual Meeting of the Association of Computational Linguistics, p.440-447.

[5]Bollegala D, Weir D, Carroll J, 2013. Cross-domain sentiment classification using a sentiment sensitive thesaurus. IEEE Trans Know Data Eng, 25(8):1719-1731.

[6]Chen M, Xu Z, Kilian Q, et al., 2012. Marginalized denoising autoencoders for domain adaptation. arXiv preprint, cs.LG(1206.4683):1-8.

[7]Chung FRK, 1997. Spectral Graph Theory. Co-publication of the AMS and CBMS.

[8]Dave K, Lawrence S, Pennock D, 2003. Mining the peanut gallery: opinion extraction and semantic classification of product reviews. 12th Int Conf on World Wide Web, p.519-528.

[9]Ding S, Jia H, Shi Z, 2014. Spectral clustering algorithm based on adaptive Nystrom sampling for big data analysis. J Softw, 25(9):2037-2049.

[10]Glorot X, Bordes A, Bengio Y, 2011. Domain adaptation for large-scale sentiment classification: a deep learning approach. 28th Int Conf on Machine Learning, p.513-520.

[11]Goldberg A, Zhu X, 2006. Seeing stars when there are not many stars: graph-based semi-supervised learning for sentiment categorization. 1st Workshop on Graph Based Methods for Natural Language Processing, p.45-52.

[12]Jiang J, Zhai C, 2007. Instance weighting for domain adaptation in NLP. 45th Annual Meeting of the Association of Computational Linguistics, p.264-271.

[13]Li L, Jin X, Long M, 2012. Topic correlation analysis for cross-domain text classification. AAAI Conf on Artificial Intelligence, p.998-1004.

[14]Li T, Zhang Y, Sindhwani V, 2009. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. Joint Conf of the 47th Annual Meeting of the ACL and the 4th Int Joint Conf on Natural Language Processing, p.244-252.

[15]Pan S, Ni X, Sun J, et al., 2010. Cross-domain sentiment classification via spectral feature alignment. 19th Int Conf on World Wide Web, p.751-760.

[16]Pang B, Lee L, Vaithyanathan S, 2002. Thumbs up? sentiment classification using machine learning techniques. ACL-02 Conf on Empirical Methods in Natural Language Processing, p.79-86.

[17]Pantelis A, Loannis K, Loannis K, et al., 2016. Learning patterns for discovering domain oriented opinion words. Know Inf Syst, 2017(1):1-33.

[18]Schölkopf B, J Platt TH, 2007. Correcting sample selection bias by unlabeled data. Advances in Neural Information Processing Systems, p.601-608.

[19]Turney PD, 2002. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. 40th Annual Meeting on Association for Computational Linguistics, p.417-424.

[20]Wu Q, Tan S, Xu H, et al., 2010a. Cross-domain opinion analysis based on random-walk model. J Comput Res Dev, 47(12):2123-2131.

[21]Wu Q, Tan S, Zhang G, et al., 2010b. Research on cross-domain opinion analysis. J Chin Inf Process, 24(1):77-83.

[22]Yang Y, Pedersen J, 1997. A comparative study on feature selection in text categorization. 14th Int Conf on Machine Learning, p.412-420.

[23]Zadrozny B, 2004. Learning and evaluating classifiers under sample selection bias. 21st Int Conf on Machine Learning, p.114-121.

[24]Zhou G, Zhou Y, Guo X, et al., 2015. Cross-domain sentiment classification via topical correspondence transfer. Neurocomputing, 159:298-305.

[25]Zhuang L, Jing F, Zhu X, 2006. Movie review mining and summarization. 15th ACM Int Conf on Information and Knowledge Management, p.43-50.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

基于关联规则进行词对齐的跨领域情感分类算法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference