JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2018 Vol.19 No.5 P.651-661

Cross-lingual implicit discourse relation recognition with co-training

Author(s): Yao-jie Lu, Mu Xu, Chang-xing Wu, De-yi Xiong, Hong-ji Wang, Jin-song Su
Affiliation(s): School of Software, Xiamen University, Xiamen 361005, China; more
Corresponding email(s): jssu@xmu.edu.cn
Key Words: Cross-lingual, Implicit discourse relation recognition, Co-training

Share this article to： More <<< Previous Article \|Next Article >>>

Yao-jie Lu, Mu Xu, Chang-xing Wu, De-yi Xiong, Hong-ji Wang, Jin-song Su. Cross-lingual implicit discourse relation recognition with co-training[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(5): 651-661.

@article{title="Cross-lingual implicit discourse relation recognition with co-training",
author="Yao-jie Lu, Mu Xu, Chang-xing Wu, De-yi Xiong, Hong-ji Wang, Jin-song Su",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="5",
pages="651-661",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601865"
}

%0 Journal Article
%T Cross-lingual implicit discourse relation recognition with co-training
%A Yao-jie Lu
%A Mu Xu
%A Chang-xing Wu
%A De-yi Xiong
%A Hong-ji Wang
%A Jin-song Su
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 5
%P 651-661
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601865

TY - JOUR
T1 - Cross-lingual implicit discourse relation recognition with co-training
A1 - Yao-jie Lu
A1 - Mu Xu
A1 - Chang-xing Wu
A1 - De-yi Xiong
A1 - Hong-ji Wang
A1 - Jin-song Su
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 5
SP - 651
EP - 661
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601865

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.

基于协同学习的跨语言隐式篇章关系识别

摘要：标注语料库的缺乏阻碍了中文隐式篇章关系识别研究的进展，而在其他语言（如英语）中存在一些可用的篇章关系语料库。提出一个跨语言的隐式篇章关系识别框架，该框架可利用英语语料库完成中文隐式篇章关系识别任务。使用机器翻译从带标签的英语篇章关系语料库生成中文实例。基于该方法，每个实例都有两个独立视角：中文和英文。然后，利用联合训练方式，分别基于中文和英文视角学习两个分类器，同时利用无标签中文数据帮助完成中文隐式篇章关系识别。实验结果证明该方法有效。

关键词：跨语言；隐式篇章关系；协同训练

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Biran O, McKeown K, 2013. Aggregated word pair features for implicit discourse relation disambiguation. Proc 51^st Annual Meeting of the Association for Computational Linguistics, p.69-73.

[2]Blum A, Mitchell T, 1998. Combining labeled and unlabeled data with cotraining. Proc 11^th Annual Conf on Computational Learning Theory, p.92-100.

[3]Braud C, Denis P, 2015. Comparing word representations for implicit discourse relation classification. Proc Conf on Empirical Methods in Natural Language Processing, p.2201-2211.

[4]Carlson L, Marcu D, Okurowski M, 2001. Building a discourse-tagged corpus in the framework of rhetorical structure theory. Proc 2^nd SIGDIAL Workshop on Discourse and Dialogue, p.1-10.

[5]Chen J, Zhang Q, Liu P, et al., 2016. Implicit discourse relation detection via a deep architecture with gated relevance network. Proc 54^th Annual Meeting of the Association for Computational Linguistics, p.1726-1735.

[6]Chen L, 2006. English and Chinese Discourse Structure Dimension Theory and Practice. PhD Thesis, Shanghai International Studies University, China.

[7]Chiarcos C, 2012. Towards the unsupervised acquisition of discourse relations. Proc 50^th Annual Meeting of the Association for Computational Linguistics, p.213-217.

[8]Cimiano P, Reyle U, Šarić J, 2005. Ontology-driven discourse analysis for information extraction. Data Knowl Eng, 55(1):59-83.

[8]Clark S, Curran J, Osborne M, 2003. Bootstrapping POS-taggers using unlabelled data. Proc 7^th Conf on Natural Language Learning, p.49-55.

[10]Guzmán F, Joty S, Màrquez L, et al., 2014. Using discourse structure improves machine translation evaluation. Proc 52^nd Annual Meeting of the Association for Computational Linguistics, p.687-698.

[11]Hernault H, Bollegala D, Ishizuka M, 2010. Towards semi-supervised classification of discourse relations using feature correlations. Proc SIGDIAL Conf and the 11^th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.55-58.

[12]Huang H, Chen H, 2011. Chinese discourse relation recognition. Proc 5^th Int Joint Conf on Natural Language Processing, p.1442-1446.

[13]Jain S, Batra S, 2015. Cross lingual sentiment analysis using modified BRAE. Proc Conf on Empirical Methods in Natural Language Processing, p.159-168.

[14]Ji Y, Eisenstein J, 2015. One vector is not enough: entity-augmented distributed semantics for discourse relations. Trans Assoc Comput Ling, 3:329-344.

[15]Ji Y, Haffari G, Eisenstein J, 2016. A latent variable recurrent neural network for discourse relation language models. Proc Conf North American Chapter of the Association for Computational Linguistics on Human Language Technologies, p.332-342.

[16]Laali M, Kosseim L, 2014. Inducing discourse connectives from parallel texts. Proc 25^th Int Conf on Computational Linguistics, p.610-619.

[17]Lan M, Xu Y, Niu Z, 2013. Leveraging synthetic discourse data via multi-task learning for implicit discourse relation recognition. Proc 51^st Annual Meeting of the Association for Computational Linguistics, p.476-485.

[18]Li J, Carpuat M, Nenkova A, 2014. Cross-lingual discourse relation analysis: a corpus study and a semi-supervised classification system. Proc 25^th Int Conf on Computational Linguistics, p.577-587.

[19]Li Y, Feng W, Sun J, et al., 2014. Building Chinese discourse corpus with connective-driven dependency tree structure. Proc Conf on Empirical Methods in Natural Language Processing, p.2105-2114.

[20]Lin Z, Kan M, Ng H, 2009. Recognizing implicit discourse relations in the Penn discourse treebank. Proc Conf on Empirical Methods in Natural Language Processing, p.343-351.

[21]Liu Y, Li S, Zhang X, et al., 2016. Implicit discourse relation classification via multi-task neural networks. Proc 30^th Conf on Artificial Intelligence, p.2750-2756.

[22]Louis A, Joshi A, Prasad R, et al., 2010. Using entity features to classify implicit discourse relations. Proc 11^th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.59-62.

[23]Marcu D, Echihabi A, 2002. An unsupervised approach to recognizing discourse relations. Proc 40^th Annual Meeting of the Association for Computational Linguistics, p.368-375.

[24]Miltsakaki E, Dinesh N, Prasad R, et al., 2005. Experiments on sense annotations and sense disambiguation of discourse connectives. Proc 4^th Workshop on Treebanks and Linguistic Theories, p.1-13.

[25]Ming Y, 2008. Rhetorical structure annotation of Chinese news commentaries. J Chin Inform Proc, 22(4):19-23.

[26]Ng V, Cardie C, 2003. Weakly supervised natural language learning without redundant views. Proc Conf North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.94-101.

[27]Park J, Cardie C, 2012. Improving implicit discourse relation recognition through feature set optimization. Proc 13^th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p.108-112.

[28]Pitler E, Nenkova A, 2009. Using syntax to disambiguate explicit discourse connectives in text. Proc ACL-IJCNLP Conf, p.13-16.

[29]Pitler E, Louis A, Nenkova A, 2009. Automatic sense prediction for implicit discourse relations in text. Proc of the Joint Conf 47^th Annual Meeting of the ACL and the 4^th Int Joint Conf on Natural Language Processing of the AFNLP, p.683-691.

[30]Prasad R, Dinesh N, Lee A, et al., 2008. The Penn discourse treebank 2.0. Proc Int Conf on Language Resources and Evaluation, p.2961-2968.

[31]Qian L, Hui H, Hu Y, et al., 2014. Bilingual active learning for relation classification via pseudo parallel corpora. Proc 52^nd Annual Meeting of the Association for Computational Linguistics, p.582-592.

[32]Qin L, Zhang Z, Zhao H, 2016. A stacking gated neural architecture for implicit discourse relation classification. Proc Conf on Empirical Methods in Natural Language Processing, p.2263-2270.

[33]Rutherford A, Xue N, 2014. Discovering implicit discourse relations through brown cluster pair representation and coreference patterns. Proc 14^th Conf European Chapter of the Association for Computational Linguistics, p.645-654.

[34]Rutherford A, Xue N, 2015. Improving the inference of implicit discourse relations via classifying explicit discourse connectives. Proc Conf of the North American Chapter of the Association for Computational Linguistics on Human Language Technologies, p.799-808.

[35]Rutherford A, Demberg V, Xue N, 2016. Neural network models for implicit discourse relation classification in English and Chinese without surface features. http://arxiv.org/abs/1606.01990

[36]Sarkar A, 2001. Applying co-training methods to statistical parsing. Proc 2^nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, p.1-8.

[37]Sporleder C, Lascarides A, 2008. Using automatically labelled examples to classify rhetorical relations: an assessment. Nat Lang Eng, 14:369-416.

[38]Verberne S, Boves L, Oostdijk N, et al., 2007. Evaluating discourse-based answer extraction for why-question answering. Proc 30^th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.735-736.

[39]Wan X, 2009. Co-training for cross-lingual sentiment classification. Proc of the Joint Conf 47^th Annual Meeting of the ACL and the 4^th Int Joint Conf on Natural Language Processing of the AFNLP, p.235-243.

[40]Wang W, Su J, Tan C, 2010. Kernel based discourse relation recognition with temporal ordering information. Proc 48^th Annual Meeting of the Association for Computational Linguistics, p.710-719.

[41]Wang X, Li S, Li J, et al., 2012. Implicit discourse relation recognition by selecting typical training examples. Proc of COLING, p.2757-2772.

[42]Xue N, 2005. Annotating discourse connectives in the Chinese treebank. Proc Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, p.84-91.

[43]Zhang B, Su J, Xiong D, et al., 2015. Shallow convolutional neural network for implicit discourse relation recognition. Proc Conf on Empirical Methods in Natural Language Processing, p.2230-2235.

[44]Zhang B, Xiong D, Su J, et al., 2016. Variational neural discourse relation recognizer. Proc Conf on Empirical Methods in Natural Language Processing, p.382-391.

[45]Zhang M, Song Y, Qin B, et al., 2013. Chinese discourse relation recognition. J Chin Inform Proc, 27(6):51-57.

[46]Zhang M, Qin B, Liu T, 2014. Chinese discourse relation hierarchy and annotation. J Chin Inform Proc, 28(2):28-36.

[47]Zhou L, Gao W, Li B, et al., 2012. Cross-lingual identification of ambiguous discourse connectives for resource poor language. Proc COLING, p.1409-1418.

[48]Zhou Y, Xue N, 2015. The Chinese discourse treebank: a Chinese corpus annotated with discourse relations. Lang Res Eval, 49(2):397-431.

[49]Zhou Z, Xu Y, Niu Z, et al., 2010. Predicting discourse connectives for implicit discourse relation recognition. 23^rd Int Conf on Computational Linguistics, p.1507-1514.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

基于协同学习的跨语言隐式篇章关系识别

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference