JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering 2017 Vol.18 No.4 P.535-544

Attention-based encoder-decoder model for answer selection in question answering

Author(s): Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li
Affiliation(s): College of Computer, National University of Defense Technology, Changsha 410073, China; more
Corresponding email(s): yuanpingnie@nudt.edu.cn
Key Words: Question answering, Answer selection, Attention, Deep learning

Share this article to： More <<< Previous Article \|Next Article >>>

Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li. Attention-based encoder-decoder model for answer selection in question answering[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 535-544.

@article{title="Attention-based encoder-decoder model for answer selection in question answering",
author="Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="18",
number="4",
pages="535-544",
year="2017",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601232"
}

%0 Journal Article
%T Attention-based encoder-decoder model for answer selection in question answering
%A Yuan-ping Nie
%A Yi Han
%A Jiu-ming Huang
%A Bo Jiao
%A Ai-ping Li
%J Frontiers of Information Technology & Electronic Engineering
%V 18
%N 4
%P 535-544
%@ 2095-9184
%D 2017
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601232

TY - JOUR
T1 - Attention-based encoder-decoder model for answer selection in question answering
A1 - Yuan-ping Nie
A1 - Yi Han
A1 - Jiu-ming Huang
A1 - Bo Jiao
A1 - Ai-ping Li
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 18
IS - 4
SP - 535
EP - 544
%@ 2095-9184
Y1 - 2017
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601232

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: One of the key challenges for question answering is to bridge the lexical gap between questions and answers because there may not be any matching word between them. Machine translation models have been shown to boost the performance of solving the lexical gap problem between question-answer pairs. In this paper, we introduce an attention-based deep learning model to address the answer selection task for question answering. The proposed model employs a bidirectional long short-term memory (LSTM) encoder-decoder, which has been demonstrated to be effective on machine translation tasks to bridge the lexical gap between questions and answers. Our model also uses a step attention mechanism which allows the question to focus on a certain part of the candidate answer. Finally, we evaluate our model using a benchmark dataset and the results show that our approach outperforms the existing approaches. Integrating our model significantly improves the performance of our question answering system in the TREC 2015 LiveQA task.

基于注意机制编码解码模型的答案选择方法

概要：问答技术的重要挑战之一就是解决问题与答案之间的语义空白。机器翻译模型已经被证明能有效的提升解决问题与答案之间的语义空白。本文提出了一种基于注意机制的深度神经网络模型来解决问答系统中的答案选择任务。该模型采用了基于双向长短时记忆（Long short-term memory, LSTM）的编码解码模型，编码解码模型是一个被证明再机器翻译领域取得了突出的成绩。我们还在模型中应用了注意力机制来提升模型的效果。本文在一个公开数据集上验证了实验的有效性，同时通过结合该模型显著提高了问答系统的性能在TREC 2015 liveQA的任务中。

关键词：问答技术；答案选择；注意机制；深度学习

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bahdanau, D., Cho, K., Bengio, Y., 2014. Neural machine translation by jointly learning to align and translate. ArXiv:1409.0473.

[2]Berger, A., Caruana, R., Cohn, D., et al., 2000. Bridging the lexical chasm: statistical approaches to answer-finding. Proc. 23rd Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.192-199.

[3]Cho, K., van Merriënboer, B., Gulcehre, C., et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. ArXiv:1406.1078.

[4]Cui, H., Sun, R., Li, K., et al., 2005. Question answering passage retrieval using dependency relations. Proc. 28th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.400-407.

[5]dos Santos, C., Barbosa, L., Bogdanova, D., et al., 2015. Learning hybrid representations to retrieve semantically equivalent questions. Proc. 53rd Annual Meeting of the Association for Computational Linguistics and 7th Int. Joint Conf. on Natural Language Processing, p.694-699.

[6]Echihabi, A., Marcu, D., 2003. A noisy-channel approach to question answering. Proc. 41st Annual Meeting of the Association for Computational Linguistics, p.16-23.

[7]Feng, M., Xiang, B., Glass, M.R., et al., 2015. Applying deep learning to answer selection: a study and an open task. ArXiv:1508.01585.

[8]Graves, A., Mohamed, A., Hinton, G.E., 2013. Speech recognition with deep recurrent neural networks. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.6645-6649.

[9]Heilman, M., Smith, N.A., 2010. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. Human Language Technologies: Annual Conf. of the North American Chapter of the Association for Computational Linguistics, p.1011-1019.

[10]Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neur. Comput., 9(8):1735-1780.

[11]Iyyer, M., Boyd-Graber, J.L., Claudino, L.M.B., et al., 2014. A neural network for factoid question answering over paragraphs. Proc. Conf. on Empirical Methods in Natural Language Processing, p.633-644.

[12]Jeon, J., Croft, W.B., Lee, J.H., 2005. Finding similar questions in large question and answer archives. Proc. 14th ACM Int. Conf. on Information and Knowledge Management, p.84-90.

[13]Kalchbrenner, N., Blunsom, P., 2013. Recurrent continuous translation models. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1700-1709.

[14]Kim, Y., 2014. Convolutional neural networks for sentence classification. ArXiv:1408.5882.

[15]Punyakanok, V., Roth, D., Yih, W.T., 2004. Mapping dependencies trees: an application to question answering. Proc. 8th Int. Symp. on Artificial Intelligence and Mathematics, p.1-10.

[16]Riezler, S., Vasserman, A., Tsochantaridis, I., et al., 2007. Statistical machine translation for query expansion in answer retrieval. Annual Meeting of the Association for Computational Linguistics, p.464-471.

[17]Robertson, S.E., Walker, S., Jones, S., et al., 1995. Okapi at TREC-3. Overview of 3rd Text REtrieval Conf., p.109-126.

[18]Rush, A.M., Chopra, S., Weston, J., 2015. A neural attention model for abstractive sentence summarization. ArXiv: 1509.00685.

[19]Severyn, A., Moschitti, A., 2013. Automatic feature engineering for answer selection and extraction. Proc. Conf. on Empirical Methods in Natural Language Processing, p.458-467.

[20]Severyn, A., Moschitti, A., 2015. Learning to rank short text pairs with convolutional deep neural networks. Proc. 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.373-382.

[21]Soricut, R., Brill, E., 2006. Automatic question answering using the web: beyond the factoid. Inform. Retr., 9(2):191-206.

[22]Surdeanu, M., Ciaramita, M., Zaragoza, H., 2011. Learning to rank answers to non-factoid questions from web collections. Comput. Ling., 37(2):351-383.

[23]Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104-3112.

[24]Wang, D., Nyberg, E., 2015. A long short-term memory model for answer sentence selection in question answering. Proc. 53rd Annual Meeting of the Association for Computational Linguistics and 7th Int. Joint Conf. on Natural Language Processing, p.707-712.

[25]Wang, M., Manning, C.D., 2010. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. Proc. 23rd Int. Conf. on Computational Linguistics, p.1164-1172.

[26]Wang, M., Smith, N.A., Mitamura, T., 2007. What is the jeopardy model? A quasi-synchronous grammar for QA. Proc. Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, p.22-32.

[27]Xu, K., Ba, J., Kiros, R., et al., 2015. Show, attend and tell: neural image caption generation with visual attention. ArXiv:1502.03044.

[28]Xue, X., Jeon, J., Croft, W.B., 2008. Retrieval models for question and answer archives. Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.475-482.

[29]Yao, X., van Durme, B., Callison-Burch, C., et al., 2013a. Answer extraction as sequence tagging with tree edit distance. Proc. NAACL-HLT, p.858-867.

[30]Yao, X., van Durme, B., Callisonburch, C., et al., 2013b. Semi-Markov phrase-based monolingual alignment. Proc. Conf. on Empirical Methods in Natural Language Processing, p.590-600.

[31]Yih, W., Chang, M., Meek, C., et al., 2013. Question answering using enhanced lexical semantic models. Proc. 51st Annual Meeting of the Association for Computational Linguistics, p.1744-1753.

[32]Yih, W., He, X., Meek, C., 2014. Semantic parsing for single-relation question answering. Proc. 52nd Annual Meeting of the Association for Computational Linguistics, p.643-648.

[33]Yu, L., Hermann, K.M., Blunsom, P., et al., 2014. Deep learning for answer sentence selection. ArXiv:1412.1632.

[34]Zhou, G., Cai, L., Zhao, J., et al., 2011. Phrase-based translation model for question retrieval in community question answer archives. Proc. 49th Annual Meeting of the Association for Computational Linguistics, p.653-662.

[35]Zhou, G., Liu, F., Liu, Y., et al., 2013. Statistical machine translation improves question retrieval in community question answering via matrix factorization. Proc. 51st Annual Meeting of the Association for Computational Linguistics, p.852-861.

[36]Zhou, G., Zhou, Y., He, T., et al., 2016. Learning semantic representation with neural networks for community question answering retrieval. Knowl.-Based Syst., 93:75-83.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Similar articles

- Go to

基于注意机制编码解码模型的答案选择方法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference