CLC number: TP391.1
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2017-01-20
Cited: 0
Clicked: 6917
Hui Chen, Bao-gang Wei, Yi-ming Li, Yong-huai Liu, Wen-hao Zhu. An easy-to-use evaluation framework for benchmarking entity recognition and disambiguation systems[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(2): 195-205.
@article{title="An easy-to-use evaluation framework for benchmarking entity recognition and disambiguation systems",
author="Hui Chen, Bao-gang Wei, Yi-ming Li, Yong-huai Liu, Wen-hao Zhu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="18",
number="2",
pages="195-205",
year="2017",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1500473"
}
%0 Journal Article
%T An easy-to-use evaluation framework for benchmarking entity recognition and disambiguation systems
%A Hui Chen
%A Bao-gang Wei
%A Yi-ming Li
%A Yong-huai Liu
%A Wen-hao Zhu
%J Frontiers of Information Technology & Electronic Engineering
%V 18
%N 2
%P 195-205
%@ 2095-9184
%D 2017
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1500473
TY - JOUR
T1 - An easy-to-use evaluation framework for benchmarking entity recognition and disambiguation systems
A1 - Hui Chen
A1 - Bao-gang Wei
A1 - Yi-ming Li
A1 - Yong-huai Liu
A1 - Wen-hao Zhu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 18
IS - 2
SP - 195
EP - 205
%@ 2095-9184
Y1 - 2017
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1500473
Abstract: entity recognition and disambiguation (ERD) is a crucial technique for knowledge base population and information extraction. In recent years, numerous papers have been published on this subject, and various ERD systems have been developed. However, there are still some confusions over the ERD field for a fair and complete comparison of these systems. Therefore, it is of emerging interest to develop a unified evaluation framework. In this paper, we present an easy-to-use evaluation framework (EUEF), which aims at facilitating the evaluation process and giving a fair comparison of ERD systems. EUEF is well designed and released to the public as an open source, and thus could be easily extended with novel ERD systems, datasets, and evaluation metrics. It is easy to discover the advantages and disadvantages of a specific ERD system and its components based on EUEF. We perform a comparison of several popular and publicly available ERD systems by using EUEF, and draw some interesting conclusions after a detailed analysis.
[1]Bizer, C., Lehmann, J., Kobilarov, G., et al., 2009. DBpedia–-a crystallization point for the Web of Data. Web Semant. Sci. Serv. Agents World Wide Web, 7(3):154-165.
[2]Carletta, J., 1996. Assessing agreement on classification tasks: the kappa statistic. Comput. Ling., 22(2):249-254.
[3]Cornolti, M., Ferragina, P., Ciaramita, M., 2013. A framework for benchmarking entity-annotation systems. Proc. 22nd Int. Conf. on World Wide Web, p.249-260.
[4]Finkel, J.R., Grenager, T., Manning, C., 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.363-370.
[5]Hachey, B., Nothman, J., Radford, W., 2014. Cheap and easy entity evaluation. Proc. 52nd Annual Meeting of the Association for Computational Linguistics, p.464-469.
[6]Hoffart, J., Yosef, M.A., Bordino, I., et al., 2011. Robust disambiguation of named entities in text. Proc. Conf. on Empirical Methods in Natural Language Processing, p.782-792.
[7]Ji, H., Nothman, J., Hachey, B., et al., 2014. Overview of TAC-KBP2014 entity discovery and linking tasks. Proc. Text Analysis Conf.
[8]Ji, H., Nothman, J., Hachey, B., et al., 2015. Overview of TAC-KBP2015 tri-lingual entity discovery and linking. Proc. Text Analysis Conf.
[9]Ling, X., Singh, S., Weld, D.S., 2015. Design challenges for entity linking. Trans. Assoc. Comput. Ling., 3:315-328.
[10]Milne, D., Witten, I.H., 2008. Learning to link with Wikipedia. Proc. 17th ACM Conf. on Information and Knowledge Management, p.509-518.
[11]Milne, D., Witten, I.H., 2013. An open-source toolkit for mining Wikipedia. Artif. Intell., 194:222-239.
[12]Ratinov, L., Roth, D., 2009. Design challenges and misconceptions in named entity recognition. Proc. 13th Conf. on Computational Natural Language Learning, p.147-155.
[13]Ratinov, L., Roth, D., Downey, D., et al., 2011. Local and global algorithms for disambiguation to Wikipedia. Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language, p.1375-1384.
[14]Ristad, E.S., Yianilos, P.N., 1998. Learning string-edit distance. IEEE Trans. Patt. Anal. Mach. Intell., 20(5):522-532.
[15]Rizzo, G., van Erp, M., Troncy, R., 2014. Benchmarking the extraction and disambiguation of named entities on the semantic web. Proc. 9th Int. Conf. on Language Resources and Evaluation.
[16]Shen, W., Wang, J., Han, J., 2015. Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng., 27(2):443-460.
[17]Spitkovsky, V.I., Chang, A.X., 2012. A cross-lingual dictionary for English Wikipedia concepts. 8th Int. Conf. on Language Resources and Evaluation, p.3168-3175.
[18]Usbeck, R., Röder, M., Ngonga Ngomo, A.C., et al., 2015. GERBIL: general entity annotator benchmarking framework. Proc. 24th Int. Conf. on World Wide Web, p.1133-1143.
Open peer comments: Debate/Discuss/Question/Opinion
<1>