CLC number: TP182
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2020-08-19
Cited: 0
Clicked: 5273
Citations: Bibtex RefMan EndNote GB/T7714
Junfang Jia, Guoqiang Li. Learning natural ordering of tags in domain-specific Q&A sites[J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22(2): 170-184.
@article{title="Learning natural ordering of tags in domain-specific Q&A sites",
author="Junfang Jia, Guoqiang Li",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="22",
number="2",
pages="170-184",
year="2021",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900645"
}
%0 Journal Article
%T Learning natural ordering of tags in domain-specific Q&A sites
%A Junfang Jia
%A Guoqiang Li
%J Frontiers of Information Technology & Electronic Engineering
%V 22
%N 2
%P 170-184
%@ 2095-9184
%D 2021
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900645
TY - JOUR
T1 - Learning natural ordering of tags in domain-specific Q&A sites
A1 - Junfang Jia
A1 - Guoqiang Li
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 22
IS - 2
SP - 170
EP - 184
%@ 2095-9184
Y1 - 2021
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900645
Abstract: tagging is a defining characteristic of Web 2.0. It allows users of social computing systems (e.g., question and answering (Q&;a) sites) to use free terms to annotate content. However, is tagging really a free action? Existing work has shown that users can develop implicit consensus about what tags best describe the content in an online community. However, there has been no work studying the regularities in how users order tags during tagging. In this paper, we focus on the natural ordering of tags in domain-specific Q&A sites. We study tag sequences of millions of questions in four Q&A sites, i.e., CodeProject, SegmentFault, Biostars, and CareerCup. Our results show that users of these Q&A sites can develop implicit consensus about in which order they should assign tags to questions. We study the relationships between tags that can explain the emergence of natural ordering of tags. Our study opens the path to improve existing tag recommendation and Q&A site navigation by leveraging the natural ordering of tags.
[1]Abate ST, Besacier L, Seng S, 2010. Boosting N-gram coverage for unsegmented languages using multiple text segmentation approach. Proc 1st Workshop on South and Southeast Asian Natural Language, p.1-7.
[2]Allamanis M, Barr ET, Bird C, et al., 2014. Learning natural coding conventions. Proc 22nd ACM SIGSOFT Int Symp on Foundations of Software Engineering, p.281-293.
[3]Belém F, Martins E, Pontes T, et al., 2011. Associative tag recommendation exploiting multiple textual features. Proc 34th Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.1033-1042.
[4]Bird S, Boguraev B, Kay M, et al., 1997. Survey of the State of the Art in Human Language Technology. Cambridge University Press, USA.
[5]Cattuto C, Loreto V, Pietronero L, 2007. Semiotic dynamics and collaborative tagging. PNAS, 104(5):1461-1464.
[6]Chen SF, Goodman J, 1996. An empirical study of smoothing techniques for language modeling. Proc 34th Annual Meeting on Association for Computational Linguistics, p.310-318.
[7]Chi EH, Mytkowicz T, 2008. Understanding the efficiency of social tagging systems using information theory. Proc 19th ACM Conf on Hypertext and Hypermedia, p.81-88.
[8]Feng W, Wang JY, 2012. Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. Proc 18th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.1276-1284.
[9]Fu WT, Kannampallil T, Kang RG, et al., 2010. Semantic imitation in social tagging. ACM Trans Comput-Human Interact, Article 12.
[10]Gemmell J, Shepitsen A, Mobasher B, et al., 2008. Personalizing navigation in folksonomies using hierarchical tag clustering. Proc 10th Int Conf on Data Warehousing and Knowledge, p.196-205.
[11]Golder SA, Huberman BA, 2006. Usage patterns of collaborative tagging systems. J Inform Sci, 32(2):198-208.
[12]Goodman JT, 2001. A bit of progress in language modeling. Comput Speech Lang, 15(4):403-434.
[13]Gummidi SRB, Xie XK, Pedersen TB, 2019. A survey of spatial crowdsourcing. ACM Trans Database Syst, 44(2):1-46.
[14]Guthrie D, Allison B, Liu W, et al., 2006. A closer look at skip-gram modelling. Proc 5th Int Conf on Language Resources and Evaluation, p.1-4.
[15]Halpin H, Robu V, Shepherd H, 2007. The complex dynamics of collaborative tagging. Proc 16th Int Conf on World Wide Web, p.211-220.
[16]Heckner M, Heilemann M, Wolff C, 2009. Personal information management vs. resource sharing: towards a model of information behaviour in social tagging systems. Proc 3rd Int AAAI Conf on Weblogs and Social Media, p.42-49.
[17]Heymann P, Garcia-Molina H, 2006. Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems. InfoLab Technical Report, Stanford.
[18]Heymann P, Koutrika G, Garcia-Molina H, 2008. Can social bookmarking improve web search? Proc Int Conf on Web Search and Data Mining, p.195-206.
[19]Hindle A, Barr ET, Su ZD, et al., 2012. On the naturalness of software. Proc 34th Int Conf on Software Engineering, p.837-847.
[20]Körner C, Kern R, Grahsl HP, et al., 2010. Of categorizers and describers: an evaluation of quantitative measures for tagging motivation. Proc 21st ACM Conf on Hypertext and Hypermedia, p.157-166.
[21]Levenshtein VI, 1966. Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl, 10(8):707-710.
[22]Ponte JM, Croft WB, 1998. A language modeling approach to information retrieval. Proc 21st Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.275-281.
[23]Robu V, Halpin H, Shepherd H, 2009. Emergence of consensus and shared vocabularies in collaborative tagging systems. ACM Trans Web, 3(4):14.
[24]Rosenfeld R, 1994. A hybrid approach to adaptive statistical language modeling. Proc Workshop on Human Language Technology, p.76-81.
[25]Rosenfeld R, 1995. Optimizing lexical and N-gram coverage via judicious use of linguistic data. Proc European Conf on Speech Technology, p.1763-1766.
[26]Schenkel R, Crecelius T, Kacimi M, et al., 2008. Efficient top-k querying over social-tagging networks. Proc 31st Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.523-530.
[27]Schmitz C, Hotho A, Jäschke R, et al., 2006. Mining association rules in folksonomies. In: Batagelj V, Bock HH, Ferligoj A, et al. (Eds.), Data Science and Classification. Springer, Berlin, p.261-270.
[28]Sigurbjörnsson B, van Zwol R, 2008. Flickr tag recommendation based on collective knowledge. Proc 17th Int Conf on World Wide Web, p.327-336.
[29]Siu M, Ostendorf M, 2000. Variable N-grams and extensions for conversational speech language modeling. IEEE Trans Speech Audio Process, 8(1):63-75.
[30]Song Y, Zhuang ZM, Li HJ, et al., 2008. Real-time automatic tag recommendation. Proc 31st Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.515-522.
[31]Storey MA, Cheng LT, Bull I, et al., 2006. Waypointing and social tagging to support program navigation. CHI Extended Abstracts on Human Factors in Computing Systems, p.1367-1372.
[32]Strohmaier M, Körner C, Kern R, 2010. Why do users tag? Detecting users’ motivation for tagging in social tagging systems. Proc 4th Int AAAI Conf on Weblogs and Social Media, p.23-26.
[33]Thom-Santelli J, Muller MJ, Millen DR, 2008. Social tagging roles: publishers, evangelists, leaders. Proc SIGCHI Conf on Human Factors in Computing Systems, p.1041-1044.
[34]Tuarob S, Pouchard LC, Giles CL, 2013. Automatic tag recommendation for metadata annotation using probabilistic topic modeling. Proc 13th ACM/IEEE-CS joint Conf on Digital Libraries, p.239-248.
[35]Wagner C, Singer P, Strohmaier M, et al., 2014. Semantic stability in social tagging streams. Proc 23rd Int Conf on World Wide Web, p.735-746.
[36]Wang SW, Lo D, Vasilescu B, et al., 2014. EnTagRec: an enhanced tag recommendation system for software information sites. Proc IEEE Int Conf on Software Maintenance and Evolution, p.291-300.
[37]Wattenberg M, Viégas FB, 2008. The word tree, an interactive visual concordance. IEEE Trans Vis Comput Graph, 14(6):1221-1228.
[38]Xia X, Lo D, Wang XY, et al., 2013. Tag recommendation in software information sites. Proc 10th Working Conf on Mining Software Repositories, p.287-296.
[39]Xie XK, Jin PQ, Yiu ML, et al., 2016. Enabling scalable geographic service sharing with weighted imprecise Voronoi cells. IEEE Trans Knowl Data Eng, 28(2):439-453.
[40]Xie XK, Lin X, Xu JL, et al., 2017. Reverse keyword-based location search. Proc IEEE 33rd Int Conf on Data Engineering, p.403-434.
[41]Zubiaga A, 2012. Enhancing navigation on Wikipedia with social tags. https://arxiv.org/abs/1202.5469v1
Open peer comments: Debate/Discuss/Question/Opinion
<1>