Frontiers of Information Technology & Electronic Engineering  2021 Vol.22 No.2 P.170-184


Learning natural ordering of tags in domain-specific Q&A sites

Author(s):  Junfang Jia, Guoqiang Li

Affiliation(s):  School of Computer and Network Engineering, Shanxi Datong University, Datong 037009, China; more

Corresponding email(s):   jiajunfang816@163.com, li.g@sjtu.edu.cn

Key Words:  Question and answering (Q&, A) sites, Tagging, Natural order, Skip gram

Junfang Jia, Guoqiang Li. Learning natural ordering of tags in domain-specific Q&A sites[J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22(2): 170-184.

tagging is a defining characteristic of Web 2.0. It allows users of social computing systems (e.g., question and answering (Q&;a) sites) to use free terms to annotate content. However, is tagging really a free action? Existing work has shown that users can develop implicit consensus about what tags best describe the content in an online community. However, there has been no work studying the regularities in how users order tags during tagging. In this paper, we focus on the natural ordering of tags in domain-specific Q&A sites. We study tag sequences of millions of questions in four Q&A sites, i.e., CodeProject, SegmentFault, Biostars, and CareerCup. Our results show that users of these Q&A sites can develop implicit consensus about in which order they should assign tags to questions. We study the relationships between tags that can explain the emergence of natural ordering of tags. Our study opens the path to improve existing tag recommendation and Q&A site navigation by leveraging the natural ordering of tags.



摘要:标注是Web 2.0的一个重要特征。它使得社会计算系统(如问答网站)的用户们可以自由地标记内容。然而,标注真的是自由不受限的吗?现有工作表明,用户们常常可以隐性地就哪种标签最能描述在线社区的内容达成共识。然而,目前还没有针对用户在标注过程中对标签排序的规律性开展研究。本文专注于研究特定领域问答网站中的标签自然排序,并对CodeProject,SegmentFault,Biostars以及CareerCup 4个问答网站上数以百万计的问题中的标签序列进行研究。结果表明,这些问答网站的用户可以就问题标签的排序达成隐性共识。研究了标签之间的关系,这些关系可以解释标签自然顺序的出现。该研究为利用标签的自然顺序提升现有标签推荐以及问答站点导航提供了可能。

关键词:问答网站;标注;自然顺序;Skip gram

