Full Text:   <2858>

Summary:  <2162>

CLC number: TP393.09

On-line Access: 2014-01-29

Received: 2013-10-08

Revision Accepted: 2013-12-22

Crosschecked: 2014-01-15

Cited: 2

Clicked: 6993

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 2014 Vol.15 No.2 P.81-90

http://doi.org/10.1631/jzus.C1300281


Querying dynamic communities in online social networks


Author(s):  Li Weigang, Edans F. O. Sandes, Jianya Zheng, Alba C. M. A. de Melo, Lorna Uden

Affiliation(s):  Department of Computer Science, University of Brasilia, Brasilia 70910-900, Brazil; more

Corresponding email(s):   weigang@cic.unb.br

Key Words:  Follow Model, Hadoop, MapReduce, Querying, Twitter


Share this article to: More |Next Article >>>

Li Weigang, Edans F. O. Sandes, Jianya Zheng, Alba C. M. A. de Melo, Lorna Uden. Querying dynamic communities in online social networks[J]. Journal of Zhejiang University Science C, 2014, 15(2): 81-90.

@article{title="Querying dynamic communities in online social networks",
author="Li Weigang, Edans F. O. Sandes, Jianya Zheng, Alba C. M. A. de Melo, Lorna Uden",
journal="Journal of Zhejiang University Science C",
volume="15",
number="2",
pages="81-90",
year="2014",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.C1300281"
}

%0 Journal Article
%T Querying dynamic communities in online social networks
%A Li Weigang
%A Edans F. O. Sandes
%A Jianya Zheng
%A Alba C. M. A. de Melo
%A Lorna Uden
%J Journal of Zhejiang University SCIENCE C
%V 15
%N 2
%P 81-90
%@ 1869-1951
%D 2014
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1300281

TY - JOUR
T1 - Querying dynamic communities in online social networks
A1 - Li Weigang
A1 - Edans F. O. Sandes
A1 - Jianya Zheng
A1 - Alba C. M. A. de Melo
A1 - Lorna Uden
J0 - Journal of Zhejiang University Science C
VL - 15
IS - 2
SP - 81
EP - 90
%@ 1869-1951
Y1 - 2014
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1300281


Abstract: 
Online social networks (OSNs) offer people the opportunity to join communities where they share a common interest or objective. This kind of community is useful for studying the human behavior, diffusion of information, and dynamics of groups. As the members of a community are always changing, an efficient solution is needed to query information in real time. This paper introduces the follow Model to present the basic relationship between users in OSNs, and combines it with the mapReduce solution to develop new algorithms with parallel paradigms for querying. Two models for reverse relation and high-order relation of the users were implemented in the hadoop system. Based on 75 GB message data and 26 GB relation network data from twitter, a case study was realized using two dynamic discussion communities: #musicmonday and #beatcancer. The querying performance demonstrates that the new solution with the implementation in hadoop significantly improves the ability to find useful information from OSNs.

在线社交网络内动态群组查询

研究目的:在线社交网络的动态群组形成具有在线即时、信息突发和传播迅速等特点,在大数据环境下及时发现有用的群组内的信息,是本专业的一项富有挑战性的工作。本文引用描述用户关系的逻辑模型(Follow Model,简称“粉丝模型”),结合文章映射和化简(MapReduce)概念,探讨映射关注和化简粉丝(MapFollowee & ReduceFollower)机制在Hadoop系统联机实现的算法。
创新要点:在线社交网络的研究缺乏使用和方便的基础理论模型,粉丝模型(Follow Model)的建立,为研究动态群组查询和微博转发预测等提供有效的元模型。结合映射和化简(MapReduce)理念,本文算法为在线社交网络动态群组的查询,即大数据的动态查询,提供并行计算的实用性算法。
方法提亮:组成粉丝模型(Follow Model)的各类函数把微博用户关系简洁和准确地描述出来,同时具备以下三个特点:反对称与对称性、可扩展性和可组合性。这些特性的灵活应用,形成本文提出的两大类查询算法:反对称关系查询算法(reverse relation)和高阶关系查询算法(high-order relation)。
重要结论:本文研究在线社交网络,特别是Twitter和新浪微博平台的动态群组形成机理,提出描述用户间关系的逻辑模型,即粉丝模型。将此模型结合映射和化简理念,提出对这些动态群组信息查询的并行算法。特别是通过对Twitter平台内两个群组信息查询的实际检验,展示大数据环境下本文算法的有效性。

关键词:粉丝模型,Hadoop,映射和化简,信息查询,Twitter微博

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Anagnostopoulos, A., Kumar, R., Mahdian, M., 2008. Influence and correlation in social networks. Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.7-15.

[2]Bhandarkar, M., 2010. MapReduce programming with Apache Hadoop. 24th IEEE Int. Parallel & Distributed Processing Symp., p.1.

[3]Bialecki, A., Cafarella, M., Cutting, D., et al., 2005. Hadoop: a framework for running applications on large clusters built of commodity hardware. Available from http://lucene.apache.org/hadoop.

[4]Cha, M., Haddadi, H., Benevenuto, F., et al., 2010. Measuring user influence in Twitter: the million follower fallacy. Proc. 4th Int. AAAI Conf. on Weblogs and Social Media, p.10-17.

[5]Chen, W., Wang, Y., Yang, S., 2009. Efficient influence maximization in social networks. Proc. 15th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.199-208.

[6]Dean, J., Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1):107-113.

[7]Goyal, A., Bonchi, F., Lakshmanan, L.V.S., 2010. Learning influence probabilities in social networks. Proc. 3rd ACM Int. Conf. on Web Search and Data Mining, p.241-250.

[8]Karypis, G., Aggarwal, R., Kumar, V., et al., 1999. Multilevel hypergraph partitioning: applications in VLSI domain. IEEE Trans. VLSI, 7(1):69-79.

[9]Kwak, H., Lee, C., Park, H., et al., 2010. What is Twitter, a social network or a news media? Proc. 19th Int. Conf. on World Wide Web, p.591-600.

[10]Liben-Nowell, D., Kleinberg, J., 2007. The link-prediction problem for social networks. J. Amer. Soc. Inform. Sci. Technol., 58(7):1019-1031.

[11]Lü, L., Zhou, T., 2011. Link prediction in complex networks: a survey. Phys. A, 390(6):1150-1170.

[12]Sandes, E.F.O., Weigang, L., de Melo, A.C.M.A., 2012. Logical model of relationship for online social networks and performance optimization of queries. LNCS, 7651:726-736.

[13]Sun, Y., Han, J., Aggarwal, C.C., et al., 2012. When will it happen?—relationship prediction in heterogeneous information networks. Proc. 5th ACM Int. Conf. on Web Search and Data Mining, p.663-672.

[14]Tang, J., Sun, J., Wang, C., et al., 2009. Social influence analysis in large-scale networks. Proc. 15th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.807-816.

[15]Tang, Z., Lin, H., Li, K., et al., 2012. Acolyte: an in-memory social network query system. Proc. 13th Int. Conf. on Web Information Systems Engineering, p.755-763.

[16]Theobald, M., Bast, H., Majumdar, D., et al., 2008. TopX: efficient and versatile top-k query processing for semi-structured data. VLDB J., 17(1):81-115.

[17]Weigang, L., Zheng, J., Liu, G., 2013. W-entropy method to measure the influence of the members from social networks. Int. J. Web Eng. Technol., in press.

[18]Yang, J., Leskovec, J., 2011. Patterns of temporal variation in online media. Proc. 4th ACM Int. Conf. on Web Search and Data Mining, p.177-186.

[19]Zhang, Z.K., Liu, C., 2010. A hypergraph model of social tagging networks. J. Stat. Mech., 2010(10):P10005.

[20]Zheng, J., Weigang, L., Uden, L., 2014. Top-X querying in online social networks with MapReduce solution. Proc. 8th Int. Conf. on Knowledge Management in Organizations, p.397-410.

[21]Zheng, L., Zhou, X., Lin, Z., et al., 2012. Accelerating queries over microblog dataset via grouping and indexing techniques. Proc. 13th Int. Conf. on Web Information Systems Engineering, p.764-770.

[22]Zhu, F., Liu, J., Xu, L., 2012. A fast and high throughput SQL query system for big data. Proc. 13th Int. Conf. on Web Information Systems Engineering, p.783-788.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE