Journal of Zhejiang University

Frontiers of Information Technology & Electronic Engineering 2020 Vol.21 No.3 P.384-404

http://doi.org/10.1631/FITEE.1900127

Large-scale graph processing systems: a survey

Author(s): Ning Liu, Dong-sheng Li, Yi-ming Zhang, Xiong-lve Li
Affiliation(s): 1. Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410000, China
Corresponding email(s): liuning17a@nudt.edu.cn, dsli@nudt.edu.cn, zhangyiming@nudt.edu.cn, lixionglve17@nudt.edu.cn
Key Words: Graph workloads, Graph applications, Graph processing systems

Share this article to： More <<< Previous Article \|Next Article >>>

Ning Liu, Dong-sheng Li, Yi-ming Zhang, Xiong-lve Li. Large-scale graph processing systems: a survey[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(3): 384-404.

@article{title="Large-scale graph processing systems: a survey",
author="Ning Liu, Dong-sheng Li, Yi-ming Zhang, Xiong-lve Li",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="3",
pages="384-404",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900127"
}

%0 Journal Article
%T Large-scale graph processing systems: a survey
%A Ning Liu
%A Dong-sheng Li
%A Yi-ming Zhang
%A Xiong-lve Li
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 3
%P 384-404
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900127

TY - JOUR
T1 - Large-scale graph processing systems: a survey
A1 - Ning Liu
A1 - Dong-sheng Li
A1 - Yi-ming Zhang
A1 - Xiong-lve Li
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 3
SP - 384
EP - 404
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900127

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Graph is a significant data structure that describes the relationship between entries. Many application domains in the real world are heavily dependent on graph data. However, graph applications are vastly different from traditional applications. It is inefficient to use general-purpose platforms for graph applications, thus contributing to the research of specific graph processing platforms. In this survey, we systematically categorize the graph workloads and applications, and provide a detailed review of existing graph processing platforms by dividing them into general-purpose and specialized systems. We thoroughly analyze the implementation technologies including programming models, partitioning strategies, communication models, execution models, and fault tolerance strategies. Finally, we analyze recent advances and present four open problems for future research.

大规模图计算系统综述

刘苧，李东升，张一鸣，李雄略
国防科技大学并行与分布处理国防科技重点实验室，中国长沙市，410000

摘要：图是描述实体之间关系的一种重要数据结构。现实世界中许多应用领域非常依赖图数据。然而，由于图计算应用与传统应用的显著差异，利用通用平台处理图计算应用是低效的，这极大推动了专用图计算系统的研究。本综述系统地对图算法和图计算应用进行分类，将现有图计算系统划分为通用和专用系统，并详细总结。深入分析图计算系统的实现技术，包括编程模型、分区策略、通信模型、执行模型和容错机制。最后，分析图计算领域最新进展，并提出有待进一步研究的4个问题。

关键词：图算法；图计算应用；图计算系统

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Abou-Rjeili A, Karypis G, 2006. Multilevel algorithms for partitioning power-law graphs. Proc 20^th IEEE Int Parallel and Distributed Processing Symp, Article 10.

[2]Ajwani D, Dementiev R, Meyer U, 2006. A computational study of external-memory BFS algorithms. Proc 17^th Annual ACM-SIAM Symp on Discrete Algorithm, p.601-610.

[3]Ajwani D, Meyer U, Osipov V, 2007. Improved external memory BFS implementations. Proc Meeting on Algorithm Engineering and Expermiments, p.3-12.

[4]Arge L, Brodal GS, Toma L, 2000. On external-memory MST, SSSP, and multi-way planar graph separation. Proc 7^th Scandinavian Workshop on Algorithm Theory, p.433-447.

[5]Atwood J, Towsley D, 2016. Diffusion-convolutional neural networks. https://arxiv.org/abs/1511.02136

[6]Avery C, 2011. {Giraph: large-scale graph processing infrastructure on Hadoop. Proc Hadoop Summit, p.5-9.}

[7]Awerbuch B, Gallager RG, 1985. Distributed BFS algorithms. 26^th Annual Symp on Foundations of Computer Science, p.250-256.

[8]Bader DA, Cong G, 2006. Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs. J Parall Distr Comput, 66(11):1366-1378.

[9]Bader DA, Madduri K, 2006. Parallel algorithms for evaluating centrality indices in real-world networks. Int Conf on Parallel Processing, p.539-550.

[10]Bao NT, Suzumura T, 2013. Towards highly scalable pregel-based graph processing platform with x10. Proc 22^nd Int Conf on World Wide Web, p.501-508.

[11]Batarfi O, El Shawi R, Fayoumi AG, et al., 2015. Large scale graph processing systems: survey and an experimental evaluation. Clust Comput, 18(3):1189-1213.

[12]Baumes J, Goldberg M, Magdon-Ismail M, 2005. Efficient identification of overlapping communities. IEEE Int Conf on Intelligence and Security Informatics, p.27-36.

[13]Becchetti L, Boldi P, Castillo C, et al., 2008. Efficient semi-streaming algorithms for local triangle counting in massive graphs. Proc 14^th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.16-24.

[14]Belkin M, Niyogi P, 2001. {Laplacian eigenmaps and spectral techniques for embedding and clustering. Proc 14^th Int Conf on Neural Information Processing Systems, p.585-591.}

[15]Binnig C, Crotty A, Galakatos A, et al., 2016. The end of slow networks: it’s time for a redesign. Proc VLDB Endowm, 9(7):528-539.

[16]Borgelt C, Berthold MR, 2002. {Mining molecular fragments: finding relevant substructures of molecules. IEEE Int Conf on Data Mining, p.51-58.

[17]Brandes U, 2001. A faster algorithm for betweenness centrality. J Math Sociol, 25(2):163-177.

[18]Bruna J, Zaremba W, Szlam A, et al., 2014. {Spectral networks and locally connected networks on graphs. https://arxiv.org/abs/1312.6203}

[19]Bu YY, Howe B, Balazinska M, et al., 2010. HaLoop: efficient iterative data processing on large clusters. Proc VLDB Endowm, 3(1-2):285-296.

[20]Bu YY, Borkar V, Jia J, et al., 2014. Pregelix: big(ger) graph analytics on a dataflow engine. Proc VLDB Endowm, 8(2):161-172.

[21]Buluc{c} A, Madduri K, 2011. Parallel breadth-first search on distributed memory systems. Proc Int Conf for High Performance Computing, Networking, Storage and Analysis, Article 65.

[22]Bulu{c{c} A, Meyerhenke H, Safro I, et al., 2016}. Recent advances in graph partitioning. In: Kliemann L, Sanders P (Eds.), Algorithm Engineering. Springer, Cham, p.117-158.

[23]Chan TM, 2010. More algorithms for all-pairs shortest paths in weighted graphs. SIAM J Comput, 39(5):2075-2089.

[24]Chang LJ, Lin XM, Zhang WJ, et al., 2015. Optimal enumeration: efficient top-k tree matching. Proc VLDB Endowm, 8(5):533-544.

[25]Chen R, Weng X, He B, et al., 2010. Large graph processing in the cloud. Proc ACM SIGMOD Int Conf on Management of Data, p.1123-1126.

[26]Chen R, Ding X, Wang P, et al., 2014. Computation and communication efficient graph processing with distributed immutable view. Proc 23^rd Int Symp on High-Performance Parallel and Distributed Computing, p.215-226.

[27]Chen R, Shi J, Chen Y, et al., 2015. PowerLyra: differentiated graph computation and partitioning on skewed graphs. 10^th European Conf on Computer Systems, Article 1.

[28]Chen YZ, Wei XD, Shi JX, et al., 2016. Fast and general distributed transactions using RDMA and HTM. Proc 11^th European Conf on Computer Systems, Article 26.

[29]Cheung TY, 1983. Graph traversal techniques and the maximum flow problem in distributed computation. IEEE Trans Softw Eng, 9(4):504-512.

[30]Chi Y, Dai G, Wang Y, et al., 2016. NXgraph: an efficient graph processing system on a single machine. IEEE 32^nd Int Conf on Data Engineering, p.409-420.

[31]Da Z, Mhembere D, Burns R, et al., 2015. FlashGraph: processing billion-node graphs on an array of commodity SSDS. Proc 13^th USENIX Conf on File and Storage Technologies, p.45-58.

[32]Dean J, Ghemawat S, 2008. MapReduce: simplified data processing on large clusters. Commun ACM, 51(1):107-113.

[33]Defferrard M, Bresson X, Vandergheynst P, 2016. Convolutional neural networks on graphs with fast localized spectral filtering. https://arxiv.org/abs/1606.09375

[34]Desikan P, Pathak N, Srivastava J, et al., 2005. Incremental page rank computation on evolving graphs. Special Interest Tracks and Posters of the 14^th Int Conf on World Wide Web, p.1094-1095.

[35]Doekemeijer N, Varbanescu AL, 2014. {A Survey of Parallel Graph Processing Frameworks. Technical Report No. PDS-2014-003, Delft University of Technology, the Netherlands.}

[36]Dragojević A, Narayanan D, Hodson O, et al., 2014. FaRM: fast remote memory. Proc 11^th USENIX Conf on Networked Systems Design and Implementation, p.401-414.

[37]Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J, et al., 2015. Convolutional networks on graphs for learning molecular fingerprints. Proc 28^th Int Conf on Neural Information Processing Systems, p.2224-2232.

[38]Ekanayake J, Li H, Zhang B, et al., 2010. Twister: a runtime for iterative MapReduce. Proc 19^th ACM Int Symp on High Performance Distributed Computing, p.810-818.

[39]Farkas IJ, Ábel D, Palla G, et al., 2007. Weighted network modules. New J Phys, 9(6):180.

[40]Garey MR, Johnson DS, Stockmeyer L, 1974. Some simplified NP-complete problems. Proc 6^th Annual ACM Symp on Theory of Computing, p.47-63.

[41]Gonzalez JE, Low Y, Gu H, et al., 2012. PowerGraph: distributed graph-parallel computation on natural graphs. Proc 10^th USENIX Conf on Operating Systems Design and Implementation, p.17-30.

[42]Gonzalez JE, Xin RS, Dave A, et al., 2014. GraphX: graph processing in a distributed dataflow framework. Proc 11^th USENIX Conf on Operating Systems Design and Implementation, p.599-613.

[43]Han WS, Lee J, Lee JH, 2013a. TurboISO: towards ultrafast and robust subgraph isomorphism search in large graph databases. Proc Int Conf on Management of Data, p.337-348.

[44]Han WS, Lee S, Park K, et al., 2013b. TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC. Proc 19^thACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.77-85.

[45]Harish P, Vineet V, Narayanan P, 2009. Large graph algorithms for massively multithreaded architectures. Technical Report No. IIIT/TR/2009/74. Centre for Visual Information Technology, University of Hyderabad, mboxIndia.

[46]Hirschberg DS, Chandra AK, Sarwate DV, 1979. Computing connected components on parallel computers. Commun ACM, 22(8):461-464.

[47]Ho LY, Li TH, Wu JJ, et al., 2013. Kylin: an efficient and scalable graph data processing system. IEEE Int Conf on Big Data, p.193-198.

[48]Holder LB, Cook DJ, Djoko S, 1994. Substructure discovery in the SUBDUE system. Proc 3^rd Int Conf on Knowledge Discovery and Data Mining, p.169-180.

[49]Huan J, Wang W, Prins J, 2003. Efficient mining of frequent subgraphs in the presence of isomorphism. 3^rd IEEE Int Conf on Data Mining, p.549-552.

[50]Huan J, Wang W, Prins J, et al., 2004. SPIN: mining maximal frequent subgraphs from graph databases. 10^th Int Conf on Knowledge Discovery and Data Mining, p.581-586.

[51]Huang J, Abadi DJ, 2016. Leopard: lightweight edge oriented partitioning and replication for dynamic graphs. Proc VLDB Endowm, 9(7):540-551.

[52]Inokuchi A, Washio T, Motoda H, 2000. An Apriori-based algorithm for mining frequent substructures from graph data. European Conf on Principles of Data Mining and Knowledge Discovery, p.13-23.

[53]Jain N, Liao G, Willke TL, 2013. GraphBuilder: scalable graph ETL framework. 1^st Int Workshop on Graph Data Management Experiences and Systems, Article 4.

[54]Kalavri V, Liagouris J, Hoffmann M, et al., 2018. Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows. 13^th USENIX Symp on Operating Systems Design and Implementation, p.783-798.

[55]Kamvar SD, Haveliwala TH, Manning CD, et al., 2003. Extrapolation methods for accelerating PageRank computations. Proc 12^th Int Conf on World Wide Web, p.261-270.

[56]Kang U, Tsourakakis CE, Faloutsos C, 2009. PEGASUS: a peta-scale graph mining system implementation and observations. 9^th IEEE Int Conf on Data Mining, p.229-238.

[57]Kelley S, 2009. The existence and discovery of overlapping communities in large-scale networks. PhD Thesis, Rensselaer Polytechnic Institute, Troy, NY, USA.

[58]Kipf TN, Welling M, 2016a. Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907

[59]Kipf TN, Welling M, 2016b. Variational graph auto-encoders. https://arxiv.org/abs/1611.07308

[60]Kolountzakis MN, Miller GL, Peng R, et al., 2012. Efficient triangle counting in large graphs via degree-based vertex partitioning. Int Math, 8(1-2):161-185.

[61]Kuramochi M, Karypis G, 2003. GREW: a scalable frequent subgraph discovery algorithm. 4^th IEEE Int Conf on Data Mining, p.439-442.

[62]Kuramochi M, Karypis G, 2004. An efficient algorithm for discovering frequent subgraphs. IEEE Trans Knowl Data Eng, 16(9):1038-1051.

[63]Kutzkov K, Pagh R, 2014. Triangle counting in dynamic graph streams. Scandinavian Workshop on Algorithm Theory, p.306-318.

[64]Kyrola A, Blelloch GE, Guestrin C, 2012. GraphChi: large-scale graph computation on just a PC. Proc USENIX Symp on Operating Systems Design and Implementation, p.31-46.

[65]Lancichinetti A, Fortunato S, Kertész J, 2009. Detecting the overlapping and hierarchical community structure in complex networks. N J Phys, 11(3):19-44.

[66]Lang K, 2004. Finding good nearly balanced cuts in power law graphs. Yahoo Research Labs, CA, USA. http://www.optimization-online.org/db_file/2004/12/ 1023.pdf [Assessed on Sept. 16, 2019].

[67]Lee C, Reid F, Mcdaid A, et al., 2010. Detecting highly overlapping community structure by greedy clique expansion. 4^th SNA-KDD Workshop on Social Network Mining and Analysis, p.1-10.

[68]Leiserson CE, Schardl TB, 2010. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). Proc 22^nd Annual ACM Symp on Parallelism in Algorithms and Architectures, p.303-314.

[69]Liu H, Huang HH, 2017. Graphene: fine-grained IO management for graph computing. Proc 15^th USENIX Conf on File and Storage Technologies, p.285-300.

[70]Lotker Z, Patt-Shamir B, Peleg D, 2006. Distributed MST for constant diameter graphs. Distr Comput, 18(6):453-460.

[71]Low Y, Gonzalez JE, Kyrola A, et al., 2010. GraphLab: a new framework for parallel machine learning. https://arxiv.org/abs/1408.2041

[72]Low Y, Bickson D, Gonzalez J, et al., 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc VLDB Endowm, 5(8):716-727.

[73]Ma H, Yang H, Lyu MR, et al., 2008. Mining social networks using heat diffusion processes for marketing candidates selection. Proc 17^th ACM Conf on Information and Knowledge Management, p.233-242.

[74]Maass S, Min C, Kashyap S, et al., 2017. Mosaic: processing a trillion-edge graph on a single machine. Proc 20^th European Conf on Computer Systems, p.527-543.

[75]Maheshwari A, Zeh N, 2001. I/O-efficient algorithms for graphs of bounded treewidth. Proc 12^th Annual ACM-SIAM Symp on Discrete Algorithms, p.89-90.

[76]Malewicz G, Austern MH, Bik AJ, et al., 2010. Pregel: a system for large-scale graph processing. Proc ACM SIGMOD Int Conf on Management of Data, p.135-146.

[77]Matsumoto K, Nakasato N, Sedukhin SG, 2011. Blocked all-pairs shortest paths algorithm for hybrid CPU-GPU system. IEEE 13^th Int Conf on High Performance Computing and Communications, p.145-152.

[78]McCune RR, Weninger T, Madey G, 2015. Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput Surv, 48(2):25.

[79]Miao X, 2015. DynaDiffuse: a dynamic diffusion model for continuous time constrained influence maximization. Proc 29^th AAAI Conf on Artificial Intelligence, p.346-352.

[80]Mihalcea R, 2004. Graph-based ranking algorithms for sentence extraction, applied to text summarization. Proc ACL on Interactive Poster and Demonstration Sessions, Article 20.

[81]Murray DG, McSherry F, Isaacs R, et al., 2013. Naiad: a timely dataflow system. Proc 24^th ACM Symp on Operating Systems Principles, p.439-455.

[82]Nanongkai D, 2014. Distributed approximation algorithms for weighted shortest paths. Proc 46^th Annual ACM Symp on Theory of Computing, p.565-573.

[83]Nguyen D, Lenharth A, Pingali K, 2013. A lightweight infrastructure for graph analytics. Proc 24>^thACM Symp on Operating Systems Principles, p.456-471.

[84]Niepert M, Ahmed M, Kutzkov K, 2016. Learning convolutional neural networks for graphs. https://arxiv.org/abs/1605.05273

[85]Nuutila E, Soisalon-Soininen E, 1994. On finding the strongly connected components in a directed graph. Inform Process Lett, 49(1):9-14.

[86]Pan SR, Hu RQ, Long GD, et al., 2018 Adversarially regularized graph autoencoder for graph embedding. https://arxiv.org/abs/1802.04407

[87]Power R, Li JY, 2010. Piccolo: building fast, distributed programs with partitioned tables. Proc 9^th USENIX Conf on Operating Systems Design and Implementation, p.293-306.

[88]Psorakis I, Roberts S, Ebden M, et al., 2011. Overlapping community detection using Bayesian non-negative matrix factorization. Phys Rev E, 83(2):066114.

[89]Rahimian F, Payberah AH, Girdzijauskas S, et al., 2014. Distributed vertex-cut partitioning. IFIP Int Conf on Distributed Applications and Interoperable Systems, p.186-200.

[90]Ren XG, Wang JH, 2015. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proc VLDB Endowm, 8(5):617-628.

[91]Rodriguez MA, 2015. The Gremlin graph traversal machine and language (invited talk). Proc 15^th Symp on Database Programming Languages, p.1-10.

[92]Roy A, Mihailovic I, Zwaenepoel W, 2013. X-Stream: edge-centric graph processing using streaming partitions. Proc 24^th ACM Symp on Operating Systems Principles, p.472-488.

[93]Roy A, Bindschaedler L, Malicevic J, et al., 2015. Chaos: scale-out graph processing from secondary storage. Proc 25^th Symp on Operating Systems Principles, p.410-424.

[94]Sabrin KM, Lin Z, Chau DHP, et al., 2013. MMap: Mining Billion-Scale Graphs on a PC with Fast, Minimalist Approach via Memory Mapping. Technical Report No. GT-CSE-2013-04, Georgia Institute of Technology, Atlanta, USA.

[95]Sakr S, Bajaber F, Barnawi A, et al., 2015. Big data processing systems: state-of-the-art and open challenges. Int Conf on Cloud Computing, p.1-8.

[96]Sarma AD, Molla AR, Pandurangan G, et al., 2013. Fast distributed PageRank computation. Int Conf on Distributed Computing and Networking, p.11-26.

[97]Scarselli F, Gori M, Tsoi AC, et al., 2009. The graph neural network model. IEEE Trans Neur Netw, 20(1):61-80.

[98]Schloegel K, Karypis G, Kumar V, 2000. {Parallel multilevel algorithms for multi-constraint graph partitioning}. Proc 6^th Int European Conf on Parallel Processing, p.296-310.

[99]Seo S, Yoon EJ, Kim J, et al., 2010. HAMA: an efficient matrix computation with the MapReduce framework. IEEE Second Int Conf on Cloud Computing Technology and Science, p.721-726.

[100]Shang HC, Zhang Y, Lin XM, et al., 2008. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proc VLDB Endowm, 1(1):364-375.

[101]Shao B, Wang HX, Li YT, 2013. Trinity: a distributed graph engine on a memory cloud. Proc ACM SIGMOD Int Conf on Management of Data, p.505-516.

[102]Shen HW, Cheng XQ, Cai K, et al., 2008. Detect overlapping and hierarchical community structure in networks. Phys A, 388(8):1706-1712.

[103]Shen YY, Chen G, Jagadish HV, et al., 2014. Fast failure recovery in distributed graph processing systems. Proc VLDB Endowm, 8(4):437-448.

[104]Shi JX, Yao YY, Chen R, et al., 2016. Fast and concurrent RDF queries with RDMA-based distributed graph exploration. Proc 12^th USENIX Conf on Operating Systems Design and Implementation, p.317-332.

[105]Shun JL, Blelloch GE, 2013. Ligra: a lightweight graph processing framework for shared memory. ACM SIGPLAN Not, 48(8):135-146.

[106]Simmhan Y, Kumbhare A, Wickramaarachchi C, et al., 2014. GoFFish: a sub-graph centric framework for large-scale graph analytics. European Conf on Parallel Processing, p.451-462.

[107]Stanton I, Kliot G, 2012. Streaming graph partitioning for large distributed graphs. Proc 18^th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.1222-1230.

[108]Sundaram N, Satish N, Patwary MMA, et al., 2015. GraphMat: high performance graph analytics made productive. Proc VLDB Endowm, 8(11):1214-1225.

[109]Taleb Y, Stutsman R, Antoniu G, et al., 2018. Tailwind: fast and atomic RDMA-based replication. USENIX Annual Technical Conf, p.850-863.

[110]Tangwongsan K, Pavan A, Tirthapura S, 2013. Parallel triangle counting in massive streaming graphs. Proc 22^th ACM Int Conf on Information and Knowledge Management, p.781-786.

[111]Tian YY, Balmin A, Corsten SA, et al., 2013. From “think like a vertex” to “think like a graph.” Proc VLDB Endowm, 7(3):193-204.

[112]Ullmann JR, 1976. An algorithm for subgraph isomorphism. J ACM, 23(1):31-42.

[113]Valiant LG, 1990. A bridging model for parallel computation. Commun ACM, 33(8):103-111.

[114]Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. https://arxiv.org/abs/1706.03762

[115]Velivčković P, Cucurull G, Casanova A, et al., 2017. Graph attention networks. https://arxiv.org/abs/1710.10903

[116]Vora K, Xu GH, Gupta R, 2016. Load the edges you need: a generic I/O optimization for disk-based graph processing. USENIX Annual Technical Conf, p.507-522.

[117]Vora K, Gupta R, Xu GQ, 2017. KickStarter: fast and accurate computations on streaming graphs via trimmed approximations. Proc 22^th Int Conf on Architectural Support for Programming Languages and Operating Systems, p.237-251.

[118]Wang DX, Cui P, Zhu WW, 2016. Structural deep network embedding. 22^ndACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.1225-1234.

[119]Wang K, Xu GH, Su Z, et al., 2015. GraphQ: graph query processing with abstraction refinement-scalable and programmable analytics over very large graphs on a single PC. USENIX Annual Technical Conf, p.387-401.

[120]Wang K, Hussain A, Zuo ZQ, et al., 2017. Graspan: a single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. ACM SIGPLAN Not, 52(4):389-404.

[121]Wang K, Zuo ZQ, Thorpe J, et al., 2018. RStream: marrying relational algebra with streaming for efficient graph mining on a single machine. Proc 12^th USENIX Conf on Operating Systems Design and Implementation, p.763-782.

[122]Wang P, Zhang K, Chen R, et al., 2014. Replication-based fault-tolerance for large-scale graph processing. 44^th Annual IEEE/IFIP Int Conf on Dependable Systems and Networks, p.562-573.

[123]Washio T, Motoda H, 2003. State of the art of graph-based data mining. ACM SIGKDD Explor Newsl, 5(1):59-68.

[124]Xie CN, Chen R, Guan HB, et al., 2015. SYNC or ASYNC: time to fuse for distributed graph-parallel computation. ACM SIGPLAN Not, 50(8):194-204.

[125]Xie WL, Wang GZ, Bindel D, et al., 2013. Fast iterative graph computation with block updates. Proc VLDB Endowm, 6(14):2014-2025.

[126]Yan D, Cheng J, Lu Y, et al., 2014. Blogel: a block-centric framework for distributed computation on real-world graphs. Proc VLDB Endowm, 7(14):1981-1992.

[127]Yan XF, Han JW, 2002. gSpan: graph-based substructure pattern mining. Proc IEEE Int Conf on Data Mining, p.721-724.

[128]Yan XF, Han JW, 2003. CloseGraph: mining closed frequent graph patterns. Proc ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.286-295.

[129]Yoo A, Chow E, Henderson K, et al., 2005. A scalable distributed parallel breadth-first search algorithm on BlueGene/L. Proc ACM/IEEE Conf on Supercomputing, Article 25.

[130]Yuan PP, Zhang WY, Xie CF, et al., 2014. Fast iterative graph computation: a path centric approach. Proc Int Conf for High Performance Computing, Networking, Storage and Analysis, p.401-412. %[doi:10.1109/SC.2014.38]

[131]Zaharia M, Chowdhury M, Franklin MJ, et al., 2010. Spark: cluster computing with working sets. Proc 2^nd USENIX Conf on Hot Topics in Cloud Computing, Article 10.

[132]Zhang KY, Chen R, Chen HB, 2015. NUMA-aware graph-structured analytics. ACM SIGPLAN Not, 50(8):183-193.

[133]Zhang MX, Wu YW, Chen K, et al., 2016. Exploring the hidden dimension in graph processing. Proc 12^th USENIX Conf on Operating Systems Design and Implementation, p.285-300.

[134]Zhang S, Wang RS, Zhang XS, 2007. Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys A, 374(1):483-490.

[135]Zhang Y, Liao XF, Jin H, et al., 2018. CGraph: a correlations-aware approach for efficient concurrent iterative graph processing. USENIX Annual Technical Conf, p.1-12.

[136]Zhang YH, Chen R, Chen HB, 2017. Sub-millisecond stateful stream querying over fast-evolving linked data. Proc 26^th Symp on Operating Systems Principles, p.614-630.

[137]Zhang YM, Li DS, Guo CX, et al., 2017a. CubicRing: exploiting network proximity for distributed in-memory key-value store. IEEE/ACM Trans Netw, 25(4):2040-2053.

[138]Zhang YM, Li DS, Zhang CX, et al., 2017b. GraphA: efficient partitioning and storage for distributed graph computation. IEEE Trans Serv Comput, online.

[139]Zhang YM, Li DS, Liu L, 2019. Leveraging glocality for fast failure recovery in distributed RAM storage. ACM Trans Stor, 15(1):3.

[140]Zhao Y, Yoshigoe K, Xie M, et al., 2014. LightGraph: lighten communication in distributed graph-parallel processing. IEEE Int Congress on Big Data, p.717-724.

[141]Zhou C, Gao J, Sun B, et al., 2014. MOCgraph: scalable distributed graph processing using message online computing. Proc VLDB Endowm, 8(4):377-388.

[142]Zhu G, Lin X, Zhu K, et al., 2012. TreeSpan: efficiently computing similarity all-matching. Proc ACM SIGMOD Int Conf on Management of Data, p.529-540.

[143]Zhu XW, Han WT, Chen WG, 2015. GridGraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning. USENIX Annual Technical Conf, p.375-386.

[144]Zhu XW, Chen WG, Zheng WM, et al., 2016. Gemini: a computation-centric distributed graph processing system. USENIX Symposium on Operating Systems Design and Implementation, p.301-316.

Open peer comments: Debate/Discuss/Question/Opinion

<1>