Full Text:   <5317>

Summary:  <419>

CLC number: TP391

On-line Access: 2022-03-22

Received: 2020-11-22

Revision Accepted: 2022-04-22

Crosschecked: 2021-01-10

Cited: 0

Clicked: 5843

Citations:  Bibtex RefMan EndNote GB/T7714


Yin Zhang


Jianke HU


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2022 Vol.23 No.3 P.409-421


NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning

Author(s):  Jianke HU, Yin ZHANG

Affiliation(s):  College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China

Corresponding email(s):   yinzh@zju.edu.cn

Key Words:  Graph learning, Semi-supervised learning, Node classification, Attention

Jianke HU, Yin ZHANG. NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(3): 409-421.

@article{title="NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning",
author="Jianke HU, Yin ZHANG",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning
%A Jianke HU
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 3
%P 409-421
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2000657

T1 - NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning
A1 - Jianke HU
A1 - Yin ZHANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 3
SP - 409
EP - 421
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2000657

Recently, graph neural networks (GNNs) have achieved remarkable performance in representation learning on graph-structured data. However, as the number of network layers increases, GNNs based on the neighborhood aggregation strategy deteriorate due to the problem of oversmoothing, which is the major bottleneck for applying GNNs to real-world graphs. Many efforts have been made to improve the process of feature information aggregation from directly connected nodes, i.e., breadth exploration. However, these models perform the best only in the case of three or fewer layers, and the performance drops rapidly for deep layers. To alleviate oversmoothing, we propose a nested graph attention network (NGAT), which can work in a semi-supervised manner. In addition to breadth exploration, a k-layer NGAT uses a layer-wise aggregation strategy guided by the attention mechanism to selectively leverage feature information from the kth-order neighborhood, i.e., depth exploration. Even with a 10-layer or deeper architecture, NGAT can balance the need for preserving the locality (including root node features and the local structure) and aggregating the information from a large neighborhood. In a number of experiments on standard node classification tasks, NGAT outperforms other novel models and achieves state-of-the-art performance.




Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Atwood J, Towsley D, 2016. Diffusion-convolutional neural networks. Proc 30th Int Conf on Neural Information Processing Systems, p.2001-2009.

[2]Belkin M, Niyogi P, Sindhwani V, 2006. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res, 7:2399-2434.

[3]Bruna J, Zaremba W, Szlam A, et al., 2014. Spectral networks and locally connected networks on graphs. https://arxiv.org/abs/1312.6203

[4]Buchnik E, Cohen E, 2018. Bootstrapped graph diffusions: exposing the power of nonlinearity. Proc ACM Int Conf on Measurement and Modeling of Computer Systems, p.8-10.

[5]Chapelle O, Scholkopf B, Zien A, 2009. Semi-supervised learning (Chapelle, O. et al., Eds.; 2006) [book reviews]. IEEE Trans Neur Netw, 20(3):542.

[6]Chen J, Ma TF, Xiao C, 2018. FastGCN: fast learning with graph convolutional networks via importance sampling. https://arxiv.org/abs/1801.10247

[7]Defferrard M, Bresson X, Vandergheynst P, 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Proc 30th Int Conf on Neural Information Processing Systems, p.3844-3852.

[8]Grover A, Leskovec J, 2016. node2vec: scalable feature learning for networks. Proc 22nd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.855-864.

[9]Hamilton WL, Ying R, Leskovec J, 2017. Inductive representation learning on large graphs. Proc 31st Int Conf on Neural Information Processing Systems, p.1025-1035.

[10]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.

[11]Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980

[12]Kipf TN, Welling M, 2017. Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907

[13]Klicpera J, Bojchevski A, Günnemann S, 2019. Predict then propagate: graph neural networks meet personalized pagerank. https://arxiv.org/abs/1810.05997v4

[14]Knyazev B, Taylor GW, Amer MR, 2019. Understanding attention and generalization in graph neural networks. Proc 33rd Conf on Neural Information Processing Systems, p.4204-4214.

[15]Krizhevsky A, Sutskever I, Hinton GE, 2012. ImageNet classification with deep convolutional neural networks. Proc 25th Int Conf on Neural Information Processing Systems, p.1097-1105.

[16]Lee J, Lee I, Kang J, 2019. Self-attention graph pooling. https://arxiv.org/abs/1904.08082

[17]Li QM, Han ZC, Wu XM, 2018. Deeper insights into graph convolutional networks for semi-supervised learning. Proc 32nd AAAI Conf on Artificial Intelligence, p.3538-3545.

[18]Liao RJ, Zhao ZZ, Urtasun R, et al., 2019. LanczosNet: multi-scale deep graph convolutional networks. https://arxiv.org/abs/1901.01484v1

[19]Namata G, London B, Getoor L, et al., 2012. Query-driven active surveying for collective classification. Proc 10th Int Workshop on Mining and Learning with Graphs, Article 8.

[20]Niepert M, Ahmed M, Kutzkov K, 2016. Learning convolutional neural networks for graphs. Proc 33rd Int Conf on Machine Learning, p.2014-2023.

[21]Perozzi B, Al-Rfou R, Skiena S, 2014. DeepWalk: online learning of social representations. Proc 20th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.701-710.

[22]Ribeiro LFR, Saverese PHP, Figueiredo DR, 2017. struc2vec: learning node representations from structural identity. Proc 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.385-394.

[23]Sen P, Namata G, Bilgic M, et al., 2008. Collective classification in network data. AI Mag, 29(3):93.

[24]Shchur O, Mumme M, Bojchevski A, et al., 2018. Pitfalls of graph neural network evaluation. https://arxiv.org/abs/1811.05868

[25]Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556

[26]Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929-1958.

[27]Thekumparampil KK, Wang C, Oh S, et al., 2018. Attention-based graph neural network for semi-supervised learning. https://arxiv.org/abs/1803.03735

[28]van der Maaten L, Hinton G, 2008. Visualizing data using t-SNE. J Mach Learn Res, 9:2579-2605.

[29]Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 31st Int Conf on Neural Information Processing Systems, p.6000-6010.

[30]Veličković P, Cucurull G, Casanova A, et al., 2018. Graph attention networks. https://arxiv.org/abs/1710.10903v1

[31]Veličković P, Fedus W, Hamilton WL, et al., 2019. Deep graph infomax. https://arxiv.org/abs/1809.10341

[32]Wu F, Zhang TY, de Souza AH Jr, et al., 2019. Simplifying graph convolutional networks. https://arxiv.org/abs/1902.07153

[33]Wu ZH, Pan SR, Chen FW, et al., 2019. A comprehensive survey on graph neural networks. https://arxiv.org/abs/1901.00596

[34]Xu K, Li CT, Tian YL, et al., 2018. Representation learning on graphs with jumping knowledge networks. https://arxiv.org/abs/1806.03536

[35]Xu K, Hu WH, Leskovec J, et al., 2019. How powerful are graph neural networks? https://arxiv.org/abs/1810.00826

[36]Yang ZL, Cohen W, Salakhudinov R, 2016. Revisiting semi-supervised learning with graph embeddings. Proc 33rd Int Conf on Machine Learning, p.40-48.

[37]Zhou J, Cui GQ, Zhang ZY, et al., 2018. Graph neural networks: a review of methods and applications. https://arxiv.org/abs/1812.08434

[38]Zhu XJ, Ghahramani Z, Lafferty J, 2003. Semi-supervised learning using Gaussian fields and harmonic functions. Proc 20th Int Conf on Machine Learning, p.912-919.

[39]Zou DF, Hu ZN, Wang YW, et al., 2019. Layer-dependent importance sampling for training deep and large graph convolutional networks. Proc 33rd Int Conf on Neural Information Processing Systems, p.11247-11256.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE