Full Text:  <377>

Summary:  <14>

CLC number: TP393.0

On-line Access: 2022-12-14

Received: 2022-05-18

Revision Accepted: 2022-10-11

Crosschecked: 2022-12-17

Cited: 0

Clicked: 145

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Jinshu SU

https://orcid.org/0000-0001-9273-616X

Baokang ZHAO

https://orcid.org/0000-0001-9200-9018

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering 

Accepted manuscript available online (unedited version)


Technology trends in large-scale high-efficiency network computing


Author(s):  Jinshu SU, Baokang ZHAO, Yi DAI, Jijun CAO, Ziling WEI, Na ZHAO, Congxi SONG, Yujing LIU, Yusheng XIA

Affiliation(s):  School of Computer, National University of Defense Technology, Changsha 410073, China; more

Corresponding email(s):  sjs@nudt.edu.cn, bkzhao@nudt.edu.cn

Key Words:  Supercomputing; Cloud computing; Network technology; Development trends


Share this article to: More |Next Paper >>>

Jinshu SU, Baokang ZHAO, Yi DAI, Jijun CAO, Ziling WEI, Na ZHAO, Congxi SONG, Yujing LIU, Yusheng XIA. Technology trends in large-scale high-efficiency network computing[J]. Frontiers of Information Technology & Electronic Engineering , 2022, 23(1): 1733-1746.

@article{title="Technology trends in large-scale high-efficiency network computing",
author="Jinshu SU, Baokang ZHAO, Yi DAI, Jijun CAO, Ziling WEI, Na ZHAO, Congxi SONG, Yujing LIU, Yusheng XIA",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="12",
pages="1733-1746",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200217"
}

%0 Journal Article
%T Technology trends in large-scale high-efficiency network computing
%A Jinshu SU
%A Baokang ZHAO
%A Yi DAI
%A Jijun CAO
%A Ziling WEI
%A Na ZHAO
%A Congxi SONG
%A Yujing LIU
%A Yusheng XIA
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 12
%P 1733-1746
%@ 1869-1951
%D 2022
%I Zhejiang University Press & Springer

TY - JOUR
T1 - Technology trends in large-scale high-efficiency network computing
A1 - Jinshu SU
A1 - Baokang ZHAO
A1 - Yi DAI
A1 - Jijun CAO
A1 - Ziling WEI
A1 - Na ZHAO
A1 - Congxi SONG
A1 - Yujing LIU
A1 - Yusheng XIA
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 12
SP - 1733
EP - 1746
%@ 1869-1951
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -


Abstract: 
Network technology is the basis for large-scale high-efficiency network computing, such as supercomputing, cloud computing, big data processing, and artificial intelligence computing. The network technologies of network computing systems in different fields not only learn from each other but also have targeted design and optimization. Considering it comprehensively, three development trends, i.e., integration, differentiation, and optimization, are summarized in this paper for network technologies in different fields. Integration reflects that there are no clear boundaries for network technologies in different fields, differentiation reflects that there are some unique solutions in different application fields or innovative solutions under new application requirements, and optimization reflects that there are some optimizations for specific scenarios. This paper can help academic researchers consider what should be done in the future and industry personnel consider how to build efficient practical network systems.

大规模高效网络计算中的网络技术发展趋势

苏金树1,2,赵宝康1,戴艺1,曹继军1,魏子令1,赵娜1,宋丛溪1,刘宇靖1,夏雨生2
1国防科技大学计算机学院,中国长沙市,410073
2军事科学院,中国北京市,100091
摘要:网络技术是超级计算、云计算、大数据和人工智能等大规模高效计算的基础。不同领域的网络技术既互相借鉴,又各自针对性设计和优化。综合考虑,本文认为大规模高效网络计算中的网络技术发展趋势主要包括3个方面,即融合、分化、优化。融合体现在不同领域的网络技术没有明显分界线;分化体现在不同领域的独特解决方案或者新应用需求下的创新方案;优化体现在针对特定场景的技术优化实现。本文将为相关领域的学者提供对于未来研究方向的思考,也为相关行业人员构建更加实用高效的网络系统提供方向。

关键词组:超级计算;云计算;网络技术;发展趋势

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Ajima Y, Inoue T, Hiramoto S, et al., 2014. Tofu Interconnect 2: system-on-chip integration of high-performance interconnect. Proc 29th Int Conf on Supercomputing, p.498-507.

[2]Bayatpour M, Sarkauskas N, Subramoni H, et al., 2021. BluesMPI: efficient MPI non-blocking alltoall offloading designs on modern BlueField smart NICs. Proc 36th Int Conf on High Performance Computing, p.18-37.

[3]Bishop M, 2021. Hypertext Transfer Protocol Version 3 (HTTP/3). Internet-Draft draft-ietf-quic-http-34. Internet Engineering Task Force.

[4]Boden NJ, Cohen D, Felderman RE, et al., 1995. Myrinet: a gigabit-per-second local area network. IEEE Micro, 15(1):29-36.

[5]Coteus P, Bickford HR, Cipolla TM, et al., 2005. Packaging the Blue Gene/L supercomputer. IBM J Res Dev, 49(2-3):213-248.

[6]Dai Y, Lu K, Xiao LQ, et al., 2019. A cost-efficient router architecture for HPC inter-connection networks: design and implementation. IEEE Trans Parall Distrib Syst, 30(4):738-753.

[7]Dang HT, Canini M, Pedone F, et al., 2016. Paxos made switch-y. ACM SIGCOMM Comput Commun Rev, 46(2):18-24.

[8]Dang HT, Bressana P, Wang H, et al., 2020. P4xos: consensus as a network service. IEEE/ACM Trans Netw, 28(4):1726-1738.

[9]de Coninck Q, Bonaventure O, 2017. Multipath QUIC: design and evaluation. Proc 13th Int Conf on Emerging Networking Experiments and Technologies, p.160-166.

[10]de Coninck Q, Bonaventure O, 2021. Multiflow QUIC: a generic multipath transport protocol. IEEE Commun Mag, 59(5):108-113.

[11]Derradji S, Palfer-Sollier T, Panziera JP, et al., 2015. The BXI interconnect architecture. Proc IEEE 23rd Annual Symp on High-Performance Interconnects, p.18-25.

[12]de Sensi D, di Girolamo S, McMahon KH, et al., 2020. An in-depth analysis of the slingshot interconnect. Proc Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1-14.

[13]Ferlin S, Alay Ö, Mehani O, et al., 2016. BLEST: blocking estimation-based MPTCP scheduler for heterogeneous networks. Proc IFIP Networking Conf and Workshops, p.431-439.

[14]Ford A, Raiciu C, Handley M, et al., 2020. TCP Extensions for Multipath Operation with Multiple Addresses. RFC8684. Internet Engineering Task Force.

[15]Gibson D, Hariharan H, Lance E, et al., 2022. Aquila: a unified, low-latency fabric for datacenter networks. Proc 19th USENIX Symp on Networked Systems Design and Implementation, p.1249-1266.

[16]Guo CX, Wu HT, Deng Z, et al., 2016. RDMA over commodity Ethernet at scale. Proc ACM SIGCOMM Conf, p.202-215.

[17]InfiniBand Trade Association, 2010. Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.2 annex A16: RDMA over Converged Ethernet (RoCE).

[18]InfiniBand Trade Association, 2014. Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.2 annex A17: RoCEv2 (IP Routable RoCE).

[19]Iyengar J, Thomson M, 2021. QUIC: a UDP-Based Multiplexed and Secure Transport. RFC9000. Internet Engineering Task Force.

[20]Jain A, Alnaasan N, Shafi A, et al., 2021. Accelerating CPU-based distributed DNN training on modern HPC clusters using BlueField-2 DPUs. Proc IEEE Symp on High-Performance Interconnects, p.17-24.

[21]Ji XS, Wu JX, Jin L, et al., 2022. Discussion on a new paradigm of endogenous security towards 6G networks. Front Inform Technol Electron Eng, 23(10):1421-1450.

[22]Jin X, Li XZ, Zhang HY, et al., 2017. NetCache: balancing key-value stores with fast in-network caching. Proc 26th Symp on Operating Systems Principles, p.121-136.

[23]Jonglez B, Heusse M, Gaujal B, et al., 2020. SRPT-ECF: challenging Round-Robin for stream-aware multipath scheduling. Proc IFIP Networking Conf, p.719-724.

[24]Kim J, Dally WJ, Towles B, et al., 2005. Microarchitecture of a high radix router. Proc 32nd Int Symp on Computer Architecture, p.420-431.

[25]Langley A, Riddoch A, Wilk A, et al., 2017. The QUIC transport protocol: design and Internet-scale deployment. Proc Conf of the ACM Special Interest Group on Data Communication, p.183-196.

[26]Li BJ, Ruan ZY, Xiao WC, et al., 2017. KV-Direct: high-performance in-memory key-value store with programmable NIC. Proc 26th Symp on Operating Systems Principles, p.137-152.

[27]Li YJ, Liu IJ, Yuan YF, et al., 2019. Accelerating distributed reinforcement learning with in-switch computing. Proc ACM/IEEE 46th Annual Int Symp on Computer Architecture, p.279-291.

[28]Liao XK, Pang ZB, Wang KF, et al., 2015. High performance interconnect network for Tianhe system. J Comput Sci Technol, 30(2):259-272.

[29]Lim YS, Nahum EM, Towsley D, et al., 2017. ECF: an MPTCP path scheduler to manage heterogeneous paths. Proc 13th Int Conf on Emerging Networking Experiments and Technologies, p.147-159.

[30]Liu Y, Ma Y, Huitema C, et al., 2020. Multipath Extension for QUIC. Internet-Draft: draft-liu-multipath-quic-04. Internet Engineering Task Force.

[31]Liu Y, Ma Y, de Coninck Q, et al., 2022. Multipath Extension for QUIC. Internet-Draft: draft-ietf-quic-multipath-01. Internet Engineering Task Force.

[32]Petrini F, Feng WC, Hoisie A, et al., 2002. The Quadrics network: high-performance clustering technology. IEEE Micro, 22(1):46-57.

[33]Shi X, Wang L, Zhang F, et al., 2020. PStream: priority-based stream scheduling for heterogeneous paths in multipath-QUIC. Proc 29th Int Conf on Computer Communications and Networks, p.1-8.

[34]Song QC, 2019. Mellanox In-Network Computing for AI and the Development with NVIDIA (SHARP-NCCL). Mellanox.

[35]Wang XF, Shi XQ, Su JS, 2008. A TOE-based approach to zero-copy data transmission. Comput Eng Sci, 30(2):135-138 (in Chinese).

[36]Wu JX, 2022. Revolution of the development paradigm of network technology system—network of networks. Telecommun Sci, 38(6):3-12 (in Chinese).

[37]Zheng ZL, Ma YF, Liu YM, et al., 2021. XLINK: QoE-driven multi-path QUIC transport in large-scale video services. Proc ACM SIGCOMM Conf, p.418-432.

[38]Zhu YB, Eran H, Firestone D, et al., 2015. Congestion control for large-scale RDMA deployments. ACM SIGCOMM Comput Commun Rev, 45(4):523-536.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2023 Journal of Zhejiang University-SCIENCE