Full Text:   <4073>

Summary:  <302>

CLC number: TP315

On-line Access: 2022-06-17

Received: 2021-01-09

Revision Accepted: 2022-07-05

Crosschecked: 2021-02-14

Cited: 0

Clicked: 4765

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Mingtian SHAO

https://orcid.org/0000-0003-2368-4946

Kai LU

https://orcid.org/0000-0002-6378-7002

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2022 Vol.23 No.6 P.845-857

http://doi.org/10.1631/FITEE.2100016


Self-deployed execution environment for high performance computing


Author(s):  Mingtian SHAO, Kai LU, Wenzhe ZHANG

Affiliation(s):  College of Computer, National University of Defense Technology, Changsha 410073, China

Corresponding email(s):   shaomt@nudt.edu.cn, lukainudt@163.com, zhangwenzhe@nudt.edu.cn

Key Words:  Execution environment, High performance computing, Light-weight, Isolation, Overlay


Mingtian SHAO, Kai LU, Wenzhe ZHANG. Self-deployed execution environment for high performance computing[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(6): 845-857.

@article{title="Self-deployed execution environment for high performance computing",
author="Mingtian SHAO, Kai LU, Wenzhe ZHANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="6",
pages="845-857",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2100016"
}

%0 Journal Article
%T Self-deployed execution environment for high performance computing
%A Mingtian SHAO
%A Kai LU
%A Wenzhe ZHANG
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 6
%P 845-857
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2100016

TY - JOUR
T1 - Self-deployed execution environment for high performance computing
A1 - Mingtian SHAO
A1 - Kai LU
A1 - Wenzhe ZHANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 6
SP - 845
EP - 857
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2100016


Abstract: 
Traditional high performance computing (HPC) systems provide a standard preset environment to support scientific computation. However, HPC development needs to provide support for more and more diverse applications, such as artificial intelligence and big data. The standard preset environment can no longer meet these diverse requirements. If users still run these emerging applications on HPC systems, they need to manually maintain the specific dependencies (libraries, environment variables, and so on) of their applications. This increases the development and deployment burden for users. Moreover, the multi-user mode brings about privacy problems among users. Containers like Docker and Singularity can encapsulate the job's execution environment, but in a highly customized HPC system, cross-environment application deployment of Docker and Singularity is limited. The introduction of container images also imposes a maintenance burden on system administrators. Facing the above-mentioned problems, in this paper we propose a self-deployed execution environment (SDEE) for HPC. SDEE combines the advantages of traditional virtualization and modern containers. SDEE provides an isolated and customizable environment (similar to a virtual machine) to the user. The user is the root user in this environment. The user develops and debugs the application and deploys its special dependencies in this environment. Then the user can load the job to compute nodes directly through the traditional HPC job management system. The job and its dependencies are analyzed, packaged, deployed, and executed automatically. This process enables transparent and rapid job deployment, which not only reduces the burden on users, but also protects user privacy. Experiments show that the overhead introduced by SDEE is negligible and lower than those of both Docker and Singularity.

面向高性能计算的自部署运行环境

邵明天,卢凯,张文喆
国防科技大学计算机学院,中国长沙市,410073
摘要:传统高性能计算系统提供了标准的预置环境来支持科学计算。然而,高性能计算的发展需要为越来越多应用提供支持,如人工智能、大数据等。标准的预设环境已无法满足这些多样化的要求。如果用户仍然在高性能计算系统上运行这些新兴应用程序,他们需要手动维护应用程序的特定依赖项(库、环境变量等),这增加了用户的开发和部署负担。此外,多用户模式也带来用户之间的隐私问题。像Docker和Singularity这样的容器可以封装作业的执行环境,但是在高度定制的高性能计算系统中,Docker和Singularity的跨环境应用部署是受限的。容器镜像的引入也给系统管理员增加了维护负担。针对上述问题,本文提出一种适用于高性能计算的自部署执行环境(SDEE)。SDEE结合了传统虚拟化和现代容器的优点,为用户提供了一个独立、可定制的环境(类似于虚拟机)。该用户是此环境中的根用户。用户开发和调试应用程序,并在此环境中部署其特殊的依赖项,然后可以通过传统高性能计算系统的作业管理系统,直接将作业加载到计算节点。作业及其依赖项将被自动分析、打包、部署和执行。该过程实现了透明、快速的作业部署,不仅减轻了用户负担,而且保护了用户隐私。实验表明,SDEE引入的开销可忽略不计,比Docker和Singularity都要低。

关键词:运行环境;高性能计算;轻量化;隔离;Overlay

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Azginoglu N, Atasever MU, Aydin Z, et al., 2017. Open source slurm computer cluster system design and a sample application. Proc Int Conf on Computer Science and Engineering, p.403-406.

[2]Bailey DH, Harris T, Saphir W, et al., 1995. The NAS Parallel Benchmarks 2.0. Technical Report.

[3]Belkin M, Haas R, Arnold GW, et al., 2018. Container solutions for HPC systems: a case study of using shifter on blue waters. Proc Practice and Experience on Advanced Research Computing, p.1-8.

[4]Bernstein D, 2014. Containers and cloud: from LXC to Docker to Kubernetes. IEEE Cloud Comput, 1(3):81-84.

[5]Beserra D, Moreno ED, Endo PT, et al., 2015. Performance analysis of LXC for HPC environments. Proc 9th Int Conf on Complex, Intelligent, and Software Intensive Systems, p.358-363.

[6]Biederman EW, Networx L, 2006. Multiple instances of the global linux namespaces. Proc Linux Symp, p.101-112.

[7]Boettiger C, 2015. An introduction to Docker for reproducible research. ACM SIGOPS Oper Syst Rev, 49(1):71-79.

[8]Casalicchio E, Perciballi V, 2017. Measuring Docker performance: what a mess!!! Proc 8th ACM/SPEC Int Conf on Performance Engineering Companion, p.11-16.

[9]Che JH, Shi CC, Yu Y, et al., 2010. A synthetical performance evaluation of OpenVZ, Xen and KVM. Proc IEEE Asia-Pacific Services Computing Conf, p.587-594.

[10]Christer E, 2012. Simple Linux Utility for Resource Management. Platform LSF. Technical Report.

[11]Feng HH, Misra V, Rubenstein D, 2007. PBS: a unified priority-based scheduler. Proc ACM SIGMETRICS Int Conf on Measurement and Modeling of Computer Systems, p.203-214.

[12]Gantikow H, Klingberg S, Reich C, 2015. Container-based virtualization for HPC. Int Conf on Cloud Computing and Services Science, p.543-550.

[13]Georgiou Y, Hautreux M, 2013. Evaluating scalability and efficiency of the resource and job management system on large HPC clusters. Workshop on Job Scheduling Strategies for Parallel Processing, p.134-156.

[14]Gerhardt L, Bhimji W, Canon S, et al., 2017. Shifter: containers for HPC. J Phys Conf Ser, 898:082021.

[15]Godlove D, 2019. Singularity: simple, secure containers for compute-driven workloads. Proc Practice and Experience in Advanced Research Computing on Rise of the Machines (Learning), p.1-4.

[16]Hale JS, Li LZ, Richardson CN, et al., 2017. Containers for portable, productive, and performant scientific computing. Comput Sci Eng, 19(6):40-50.

[17]Herbein S, Dusia A, Landwehr A, et al., 2016. Resource management for running HPC applications in container clouds. Int Conf on High Performance Computing, p.261-278.

[18]Huang Z, Wu S, Jiang S, et al., 2019. FastBuild: Accelerating Docker image building for efficient development and deployment of container. Proc 35th Symp on Mass Storage Systems and Technologies, p.28-37.

[19]Kopytov A, 2012. SysBench Manual. MySQL AB.

[20]Kovari A, Dukan P, 2012. KVM & OpenVZ virtualization based IaaS open source cloud virtualization platforms: OpenNode, Proxmox VE. Proc IEEE 10th Jubilee Int Symp on Intelligent Systems and Informatics, p.335-339.

[21]Kurtzer GM, Sochat V, Bauer MW, 2017. Singularity: scientific containers for mobility of compute. PLOS ONE, 12(5):e0177459.

[22]Kwon S, Lee JH, 2020. DIVDS: Docker image vulnerability diagnostic system. IEEE Access, 8:42666-42673.

[23]Lingayat A, Badre RR, Kumar Gupta A, 2018. Performance evaluation for deploying Docker containers on baremetal and virtual machine. Proc 3rd Int Conf on Communication and Electronics Systems, p.1019-1023.

[24]Manco F, Lupu C, Schmidt F, et al., 2017. My VM is lighter (and safer) than your container. Proc 26th Symp on Operating Systems Principles, p.218-233.

[25]Merkel D, 2014. Docker: lightweight linux containers for consistent development and deployment. Linux J, 2014(239):2.

[26]Mizusawa N, Nakazima K, Yamaguchi S, 2017. Performance evaluation of file operations on OverlayFS. Proc 5th Int Symp on Computing and Networking, p.597-599.

[27]Rosen R, 2013. Resource Management: Linux Kernel Namespaces and Cgroups. Huafix. Technical Report.

[28]Saha P, Beltre A, Uminski P, et al., 2018. Evaluation of Docker containers for scientific workloads in the cloud. Proc Practice and Experience on Advanced Research Computing, p.1-8.

[29]Wang B, Chen ZG, Xiao N, 2020. A survey of system scheduling for HPC and big data. Proc 4th Int Conf on High Performance Compilation, Computing and Communications, p.178-183.

[30]Wang K, Zhou XB, Chen H, et al., 2014. Next generation job management systems for extreme-scale ensemble computing. Proc 23rd Int Symp on High-Performance Parallel and Distributed Computing, p.111-114.

[31]Wright CP, Dave J, Gupta P, et al., 2006. Versatility and Unix semantics in namespace unification. ACM Trans Stor, 2(1):74-105.

[32]Xavier MG, Neves MV, Rossi FD, et al., 2013. Performance evaluation of container-based virtualization for high performance computing environments. Proc 21st Euromicro Int Conf on Parallel, Distributed, and Network-Based Processing, p.233-240.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE