Full Text:   <2734>

Summary:  <1594>

CLC number: TP316

On-line Access: 2018-04-09

Received: 2016-08-16

Revision Accepted: 2016-11-07

Crosschecked: 2018-02-15

Cited: 0

Clicked: 6762

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Kai Lu

http://orcid.org/0000-0002-8798-2195

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2018 Vol.19 No.2 P.192-205

http://doi.org/10.1631/FITEE.1601477


Versionized process based on non-volatile random-access memory for fine-grained fault tolerance


Author(s):  Wen-zhe Zhang, Kai Lu, Xiao-ping Wang

Affiliation(s):  Science and Technology on Parallel and Distributed Processing Laboratory, College of Computer, National University of Defense Technology, Changsha 410073, China

Corresponding email(s):   lukainudt@163.com

Key Words:  Non-volatile memory, Byte-persistence, Versionized process, Version number


Wen-zhe Zhang, Kai Lu, Xiao-ping Wang. Versionized process based on non-volatile random-access memory for fine-grained fault tolerance[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(2): 192-205.

@article{title="Versionized process based on non-volatile random-access memory for fine-grained fault tolerance",
author="Wen-zhe Zhang, Kai Lu, Xiao-ping Wang",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="2",
pages="192-205",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601477"
}

%0 Journal Article
%T Versionized process based on non-volatile random-access memory for fine-grained fault tolerance
%A Wen-zhe Zhang
%A Kai Lu
%A Xiao-ping Wang
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 2
%P 192-205
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601477

TY - JOUR
T1 - Versionized process based on non-volatile random-access memory for fine-grained fault tolerance
A1 - Wen-zhe Zhang
A1 - Kai Lu
A1 - Xiao-ping Wang
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 2
SP - 192
EP - 205
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601477


Abstract: 
Non-volatile random-access memory (NVRAM) technology is maturing rapidly and its byte-persistence feature allows the design of new and efficient fault tolerance mechanisms. In this paper we propose the versionized process (VerP), a new process model based on NVRAM that is natively non-volatile and fault tolerant. We introduce an intermediate software layer that allows us to run a process directly on NVRAM and to put all the process states into NVRAM, and then propose a mechanism to versionize all the process data. Each piece of the process data is given a special version number, which increases with the modification of that piece of data. The version number can effectively help us trace the modification of any data and recover it to a consistent state after a system crash. Compared with traditional checkpoint methods, our work can achieve fine-grained fault tolerance at very little cost.

基于非易失存储器的版本化进程细粒度容错

概要:新型非易失存储器(NVRAM)提供的字节粒度持久且非易失新特性,将有力支持新型容错技术的设计。提出一个基于NVRAM的新型容错进程模型--版本化进程(versionized process,VerP)。该进程模型通过在传统软硬件之间引入一个软件中间层,将软硬件解耦合,在NVRAM上重新组织进程所有数据,从而支持进程在NVRAM的天然容错。进一步,赋予进程中每个数据一个版本号,通过更新版本号实现进程非易失数据的一致性更新。与传统检查点机制相比,VerP可高效支持细粒度容错。

关键词:非易失存储器;字节粒度持久;版本化进程;版本号

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Adiga NR, Almasi G, Bright AA, et al., 2002. An overview of the Bluegene/L supercomputer. Proc ACM/IEEE Conf on Supercomputing, p.60.

[2]Badam A, 2013. How persistent memory will change software systems. Computer, 46(8):45-51.

[3]Bailey K, Ceze L, Gribble SD, et al., 2011. Operating system implications of fast, cheap, non-volatile memory. Proc 13th Usenix Conf on Hot Topics in Operating Systems, p.2.

[4]Coburn J, Caulfield AM, Akel A, et al., 2011. NV-Heaps: making persistent objects fast and safe with next-generation, non-volatile memories. ACM SIGARCH Comput Archit News, 39(1):105-118.

[5]D’Amorim M, Rosu G, 2005. An equational specification for the scheme language. J Univ Comput, 11(7):1327-1348.

[6]Dong X, Xie Y, Muralimanohar N, et al., 2011. Hybrid checkpointing using emerging nonvolatile memories for future exascale system. ACM Trans Archit Code Optim, 8(2), Article 6.

[7]Dulloor SR, Kumar S, Keshavamurthy A, et al., 2014. System software for persistent memory. Proc 9th European Conf on Computer Systems, p.15.

[8]Guerraoui R, Trigonakis V, 2016. Optimistic concurrency with OPTIK. ACM SIGPLAN Symp on Principles and Practice of Parallel Programming, p.197-211.

[9]Kannan S, Gavrilovska A, Schwan K, et al., 2013. Optimizing checkpoints using NVM as virtual memory. IEEE 27th Int Symp on Parallel & Distributed Processing, p.29-40.

[10]Larkin J, Fahey M, 2007. Guidelines for efficient parallel I/O on the cray XT3/XT4. Proc Cray User Group.

[11]Liang S, Bracha G, 2000. Dynamic class loading in the Java virtual machine. ACM SIGPLAN Not, 33(10):36-44.

[12]Liang Y, Zhang Y, Sivasubramaniam A, et al., 2006. Bluegene/L failure analysis and prediction models. Int Conf on Dependable Systems and Networks, p.425-434.

[13]Liang Y, Zhang Y, Xiong H, et al., 2007. Failure prediction in IBM Bluegene/L event logs. 7th IEEE Int Conf on Data Mining, p.583-588.

[14]Lu X, Wang H, Wang J, et al., 2013. Internet-based virtual computing environment: beyond the data center as a computer. Fut Gener Comput Syst, 29(1):309-322.

[15]Luk CK, Cohn R, Muth R, et al., 2005. Pin: building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Conf on Programming Language Design and Implementation, p.190-200.

[16]Oliphant TE, 2007. Python for scientific computing. Comput Sci Eng, 9(3):10-20.

[17]Qureshi MK, Franceschini MM, Jagmohan A, et al., 2012. PreSET: improving performance of phase change memories by exploiting asymmetry in write times. 39th Annual Int Symp on Computer Architecture, p.380-391.

[18]Rhodes C, Costanza P, D’Hondt T, et al., 2007. Lisp. Conf on Object-Oriented Technology, p.1-6.

[19]Surhone LM, Timpledon M, Marseken SF, et al., 2010. TinyScheme. Betascript Publishing.

[20]Uhlig R, Neiger G, Rodger D, et al., 2005. Intel virtualization technology. Computer, 38(5):48-56.

[21]Vallée-Rai R, Gagnon E, Hendren L, et al., 2000. Optimizing Java bytecode using the soot framework: is it feasible? Int Conf on Compiler Construction, p.18-34.

[22]Venkataraman S, Tolia N, Ranganathan P, et al., 2011. Consistent and durable data structures for non-volatile byte-addressable memory. Usenix Conf on File and Stroage Technologies, p.61-75.

[23]Volos H, Tack AJ, Swift MM, 2011. Mnemosyne: lightweight persistent memory. ACM SIGARCH Comput Archit News, 39(1):91-104.

[24]Volos H, Nalli S, Panneerselvam S, et al., 2014. Aerie: flexible file-system interfaces to storage-class memory. Proc 9th European Conf on Computer Systems, p.1-14.

[25]Wong HSP, Raoux S, Kim SB, et al., 2010. Phase change memory. Proc IEEE, 98(12):2201-2227.

[26]Yang X, Wang Z, Xue J, et al., 2012. The reliability wall for exascale supercomputing. IEEE Trans Comput, 61(6):767-779.

[27]Zhang WZ, Kai L, Luján M, et al., 2017. Fine-grained checkpoint based on non-volatile memory. Front Inform Technol Electron Eng, 18(2):220-234.

[28]Zhou P, Zhao B, Yang J, et al., 2009. A durable and energy efficient main memory using phase change memory technology. ACM SIGARCH Comput Archit News, 37(3):14-23.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE