CLC number: TP333
On-line Access: 2023-05-31
Received: 2022-10-15
Revision Accepted: 2023-05-31
Crosschecked: 2023-02-12
Cited: 0
Clicked: 1619
Citations: Bibtex RefMan EndNote GB/T7714
Yaofeng TU, Rong XIAO, Yinjun HAN, Zhenghua CHEN, Hao JIN, Xuecheng QI, Xinyuan SUN. DDUC: an erasure-coded system with decoupled data updating and coding[J]. Frontiers of Information Technology & Electronic Engineering, 2023, 24(5): 716-730.
@article{title="DDUC: an erasure-coded system with decoupled data updating and coding",
author="Yaofeng TU, Rong XIAO, Yinjun HAN, Zhenghua CHEN, Hao JIN, Xuecheng QI, Xinyuan SUN",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="24",
number="5",
pages="716-730",
year="2023",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200466"
}
%0 Journal Article
%T DDUC: an erasure-coded system with decoupled data updating and coding
%A Yaofeng TU
%A Rong XIAO
%A Yinjun HAN
%A Zhenghua CHEN
%A Hao JIN
%A Xuecheng QI
%A Xinyuan SUN
%J Frontiers of Information Technology & Electronic Engineering
%V 24
%N 5
%P 716-730
%@ 2095-9184
%D 2023
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200466
TY - JOUR
T1 - DDUC: an erasure-coded system with decoupled data updating and coding
A1 - Yaofeng TU
A1 - Rong XIAO
A1 - Yinjun HAN
A1 - Zhenghua CHEN
A1 - Hao JIN
A1 - Xuecheng QI
A1 - Xinyuan SUN
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 24
IS - 5
SP - 716
EP - 730
%@ 2095-9184
Y1 - 2023
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200466
Abstract: In distributed storage systems, replication and erasure code (EC) are common methods for data redundancy. Compared with replication, EC has better storage efficiency, but suffers higher overhead in update. Moreover, consistency and reliability problems caused by concurrent updates bring new challenges to applications of EC. Many works focus on optimizing the EC solution, including algorithm optimization, novel data update method, and so on, but lack the solutions for consistency and reliability problems. In this paper, we introduce a storage system that decouples data updating and EC encoding, namely, decoupled data updating and coding (DDUC), and propose a data placement policy that combines replication and parity blocks. For the (N,M) EC system, the data are placed as N groups of M+1 replicas, and redundant data blocks of the same stripe are placed in the parity nodes, so that the parity nodes can autonomously perform local EC encoding. Based on the above policy, a two-phase data update method is implemented in which data are updated in replica mode in phase 1, and the EC encoding is done independently by parity nodes in phase 2. This solves the problem of data reliability degradation caused by concurrent updates while ensuring high concurrency performance. It also uses persistent memory (PMem) hardware features of the byte addressing and eight-byte atomic write to implement a lightweight logging mechanism that improves performance while ensuring data consistency. Experimental results show that the concurrent access performance of the proposed storage system is 1.70–3.73 times that of the state-of-the-art storage system Ceph, and the latency is only 3.4%–5.9% that of Ceph.
[1]Aguilera MK, Janakiraman R, Xu LH, 2005a. On the erasure recoverability of MDS codes under concurrent updates. Proc IEEE Int Symp on Information Theory, p.1358-1362.
[2]Aguilera MK, Janakiraman R, Xu LH, 2005b. Using erasure codes efficiently for storage in a distributed system. Int Conf on Dependable Systems and Networks, p.336-345.
[3]Chan JCW, Ding Q, Lee PPC, et al., 2014. Parity logging with reserved space: towards efficient updates and recovery in erasure-coded clustered storage. Proc 12th USENIX Conf on File and Storage Technologies, p.163-176.
[4]Ghemawat S, Gobioff H, Leung ST, 2003. The Google file system. Proc 19th ACM Symp on Operating Systems Principles, p.29-43.
[5]Gong GW, Shen ZR, Wu SZ, et al., 2021. Optimal rack-coordinated updates in erasure-coded data centers. 40th IEEE Conf on Computer Communications, p.1-10.
[6]Huang C, Simitci H, Xu YK, et al., 2012. Erasure coding in Windows Azure Storage. Proc USENIX Annual Technical Conf, p.2.
[7]Huang JZ, Xia J, Qin X, et al., 2019. Optimization of small updates for erasure-coded in-memory stores. Comput J, 62(6):869-883.
[8]Jiang TY, Zhang GY, Huang ZC, et al., 2021. FusionRAID: achieving consistent low latency for commodity SSD arrays. 19th USENIX Conf on File and Storage Technologies, p.355-370.
[9]Konwar KM, Prakash N, Lynch N, et al., 2017. A layered architecture for erasure-coded consistent distributed storage. Proc ACM Symp on Principles of Distributed Computing, p.63-72.
[10]Li HB, Zhang YM, Zhang ZM, et al., 2017. PARIX: speculative partial writes in erasure-coded systems. Proc USENIX Annual Technical Conf, p.581-587.
[11]Liu YJ, Wei B, Wu JG, et al., 2021. Erasure-coded multi-block updates based on hybrid writes and common XORs first. 39th Int Conf on Computer Design, p.472-479.
[12]Meng YL, Zhang LL, Xu D, et al., 2019. A dynamic erasure code based on block code. Proc Int Conf on Embedded Wireless Systems and Networks, p.379-383.
[13]Ousterhout J, Agrawal P, Erickson D, et al., 2010. The case for RAMClouds: scalable high-performance storage entirely in DRAM. ACM SIGOPS Oper Syst Rev, 43(4):92-105.
[14]Peter K, Reinefeld A, 2012. Consistency and fault tolerance for erasure-coded distributed storage systems. Proc 5th Int Workshop on Data-Intensive Distributed Computing, p.23-32.
[15]Pless V, 1998. Introduction to the Theory of Error-Correcting Codes (3rd Ed.). John Wiley & Sons, Hoboken, USA.
[16]Pu WJ, Chen NJ, Zhong QW, 2020. SDCUP: software-defined-control based erasure-coded collaborative data update mechanism. IEEE Access, 8:180646-180660.
[17]Rizzo L, 1997. Effective erasure codes for reliable computer communication protocols. ACM SIGCOMM Comput Commun Rev, 27(2):24-36.
[18]Shen ZR, Lee PPC, 2018. Cross-rack-aware updates in erasure-coded data centers. Proc 47th Int Conf on Parallel Processing, Article 80.
[19]Wang F, Tang YJ, Xie YW, et al., 2019. XORInc: optimizing data repair and update for erasure-coded systems with XOR-based in-network computation. Proc 35th Symp on Mass Storage Systems and Technologies, p.244-256.
[20]Wang YJ, Pei XQ, Ma XK, et al., 2018. TA-update: an adaptive update scheme with tree-structured transmission in erasure-coded storage systems. IEEE Trans Parall Distrib Syst, 29(8):1893-1906.
[21]Weatherspoon H, Kubiatowicz JD, 2002. Erasure coding vs. replication: a quantitative comparison. 1st Int Workshop on Peer-to-Peer Systems, p.328-337.
[22]Xiong YL, Zhou J, Su L, et al., 2021. ECCH: erasure coded consistent hashing for distributed storage systems. IEEE Int Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking, p.177-184.
Open peer comments: Debate/Discuss/Question/Opinion
<1>