|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2022 Vol.23 No.6 P.858-875
SA-RSR: a read-optimal data recovery strategy for XOR-coded distributed storage systems
Abstract: To ensure the reliability and availability of data, redundancy strategies are always required for distributed storage systems. Erasure coding, one of the representative redundancy strategies, has the advantage of low storage overhead, which facilitates its employment in distributed storage systems. Among the various erasure coding schemes, XOR-based erasure codes are becoming popular due to their high computing speed. When a single-node failure occurs in such coding schemes, a process called data recovery takes place to retrieve the failed node's lost data from surviving nodes. However, data transmission during the data recovery process usually requires a considerable amount of time. Current research has focused mainly on reducing the amount of data needed for data recovery to reduce the time required for data transmission, but it has encountered problems such as significant complexity and local optima. In this paper, we propose a random search recovery algorithm, named SA-RSR, to speed up single-node failure recovery of XOR-based erasure codes. SA-RSR uses a simulated annealing technique to search for an optimal recovery solution that reads and transmits a minimum amount of data. In addition, this search process can be done in polynomial time. We evaluate SA-RSR with a variety of XOR-based erasure codes in simulations and in a real storage system, Ceph. Experimental results in Ceph show that SA-RSR reduces the amount of data required for recovery by up to 30.0% and improves the performance of data recovery by up to 20.36% compared to the conventional recovery method.
Key words: Distributed storage system; Data reliability and availability; XOR-based erasure codes; Single-node failure; Data recovery
1西安交通大å¦è®¡ç®—机科å¦ä¸ŽæŠ€æœ¯å¦é™¢ï¼Œä¸å›½è¥¿å®‰å¸‚,710049
2北京电åå·¥ç¨‹æ€»ä½“ç ”ç©¶æ‰€ï¼Œä¸å›½åŒ—京市,100854
摘è¦ï¼šå†—ä½™ç–ç•¥ç»å¸¸è¢«ç”¨äºŽåˆ†å¸ƒå¼å˜å‚¨ç³»ç»Ÿï¼Œä»¥ä¿è¯æ•°æ®çš„å¯é 性与å¯ç”¨æ€§ã€‚çº åˆ ç æ˜¯ä¸€ç§ä»£è¡¨æ€§çš„冗余ç–略,具有低å˜å‚¨å¼€é”€ä¼˜åŠ¿ï¼Œè¿™ç§ä¼˜åŠ¿ä¿ƒè¿›äº†å®ƒåœ¨åˆ†å¸ƒå¼å˜å‚¨ç³»ç»Ÿä¸çš„应用。在å„ç§çº åˆ ç æœºåˆ¶ä¸ï¼Œå¼‚æˆ–ç±»çº åˆ ç å‡å€Ÿé«˜è®¡ç®—效率å˜å¾—è¶Šæ¥è¶Šæµè¡Œã€‚é‡‡ç”¨å¼‚æˆ–ç±»çº åˆ ç æœºåˆ¶çš„å˜å‚¨ç³»ç»Ÿï¼Œå¦‚æžœå‘生å•èŠ‚ç‚¹æ•…éšœï¼Œä¾¿ä¼šè¿›è¡Œæ•°æ®æ¢å¤ï¼Œè¯¥è¿‡ç¨‹éœ€è¦ä»Žå¹¸å˜èŠ‚ç‚¹ä¸ä¸‹è½½æ•°æ®ï¼Œç„¶åŽæ¢å¤æ•…障节点ä¸çš„æ•°æ®ã€‚ç„¶è€Œï¼Œæ•°æ®æ¢å¤è¿‡ç¨‹ä¸çš„æ•°æ®ä¼ 输通常需è¦ç›¸å½“长时间。目å‰ç ”究主è¦é›†ä¸åœ¨é€šè¿‡å‡å°‘æ•°æ®æ¢å¤è¿‡ç¨‹æ‰€éœ€æ•°æ®é‡ï¼Œå‡å°‘æ•°æ®ä¼ 输所需时间,但å˜åœ¨å¤æ‚度高和局部最优解ç‰é—®é¢˜ã€‚本文æå‡ºä¸€ç§éšæœºæœç´¢æ¢å¤ç®—法,SA-RSRï¼Œè¯¥ç®—æ³•èƒ½åŠ é€Ÿå¼‚æˆ–ç±»çº åˆ ç å•节点故障æ¢å¤ã€‚SA-RSRåˆ©ç”¨æ¨¡æ‹Ÿé€€ç«æŠ€æœ¯å¯»æ‰¾è¯»å–å’Œä¼ è¾“æœ€å°‘æ•°æ®é‡çš„æœ€ä¼˜æ¢å¤æœºåˆ¶ï¼Œä¸”该æœç´¢è¿‡ç¨‹å¯åœ¨å¤šé¡¹å¼æ—¶é—´å†…完æˆã€‚最åŽï¼Œä¸ºéªŒè¯è¯¥æ–¹æ³•的有效性,使用多ç§å¼‚æˆ–ç±»çº åˆ ç 进行仿真验è¯ï¼Œå¹¶åœ¨çœŸå®žå˜å‚¨ç³»ç»ŸCephä¸éªŒè¯ã€‚å®žéªŒç»“æžœè¡¨æ˜Žï¼Œä¸Žä¼ ç»Ÿæ¢å¤æ–¹æ³•相比,SA-RSRå‡å°‘了30%的数æ®è¯»å–ä¸Žä¼ è¾“é‡ï¼Œæé«˜äº†20.36%çš„æ•°æ®æ¢å¤æ€§èƒ½ã€‚
关键è¯ç»„:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.2100242
CLC number:
TP391.4
Download Full Text:
Downloaded:
8288
Download summary:
<Click Here>Downloaded:
652Clicked:
7371
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2021-08-16