Journal of Zhejiang University

ENGINEERING Information Technology & Electronic Engineering 2026 Vol.27 No.2 P.1-16

http://doi.org/10.1631/ENG.ITEE.2025.0152

GC bypass: decoupling GC from the flash translation layer to eliminate GC-induced long-tail latency inside SSD

Author(s): Shiqiang NIE, Jie NIU, Yingzhao SHAO, Xiaobo LI, Mingming ZHANG, Weiguo WU
Affiliation(s): 1. School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China more
Corresponding email(s): wgwu@xjtu.edu.cn
Key Words: Solid-state drive (SSD), NAND flash, Garbage collection (GC), Interconnected network, Flash channel

Share this article to： More <<< Previous Article \|Next Article >>>

Shiqiang NIE, Jie NIU, Yingzhao SHAO, Xiaobo LI, Mingming ZHANG, Weiguo WU. GC bypass: decoupling GC from the flash translation layer to eliminate GC-induced long-tail latency inside SSD[J]. Journal of Zhejiang University Science C, 2026, 27(2): 1-16.

@article{title="GC bypass: decoupling GC from the flash translation layer to eliminate GC-induced long-tail latency inside SSD",
author="Shiqiang NIE, Jie NIU, Yingzhao SHAO, Xiaobo LI, Mingming ZHANG, Weiguo WU",
journal="Journal of Zhejiang University Science C",
volume="27",
number="2",
pages="1-16",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/ENG.ITEE.2025.0152"
}

%0 Journal Article
%T GC bypass: decoupling GC from the flash translation layer to eliminate GC-induced long-tail latency inside SSD
%A Shiqiang NIE
%A Jie NIU
%A Yingzhao SHAO
%A Xiaobo LI
%A Mingming ZHANG
%A Weiguo WU
%J Frontiers of Information Technology & Electronic Engineering
%V 27
%N 2
%P 1-16
%@ 1869-1951
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/ENG.ITEE.2025.0152

TY - JOUR
T1 - GC bypass: decoupling GC from the flash translation layer to eliminate GC-induced long-tail latency inside SSD
A1 - Shiqiang NIE
A1 - Jie NIU
A1 - Yingzhao SHAO
A1 - Xiaobo LI
A1 - Mingming ZHANG
A1 - Weiguo WU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 27
IS - 2
SP - 1
EP - 16
%@ 1869-1951
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/ENG.ITEE.2025.0152

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: NAND flash-based solid-state drives (SSDs) have been adopted by many data centers due to their high performance and low power consumption. However, the physical characteristics of the underlying flash memory necessitate garbage collection (GC) operations. Valid page migration during GC contributes significantly to latency overhead while competing for flash channel bandwidth and controller resources with user I/O requests through shared physical paths, leading to path conflicts and elevated long-tail latency. The existing Venice scheme introduces a low-cost interconnected network with path reservation mechanisms to provide substantial path diversity for SSDs. Nevertheless, its fair scheduling policy lacks priority differentiation between I/O and GC requests. In this paper, we propose GC bypass, which leverages Venice’s path diversity while enforcing GC request transmission through dedicated controllers. GC bypass decomposes GC requests into sub-requests and assigns low priority to valid page writes, enabling high-priority operations including user I/O, valid page reads, and block erases, to preempt paths reserved by low-priority requests. Valid pages failing to secure reserved paths are temporarily buffered for retry. Experimental results demonstrate that GC bypass reduces the 99.99^th percentile long-tail latency by up to 25% compared to Venice. GC bypass effectively mitigates interference between critical I/O operations and background maintenance tasks while maintaining the architectural benefits of path diversity.

抑制固态硬盘垃圾回收长尾延迟的GC和主控解耦合设计方法研究

聂世强¹，牛洁¹，邵应昭²，李晓博²，张茗茗²，伍卫国¹
¹西安交通大学计算机科学与技术学院，中国西安市，710049
²中国空间技术研究院智能计算中心，中国西安市，710000
摘要：基于NAND闪存的固态硬盘（SSD）因其高性能和低功耗的特点，被数据中心广泛使用。然而，闪存的物理特性会引起垃圾回收（GC）操作。GC过程中的有效页迁移与用户I/O请求竞争闪存通道带宽和控制器资源，导致路径冲突并引发长尾延迟。现有的Venice方案引入一种低成本互连网络，并通过路径预留机制为SSD提供较大的路径多样性。然而，其公平调度策略缺乏对I/O请求与GC请求的优先级区分。本文提出一种GC旁路机制，其充分利用Venice的路径多样性，同时强制GC请求通过专用控制器进行传输。GC旁路将GC请求分解为子请求，并为有效页的写入赋予低优先级，从而使高优先级操作--包括用户I/O、有效页读取以及块擦除--能够抢占被低优先级请求预留的路径；未能获得预留路径的有效页将被临时缓冲以待重试。实验结果表明，与Venice相比，GC旁路可将99.99百分位长尾延迟降低高达25%。GC旁路有效缓解了关键I/O操作与后台维护任务之间的干扰，同时保留了路径多样性带来的架构优势。

关键词：固态硬盘；NAND闪存；垃圾回收；互连网络；闪存通道

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Balasubramonian R, Kahng AB, Muralimanohar N, et al., 2017. CACTI 7: new tools for interconnect exploration in innovative off-chip memories. ACM Trans Archit Code Optim, 14(2):14.

[2]Cui JH, Chen FY, Li L, et al., 2024. SmartNetSSD: exploiting path resources for read performance improvement in network-based SSDs. IEEE 42^nd Int Conf on Computer Design, p.356-359.

[3]Gao CM, Shi L, Di YJ, et al., 2017. Exploiting chip idleness for minimizing garbage collection-induced chip access conflict on SSDs. ACM Trans Des Autom Electron Syst, 23(2):15.

[4]Gao CM, Shi L, Li Q, et al., 2020a. Aging capacitor supported cache management scheme for solid-state drives. IEEE Trans Comput Aided Des Integr Circ Syst, 39(10):2230-2239.

[5]Gao CM, Shi L, Liu K, et al., 2020b. Boosting the performance of SSDs via fully exploiting the plane level parallelism. IEEE Trans Parall Distrib Syst, 31(9):2185-2200.

[6]JEDEC Solid State Technology Association (JSST Association), 2024. JESD230G: NAND Flash Interface Interoperability. Arlington, VA, USA.

[7]Kang W, Shin D, Yoo S, 2017. Reinforcement learning-assisted garbage collection to mitigate long-tail latency in SSD. ACM Trans Embed Comput Syst, 16(5s):134.

[8]Kim J, Kang S, Park Y, et al., 2022. Networked SSD: flash memory interconnection network for high-bandwidth SSD. 55^th IEEE/ACM Int Symp on Microarchitecture, p.388-403.

[9]Lee J, Kim Y, Shipman GM, et al., 2013. Preemptible I/O scheduling of garbage collection for solid state drives. IEEE Trans Comput Aided Des Integr Circ Syst, 32(2):247-260.

[10]Li JH, Wang QP, Lee PPC, et al., 2020. An in-depth analysis of cloud block storage workloads in large-scale production. IEEE Int Symp on Workload Characterization, p.37-47.

[11]Mao B, Wu SZ, Duan LD, 2018. Improving the SSD performance by exploiting request characteristics and internal parallelism. IEEE Trans Comput Aided Des Integr Circ Syst, 37(2):472-484.

[12]Nadig R, Sadrosadati M, Mao HY, et al., 2023. Venice: improving solid-state drive parallelism at low cost via conflict-free accesses. Proc 50^th Annual Int Symp on Computer Architecture, p.1-16.

[13]Narayanan D, Thereska E, Donnelly A, et al., 2009. Migrating server storage to SSDs: analysis of tradeoffs. Proc 4^th ACM European Conf on Computer Systems, p.145-158.

[14]Paik JY, Cho ES, Jin RZ, et al., 2018. Selective-delay garbage collection mechanism for read operations in multichannel flash-based storage devices. IEEE Trans Consum Electron, 64(1):118-126.

[15]Qiu YH, Yin WB, Wang LL, 2021. A high-performance open-channel open-way NAND flash controller architecture. 31^st Int Conf on Field-Programmable Logic and Applications, p.91-98.

[16]Ren TY, Du YJ, Cui JH, et al., 2025. Device-level optimization techniques for solid-state drives: a survey.

[17]Sha ZB, Li J, Song LH, et al., 2021. Low I/O intensity-aware partial GC scheduling to reduce long-tail latency in SSDs. ACM Trans Archit Code Optim, 18(4):46.

[18]Tavakkol A, Arjomand M, Sarbazi-Azad H, 2013. Network-on-SSD: a scalable and high-performance communication design paradigm for SSDs. IEEE Comput Arch Lett, 12(1):5-8.

[19]Tavakkol A, Gómez-Luna J, Sadrosadati M, et al., 2018. MQSim: a framework for enabling realistic studies of modern multi-queue SSD devices. 16^th USENIX Conf on File and Storage Technologies, p.49-65.

[20]Wang Y, Sun ZB, Zhou Y, et al., 2024. Balloon-ZNS: constructing high-capacity and low-cost ZNS SSDs with built-in compression. Proc 61^st ACM/IEEE Design Automation Conf, p.125.

[21]Yan SQ, Li HC, Hao MZ, et al., 2017. Tiny-tail flash: near-perfect elimination of garbage collection tail latencies in NAND SSDs. ACM Trans Stor, 13(3):22.

Open peer comments: Debate/Discuss/Question/Opinion

<1>