|
|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2026 Vol.27 No.3 P.1-20
FTHOE: a Hamiltonian-driven fault-tolerant routing algorithm for wafer-scale interconnection networks
Abstract: As application scenarios continue to grow in complexity, wafer-scale systems impose increasingly stringent requirements on the reliability of interconnection networks. Under inevitable process-induced manufacturing defects and environmental disturbances, node and link faults occur frequently in wafer-scale interconnection networks, making fault tolerance a key factor in improving overall system reliability. To address chiplet node faults and link faults in wafer-scale interconnection networks, this paper proposes a load-balancing virtual-channel-less fault-tolerant routing algorithm, termed FTHOE. The proposed algorithm is based on a Hamiltonian routing strategy and the odd–even turn model. By exploiting local fault vector information at the current node, FTHOE dynamically adjusts the output port selection priority, thereby shortening detour paths around faulty regions while effectively reducing the probability of packets being trapped in fault neighborhoods. At the same time, FTHOE preserves a relatively high degree of minimal path diversity by retaining the adaptiveness of Hamiltonian-based routing under fault conditions, thereby enhancing network load-balancing and overall communication performance. Simulation results demonstrate that, compared with existing fault-tolerant routing algorithms, FTHOE significantly reduces average network latency and improves throughput, exhibiting robust fault tolerance and load-balancing performance under complex fault scenarios.
Key words: Wafer-scale system; Fault-tolerant; Hamiltonian path; Odd–even turn model; Load balancing
1信息工程大学,中国郑州市,450001
2复旦大学大数据研究院,中国上海市,200433
摘要:随着应用场景日益复杂,晶圆级系统对互连网络可靠性提出愈发严苛的要求。在不可避免的工艺制造缺陷和环境干扰下,晶圆级互连网络中节点和链路故障频发,使得容错能力成为提升系统整体可靠性关键因素。针对晶圆级互连网络中的芯片粒节点故障和链路故障,本文提出一种名为FTHOE的负载均衡无虚通道容错路由算法。该算法基于哈密顿路由策略和奇偶转向模型,通过利用当前节点的本地故障向量信息,动态调整输出端口选择优先级,从而在绕开故障区域时缩短迂回路径,并有效降低数据包陷入故障邻域的概率。同时,FTHOE在故障条件下保留了哈密顿路由的自适应特性,维持较高的最短路径多样性,进而增强网络负载均衡能力与整体通信性能。仿真结果表明,与现有容错路由算法相比,FTHOE显著降低了平均网络延迟并提高了吞吐量,在复杂故障场景下展现出鲁棒的容错能力和负载均衡性能。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/ENG.ITEE.2025.0005
CLC number:
TP393.03
Download Full Text:
Downloaded:
11
Download summary:
<Click Here>Downloaded:
9Clicked:
26
Cited:
0
On-line Access:
2026-03-23
Received:
2025-08-29
Revision Accepted:
2026-02-02
Crosschecked:
2026-03-23