CLC number:
On-line Access: 2024-11-05
Received: 2024-06-26
Revision Accepted: 2024-09-18
Crosschecked: 0000-00-00
Cited: 0
Clicked: 127
Jiajia JIAO, Ran WEN, Hong YANG. An end-to-end automatic methodology to accelerate the accuracy evaluation of DNN under hardware transient faults[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .
@article{title="An end-to-end automatic methodology to accelerate the accuracy evaluation of DNN under hardware transient faults",
author="Jiajia JIAO, Ran WEN, Hong YANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2400547"
}
%0 Journal Article
%T An end-to-end automatic methodology to accelerate the accuracy evaluation of DNN under hardware transient faults
%A Jiajia JIAO
%A Ran WEN
%A Hong YANG
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2400547
TY - JOUR
T1 - An end-to-end automatic methodology to accelerate the accuracy evaluation of DNN under hardware transient faults
A1 - Jiajia JIAO
A1 - Ran WEN
A1 - Hong YANG
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2400547
Abstract: hardware transient faults are proven to have a significant impact on deep Neural Networks (DNNs), whose safety-critical-misclassification probabilities in autonomous vehicles, healthcare, and space applications are increased up to 4x. However, the inaccuracy evaluation using accurate fault injection is time-consuming and requires several hours and even a couple of days on a complete simulation platform. To accelerate the evaluation of hardware transient faults on DNNs, we design a unified and end-to-end automatic methodology, A-Mean, to take advantage of the silent data corruption (SDC) rates of basic operations, such as convolution, add, multiply, Relu, Maxpooling, etc., and a two-level mean mechanism to rapidly compute the overall SDC rate for estimating the general classification metric, accuracy and application-specific metric safety-critical-misclassification (SCM). More importantly, a max policy is used to determine the SDC boundary of non-sequential structures in DNNs. Then, the worst-case scheme is also used to further calculate the enlarged SCM and halved accuracy under transient faults via merging the static results of SDC with the original data from one-time dynamic fault-free execution. Furthermore, all of the steps mentioned above have been implemented automatically so that this easy-to-use automatic tool can be employed for the prompt evaluation of transient faults on diverse DNNs. Meanwhile, a novel metric fault sensitivity is defined to jointly characterize the variation of transient fault-induced higher SCM and lower accuracy. The comparative results with a state-of-the-art fault injection method on five DNN models and four datasets show that our proposed estimation method A-Mean achieves up to 922.80x speedup, with just 4.20% SCM loss and 0.77% accuracy loss on average.
Open peer comments: Debate/Discuss/Question/Opinion
<1>