CLC number:
On-line Access: 2022-11-16
Received: 2022-08-27
Revision Accepted: 2022-10-19
Crosschecked: 0000-00-00
Cited: 0
Clicked: 145
Jianbin FANG, Peng ZHANG, Chun HUANG, Tao TANG, Kai LU, Ruibo WANG, Zheng WANG. Programming bare-metal accelerators with heterogeneous threading models: a case study of matrix-3000[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .
@article{title="Programming bare-metal accelerators with heterogeneous threading models: a case study of matrix-3000",
author="Jianbin FANG, Peng ZHANG, Chun HUANG, Tao TANG, Kai LU, Ruibo WANG, Zheng WANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200359"
}
%0 Journal Article
%T Programming bare-metal accelerators with heterogeneous threading models: a case study of matrix-3000
%A Jianbin FANG
%A Peng ZHANG
%A Chun HUANG
%A Tao TANG
%A Kai LU
%A Ruibo WANG
%A Zheng WANG
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200359
TY - JOUR
T1 - Programming bare-metal accelerators with heterogeneous threading models: a case study of matrix-3000
A1 - Jianbin FANG
A1 - Peng ZHANG
A1 - Chun HUANG
A1 - Tao TANG
A1 - Kai LU
A1 - Ruibo WANG
A1 - Zheng WANG
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200359
Abstract: As the hardware industry moves toward using specialized heterogeneous many-cores to avoid the effects of the power wall, software developers are finding it hard to deal with the complexity of these systems. This article shares our experience of developing a programming model and its supporting compiler and libraries for Matrix3000, which is designed for next-generation exascale supercomputers but has a complex memory hierarchy and processor organization. To assist its software development, we developed a software stack from scratch that includes a low-level programming interface and a high-level OpenCL compiler. Our low-level programming model offers native programming support for using the bare-metal accelerators of Matrix-3000, while the high-level model allows programmers to use the OpenCL programming standard. We detail our design choices and highlight the lessons learned from developing systems software to enable the programming of bare-metal accelerators. Our programming models have been deployed in the production environment of an exascale prototype system.
Open peer comments: Debate/Discuss/Question/Opinion
<1>