CLC number: TP314
On-line Access:
Received: 2007-04-18
Revision Accepted: 2007-11-27
Crosschecked: 0000-00-00
Cited: 1
Clicked: 5785
Jian-peng ZHOU, Ce SHI. Efficient SIMD optimization for media processors[J]. Journal of Zhejiang University Science A, 2008, 9(4): 524-530.
@article{title="Efficient SIMD optimization for media processors",
author="Jian-peng ZHOU, Ce SHI",
journal="Journal of Zhejiang University Science A",
volume="9",
number="4",
pages="524-530",
year="2008",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A071203"
}
%0 Journal Article
%T Efficient SIMD optimization for media processors
%A Jian-peng ZHOU
%A Ce SHI
%J Journal of Zhejiang University SCIENCE A
%V 9
%N 4
%P 524-530
%@ 1673-565X
%D 2008
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A071203
TY - JOUR
T1 - Efficient SIMD optimization for media processors
A1 - Jian-peng ZHOU
A1 - Ce SHI
J0 - Journal of Zhejiang University Science A
VL - 9
IS - 4
SP - 524
EP - 530
%@ 1673-565X
Y1 - 2008
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A071203
Abstract: single instruction multiple data (SIMD) instructions are often implemented in modern media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIMD instructions. This paper focuses on SIMD instructions generation for media processors. We present an efficient code optimization approach that is integrated into a retargetable C compiler. SIMD instructions are generated by finding and combining the same operations in programs. Experimental results for the UltraSPARC VIS instruction set show that a speedup factor up to 2.639 is obtained.
[1] Aho, A.V., Sethi, R., Ullman, J.D., 1987. Compilers: Principles, Techniques and Tools. Addison-Wesley Publishing Company.
[2] Allen, R., Kennedy, K., 1987. Automatic translation of Fortran programs to vector form. ACM Trans. on Programming Languages and Systems, 9(4):491-542.
[3] Bik, J.C., Girkar, M., Grey, P.M., Tian, X., 2002. Automatic intra-register vectorization for the Intel® architecture. Int. J. Parallel Programming, 30(2):65-98.
[4] Bulic, P., Gustin, V., 2003. An extended ANSI C for processors with a multimedia extension. Int. J. Parallel Programming, 31(2):107-136.
[5] Fisher, R.J., Dietz, H.G., 1998. Compiling for SIMD within a Register. Proc. 11th Int. Workshop on Languages and Compilers for Parallel Computing, p.209-304.
[6] Fraser, C.W., Hanson, D.R., 1995. A Retargetable C Compiler: Design and Implementation. Addison-Wesley, Menlo Park, CA.
[7] Fraser, C.W., Hanson, D.R., Proebsting, T.A., 1992. Engineering a simple, efficient code generator generator. ACM Lett. on Programming Languages and Systems, 1(3):213-226.
[8] Hohenauer, M., Schumacher, C., Leupers, R., 2006. Retargetable Code Optimization with SIMD Instructions. Proc. 4th Int. Conf. on Hardware/Software Codesign and System Synthesis, p.148-153.
[9] Krall, A., Lelait, S., 2000. Compilation techniques for multimedia processors. Int. J. Parallel Programming, 28(4):347-361.
[10] Larsen, S., Amarasinghe, S., 2000. Exploiting superword level parallelism with multimedia instruction sets. ACM SIGPLAN Notices, 35(5):145-156.
[11] Leupers, R., 2000. Code Selection for Media Processors with SIMD Instructions. Proc. Conf. on Design, Automation and Test in Europe, p.4-8.
[12] Naishlos, D., 2004. Auto-Vectorization in GCC. Free Software Foundation. Http://gcc.gnu.org/projects/treessa/vectorization.html
[13] Osman, S., Williams, R., 2003. Towards Optimal Instruction Vectorization. Http://www.cs.cmu.edu/~sosman/classes/compilers/project/project.ps
[14] Pryanishnikov, I., Krall, A., Horspool, N., 2007. Compiler optimizations for processors with SIMD instructions. Software Practice and Experience, 37(1):93-113.
[15] Ren, G., Wu, P., Padua, D., 2003. A Preliminary Study on the Vectorization of Multimedia Applications for Multimedia Extensions. Proc. 16th Int. Workshop on Languages and Compilers for Parallel Computing. Texas A&M University, p.420-435.
[16] Sreraman, N., Govindarajan, R., 2000. A vectorizing compiler for multimedia extensions. Int. J. Parallel Programming, 28(4):363-400.
[17] Tremblay, M., O′Connor, J.M., Narayanan, V., He, L., 1996. VIS speeds new media processing. IEEE Micro, 16(4):10-20.
[18] Zivojnovic, V., Velarde, J.M., Schlager, C., Meyr, H., 1994. DSPstone: A DSP-oriented Benchmarking Methodology. Proc. Int. Conf. on Signal Processing Applications and Technology, p.715-720.
Open peer comments: Debate/Discuss/Question/Opinion
<1>