Bit-Level Optimization for High-Level Synthesis and FPGA-Based Acceleration

Jiyu Zhang1,  Zhiru Zhang2,  Sheng Zhou2,  Mingxing Tan1,  Xianhua Liu1,  Xu Cheng1,  Jason Cong3
1Peking University, 2AutoESL Design Technologies, Inc., 3University of California, Los Angeles


Abstract

Automated hardware design from behavior-level abstraction has drawn wide interest in FPGA-based acceleration and configurable computing research field. However, for many high-level programming languages, such as C/C++, the description of bitwise access and computation is not as direct as hardware description languages, and high-level synthesis of algorithmic descriptions may generate suboptimal implementations for bitwise computation-intensive applications. In this paper we introduce a bit-level transformation and optimization approach to assisting high-level synthesis of algorithmic descriptions. We introduce a bit-flow graph to capture bit-value information. Analysis and optimizing transformations can be performed on this representation, and the optimized results are transformed back to the standard data-flow graphs extended with a few instructions representing bitwise access. This allows high-level synthesis tools to automatically generate circuits with higher quality. Experiments show that our algorithm can reduce slice usage by 29.8% on average for a set of real-life benchmarks on Xilinx Virtex-4 FPGAs. In the meantime, the clock period is reduced by 13.6% on average, with an 11.4% latency reduction. In addition, the synthesized accelerating modules can achieve up to 2900X performance speedup over an embedded PowerPC microprocessor.