Designing Application-Specific Approximate Operators for Energy-Efficient AI Accelerators
DescriptionA plethora of recent works has focused on the various optimization techniques to reduce machine learning (ML) models’ overall computational complexity and memory footprints to implement them on resource-constrained embedded systems. These techniques mainly exploit the inherent error resilience of ML models to introduce deliberate approximations at the various layers of the computation stack. As Multiply-accumulate (MAC) is the most commonly utilized operation in ML models, most of the related state-of-the-art works have focused on proposing various approximate architectures for multiplication and addition operations. However, most of these works lack a consistent design methodology. Furthermore, these approximate arithmetic operators are designed without considering an application’s accuracy-performance constraints. The application agnostic-design methodology can result in approximate operators, which may not satisfy an application’s accuracy-performance constraints. In this session, we will focus on a framework for designing application-specific approximate arithmetic operators for FPGA-based systems. It involves circuit-level modeling (6-input Lookup table (LUT) and associated carry chains of modern FPGAs) and novel design space exploration methods to design approximate operators that can leverage the inherent robustness of ML applications. The framework reports more non-dominated approximate operators with better hypervolume contribution than state-of-the-art designs for various benchmark applications.
TimeWednesday, June 2815:00 - 15:30 CEST
Computer Science, Machine Learning, and Applied Mathematics