A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Sample-Wise Dynamic Precision Quantization for Neural Network Acceleration
2022
IEICE Electronics Express
Quantization is a well-known method for deep neural networks (DNNs) compression and acceleration. In this work, we propose the Sample-Wise Dynamic Precision (SWDP) quantization scheme, which can switch the bit-width of weights and activations in the model according to the task difficulty of input samples at runtime. Using low-precision networks for easy input images brings advantages in terms of computational and energy efficiency. We also propose an adaptive hardware design for the efficient
doi:10.1587/elex.19.20220229
fatcat:j3prleu5wjhdfoy5ldew4ygmcq