A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
[article]
2022
arXiv
pre-print
Quantized neural networks typically require smaller memory footprints and lower computation complexity, which is crucial for efficient deployment. However, quantization inevitably leads to a distribution divergence from the original network, which generally degrades the performance. To tackle this issue, massive efforts have been made, but most existing approaches lack statistical considerations and depend on several manual configurations. In this paper, we present an adaptive-mapping
arXiv:2112.15139v4
fatcat:q4hc7zq2obf3bcvaysii2vizoi