A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
HAWQV3: Dyadic Neural Network Quantization
[article]
2021
arXiv
pre-print
This hidden cost limits the latency improvement realized by quantizing Neural Networks. To address this, we present HAWQV3, a novel mixed-precision integer-only quantization framework. ...
Current low-precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values. ...
Adaptive quantization for deep neural network. arXiv preprint arXiv:1712.01048, 2017b. ...
arXiv:2011.10680v3
fatcat:xjbfg4cpqrbc7bp2fbztj5jyea
HAWQ-V3: Dyadic Neural Network Quantization
2021
International Conference on Machine Learning
This hidden cost limits the latency improvement realized by quantizing Neural Networks. To address this, we present HAWQ-V3, a novel mixed-precision integer-only quantization framework. ...
Current low-precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values. ...
Over the past decade, we have observed significant improvements in the accuracy of Neural Networks (NNs) for various tasks. ...
dblp:conf/icml/YaoDZGYTW0WMK21
fatcat:q76iqcffj5hmhhtrqqd3py7r3e
A Survey of Quantization Methods for Efficient Neural Network Inference
[article]
2021
arXiv
pre-print
In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. ...
Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. ...
Dyadic quantization is another class of integer-only quantization, where all the scaling is performed with dyadic numbers, which are rational numbers with integer values in their numerator and a power ...
arXiv:2103.13630v3
fatcat:5274u5yy65ch7erdt3waqz4di4
Applications and Techniques for Fast Machine Learning in Science
2022
Frontiers in Big Data
HAWQV3: Dyadic neural network quantization. arXiv preprint arXiv:2011.10680. Yao, Z., Gholami, A., Keutzer, K., and Mahoney, M. (2019). Pyhessian: Neural networks through the lens of the hessian. ...
A survey of quantization methods for efficient neural network inference. ...
doi:10.3389/fdata.2022.787421
pmid:35496379
pmcid:PMC9041419
fatcat:5w2exf7vvrfvnhln7nj5uppjga