A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Distribution Adaptive INT8 Quantization for Training CNNs
[article]
2021
arXiv
pre-print
In this paper, we propose a novel INT8 quantization training framework for convolutional neural network to address the above issues. ...
Experimental results on broad range of computer vision tasks, such as image classification, object detection and video classification, demonstrate that the proposed Distribution Adaptive INT8 Quantization ...
Adaptive INT8 Quantization for convolution layer. ...
arXiv:2102.04782v1
fatcat:wf4c73uofnditfaxcnsjlrm5ky
Fixed-Point Back-Propagation Training
2020
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Therefore, by keeping the data distribution stable through a layer-wise precision-adaptive quantization, we are able to directly train deep neural networks using low bit-width fixed-point data and achieve ...
In this paper, we propose a novel training approach, which applies a layer-wise precision-adaptive quantization in deep neural networks. ...
, float32 no yes 2.9%(AlexNet) n/a [36] int8 no yes 4%(AlexNet) n/a [1] int16, float32 no yes < 1%(ResNet50) n/a [7] int16 no yes < 1%(ResNet50) 2% (Translation) Adaptive Fixed-Point int8∼16 (CNN) int8 ...
doi:10.1109/cvpr42600.2020.00240
dblp:conf/cvpr/ZhangLZLHZGGDZC20
fatcat:lbgxpo7xlfgctnxt5m5y3oajim
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers
[article]
2020
arXiv
pre-print
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers. ...
Recent emerged quantization technique has been applied to inference of deep neural networks for fast and efficient execution. ...
Therefore, it is needed to dynamically measure the bit-width requirement for different networks and tasks. no yes < 1%(ResNet50) 2% (Translation) Adaptive Precision int8∼16 (CNN) int8∼24 (RNN) yes yes ...
arXiv:1911.00361v2
fatcat:z4nppbliirhune4zeplzabrttu
Towards Unified INT8 Training for Convolutional Neural Network
2020
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Then, we theoretically give an in-depth analysis of the convergence bound and derive two principles for stable INT8 training. ...
We believe that this pioneering study will help lead the community towards a fully unified INT8 training for convolutional neural networks. ...
average time for a round of INT8 training. ...
doi:10.1109/cvpr42600.2020.00204
dblp:conf/cvpr/ZhuGYLWLYY20
fatcat:7ujbnvuumrbp5ogz7vgxrxukl4
CoopNet: Cooperative Convolutional Neural Network for Low-Power MCUs
[article]
2019
arXiv
pre-print
Fixed-point quantization and binarization are two reduction methods adopted to deploy Convolutional Neural Networks (CNN) on end-nodes powered by low-power micro-controller units (MCUs). ...
designs where quantization and binarization are applied separately. ...
Fixed-Point Quantization While a CNN training is usually run using a 32-bits floating-point representation, recent studies, e.g. ...
arXiv:1911.08606v2
fatcat:n4xgvllrd5ccropx7mmirdbdau
Towards Unified INT8 Training for Convolutional Neural Network
[article]
2019
arXiv
pre-print
Then, we theoretically give an in-depth analysis of the convergence bound and derive two principles for stable INT8 training. ...
We believe that this pioneering study will help lead the community towards a fully unified INT8 training for convolutional neural networks. ...
Table 3 . 3 Ablation study on clipping method for INT8 training. ...
arXiv:1912.12607v1
fatcat:di6pjnz7prei3krgot2cqiln7e
Training Deep Neural Network in Limited Precision
[article]
2018
arXiv
pre-print
Extensive experiments on various network architectures and benchmarks verifies the effectiveness of the proposed technique for low precision training. ...
We also proposed a simple guideline to help select the appropriate bit-width for the last FC layer followed by a softmax nonlinearity layer. ...
Figure 2b 2b shows the distribution of s, y, and ∂L/∂s at an early stage of training and their quantized levels. ...
arXiv:1810.05486v1
fatcat:bedhcu447zcm7gaxluzfhe3agm
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation
[article]
2020
arXiv
pre-print
We also present a workflow for 8-bit quantization that is able to maintain accuracy within 1% of the floating-point baseline on all networks studied, including models that are more difficult to quantize ...
In this paper we review the mathematical aspects of quantization parameters and evaluate their choices on a wide range of neural network models for different application domains, including vision, speech ...
to adapt to the quantized weights and activations. ...
arXiv:2004.09602v1
fatcat:ykqrhfoa7zdqbjj7n6pd3l5u2i
Efficient Integer-Arithmetic-Only Convolutional Neural Networks
[article]
2020
arXiv
pre-print
Based on the proposed method, our trained 8-bit integer ResNet outperforms the 8-bit networks of Google's TensorFlow and NVIDIA's TensorRT for image recognition. ...
We analyze this phonomenon and find that the decline is due to activation quantization. ...
range for the vast mojority of a normal distribution. ...
arXiv:2006.11735v1
fatcat:pw6br3es4beylbsediumduupsq
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks
[article]
2020
arXiv
pre-print
We propose a method of training quantization thresholds (TQT) for uniform symmetric quantizers using standard backpropagation and gradient descent. ...
We present analytical support for the general robustness of our methods and empirically validate them on various CNNs for ImageNet classification. ...
However the quantization thresholds are typically fixed after initial calibration, leading to (a) lack of ability to adapt to changing weight and activation distributions during training, and (b) calibration ...
arXiv:1903.08066v3
fatcat:jrqqnpmekjb2dav3hsh36a7ctu
A Quantized CNN-Based Microfluidic Lensless-Sensing Mobile Blood-Acquisition and Analysis System
2019
Sensors
For a better tradeoff between accuracy and hardware cost, an integer-only quantization algorithm is proposed. ...
We designed a CNN accelerator architecture for the integer-only quantization algorithm and the dual configuration register group and implemented them in field-programmable gate arrays (FPGA). ...
To verify the benefits of using the integer quantization structure for the CNN accelerator hardware circuit, we also designed fp16 and int16 precision circuits with the same parallelism as our int8 quantization ...
doi:10.3390/s19235103
pmid:31766471
pmcid:PMC6928811
fatcat:epjmifmhdfewzeu3sabpisqgc4
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
[article]
2022
arXiv
pre-print
To adopt convolutional neural networks (CNN) for a range of resource-constrained targets, it is necessary to compress the CNN models by performing quantization, whereby precision representation is converted ...
In addition, to compensate for the accuracy drop without retraining, previous studies on post-training quantization have proposed several complementary methods: calibration, schemes, clipping, granularity ...
This scheme adaptively switches the quantization method depending on the distribution of real values. ...
arXiv:2202.05048v1
fatcat:di2u46sh75dsnkmk6afnppzaxi
Is Integer Arithmetic Enough for Deep Learning Training?
[article]
2022
arXiv
pre-print
, distribution adjustment, or gradient clipping. ...
As such, quantization has attracted the attention of researchers in recent years. ...
Acknowledgments and Disclosure of Funding The authors would like to thank Richard Wu and Vanessa Courville for their constructive comments. ...
arXiv:2207.08822v1
fatcat:3lzsckkui5cihep3mtjocycmeu
AUSN: Approximately Uniform Quantization by Adaptively Superimposing Non-uniform Distribution for Deep Neural Networks
[article]
2020
arXiv
pre-print
The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN. ...
In this paper, we first define two quantitative metrics, i.e., the Clipping Error and rounding error, to analyze the quantization error distribution. ...
For example, all the floating-point weights are quantized to an integer in [0, 255] in INT8 quantization [11] . ...
arXiv:2007.03903v1
fatcat:qhxdwtpu7be23kliotwlszfrnq
Neural Network Compression Framework for fast model inference
[article]
2020
arXiv
pre-print
The framework can be used within the training samples, which are supplied with it, or as a standalone package that can be seamlessly integrated into the existing training code with minimal adaptations. ...
In this work we present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF). ...
As a result, we were able to box mAP values for models trained and tested on the
train INT8-quantized and INT8-quantized+sparse ob- COCO dataset.
ject detection models available in mmdetection ...
arXiv:2002.08679v4
fatcat:5syyycecjfbptnberplaxa3nha
« Previous
Showing results 1 — 15 out of 217 results