Filters








217 Hits in 2.5 sec

Distribution Adaptive INT8 Quantization for Training CNNs [article]

Kang Zhao, Sida Huang, Pan Pan, Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu
2021 arXiv   pre-print
In this paper, we propose a novel INT8 quantization training framework for convolutional neural network to address the above issues.  ...  Experimental results on broad range of computer vision tasks, such as image classification, object detection and video classification, demonstrate that the proposed Distribution Adaptive INT8 Quantization  ...  Adaptive INT8 Quantization for convolution layer.  ... 
arXiv:2102.04782v1 fatcat:wf4c73uofnditfaxcnsjlrm5ky

Fixed-Point Back-Propagation Training

Xishan Zhang, Shaoli Liu, Rui Zhang, Chang Liu, Di Huang, Shiyi Zhou, Jiaming Guo, Qi Guo, Zidong Du, Tian Zhi, Yunji Chen
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Therefore, by keeping the data distribution stable through a layer-wise precision-adaptive quantization, we are able to directly train deep neural networks using low bit-width fixed-point data and achieve  ...  In this paper, we propose a novel training approach, which applies a layer-wise precision-adaptive quantization in deep neural networks.  ...  , float32 no yes 2.9%(AlexNet) n/a [36] int8 no yes 4%(AlexNet) n/a [1] int16, float32 no yes < 1%(ResNet50) n/a [7] int16 no yes < 1%(ResNet50) 2% (Translation) Adaptive Fixed-Point int8∼16 (CNN) int8  ... 
doi:10.1109/cvpr42600.2020.00240 dblp:conf/cvpr/ZhangLZLHZGGDZC20 fatcat:lbgxpo7xlfgctnxt5m5y3oajim

Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers [article]

Xishan Zhang, Shaoli Liu, Rui Zhang, Chang Liu, Di Huang, Shiyi Zhou, Jiaming Guo, Yu Kang, Qi Guo, Zidong Du, Yunji Chen
2020 arXiv   pre-print
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers.  ...  Recent emerged quantization technique has been applied to inference of deep neural networks for fast and efficient execution.  ...  Therefore, it is needed to dynamically measure the bit-width requirement for different networks and tasks. no yes < 1%(ResNet50) 2% (Translation) Adaptive Precision int8∼16 (CNN) int8∼24 (RNN) yes yes  ... 
arXiv:1911.00361v2 fatcat:z4nppbliirhune4zeplzabrttu

Towards Unified INT8 Training for Convolutional Neural Network

Feng Zhu, Ruihao Gong, Fengwei Yu, Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, Junjie Yan
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Then, we theoretically give an in-depth analysis of the convergence bound and derive two principles for stable INT8 training.  ...  We believe that this pioneering study will help lead the community towards a fully unified INT8 training for convolutional neural networks.  ...  average time for a round of INT8 training.  ... 
doi:10.1109/cvpr42600.2020.00204 dblp:conf/cvpr/ZhuGYLWLYY20 fatcat:7ujbnvuumrbp5ogz7vgxrxukl4

CoopNet: Cooperative Convolutional Neural Network for Low-Power MCUs [article]

Luca Mocerino, Andrea Calimera
2019 arXiv   pre-print
Fixed-point quantization and binarization are two reduction methods adopted to deploy Convolutional Neural Networks (CNN) on end-nodes powered by low-power micro-controller units (MCUs).  ...  designs where quantization and binarization are applied separately.  ...  Fixed-Point Quantization While a CNN training is usually run using a 32-bits floating-point representation, recent studies, e.g.  ... 
arXiv:1911.08606v2 fatcat:n4xgvllrd5ccropx7mmirdbdau

Towards Unified INT8 Training for Convolutional Neural Network [article]

Feng Zhu, Ruihao Gong, Fengwei Yu, Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, Junjie Yan
2019 arXiv   pre-print
Then, we theoretically give an in-depth analysis of the convergence bound and derive two principles for stable INT8 training.  ...  We believe that this pioneering study will help lead the community towards a fully unified INT8 training for convolutional neural networks.  ...  Table 3 . 3 Ablation study on clipping method for INT8 training.  ... 
arXiv:1912.12607v1 fatcat:di6pjnz7prei3krgot2cqiln7e

Training Deep Neural Network in Limited Precision [article]

Hyunsun Park, Jun Haeng Lee, Youngmin Oh, Sangwon Ha, Seungwon Lee
2018 arXiv   pre-print
Extensive experiments on various network architectures and benchmarks verifies the effectiveness of the proposed technique for low precision training.  ...  We also proposed a simple guideline to help select the appropriate bit-width for the last FC layer followed by a softmax nonlinearity layer.  ...  Figure 2b 2b shows the distribution of s, y, and ∂L/∂s at an early stage of training and their quantized levels.  ... 
arXiv:1810.05486v1 fatcat:bedhcu447zcm7gaxluzfhe3agm

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation [article]

Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, Paulius Micikevicius
2020 arXiv   pre-print
We also present a workflow for 8-bit quantization that is able to maintain accuracy within 1% of the floating-point baseline on all networks studied, including models that are more difficult to quantize  ...  In this paper we review the mathematical aspects of quantization parameters and evaluate their choices on a wide range of neural network models for different application domains, including vision, speech  ...  to adapt to the quantized weights and activations.  ... 
arXiv:2004.09602v1 fatcat:ykqrhfoa7zdqbjj7n6pd3l5u2i

Efficient Integer-Arithmetic-Only Convolutional Neural Networks [article]

Hengrui Zhao and Dong Liu and Houqiang Li
2020 arXiv   pre-print
Based on the proposed method, our trained 8-bit integer ResNet outperforms the 8-bit networks of Google's TensorFlow and NVIDIA's TensorRT for image recognition.  ...  We analyze this phonomenon and find that the decline is due to activation quantization.  ...  range for the vast mojority of a normal distribution.  ... 
arXiv:2006.11735v1 fatcat:pw6br3es4beylbsediumduupsq

Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks [article]

Sambhav R. Jain, Albert Gural, Michael Wu, Chris H. Dick
2020 arXiv   pre-print
We propose a method of training quantization thresholds (TQT) for uniform symmetric quantizers using standard backpropagation and gradient descent.  ...  We present analytical support for the general robustness of our methods and empirically validate them on various CNNs for ImageNet classification.  ...  However the quantization thresholds are typically fixed after initial calibration, leading to (a) lack of ability to adapt to changing weight and activation distributions during training, and (b) calibration  ... 
arXiv:1903.08066v3 fatcat:jrqqnpmekjb2dav3hsh36a7ctu

A Quantized CNN-Based Microfluidic Lensless-Sensing Mobile Blood-Acquisition and Analysis System

Liao, Yu, Tian, Li, Li
2019 Sensors  
For a better tradeoff between accuracy and hardware cost, an integer-only quantization algorithm is proposed.  ...  We designed a CNN accelerator architecture for the integer-only quantization algorithm and the dual configuration register group and implemented them in field-programmable gate arrays (FPGA).  ...  To verify the benefits of using the integer quantization structure for the CNN accelerator hardware circuit, we also designed fp16 and int16 precision circuits with the same parallelism as our int8 quantization  ... 
doi:10.3390/s19235103 pmid:31766471 pmcid:PMC6928811 fatcat:epjmifmhdfewzeu3sabpisqgc4

Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment [article]

Jemin Lee, Misun Yu, Yongin Kwon, Teaho Kim
2022 arXiv   pre-print
To adopt convolutional neural networks (CNN) for a range of resource-constrained targets, it is necessary to compress the CNN models by performing quantization, whereby precision representation is converted  ...  In addition, to compensate for the accuracy drop without retraining, previous studies on post-training quantization have proposed several complementary methods: calibration, schemes, clipping, granularity  ...  This scheme adaptively switches the quantization method depending on the distribution of real values.  ... 
arXiv:2202.05048v1 fatcat:di2u46sh75dsnkmk6afnppzaxi

Is Integer Arithmetic Enough for Deep Learning Training? [article]

Alireza Ghaffari, Marzieh S. Tahaei, Mohammadreza Tayaranian, Masoud Asgharian, Vahid Partovi Nia
2022 arXiv   pre-print
, distribution adjustment, or gradient clipping.  ...  As such, quantization has attracted the attention of researchers in recent years.  ...  Acknowledgments and Disclosure of Funding The authors would like to thank Richard Wu and Vanessa Courville for their constructive comments.  ... 
arXiv:2207.08822v1 fatcat:3lzsckkui5cihep3mtjocycmeu

AUSN: Approximately Uniform Quantization by Adaptively Superimposing Non-uniform Distribution for Deep Neural Networks [article]

Liu Fangxin, Zhao Wenbo, Wang Yanzhi, Dai Changzhi, Jiang Li
2020 arXiv   pre-print
The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN.  ...  In this paper, we first define two quantitative metrics, i.e., the Clipping Error and rounding error, to analyze the quantization error distribution.  ...  For example, all the floating-point weights are quantized to an integer in [0, 255] in INT8 quantization [11] .  ... 
arXiv:2007.03903v1 fatcat:qhxdwtpu7be23kliotwlszfrnq

Neural Network Compression Framework for fast model inference [article]

Alexander Kozlov and Ivan Lazarevich and Vasily Shamporov and Nikolay Lyalyushkin and Yury Gorbachev
2020 arXiv   pre-print
The framework can be used within the training samples, which are supplied with it, or as a standalone package that can be seamlessly integrated into the existing training code with minimal adaptations.  ...  In this work we present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF).  ...  As a result, we were able to box mAP values for models trained and tested on the train INT8-quantized and INT8-quantized+sparse ob- COCO dataset. ject detection models available in mmdetection  ... 
arXiv:2002.08679v4 fatcat:5syyycecjfbptnberplaxa3nha
« Previous Showing results 1 — 15 out of 217 results