2,838 Hits in 4.1 sec

Quantization of Weights of Neural Networks with Negligible Decreasing of Prediction Accuracy

Zoran Peric, Bojan Denic, Milan Savic, Milan Dincic, Darko Mihajlov
2021 Information Technology and Control  
The impact of weights compression on the NN (neural network) performance is analyzed, indicating good matching with the theoretical results and showing negligible decreasing of the prediction accuracy  ...  Quantization and compression of neural network parameters using the uniform scalar quantization is carried out in this paper.  ...  Acknowledgement This work has been supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia and by the Science Fund of the Republic of Serbia (Grant No. 6527104  ... 
doi:10.5755/j01.itc.50.3.28468 fatcat:kiemvc7llna6rjtuvx2ri6amwq

SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization

Shijie Cao, Lingxiao Ma, Wencong Xiao, Chen Zhang, Yunxin Liu, Lintao Zhang, Lanshun Nie, Zhi Yang
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
negligible accuracy drop compared with the original network.  ...  We experimentally demonstrate that a highly quantized version of the original network is sufficient in predicting the output sparsity accurately, and verify that leveraging such sparsity in inference incurs  ...  a negligible drop of model accuracy.  ... 
doi:10.1109/cvpr.2019.01147 dblp:conf/cvpr/CaoMXZLZNY19 fatcat:sl5jrlgzx5g4tlgyh6jrzv3api

Compressing Neural Networks With Inter Prediction and Linear Transformation

Kang-Ho Lee, Sung-Ho Bae
2021 IEEE Access  
QUANTIZATION Quantization reduces the representation bits of original weights in neural networks. [13] proposed a weight quantization using weight discretization in neural networks.  ...  INTER-LAYER KERNEL PREDICTION As shown in Table 1 , the proposed ILKP decreases the sizes of all the test models more than 4× with negligible performance drop compared to the baseline models.  ... 
doi:10.1109/access.2021.3077596 fatcat:smb4ig3hgzds5ff2w4e3ic2wee

Joint Pruning Quantization for Extremely Sparse Neural Networks [article]

Po-Hsiang Yu, Sih-Sian Wu, Jan P. Klopp, Liang-Gee Chen, Shao-Yi Chien
2020 arXiv   pre-print
We investigate pruning and quantization for deep neural networks.  ...  In addition, to compare with other works, we demonstrate that our pruning stage alone beats the state-of-the-art when applied to ResNet on CIFAR10 and ImageNet.  ...  Figure 10 :Figure 11 : 1011 Pruning Results of Different Threshold of (a) the accuracy and (b) the achieved weight sparsity Accuracy comparison with different weight bits with various stereo neural networks  ... 
arXiv:2010.01892v1 fatcat:y6fyzyx6wnehhisxzlbw3m7zda

Weight-Quantized SqueezeNet for Resource-Constrained Robot Vacuums for Indoor Obstacle Classification

Qian Huang
2022 AI  
With the rapid development of artificial intelligence (AI) theory, particularly deep learning neural networks, robot vacuums equipped with AI power can automatically clean indoor floors by using intelligent  ...  In this work, we propose a weight-quantized SqueezeNet model for robot vacuums.  ...  Conflicts of Interest: The author declares no conflict of interest.  ... 
doi:10.3390/ai3010011 fatcat:6yv34bhlsngq5a6nl3crtsz6e4

Non-Volatile Memory Array Based Quantization- and Noise-Resilient LSTM Neural Networks

Wen Ma, Pi-Feng Chiu, Won Ho Choi, Minghai Qin, Daniel Bedau, Martin Lueker-Boden
2019 2019 IEEE International Conference on Rebooting Computing (ICRC)  
Reasonable levels of ADC quantization noise and weight noise can be naturally tolerated within our NVMbased quantized LSTM network.  ...  Long short-term memory (LSTM) neural networks have been widely used for natural language processing, time series prediction and many other sequential data tasks.  ...  Fig. 9 . 9 Comparison of our NVM-based quantized LSTM neural network with other hardware platforms. TABLE I .  ... 
doi:10.1109/icrc.2019.8914713 dblp:conf/icrc/MaCCQBL19 fatcat:fxp3gkrhpfd45gz5iiemswgsuq

A Neural Network Based ECG Classification Processor with Exploitation of Heartbeat Similarity

Jiaquan Wu, Feiteng Li, Zhijian Chen, Yu Pu, Mengyuan Zhan
2019 IEEE Access  
A lightweight classification algorithm that integrates both bi-directional long short-term memory (BLSTM) and convolutional neural networks (CNN) is proposed to deliver high accuracy with minimal network  ...  This paper presents a neural network based processor with improved computation efficiency, which aims at multiclass heartbeat recognition in wearable devices.  ...  The results show that the proposed adaptivegrained method with 32 quantitative values achieves a high similarity of 49.1% with negligible accuracy loss owing to the error tolerant ability of neural networks  ... 
doi:10.1109/access.2019.2956179 fatcat:nomq2ps4jvcmfo5u2lo2maacpu

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients [article]

Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou
2018 arXiv   pre-print
We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients.  ...  Moreover, as bit convolutions can be efficiently implemented on CPU, FPGA, ASIC and GPU, DoReFa-Net opens the way to accelerate training of low bitwidth neural network on these hardware.  ...  neural network with bit convolution kernel.  ... 
arXiv:1606.06160v3 fatcat:5hbcfuhwz5cvrhq7uwuzuklaie

AdaBits: Neural Network Quantization with Adaptive Bit-Widths [article]

Qing Jin, Linjie Yang, Zhenyu Liao
2020 arXiv   pre-print
Deep neural networks with adaptive configurations have gained increasing attention due to the instant and flexible deployment of these models on platforms with different resource budgets.  ...  neural networks, offering a distinct opportunity for improved accuracy-efficiency trade-off as well as instant adaptation according to the platform constraints in real-world applications.  ...  The authors would like to appreciate invaluable discussion with Professor Hao Chen from University of California Davis and Professor Yi Ma from University of California Berkeley.  ... 
arXiv:1912.09666v2 fatcat:xpo4gkvcnfbydlyvj5g5pgbepq

ECQ^x: Explainability-Driven Quantization for Low-Bit and Sparse DNNs [article]

Daniel Becking, Maximilian Dreyer, Wojciech Samek, Karsten Müller, Sebastian Lapuschkin
2022 arXiv   pre-print
The remarkable success of deep neural networks (DNNs) in various applications is accompanied by a significant increase in network parameters and arithmetic operations.  ...  The ultimate goal is to preserve the most relevant weights in quantization clusters of highest information content.  ...  The ECQ x 4 bit quantization achieves a compression ratio for VGG16 of 103× with a negligible drop in accuracy of −0.1%.  ... 
arXiv:2109.04236v2 fatcat:xgrrejz33zfnpjclq4olzhix7i

ProgressiveNN: Achieving Computational Scalability with Dynamic Bit-Precision Adjustment by MSB-first Accumulative Computation

Junnosuke Suzuki, Tomohiro Kaneko, Kota Ando, Kazutoshi Hirose, Kazushi Kawamura, Thiem Van Chu, Masato Motomura, Jaehoon Yu
2021 International Journal of Networking and Computing  
The evaluation result indicates that the accuracy increases by 1.3% with an average bit-length of 2 compared with only the 2-bit BWB network.  ...  It also shows that BN retraining suppresses accuracy degradation of training performed with low computational cost and restores inference accuracy to 65% at 1-bit width inference.  ...  Ternary weight networks [6] introduce ternary weights with zero added to improve accuracy.  ... 
doi:10.15803/ijnc.11.2_338 fatcat:r5jmpw2vdfcgfhpevh6rr7oaza

Compressing Deep Convolutional Networks using Vector Quantization [article]

Yunchao Gong and Liu Liu and Ming Yang and Lubomir Bourdev
2014 arXiv   pre-print
For the 1000-category classification task in the ImageNet challenge, we are able to achieve 16-24 times compression of the network with only 1% loss of classification accuracy using the state-of-the-art  ...  Simply applying k-means clustering to the weights or conducting product quantization can lead to a very good balance between model size and recognition accuracy.  ...  decrease of accuracy.  ... 
arXiv:1412.6115v1 fatcat:qmfcwljfjjaubmfjw3mxgegn2y

Building A Size Constrained Predictive Models for Video Classification [chapter]

Miha Skalic, David Austin
2019 Lecture Notes in Computer Science  
Our final solution consists of several submodels belonging to Fisher vectors, NetVlad, Deep Bag of Frames and Recurrent neural networks model families.  ...  To make the classifier efficient under size constraints we introduced model distillation, partial weights quantization and training with exponential moving average.  ...  To minimize drop of accuracy we used partial weights quantization, where only variables with more than 17, 000 elements were quantized.  ... 
doi:10.1007/978-3-030-11018-5_27 fatcat:ryt34btiqzcbthvnimohfa5u4q

HAO: Hardware-aware neural Architecture Optimization for Efficient Inference [article]

Zhen Dong, Yizhao Gao, Qijing Huang, John Wawrzynek, Hayden K.H. So, Kurt Keutzer
2021 arXiv   pre-print
With low computational cost, our algorithm can generate quantized networks that achieve state-of-the-art accuracy and hardware performance on Xilinx Zynq (ZU3EG) FPGA for image classification on ImageNet  ...  We use an accuracy predictor for different DNN subgraphs with different quantization schemes and generate accuracy-latency pareto frontiers.  ...  HAO can balance the efficiency and perturbation, and we observe that the 8-bit counterpart of HAO 6/7-bit result runs 5% slower with negligible accuracy gain.  ... 
arXiv:2104.12766v1 fatcat:wvpt6sil4zhf5dknqhv5zj76lu

Binary Quantization Analysis of Neural Networks Weights on MNIST Dataset

Zoran H. Peric, Bojan D. Denic, Milan S. Savic, Nikola J. Vucic, Nikola B. Simic
2021 Elektronika ir Elektrotechnika  
This paper considers the design of a binary scalar quantizer of Laplacian source and its application in compressed neural networks.  ...  Binary quantizers are further implemented for compressing neural network weights and its performance is analysed for a simple classification task.  ...  DESIGN OF BINARY QUANTIZER FOR THE REFERENCE VARIANCE Let us consider a symmetrical binary (N = 2 levels) scalar Binary Quantization Analysis of Neural Networks Weights on MNIST Dataset quantizer presented  ... 
doi:10.5755/j02.eie.28881 fatcat:bl77womljnh3ph6u3v6ihp2szm
« Previous Showing results 1 — 15 out of 2,838 results