A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Quantization of Weights of Neural Networks with Negligible Decreasing of Prediction Accuracy
2021
Information Technology and Control
The impact of weights compression on the NN (neural network) performance is analyzed, indicating good matching with the theoretical results and showing negligible decreasing of the prediction accuracy ...
Quantization and compression of neural network parameters using the uniform scalar quantization is carried out in this paper. ...
Acknowledgement This work has been supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia and by the Science Fund of the Republic of Serbia (Grant No. 6527104 ...
doi:10.5755/j01.itc.50.3.28468
fatcat:kiemvc7llna6rjtuvx2ri6amwq
SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
negligible accuracy drop compared with the original network. ...
We experimentally demonstrate that a highly quantized version of the original network is sufficient in predicting the output sparsity accurately, and verify that leveraging such sparsity in inference incurs ...
a negligible drop of model accuracy. ...
doi:10.1109/cvpr.2019.01147
dblp:conf/cvpr/CaoMXZLZNY19
fatcat:sl5jrlgzx5g4tlgyh6jrzv3api
Compressing Neural Networks With Inter Prediction and Linear Transformation
2021
IEEE Access
QUANTIZATION Quantization reduces the representation bits of original weights in neural networks. [13] proposed a weight quantization using weight discretization in neural networks. ...
INTER-LAYER KERNEL PREDICTION As shown in Table 1 , the proposed ILKP decreases the sizes of all the test models more than 4× with negligible performance drop compared to the baseline models. ...
doi:10.1109/access.2021.3077596
fatcat:smb4ig3hgzds5ff2w4e3ic2wee
Joint Pruning Quantization for Extremely Sparse Neural Networks
[article]
2020
arXiv
pre-print
We investigate pruning and quantization for deep neural networks. ...
In addition, to compare with other works, we demonstrate that our pruning stage alone beats the state-of-the-art when applied to ResNet on CIFAR10 and ImageNet. ...
Figure 10 :Figure 11 : 1011 Pruning Results of Different Threshold of (a) the accuracy and (b) the achieved weight sparsity Accuracy comparison with different weight bits with various stereo neural networks ...
arXiv:2010.01892v1
fatcat:y6fyzyx6wnehhisxzlbw3m7zda
Weight-Quantized SqueezeNet for Resource-Constrained Robot Vacuums for Indoor Obstacle Classification
2022
AI
With the rapid development of artificial intelligence (AI) theory, particularly deep learning neural networks, robot vacuums equipped with AI power can automatically clean indoor floors by using intelligent ...
In this work, we propose a weight-quantized SqueezeNet model for robot vacuums. ...
Conflicts of Interest: The author declares no conflict of interest. ...
doi:10.3390/ai3010011
fatcat:6yv34bhlsngq5a6nl3crtsz6e4
Non-Volatile Memory Array Based Quantization- and Noise-Resilient LSTM Neural Networks
2019
2019 IEEE International Conference on Rebooting Computing (ICRC)
Reasonable levels of ADC quantization noise and weight noise can be naturally tolerated within our NVMbased quantized LSTM network. ...
Long short-term memory (LSTM) neural networks have been widely used for natural language processing, time series prediction and many other sequential data tasks. ...
Fig. 9 . 9 Comparison of our NVM-based quantized LSTM neural network with other hardware platforms.
TABLE I . ...
doi:10.1109/icrc.2019.8914713
dblp:conf/icrc/MaCCQBL19
fatcat:fxp3gkrhpfd45gz5iiemswgsuq
A Neural Network Based ECG Classification Processor with Exploitation of Heartbeat Similarity
2019
IEEE Access
A lightweight classification algorithm that integrates both bi-directional long short-term memory (BLSTM) and convolutional neural networks (CNN) is proposed to deliver high accuracy with minimal network ...
This paper presents a neural network based processor with improved computation efficiency, which aims at multiclass heartbeat recognition in wearable devices. ...
The results show that the proposed adaptivegrained method with 32 quantitative values achieves a high similarity of 49.1% with negligible accuracy loss owing to the error tolerant ability of neural networks ...
doi:10.1109/access.2019.2956179
fatcat:nomq2ps4jvcmfo5u2lo2maacpu
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients
[article]
2018
arXiv
pre-print
We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients. ...
Moreover, as bit convolutions can be efficiently implemented on CPU, FPGA, ASIC and GPU, DoReFa-Net opens the way to accelerate training of low bitwidth neural network on these hardware. ...
neural network with bit convolution kernel. ...
arXiv:1606.06160v3
fatcat:5hbcfuhwz5cvrhq7uwuzuklaie
AdaBits: Neural Network Quantization with Adaptive Bit-Widths
[article]
2020
arXiv
pre-print
Deep neural networks with adaptive configurations have gained increasing attention due to the instant and flexible deployment of these models on platforms with different resource budgets. ...
neural networks, offering a distinct opportunity for improved accuracy-efficiency trade-off as well as instant adaptation according to the platform constraints in real-world applications. ...
The authors would like to appreciate invaluable discussion with Professor Hao Chen from University of California Davis and Professor Yi Ma from University of California Berkeley. ...
arXiv:1912.09666v2
fatcat:xpo4gkvcnfbydlyvj5g5pgbepq
ECQ^x: Explainability-Driven Quantization for Low-Bit and Sparse DNNs
[article]
2022
arXiv
pre-print
The remarkable success of deep neural networks (DNNs) in various applications is accompanied by a significant increase in network parameters and arithmetic operations. ...
The ultimate goal is to preserve the most relevant weights in quantization clusters of highest information content. ...
The ECQ x 4 bit quantization achieves a compression ratio for VGG16 of 103× with a negligible drop in accuracy of −0.1%. ...
arXiv:2109.04236v2
fatcat:xgrrejz33zfnpjclq4olzhix7i
ProgressiveNN: Achieving Computational Scalability with Dynamic Bit-Precision Adjustment by MSB-first Accumulative Computation
2021
International Journal of Networking and Computing
The evaluation result indicates that the accuracy increases by 1.3% with an average bit-length of 2 compared with only the 2-bit BWB network. ...
It also shows that BN retraining suppresses accuracy degradation of training performed with low computational cost and restores inference accuracy to 65% at 1-bit width inference. ...
Ternary weight networks [6] introduce ternary weights with zero added to improve accuracy. ...
doi:10.15803/ijnc.11.2_338
fatcat:r5jmpw2vdfcgfhpevh6rr7oaza
Compressing Deep Convolutional Networks using Vector Quantization
[article]
2014
arXiv
pre-print
For the 1000-category classification task in the ImageNet challenge, we are able to achieve 16-24 times compression of the network with only 1% loss of classification accuracy using the state-of-the-art ...
Simply applying k-means clustering to the weights or conducting product quantization can lead to a very good balance between model size and recognition accuracy. ...
decrease of accuracy. ...
arXiv:1412.6115v1
fatcat:qmfcwljfjjaubmfjw3mxgegn2y
Building A Size Constrained Predictive Models for Video Classification
[chapter]
2019
Lecture Notes in Computer Science
Our final solution consists of several submodels belonging to Fisher vectors, NetVlad, Deep Bag of Frames and Recurrent neural networks model families. ...
To make the classifier efficient under size constraints we introduced model distillation, partial weights quantization and training with exponential moving average. ...
To minimize drop of accuracy we used partial weights quantization, where only variables with more than 17, 000 elements were quantized. ...
doi:10.1007/978-3-030-11018-5_27
fatcat:ryt34btiqzcbthvnimohfa5u4q
HAO: Hardware-aware neural Architecture Optimization for Efficient Inference
[article]
2021
arXiv
pre-print
With low computational cost, our algorithm can generate quantized networks that achieve state-of-the-art accuracy and hardware performance on Xilinx Zynq (ZU3EG) FPGA for image classification on ImageNet ...
We use an accuracy predictor for different DNN subgraphs with different quantization schemes and generate accuracy-latency pareto frontiers. ...
HAO can balance the efficiency and perturbation, and we observe that the 8-bit counterpart of HAO 6/7-bit result runs 5% slower with negligible accuracy gain. ...
arXiv:2104.12766v1
fatcat:wvpt6sil4zhf5dknqhv5zj76lu
Binary Quantization Analysis of Neural Networks Weights on MNIST Dataset
2021
Elektronika ir Elektrotechnika
This paper considers the design of a binary scalar quantizer of Laplacian source and its application in compressed neural networks. ...
Binary quantizers are further implemented for compressing neural network weights and its performance is analysed for a simple classification task. ...
DESIGN OF BINARY QUANTIZER FOR THE REFERENCE VARIANCE Let us consider a symmetrical binary (N = 2 levels) scalar Binary Quantization Analysis of Neural Networks Weights on MNIST Dataset quantizer presented ...
doi:10.5755/j02.eie.28881
fatcat:bl77womljnh3ph6u3v6ihp2szm
« Previous
Showing results 1 — 15 out of 2,838 results