A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters
[article]
2019
arXiv
pre-print
The application of the low-bit quantization allows a 50% reduction of the DNN memory footprint while the STOI performance drops only by 2.7%. ...
Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems is hampered by requirements for memory and computational power. ...
Memory Requirement Reduction of Deep Neural Networks Using Low-bit
Quantization of Parameters ...
arXiv:1911.00527v1
fatcat:vvrqusiyirft7j6qrfn677priy
Low-complexity Recurrent Neural Network-based Polar Decoder with Weight Quantization Mechanism
[article]
2019
arXiv
pre-print
However, neural networks are memory-intensive and hinder the deployment of DL in communication systems. ...
In this work, a low-complexity recurrent neural network (RNN) polar decoder with codebook-based weight quantization is proposed. ...
the required number of parameters for neural network decoder. ...
arXiv:1810.12154v2
fatcat:mmdc2qyhf5co7bmiuleuhwgr6a
A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing
2019
Algorithms
CNNs achieve better results at the cost of higher computing and memory requirements. Inference of convolutional neural networks is therefore usually done in centralized high-performance platforms. ...
The convolutional neural network (CNN) is one of the most used deep learning models for image detection and classification, due to its high accuracy when compared to other machine learning algorithms. ...
To reduce memory bandwidth and power requirements, the arithmetic units support 8-and 12-bits quantization. ...
doi:10.3390/a12080154
fatcat:jbdak7eisbcjtj6ba5hlpvnq5y
Weight-Quantized SqueezeNet for Resource-Constrained Robot Vacuums for Indoor Obstacle Classification
2022
AI
With the rapid development of artificial intelligence (AI) theory, particularly deep learning neural networks, robot vacuums equipped with AI power can automatically clean indoor floors by using intelligent ...
As a result, these existing deep AI models require far more memory space than a typical robot vacuum can provide. ...
Conflicts of Interest: The author declares no conflict of interest. ...
doi:10.3390/ai3010011
fatcat:6yv34bhlsngq5a6nl3crtsz6e4
To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference
2018
2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)
We experimentally show that how two mainstream compression techniques, data quantization and pruning, perform on these network architectures and the implications of compression techniques to the model ...
The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resourceconstrained computing devices. ...
The reduction in the storage size is consistent across neural networks as the size of a network is dominated by its weights. ...
doi:10.1109/bdcloud.2018.00110
dblp:conf/ispa/QinRYWG0FFW18
fatcat:q6zjgqhqcngplgl67lh7fnjwsm
To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference
[article]
2018
arXiv
pre-print
We experimentally show that how two mainstream compression techniques, data quantization and pruning, perform on these network architectures and the implications of compression techniques to the model ...
The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. ...
The reduction in the storage size is consistent across neural networks as the size of a network is dominated by its weights. ...
arXiv:1810.08899v1
fatcat:ewhclsajprh7tcp7kylxkftf3m
WRPN: Training and Inference using Wide Reduced-Precision Networks
[article]
2017
arXiv
pre-print
For computer vision applications, prior works have shown the efficacy of reducing the numeric precision of model parameters (network weights) in deep neural networks but also that reducing the precision ...
We reduce the precision of activation maps (along with model parameters) using a novel quantization scheme and increase the number of filter maps in a layer, and find that this scheme compensates or surpasses ...
Due to such efficiency benefits, there are many existing works that have proposed low-precision deep neural networks (DNNs), even down to 2-bit ternary mode [5] and 1-bit mode [4, 1] . ...
arXiv:1704.03079v1
fatcat:iv6g7b3yvzhfzovq3ibvc7dhwi
Quantization of Deep Neural Networks for Accurate Edge Computing
[article]
2021
arXiv
pre-print
with 3.5x-6.4x memory reduction. ...
Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. ...
Quantized neural networks, binarized neural networks, and XNOR-net [29] reduced the weights to only 1 bit and the activations to 1-2 bits resulting in a large reduction on memory and computation cost ...
arXiv:2104.12046v2
fatcat:dltil2m2yrgnbp6vgfrf46l6va
RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks
2019
IEICE transactions on information and systems
With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks ...
In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them ...
From the perspective of computation and memory cost, extensive quantization methods [10] - [13] are proposed to quantize neural networks with low-precision weights and activations, which drastically ...
doi:10.1587/transinf.2018rcp0008
fatcat:gmlkmykd6bevnfiqwklwicfiwy
Quantization and Deployment of Deep Neural Networks on Microcontrollers
2021
Sensors
This work focuses on quantization and deployment of deep neural networks onto low-power 32-bit microcontrollers. ...
However, there is still room for optimization of deep neural networks onto embedded devices. ...
In Figure 6 , we can observe that the accuracy obtained using 8-bit and 16-bit quantization is similar only for deep neural networks exhibiting a reduced number of parameters, in other words a low memory ...
doi:10.3390/s21092984
pmid:33922868
pmcid:PMC8122998
fatcat:3hzk3tkvxbgurcy3o6wxmnssgm
ECQ^x: Explainability-Driven Quantization for Low-Bit and Sparse DNNs
[article]
2022
arXiv
pre-print
The remarkable success of deep neural networks (DNNs) in various applications is accompanied by a significant increase in network parameters and arithmetic operations. ...
Experimental results show that this novel Entropy-Constrained and XAI-adjusted Quantization (ECQ^x) method generates ultra low-precision (2-5 bit) and simultaneously sparse neural networks while maintaining ...
For instance, a reduction from standard 32 bit precision to 8 bit or 4 bit directly leads to a memory reduction of almost 4× and 8×. ...
arXiv:2109.04236v2
fatcat:xgrrejz33zfnpjclq4olzhix7i
Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression
[article]
2018
arXiv
pre-print
Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which ...
However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. ...
DNN model with 8X compression and 3-bit weight quantization (10X weight memory reduction) shows minimal accuracy degradation of 0.45% compared to high precision and uncompressed network. ...
arXiv:1804.07370v1
fatcat:hirsopx7czbexffugk6a3iixmm
A Targeted Acceleration and Compression Framework for Low bit Neural Networks
[article]
2019
arXiv
pre-print
1 bit deep neural networks (DNNs), of which both the activations and weights are binarized , are attracting more and more attention due to their high computational efficiency and low memory requirement ...
For the fully connected layer s, the binarization operation is re placed by network pruning and low bit quantization. ...
However, 1-bit deep neural networks tolerate the sharp reduction of prediction accuracy when both activations and weights are binarized. ...
arXiv:1907.05271v1
fatcat:akcy4tq2ozailmivuvso6lkfae
Iteratively Training Look-Up Tables for Network Quantization
[article]
2018
arXiv
pre-print
Operating deep neural networks on devices with limited resources requires the reduction of their memory footprints and computational requirements. ...
In order to obtain fully multiplier-less networks, we also introduce a multiplier-less version of batch normalization. ...
Conclusions and Future Perspectives We have presented look-up table quantization, a novel approach for the reduction of size and computations of deep neural networks. ...
arXiv:1811.05355v1
fatcat:usziojexc5c5vfufvcoaknk4ma
Compact recurrent neural networks for acoustic event detection on low-energy low-complexity platforms
2020
IEEE Journal on Selected Topics in Signal Processing
This challenge discourages IoT implementation, where an efficient use of resources is required. ...
test our approach on an ARM Cortex M4, particularly focusing on issues related to 8-bits quantization. ...
These results paved the way to the effective use of deep neural networks on a low-power microcontroller to enable SED. ...
doi:10.1109/jstsp.2020.2969775
fatcat:pjsjujou6zcj7i2jggmgsgcwla
« Previous
Showing results 1 — 15 out of 3,979 results