A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference
[article]
2018
arXiv
pre-print
The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address the computation issue of deep inference on embedded devices. This technique is highly attractive, as it does not rely on specialized hardware, or computation-offloading that is often infeasible due to privacy concerns or high latency. However, it remains
arXiv:1810.08899v1
fatcat:ewhclsajprh7tcp7kylxkftf3m