Filters








10 Hits in 2.3 sec

Slimmable Compressive Autoencoders for Practical Neural Image Compression [article]

Fei Yang, Luis Herranz, Yongmei Cheng, Mikhail G. Mozerov
2022 arXiv   pre-print
Focusing on practical image compression, we propose slimmable compressive autoencoders (SlimCAEs), where rate (R) and distortion (D) are jointly optimized for different capacities.  ...  Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance.  ...  Acknowledgments We acknowledge the support from Huawei Kirin Solution and the Spanish Government funding for projects RTI2018-102285-A-I00 and RYC2019-027020-I.  ... 
arXiv:2103.15726v2 fatcat:5wrayrmvnjedxink26lpnuuph4

Slimmable Video Codec [article]

Zhaocheng Liu, Luis Herranz, Fei Yang, Saiping Zhang, Shuai Wan, Marta Mrak, Marc Górriz Blanch
2022 arXiv   pre-print
practical video compression.  ...  In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder.  ...  While SlimCAE showed that slimmable codecs are promising approaches for practical neural image compression, SlimVC further advances this potential for the case of practical neural video compression.  ... 
arXiv:2205.06754v1 fatcat:uaays3pmrzaapgwuqpb2zzrgzq

Single-Training Collaborative Object Detectors Adaptive to Bandwidth and Computation [article]

Juliano S. Assine, J. C. S. Santos Filho, Eduardo Valle
2021 arXiv   pre-print
Our design is robust to the choice of base architecture and compressor and should adapt well for future architectures.  ...  In this work, we help to bridge that gap, introducing the first configurable solution for object detection that manages the triple communication-computation-accuracy trade-off with a single set of weights  ...  The compressive autoencoder inserted at the split point is what allows a full-neural architecture.  ... 
arXiv:2105.00591v2 fatcat:ysh24mdaencwrj33pxybsbwlsm

DPICT: Deep Progressive Image Compression Using Trit-Planes [article]

Jae-Han Lee, Seungmin Jeon, Kwang Pyo Choi, Youngo Park, Chang-Su Kim
2022 arXiv   pre-print
Since the compression network is less optimized for the cases of using fewer trit-planes, we develop a postprocessing network for refining reconstructed images at low rates.  ...  We propose the deep progressive image compression using trit-planes (DPICT) algorithm, which is the first learning-based codec supporting fine granular scalability (FGS).  ...  [47] used the slimmable neural networks [49, 50] to perform low-rate compression using only a fraction of the parameters and the highest-rate compression using all parameters.  ... 
arXiv:2112.06334v2 fatcat:5jxv4yrp2rcsdk2tqkvv3eftby

LC-FDNet: Learned Lossless Image Compression with Frequency Decomposition Network [article]

Hochang Rhee, Yeong Il Jang, Seyun Kim, Nam Ik Cho
2021 arXiv   pre-print
We initially compress the low-frequency components and then use them as additional input for encoding the remaining high-frequency region.  ...  Recent learning-based lossless image compression methods encode an image in the unit of subimages and achieve comparable performances to conventional non-learning algorithms.  ...  Slimmable compressive autoencoders for practical tals, standards and practice.  ... 
arXiv:2112.06417v1 fatcat:sedhmbgq3bhizkwqpzanozbk6i

ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding [article]

Dailan He, Ziming Yang, Weikun Peng, Rui Ma, Hongwei Qin, Yan Wang
2022 arXiv   pre-print
For the sake of practicality, a thorough investigation of the architecture design of learned image compression, regarding both compression performance and running speed, is essential.  ...  Recently, learned image compression techniques have achieved remarkable performance, even surpassing the best manually designed lossy image coders. They are promising to be large-scale adopted.  ...  Note that VVC is majorly designed for YUV 4:2:0 colorspace instead of YUV 4:4:4, as the former better reflects the sensitivity of human perception.  ... 
arXiv:2203.10886v2 fatcat:h355yljrzvgv7fi6essn6e4hzy

ThumbNet: One Thumbnail Image Contains All You Need for Recognition

Chen Zhao, Bernard Ghanem
2020 Proceedings of the 28th ACM International Conference on Multimedia  
images for generic classification tasks.  ...  Current works mostly seek to compress the network by reducing its parameters or parameter-incurred computation, neglecting the influence of the input image on the system complexity.  ...  Therefore, it is of practical significance to accelerate and compress CNNs for test-time deployment.  ... 
doi:10.1145/3394171.3413937 dblp:conf/mm/ZhaoG20 fatcat:2kvbgljcavax7bz52b4d5hc65i

ThumbNet: One Thumbnail Image Contains All You Need for Recognition [article]

Chen Zhao, Bernard Ghanem
2020 arXiv   pre-print
images for generic classification tasks.  ...  Current works mostly seek to compress the network by reducing its parameters or parameter-incurred computation, neglecting the influence of the input image on the system complexity.  ...  Therefore, it is of practical significance to accelerate and compress CNNs for test-time deployment. by inferring on thumbnail images.  ... 
arXiv:1904.05034v3 fatcat:j3tja6lyojfjpiv76xglxdm5ba

Learning Task-Oriented Communication for Edge Inference: An Information Bottleneck Approach [article]

Jiawei Shao, Yuyi Mao, Jun Zhang
2021 arXiv   pre-print
Furthermore, considering dynamic channel conditions in practical communication systems, we propose a variable-length feature encoding scheme based on dynamic neural networks to adaptively adjust the activated  ...  This paper investigates task-oriented communication for edge inference, where a low-end edge device transmits the extracted feature vector of a local data sample to a powerful edge server for processing  ...  Besides, there are also some variants of dynamic neural networks, including the slimmable neural networks and the "Once-for-All" architecture.  ... 
arXiv:2102.04170v2 fatcat:2o2mlfblxfdfzfk5x3m4dqjche

Make A Long Image Short: Adaptive Token Length for Vision Transformers [article]

Yichen Zhu, Yuqin Zhu, Jie Du, Yi Wang, Zhicai Ou, Feifei Feng, Jian Tang
2021 arXiv   pre-print
Motivated by the proverb "A picture is worth a thousand words" we aim to accelerate the ViT model by making a long image short.  ...  The vision transformer splits each image into a sequence of tokens with fixed length and processes the tokens in the same way as words in natural language processing.  ...  - Model Compression.  ... 
arXiv:2112.01686v2 fatcat:fzenydaarjg3jffxjwv324qsa4