Filters








4,973 Hits in 4.0 sec

Dynamic Network Quantization for Efficient Video Inference [article]

Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Aude Oliva, Rogerio Feris, Kate Saenko
2021 arXiv   pre-print
Motivated by the effectiveness of quantization for boosting efficiency, in this paper, we propose a dynamic network quantization framework, that selects optimal precision for each frame conditioned on  ...  the input for efficient video recognition.  ...  Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer.  ... 
arXiv:2108.10394v1 fatcat:5gv2uwr6mfbdtku32g5slmgtia

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths [article]

Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Naigang Wang, Bowen Pan, Kailash Gopalakrishnan, Aude Oliva, Rogerio Feris, Kate Saenko
2021 arXiv   pre-print
Quantizing deep networks with adaptive bit-widths is a promising technique for efficient inference across many devices and resource constraints.  ...  network during inference for instant adaptation in different scenarios.  ...  Despite recent progress in network quantization for improving efficiency of deep networks, most of the existing methods repeat the quantization process and retrain the low-precision network from scratch  ... 
arXiv:2103.01435v3 fatcat:qukv5qgpxjfxnixlor6blwfbi4

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions [article]

Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng Dong, Jianbo Shi
2021 arXiv   pre-print
Though recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications.  ...  Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community.  ...  - specific Dynamic No Yes No Efficient Visual Recognition with Deep Neural NetworksImage Video PointFast Run-time Inference• Compact Networks (CNNs, RNNs, ..., NAS) • Tensor Decomposition (CP,  ... 
arXiv:2108.13055v2 fatcat:nf3lymdbvzgl7otl7gjkk5qitq

Editorial: Special Issue on Compact Deep Neural Networks With Industrial Applications

Lixin Fan, Diana Marculescu, Werner Bailer, Yurong Chen
2020 IEEE Journal on Selected Topics in Signal Processing  
Shen et al. propose in "Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference" a framework for dynamic inference on resource constrained hardware.  ...  An application of computationally efficient neural networks for safety monitoring is described in "Computationally Efficient Spatio-Temporal Dynamic Texture Recognition for Volatile Organic Compound 1932  ... 
doi:10.1109/jstsp.2020.3006323 fatcat:d75ni7ocajb4pemovq2l3ton4i

Implicit Neural Video Compression [article]

Yunfan Zhang, Ties van Rozendaal, Johann Brehmer, Markus Nagel, Taco Cohen
2021 arXiv   pre-print
We further lower the bitrate by storing the network weights with learned integer quantization.  ...  Together with a small residual network, this allows us to efficiently compress P-frames relative to the previous frame.  ...  Quantization and training able local functional representations. arXiv preprint of neural networks for efficient integer-arithmetic-only arXiv:2104.03960, 2021. 2, 4 inference  ... 
arXiv:2112.11312v1 fatcat:ogd256n4qzfc7epkankzy4ktuy

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
2022 ACM Transactions on Design Automation of Electronic Systems  
Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video, and natural language processing by exploiting their spatial sparsity and temporal  ...  To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization.  ...  Tiny Video Networks [227] automatically design highly efficient models for video understanding.  ... 
doi:10.1145/3486618 fatcat:h6xwv2slo5eklift2fl24usine

Training for temporal sparsity in deep neural networks, application in video processing [article]

Amirreza Yousefzadeh, Manolis Sifalakis
2021 arXiv   pre-print
On the other hand, temporal sparsity is an inherent feature of bio-inspired spiking neural networks (SNNs), which neuromorphic processing exploits for hardware efficiency.  ...  Activation sparsity improves compute efficiency and resource utilization in sparsity-aware neural network accelerators.  ...  ., 2018] , the authors propose a novel DNN inference algorithm (AMS) alongside a DNN accelerator optimized for video-based inference.  ... 
arXiv:2107.07305v1 fatcat:zgkh4hmffndulpbymffq6vjmbq

Recurrent Residual Module for Fast Inference in Videos [article]

Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
2018 arXiv   pre-print
In this work, we propose a framework called Recurrent Residual Module (RRM) to accelerate the CNN inference for video recognition tasks.  ...  Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.  ...  Efficient inference engine To implement the RRM framework efficiently, we utilize dynamic sparse matrix-vector multiplication(DSPMV) technique.  ... 
arXiv:1802.09723v1 fatcat:lap3sjz62rcibptwvphgq3bq2y

Efficient Integer-Arithmetic-Only Convolutional Neural Networks [article]

Hengrui Zhao and Dong Liu and Houqiang Li
2020 arXiv   pre-print
We also experiment on VDSR for image super-resolution and on VRCNN for compression artifact reduction, both of which serve for regression tasks that natively require high inference accuracy.  ...  However, previous works usually report a decline in the inference accuracy when converting well-trained floating-point-number (FPN) networks into integer networks.  ...  Different from parameter quantization that quantizes static weights and biases into integer, activation quantization is dynamic as it quantizes the computed activations into integer during network inference  ... 
arXiv:2006.11735v1 fatcat:pw6br3es4beylbsediumduupsq

AET-EFN: A Versatile Design for Static and Dynamic Event-Based Vision [article]

Chang Liu, Xiaojuan Qi, Edmund Lam, Ngai Wong
2021 arXiv   pre-print
Our method is also efficient and achieves the fastest inference speed among others.  ...  dynamic scenes.  ...  We also present a two-branch neural network design, the Event Frame Net (EFN), which is versatile for static and dynamic event-based vision.  ... 
arXiv:2103.11645v1 fatcat:g4mm3ugg7vclpfmtlxxwfupbpq

EMC2-NIPS 2019 Abstracts of Invited Talks

2019 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS)  
This talk will uncover the need for building accurate, platform-specific power and latency models for convolutional neural networks (CNNs) and efficient hardware-aware CNN design methodologies, thus allowing  ...  , computer vision, autonomous navigation/exploration and video/image processing.  ...  We accelerate computation-intensive AI applications including (TSM) for efficient video recognition and PVCNN for efficient 3D recognition on point clouds.  ... 
doi:10.1109/emc2-nips53020.2019.00007 fatcat:bvtcsgwxsrh3bmwh6tba3ly3ra

Overview of the Neural Network Compression and Representation (NNR) Standard

Heiner Kirchhoffer, Paul Haase, Wojciech Samek, Karsten Muller, Hamed Rezazadegan-Tavakoli, Francesco Cricri, Emre Aksu, Miska M. Hannuksela, Wei Jiang, Wei Wang, Shan Liu, Swayambhoo Jain (+3 others)
2021 IEEE transactions on circuits and systems for video technology (Print)  
Abstract-Neural Network Coding and Representation (NNR) is the first international standard for efficient compression of neural networks (NNs).  ...  The NNR standard contains compression-efficient quantization and deep context-adaptive binary arithmetic coding (DeepCABAC) as core encoding and decoding technologies, as well as neural network parameter  ...  ACKNOWLEDGMENT The authors would like to thank the experts of ISO/IEC MPEG and in particular the MPEG NNR group for their contributions.  ... 
doi:10.1109/tcsvt.2021.3095970 fatcat:wl4tvtpjaveuxng6fqu4tost5y

Recurrent Residual Module for Fast Inference in Videos

Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
In this work, we propose a framework called Recurrent Residual Module (RRM) to accelerate the CNN inference for video recognition tasks.  ...  Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.  ...  Efficient inference engine To implement the RRM framework efficiently, we utilize dynamic sparse matrix-vector multiplication(DSPMV) technique.  ... 
doi:10.1109/cvpr.2018.00166 dblp:conf/cvpr/PanLFHZL18 fatcat:pc2z4kxacrcthckdljv2d6rcoy

A practical convolutional neural network as loop filter for intra frame [article]

Xiaodan Song, Jiabao Yao, Lulu Zhou, Li Wang, Xiaoyang Wu, Di Xie and Shiliang Pu
2018 arXiv   pre-print
To ensure consistency, dynamic fixed points (DFP) are adopted in testing CNN. Parameters in the compressed model are first quantized to DFP and then used for inference of CNN.  ...  First, different model is used for frames encoded with different quantization parameter (QP), respectively. It is expensive for hardware.  ...  Before DFP inference, parameters in the compressed model are first quantized and converted to DFP. To recover performance loss due to quantization, fine-tuning is established [12] .  ... 
arXiv:1805.06121v1 fatcat:3kuhmvtotrd2plt3rn6r7qtb5q

ProgressiveNN: Achieving Computational Scalability with Dynamic Bit-Precision Adjustment by MSB-first Accumulative Computation

Junnosuke Suzuki, Tomohiro Kaneko, Kota Ando, Kazutoshi Hirose, Kazushi Kawamura, Thiem Van Chu, Masato Motomura, Jaehoon Yu
2021 International Journal of Networking and Computing  
BWB quantization decomposes and transforms each parameter into a bitwise format for ABS inference, which then utilizes the parameters in the most-significant-bit-first order, enabling progressive inference  ...  This paper also presents a method to dynamically adjust the bit-precision of the ProgressiveNN to achieve a better trade-off between computational resource use and accuracy for practical applications using  ...  For applications using sequential data such as audio and video, it is possible to determine proper computational cost based on previous inference results.  ... 
doi:10.15803/ijnc.11.2_338 fatcat:r5jmpw2vdfcgfhpevh6rr7oaza
« Previous Showing results 1 — 15 out of 4,973 results