Filters








185 Hits in 2.0 sec

Generative and Discriminative Deep Belief Network Classifiers: Comparisons Under an Approximate Computing Framework [article]

Siqiao Ruan, Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das
2021 arXiv   pre-print
To enable low power implementations, we consider efficient bitwidth reduction and pruning for the class of Deep Learning algorithms known as Discriminative Deep Belief Networks (DDBNs) for embedded-device  ...  Based on our analysis, we provide novel insights and recommendations for choice of training objectives, bitwidth values, and accuracy sensitivity with respect to the amount of labeled data for implementing  ...  A DDBN is a stochastic neural network that extracts a deep hierarchical representation from data.  ... 
arXiv:2102.00534v1 fatcat:sma3ahi6bbfafa5drwmooipn6u

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA [article]

Xinyu Zhang, Srinjoy Das, Ojash Neopane, Ken Kreutz-Delgado
2017 arXiv   pre-print
In support of such applications, various FPGA accelerator architectures have been proposed for convolutional neural networks (CNNs) that enable high performance for classification tasks at lower power  ...  However, to date, there has been little research on the use of FPGA implementations of deconvolutional neural networks (DCNNs).  ...  DECONVOLUTIONAL NEURAL NETWORK A deconvolutional neural network (DCNN) converts latent space representations to high-dimensional data similar to the training set by applying successive deconvolution operations  ... 
arXiv:1705.02583v1 fatcat:qpscnbb5lzfmxfisbmautgy6om

Hardware-Centric AutoML for Mixed-Precision Quantization

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
2020 International Journal of Computer Vision  
Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference.  ...  We interpreted the implication of different quantization policies, which offer insights for both neural network architecture design and hardware architecture design.  ...  Acknowledgements We thank NSF Career Award #1943349, MIT-IBM Watson AI Lab, Samsung, SONY, Xilinx, TI and AWS for supporting this research.  ... 
doi:10.1007/s11263-020-01339-6 fatcat:c565raez3faihf7jhfiidb47mu

HAQ: Hardware-Aware Automated Quantization With Mixed Precision

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
There are plenty of specialized hardware for neural networks, but little research has been done for specialized neural network optimization for a particular hardware architecture.  ...  Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference.  ...  We thank MIT Quest for Intelligence, MIT-IBM Watson AI Lab, Xilinx, Samsung, Intel, ARM, Qualcomm, and SONY for supporting this research.  ... 
doi:10.1109/cvpr.2019.00881 dblp:conf/cvpr/WangLLLH19 fatcat:xlb3d7riejcm7c4udpzgqnk67y

ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks [article]

Ahmed T. Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Hadi Esmaeilzadeh
2020 arXiv   pre-print
Deep Neural Networks (DNNs) typically require massive amount of computation resource in inference tasks for computer vision applications.  ...  As such, deep quantization opens a large hyper-parameter space (bitwidth of the layers), the exploration of which is a major challenge.  ...  Here, we employ RL in the context of quantization to choose an appropriate quantization bitwidth for each layer of a network. Training algorithms for quantized neural networks.  ... 
arXiv:1811.01704v4 fatcat:eb5bznr55zftfnvgxt6dpv5f2i

HAQ: Hardware-Aware Automated Quantization with Mixed Precision [article]

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
2019 arXiv   pre-print
Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference.  ...  We interpreted the implication of different quantization policies, which offer insights for both neural network architecture design and hardware architecture design.  ...  We thank MIT Quest for Intelligence, MIT-IBM Watson AI Lab, Xilinx, Samsung, Intel, ARM, Qualcomm, and SONY for supporting this research.  ... 
arXiv:1811.08886v3 fatcat:csobmurle5bxrjr73mgyydz6ee

Generative Low-bitwidth Data Free Quantization [article]

Shoukai Xu, Haokun Li, Bohan Zhuang, Jing Liu, Jiezhang Cao, Chuangrun Liang, Mingkui Tan
2020 arXiv   pre-print
In this paper, we investigate a simple-yet-effective method called Generative Low-bitwidth Data Free Quantization (GDFQ) to remove the data dependence burden.  ...  Neural network quantization is an effective way to compress deep models and improve their execution latency and energy efficiency, so that they can be deployed on mobile or embedded devices.  ...  Acknowledgements This work was partially supported by the Key-Area Research and Development Program of Guangdong Province 2018B010107001, Program for Guangdong Introducing Innovative and Entrepreneurial  ... 
arXiv:2003.03603v3 fatcat:vudozwttcnajbolh7v4ntajwba

CodeX: Bit-Flexible Encoding for Streaming-based FPGA Acceleration of DNNs [article]

Mohammad Samragh, Mojan Javaheripi, Farinaz Koushanfar
2019 arXiv   pre-print
CodeX full-stack framework comprises of a compiler which takes a high-level Python description of an arbitrary neural network architecture.  ...  This paper proposes CodeX, an end-to-end framework that facilitates encoding, bitwidth customization, fine-tuning, and implementation of neural networks on FPGA platforms.  ...  INTRODUCTION Deep Neural Networks (DNNs) are being widely developed for various machine learning applications, many of which are required to run on embedded devices.  ... 
arXiv:1901.05582v1 fatcat:l4lsuogaknhwlpcbyncbich42q

NICE: Noise Injection and Clamping Estimation for Neural Network Quantization

Chaim Baskin, Evgenii Zheltonozhkii, Tal Rozen, Natan Liss, Yoav Chai, Eli Schwartz, Raja Giryes, Alexander M. Bronstein, Avi Mendelson
2021 Mathematics  
The method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve accuracy.  ...  To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly.  ...  Introduction Deep neural networks are important tools in the machine learning arsenal.  ... 
doi:10.3390/math9172144 fatcat:zgrkaxcoazd3vdesjviiv25vca

Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

Jianghao Shen, Yue Wang, Pengfei Xu, Yonggan Fu, Zhangyang Wang, Yingyan Lin
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice.  ...  For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two "extremes" (i.e., full bitwidth and zero bitwidth  ...  Two-Step Training Procedure: Given a pre-trained CNN model A, our goal is to jointly train A and its gating network for targeted computational budget.  ... 
doi:10.1609/aaai.v34i04.6025 fatcat:mp66ycyvdzap7bnrdd2vegprpm

Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference [article]

Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin
2020 arXiv   pre-print
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice.  ...  For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two "extremes" (i.e., full bitwidth and zero bitwidth  ...  Two-Step Training Procedure: Given a pre-trained CNN model A, our goal is to jointly train A and its gating network for targeted computational budget.  ... 
arXiv:2001.00705v1 fatcat:72xef5fl4bhwxcwdkprrrsqz4i

NICE: Noise Injection and Clamping Estimation for Neural Network Quantization [article]

Chaim Baskin, Natan Liss, Yoav Chai, Evgenii Zheltonozhskii, Eli Schwartz, Raja Giryes, Avi Mendelson, Alexander M. Bronstein
2018 arXiv   pre-print
The \uniqname method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve the accuracy.  ...  Convolutional Neural Networks (CNN) are very popular in many fields including computer vision, speech recognition, natural language processing, to name a few.  ...  C O N C L U S I O N We introduced NICE -a training scheme for quantized neural networks.  ... 
arXiv:1810.00162v2 fatcat:jep74coe3bgg5jminzmhy7l6uq

PT-MMD: A Novel Statistical Framework for the Evaluation of Generative Systems [article]

Alexander Potapov, Ian Colbert, Ken Kreutz-Delgado, Alexander Cloninger, Srinjoy Das
2019 arXiv   pre-print
Stochastic-sampling-based Generative Neural Networks, such as Restricted Boltzmann Machines and Generative Adversarial Networks, are now used for applications such as denoising, image occlusion removal  ...  We demonstrate the effectiveness of this metric for two cases: (1) Selection of bitwidth and activation function complexity to achieve minimum power-at-performance for Restricted Boltzmann Machines; (2  ...  Thanks to CENIC for the 100Gpbs networks. AC was partially supported by NSF grant DMS-1819222. The authors would also like to thank Dhiman Sengupta at UCSD.  ... 
arXiv:1910.12454v1 fatcat:zs3grrwrzjghhjxiyjoeygnnpy

UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks [article]

Chaim Baskin, Eli Schwartz, Evgenii Zheltonozhskii, Natan Liss, Raja Giryes, Alex M. Bronstein, Avi Mendelson
2018 arXiv   pre-print
We present a novel method for neural network quantization that emulates a non-uniform k-quantile quantizer, which adapts to the distribution of the quantized parameters.  ...  Our approach provides a novel alternative to the existing uniform quantization techniques for neural networks.  ...  in low-precision arithmetic, we start by outlining several common quantization schemes and discussing their suitability for deep neural networks.  ... 
arXiv:1804.10969v3 fatcat:hpzkvj2m6vhc5oj4kczl6hsfdy

Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks [article]

Yochai Zur, Chaim Baskin, Evgenii Zheltonozhskii, Brian Chmiel, Itay Evron, Alex M. Bronstein, Avi Mendelson
2019 arXiv   pre-print
Recently, deep learning has become a de facto standard in machine learning with convolutional neural networks (CNNs) demonstrating spectacular success on a wide variety of tasks.  ...  While mainstream deep learning methods train the neural networks weights while keeping the network architecture fixed, the emerging neural architecture search (NAS) techniques make the latter also amenable  ...  Introduction Convolutional neural networks (CNNs) have become a main solution for computer vision tasks. However, high computation requirements complicate their usage in low-power systems.  ... 
arXiv:1904.09872v4 fatcat:7swtp7ya5ja6rn5woa6vkcxfve
« Previous Showing results 1 — 15 out of 185 results