Filters








14,683 Hits in 6.2 sec

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
We demonstrate that this improves performance and provide an intuition that led to this design.  ...  We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite.  ...  Acknowledgments We would like to thank Matt Streeter and Sergey Ioffe for their helpful feedback and discussion.  ... 
doi:10.1109/cvpr.2018.00474 dblp:conf/cvpr/SandlerHZZC18 fatcat:mwxsieetgjaxjlnynflic2gc2q

MobileNetV2: Inverted Residuals and Linear Bottlenecks [article]

Mark Sandler and Andrew Howard and Menglong Zhu and Andrey Zhmoginov and Liang-Chieh Chen
2019 arXiv   pre-print
We demonstrate that this improves performance and provide an intuition that led to this design.  ...  representations in the input an MobileNetV2 uses lightweight depthwise convolutions to filter features in the intermediate expansion layer.  ...  Acknowledgments We would like to thank Matt Streeter and Sergey Ioffe for their helpful feedback and discussion.  ... 
arXiv:1801.04381v4 fatcat:obatzv53rrgyvpzmhs5iq425xu

Structured Multi-Hashing for Model Compression [article]

Elad Eban, Yair Movshovitz-Attias, Hao Wu, Mark Sandler, Andrew Poon, Yerlan Idelbayev, Miguel A. Carreira-Perpinan
2019 arXiv   pre-print
In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of  ...  We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and MobileNet architecture families.  ...  We ask the question: Can deep models be accurate when using an extremely small number of trainable variables? Can this be done for an architecture that was not specifically designed for this purpose?  ... 
arXiv:1911.11177v1 fatcat:6osyq4nuqnazhf6fzmgoebvrra

Structured Multi-Hashing for Model Compression

Elad Eban, Yair Movshovitz-Attias, Hao Wu, Mark Sandler, Andrew Poon, Yerlan Idelbayev, Miguel A. Carreira-Perpinan
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of  ...  We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and Mo-bileNet architecture families.  ...  We ask the question: Can deep models be accurate when using an extremely small number of trainable variables? Can this be done for an architecture that was not specifically designed for this purpose?  ... 
doi:10.1109/cvpr42600.2020.01192 dblp:conf/cvpr/EbanMWSPIC20 fatcat:dmax6v7ztzct5mh4ijvwba3ioy

EDSSA: An Encoder-Decoder Semantic Segmentation Networks Accelerator on OpenCL-Based FPGA Platform

Hongzhi Huang, Yakun Wu, Mengqi Yu, Xuesong Shi, Fei Qiao, Li Luo, Qi Wei, Xinjun Liu
2020 Sensors  
We introduce the related technologies, architecture design, algorithm optimization, and hardware implementation of the Encoder-Decoder semantic segmentation network SegNet as an example, and undertake  ...  Using an Intel Arria-10 GX1150 platform for evaluation, our work achieves a throughput higher than 432.8 GOP/s with power consumption of about 20 W, which is a 1.2× times improvement the energy-efficiency  ...  Conflicts of Interest: The authors declare no conflict of interest. Sensors 2020, 20, 3969  ... 
doi:10.3390/s20143969 pmid:32708851 fatcat:bfzh6ou5djdf5j46734bnbdjgy

Recurrently Decomposable 2-D Convolvers for FPGA-Based Digital Image Processing

Zhao-Bin Ma, Yang Yang, Yun-Xia Liu, Anil Anthony Bharath
2016 IEEE Transactions on Circuits and Systems - II - Express Briefs  
The conclusion is that RD based architectures achieve higher area efficiency than other previously reported state-of-the-art methods, especially for larger convolution masks.  ...  An area efficiency metric is also suggested, which allows the most appropriate architecture to be selected.  ...  ACKNOWLEDGEMENT The authors would like to thank the associate editor and the reviewers for helpful comments that greatly improved this brief.  ... 
doi:10.1109/tcsii.2016.2536202 fatcat:udiakci4ureb7m6rzlorlbrl6a

A Multiplier-less Implementation of Two-Dimensional Circular-Support Wavelet Transform on FPGA

Jassim Abdul-Jabbar, Zahraa Abede, Akram Dawood
2013 Iraqi Journal for Electrical And Electronic Engineering  
The FPGA (Spartan-3E) Kit is used to implement the resulting architecture in a multiplier-less manner and to calculate the die area and the critical path or maximum frequency of operation.  ...  The designed 2-D wavelet filter bank is realized in a separable architecture.  ...  The corresponding 2-D wavelet filters are designed. Then a multiplier-less implementation of such 2-D CSWT on FPGA is proposed.  ... 
doi:10.37917/ijeee.9.1.2 fatcat:lt3vzhkypfaxfjbbeitbhcbecu

Review of Efficient Discrete Wavelet Filter based CSD Technique

Naveen Raikwar, Navneet Kaur
2016 International Journal of Computer Applications  
With this use of this architecture design the speed of the wavelet packet transforms will be increased with a factor two but the occupied area of the circuit will be less than double.  ...  We have proposed canonic signed digit (CSD) arithmetic based design for low complexity and efficient implementation of discrete wavelet packet transform.  ...  The proposed architecture thus therefore, can be used as one for area delay efficient and energy efficient implementation of multi-level 2-Dimensional Discrete Wavelet Transform by using Daubechiesas well  ... 
doi:10.5120/ijca2016911926 fatcat:7qrx4fp3c5ektefiyi6bhr7d7y

An Efficient Area and Power 2D-DWT Lifting using Radix-8 Modified Booth Algorithm

S.Angel Latha Mary, N. Dharani
2014 International Journal of Communication and Networking System  
Lifting based and Convolution based designs have been suggested for efficient VLSI implementation of 2D-DWT.  ...  This paper presents an efficient implementation of high speed multiplier using the shift and adds method, Radix-8 modified Booth multiplier algorithm.  ...  This transform is very efficient for multi-resolution decomposition of signals.  ... 
doi:10.20894/ijcnes.103.003.001.001 fatcat:vgkz2pluvrhqnfzzudi5yygupi

Time and Area Efficient 2-D DWT using Multiplier-less Canonic Signed Digit Technique

2019 International journal of recent technology and engineering  
Multiplier-less equipment usage approach gives an answer for diminish chip region, lower equipment intricacy and higher throughput of calculation of the DWT design.The proposed design outline is (i) priority  ...  Based on the proposed design outline four separate design approaches and concurrent architectures are presented in this thesis for area-delay and power efficient realization of multilevel 2-D DWT.In this  ...  The proposed CSD-based 1-D DWT structure involves significantly less logic resources than the similar existing multiplier-less designs and, it has less bit-cycle period than others.  ... 
doi:10.35940/ijrte.d7419.118419 fatcat:patdxpvc25c2tprrdxn5cnwe3y

CapsAcc: An Efficient Hardware Accelerator for CapsuleNets with Data Reuse [article]

Alberto Marchisio, Muhammad Abdullah Hanif, Muhammad Shafique
2018 arXiv   pre-print
State-of-the-art convolutional DNN accelerators would not work efficiently for CapsuleNets, as their designs do not account for key operations involved in CapsuleNets, like squashing and dynamic routing  ...  Recently, CapsuleNets have overtaken traditional DNNs, because of their improved generalization ability due to the multi-dimensional capsules, in contrast to the single-dimensional neurons.  ...  Squashing The squashing is an activation function designed to efficiently fit for the prediction vector.  ... 
arXiv:1811.08932v1 fatcat:72ctodtjwrca7ngmppf3rqttoy

An Efficient VLSI Architecture of Fixed and Reconfigurable FIR based on Booth Multiplier

Ms. Kanaka K, Ilayaraja M.E
2017 IJIREEICE  
filters, the block implementation of direct-form FIR structure has less ADP than the proposed structure by using Booth Multipliers.  ...  The possibility of realization of block FIR filter in transpose form configuration for area-delay efficient realization of large order FIR filters for both fixed and reconfigurable applications.  ...  Besides, it is shown that the proposed systolic designs for circular convolution can be used for computation of linear convolution as well.  ... 
doi:10.17148/ijireeice.2017.5644 fatcat:wjgn2hyr7ne63iuy5cpdh47gxm

Design and Implementation of Deep Neural Network for Edge Computing

Junyang ZHANG, Yang GUO, Xiao HU, Rongzhen LI
2018 IEICE transactions on information and systems  
For an edge oriented computing vector processor, combined with a specific neural network model, a new data layout method for putting the input feature maps in DDR, rearrangement of the convolutional kernel  ...  Aiming at the difficulty of parallelism of two-dimensional matrix convolution, a method of parallelizing the matrix convolution calculation in the third dimension is proposed, by setting the vector register  ...  Acknowledgments We would like to thank the project of (2016YFB0200401) the National Key Research and Development Program of China and (60133007, 61572025) the National Natural Science Foundation of China  ... 
doi:10.1587/transinf.2018edp7044 fatcat:2qmt4l76grebbiwwt54smmqlyq

HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions [article]

Duo Li, Aojun Zhou, Anbang Yao
2019 arXiv   pre-print
MobileNets, a class of top-performing convolutional neural network architectures in terms of accuracy and efficiency trade-off, are increasingly used in many resourceaware vision applications.  ...  of less than 40 MFLOPs.  ...  Inception series [16, 38] customizes group convolution application by coupling it with multi-branch design.  ... 
arXiv:1908.03888v1 fatcat:emm3kleypnfzffosz7xhivjwj4

Evolutionary Neural Architecture Search Supporting Approximate Multipliers [article]

Michal Pinos and Vojtech Mrazek and Lukas Sekanina
2021 arXiv   pre-print
We propose a multi-objective NAS method based on Cartesian genetic programming for evolving convolutional neural networks (CNN).  ...  The most suitable approximate multipliers are automatically selected from a library of approximate multipliers.  ...  The computational experiments were supported by The Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project "e-Infrastructure  ... 
arXiv:2101.11883v1 fatcat:ser3sjlspjfjplryeh5mxo3oum
« Previous Showing results 1 — 15 out of 14,683 results