Filters








413 Hits in 4.6 sec

DeepCache

Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, Xuanzhe Liu
2018 Proceedings of the 24th Annual International Conference on Mobile Computing and Networking - MobiCom '18  
We present DeepCache, a principled cache design for deep learning inference in continuous mobile vision.  ...  It addresses a key challenge raised by mobile vision: the cache must operate under video scene variation, while trading off among cacheability, overhead, and loss in model accuracy.  ...  support continuous mobile vision, Convolutional Neural Network Xuanzhe Liu is the paper's corresponding author.  ... 
doi:10.1145/3241539.3241563 dblp:conf/mobicom/XuZLLL18 fatcat:w4pwzh3trjeb5ealpwq4i6c4ea

Boosting Mobile CNN Inference through Semantic Memory [article]

Yun Li, Chen Zhang, Shihao Han, Li Lyna Zhang, Baoqun Yin, Yunxin Liu, Mengwei Xu
2021 arXiv   pre-print
SMTM is prototyped on commodity CNN engine and runs on both mobile CPU and GPU.  ...  low-cost yet accurate cache and lookup; (2) it uses a novel metric in determining the exit timing considering different layers' inherent characteristics; (3) it adaptively adjusts the cache size and semantic  ...  Optimizing fpga-based accelerator design for deep convolutional neural control approach to resource-efficient deep neural networks on mobile devices. networks.  ... 
arXiv:2112.02644v1 fatcat:gfyecsojvzgaxjgmlwihov26ju

Edge Intelligence: Architectures, Challenges, and Applications [article]

Dianlei Xu, Tong Li, Yong Li, Xiang Su, Sasu Tarkoma, Tao Jiang, Jon Crowcroft, Pan Hui
2020 arXiv   pre-print
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence.  ...  We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems  ...  Xu et al. propose CNNCache, a cache-based software accelerator for mobile continuous vision applications, which reuses the computation of similar image regions to avoid unnecessary computation and saves  ... 
arXiv:2003.12172v2 fatcat:xbrylsvb7bey5idirunacux6pe

Deep Learning on Computational-Resource-Limited Platforms: A Survey

Chunlei Chen, Peng Zhang, Huixiang Zhang, Jiangyan Dai, Yugen Yi, Huihui Zhang, Yonghui Zhang
2020 Mobile Information Systems  
network.  ...  Subsequently, we explore the underlying reasons for the high computational overhead of DL through reviewing the fundamental concepts including capacity, generalization, and backpropagation of a neural  ...  Huynh et al. developed a tool DeepMon for continuous vision applications based on commodity mobile GPUs [37] .  ... 
doi:10.1155/2020/8454327 fatcat:pocvmihd7jcw7ig544s2yduovu

2020-2021 Index IEEE Transactions on Computers Vol. 70

2021 IEEE transactions on computers  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  Convolutional neural networks PermCNN: Energy-Efficient Convolutional Neural Network Hardware Architecture With Permuted Diagonal Structure.  ...  ., +, TC Feb. 2021 163-173 Distributed Deep Convolutional Neural Networks for the Internet-of-Things.  ... 
doi:10.1109/tc.2021.3134810 fatcat:p5otlsapynbwvjmqogj47kv5qa

CNN-MERP: An FPGA-Based Memory-Efficient Reconfigurable Processor for Forward and Backward Propagation of Convolutional Neural Networks [article]

Xushen Han, Dajiang Zhou, Shihao Wang, Shinji Kimura
2017 arXiv   pre-print
Large-scale deep convolutional neural networks (CNNs) are widely used in machine learning applications.  ...  CNN-MERP incorporates an efficient memory hierarchy that significantly reduces the bandwidth requirements from multiple optimizations including on/off-chip data allocation, data flow optimization and data reuse  ...  neural network.  ... 
arXiv:1703.07348v1 fatcat:e3e4v6h6qncu3faseqpwn75is4

Alleviating Bottlenecks for DNN Execution on GPUs via Opportunistic Computing [article]

Xianwei Cheng, Hui Zhao, Mahmut Kandemir, Saraju Mohanty, Beilei Jiang
2019 arXiv   pre-print
As one of the most widely used platforms for DNN acceleration, GPUs face the bottleneck of on-chip bandwidth.  ...  Simple algorithms such as direct convolution are finding their way in embedded machine learning.  ...  Kim et.al proposed a kernel decomposition method used for binary weight neural networks for operation reduction [19] .  ... 
arXiv:1910.07055v1 fatcat:y3zminsolngzxkbnkbadmhoche

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
2022 ACM Transactions on Design Automation of Electronic Systems  
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing, and speech recognition.  ...  Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video, and natural language processing by exploiting their spatial sparsity and temporal  ...  Such continuous relaxation allows optimizing neural network architectures in the continuous space using gradient descent, which greatly improves the search efficiency.  ... 
doi:10.1145/3486618 fatcat:h6xwv2slo5eklift2fl24usine

Boda-RTC: Productive Generation of Portable, Efficient Code for Convolutional Neural Networks on Mobile Computing Platforms [article]

Matthew Moskewicz and Forrest Iandola and Kurt Keutzer
2016 arXiv   pre-print
In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results.  ...  The popularity of neural networks (NNs) spans academia, industry, and popular culture.  ...  INTRODUCTION AND MOTIVATION Convolutional neural network (CNN) based approaches have become dominant in a broad variety of computer vision applications, including object detection [12] , video classification  ... 
arXiv:1606.00094v2 fatcat:khbnzb3z6bdcpmd4dughdiflpm

Recent Advances in Efficient Computation of Deep Convolutional Neural Networks [article]

Jian Cheng, Peisong Wang, Gang Li, Qinghao Hu, Hanqing Lu
2018 arXiv   pre-print
As for hardware implementation of deep neural networks, a batch of accelerators based on FPGA/ASIC have been proposed in recent years.  ...  At the same time, the computational complexity and resource consumption of these networks also continue to increase.  ...  Stochastic computing representing continuous values through streams of random bits have been investigated for hardware acceleration of deep neural networks [66, 71, 39] .  ... 
arXiv:1802.00939v2 fatcat:5mchdjcrc5czhgracihs4jvbmq

A Survey of Near-Data Processing Architectures for Neural Networks [article]

Mehdi Hassanpour, Marc Riera, Antonio González
2021 arXiv   pre-print
network (NN)-based accelerators has grown significantly.  ...  Emerging memory technologies, such as ReRAM and 3D-stacked, are promising for efficiently architecting NDP-based accelerators for NN due to their capabilities to work as both: High-density/low-energy storage  ...  Dally, “Learning both weights and con- accelerator for binary convolutional neural network,” IEEE Transactions nections for efficient neural network,” in Advances in neural information  ... 
arXiv:2112.12630v1 fatcat:drkwrztkazd3hlblxc7i4kgn2a

Autonomous Learning System Towards Mobile Intelligence

Mengwei Xu, Institute of Software, School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China; Key Laboratory of High Confidence Software Technologies of Ministry of Education, Peking University, Beijing 100871, China, Yuanqiang Liu, Kang Huang, Xuanzhe Liu, Gang Huang
2021 International Journal of Software and Informatics  
Furthermore, by optimization techniques such as model compression, neural network compiler, and runtime cache reuse, AutLearn can significantly reduce the on-client training cost.  ...  AutLearn can also remarkably cut the computational and energy cost of neural network training on mobile devices.  ...  The classic convolutional neural network MobileNet is trained based on AutLearn to realize image classification on mobiles. MobileNet is a network structure specially designed for mobile devices.  ... 
doi:10.21655/ijsi.1673-7288.00247 fatcat:fxbhsznxdvcs3aflfqwpmdlgou

SPARCNet

Adam Page, Ali Jafari, Colin Shea, Tinoosh Mohsenin
2017 ACM Journal on Emerging Technologies in Computing Systems  
In the second contribution, we propose SPARCNet, a hardware accelerator for efficient deployment of SPARse Convolutional NETworks.  ...  In particular, deep convolutional neural networks have been shown to dominate on several popular public benchmarks such as the ImageNet database.  ...  In order to demonstrate both phases, a common convolutional neural network topology is explored within the context of the computer vision dataset CIFAR.  ... 
doi:10.1145/3005448 fatcat:quxiy72jtrfipdpeup75mhiizm

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads [article]

Sam Likun Xi, Yuan Yao, Kshitij Bhardwaj, Paul Whatmough, Gu-Yeon Wei, David Brooks
2019 arXiv   pre-print
In recent years, there has been tremendous advances in hardware acceleration of deep neural networks.  ...  SMAUG offers researchers a wide range of capabilities for evaluating DNN workloads, from diverse network topologies to easy accelerator modeling and SoC integration.  ...  In particular, for both performance and energy efficiency, dedicated hardware accelerators for deep neural networks (DNNs) have received a phenomenal amount of interest [1] - [7] .  ... 
arXiv:1912.04481v2 fatcat:akewc2b7xvbm7malvjxcx6xj2i

ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network [article]

David Gschwend
2020 arXiv   pre-print
Convolutional Neural Networks (CNNs) presently achieve record-breaking accuracies in all image understanding benchmarks, but have a very high computational complexity.  ...  It accelerates the full network based on a nested-loop algorithm which minimizes the number of arithmetic operations and memory accesses.  ...  Acknowledgement First and foremost, I would like to thank my supervisor Emanuel Schmid for the pleasant  ... 
arXiv:2005.06892v1 fatcat:tduahjb5w5cjromemahngmt3gy
« Previous Showing results 1 — 15 out of 413 results