Filters








4,052 Hits in 4.8 sec

GPGPU Accelerated Deep Object Classification on a Heterogeneous Mobile Platform

Syed Rizvi, Gianpiero Cabodi, Denis Patti, Gianluca Francini
2016 Electronics  
Deep convolutional neural networks achieve state-of-the-art performance in image classification.  ...  The Compute Unified Device Architecture(CUDA)-based implementation of the proposed approach is evaluated over three different image classification networks on a Tegra K1 CPU-GPU mobile processor.  ...  A mobile-GPU accelerated deep neural network flow is presented in [25] . Different techniques to optimize the various components of a typical neural network flow on mobile devices are discussed.  ... 
doi:10.3390/electronics5040088 fatcat:oexwebv5wvarlomsmxihvvd2qi

Deep Learning Acceleration Techniques for Real Time Mobile Vision Applications [article]

Gael Kamdem De Teyou
2019 arXiv   pre-print
As a consequence, the possibility of implementing deep neural networks to mobile environments has attracted a lot of researchers.  ...  For the particular case of computer vision, several algorithms like object detection in real time videos have been proposed and they work well on Desktop GPUs and distributed computing platforms.  ...  CNNdroid presented in [44] , is an open source GPU-accelerated library, dubbed CNNdroid, which is specifically designed and optimized for execution of trained deep CNNs on Android-based mobile devices  ... 
arXiv:1905.03418v2 fatcat:mxtgdesm2fafbjmyuck5jkphpa

PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones [article]

Gang Chen, Shengyu He, Haitao Meng, Kai Huang
2019 arXiv   pre-print
In this paper, we propose PhoneBit, a GPU-accelerated BNN inference engine for Android-based mobile devices that fully exploits the computing power of BNNs on mobile GPUs.  ...  Over the last years, a great success of deep neural networks (DNNs) has been witnessed in computer vision and other fields.  ...  To the best of our knowledge, such GPU-accelerated BNN libraries are not available yet on mobile platforms. III. BACKGROUND A.  ... 
arXiv:1912.04050v1 fatcat:2tdhoyu2qvcg3pddi5uoojlcea

Optimized Deep Neural Networks for Real-Time Object Classification on Embedded GPUs

Syed Rizvi, Gianpiero Cabodi, Gianluca Francini
2017 Applied Sciences  
Convolution is the most computationally intensive task of the Convolutional Neural Network (CNN). It requires a lot of memory storage and computational power.  ...  Proposed flow is evaluated on two different embedded platforms: first on an Nvidia Jetson TX1 embedded board and then on a Tegra K1 GPU of an Nvidia Shield K1 Tablet.  ...  Author Contributions: Syed Tahir Hussain Rizvi conducted the experiments and worked on the draft of the paper. Gianpiero Cabodi and Gianluca Francini are the academic tutors.  ... 
doi:10.3390/app7080826 fatcat:i7fwrv7umjcorhl7hwm3qnyrby

RSTensorFlow

Moustafa Alzantot, Yingnan Wang, Zhengshuang Ren, Mani B. Srivastava
2017 Proceedings of the 1st International Workshop on Deep Learning for Mobile Systems and Applications - EMDL '17  
We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU.  ...  We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources.  ...  Acknowledgments This research was supported in part by the NIH Center of Excellence for Mobile Sensor Data-to-Knowledge (MD2K) under award 1-U54EB020404-01, and by the U.S.  ... 
doi:10.1145/3089801.3089805 pmid:29629431 pmcid:PMC5889131 dblp:conf/mobisys/AlzantotWRS17 fatcat:j3hd5ruiu5hshd2qs2icfvj32q

AI Benchmark: Running Deep Neural Networks on Android Smartphones [article]

Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, Luc Van Gool
2018 arXiv   pre-print
We give an overview of the hardware acceleration resources available on four main mobile chipset platforms: Qualcomm, HiSilicon, MediaTek and Samsung.  ...  In this paper, we present a study of the current state of deep learning in the Android ecosystem and describe available frameworks, programming models and the limitations of running AI on smartphones.  ...  Especially interesting in the context of AI and deep learning are Nvidia Tegra platforms that are supporting CUDA [68] and cuDNN [69] GPU-accelerated libraries of primitives for deep neural networks  ... 
arXiv:1810.01109v2 fatcat:ad76mlp7vjdyddzm5sesq3cmee

AI Benchmark: Running Deep Neural Networks on Android Smartphones [chapter]

Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, Luc Van Gool
2019 Lecture Notes in Computer Science  
We give an overview of the hardware acceleration resources available on four main mobile chipset platforms: Qualcomm, HiSilicon, MediaTek and Samsung.  ...  In this paper, we present a study of the current state of deep learning in the Android ecosystem and describe available frameworks, programming models and the limitations of running AI on smartphones.  ...  Especially interesting in the context of AI and deep learning are Nvidia Tegra platforms that are supporting CUDA [32] and cuDNN [10] GPU-accelerated libraries of primitives for deep neural networks  ... 
doi:10.1007/978-3-030-11021-5_19 fatcat:vxurra2fmbf2xigbbthpwzmgta

CNNdroid

Seyyed Salar Latifi Oskouei, Hossein Golestani, Matin Hashemi, Soheil Ghiasi
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
We present a GPU-accelerated library, dubbed CNNdroid, for execution of trained deep CNNs on Android-based mobile devices.  ...  Many mobile applications running on smartphones and wearable devices would potentially benefit from the accuracy and scalability of deep CNN-based machine learning algorithms.  ...  On mobile platforms, to the best of our knowledge, such GPU-accelerated libraries are not available.  ... 
doi:10.1145/2964284.2973801 dblp:conf/mm/OskoueiGHG16 fatcat:kxsw7qyp6jaqrjg25wpvnizxuu

Hardware-aware mobile building block evaluation for computer vision [article]

Maxim Bonnaerens, Matthias Freiberger, Marian Verhelst, Joni Dambre
2022 arXiv   pre-print
Our comparison uses pareto fronts based on randomly sampled networks from a design space to capture the underlying accuracy/complexity trade-offs.  ...  This highlights the importance of benchmarking building blocks as a preselection step in the design process of a neural network.  ...  HARDWARE-AWARE MOBILE BUILDING BLOCKS EVALUATION In this section we show the results of our evaluation of mobile building blocks for convolutional neural networks on various hardware platforms.  ... 
arXiv:2208.12694v1 fatcat:gitqi2qkxnhndhsh3kvllzuxqq

MB-CNN: Memristive Binary Convolutional Neural Networks for Embedded Mobile Devices

Arjun Pal Chowdhury, Pranav Kulkarni, Mahdi Nazm Bojnordi
2018 Journal of Low Power Electronics and Applications  
In particular, convolutional neural networks have emerged as one of the most powerful techniques in computer vision, speech recognition, and AI applications that can improve the mobile user experience.  ...  Applications of neural networks have gained significant importance in embedded mobile devices and Internet of Things (IoT) nodes.  ...  [81] proposed FPGA solution to accelerate convolution neural network in an embedded device. GPU based CNN acceleration for low power embedded device was proposed by Motamedi et al. [82] .  ... 
doi:10.3390/jlpea8040038 fatcat:qwrw67tx4ffuthzy5xi4o4ee2y

Neural Architecture Search Survey: A Hardware Perspective

Krishna Teja Chitty-Venkata, Arun K. Somani
2022 ACM Computing Surveys  
We review the problem of automating hardware-aware architectural design process of Deep Neural Networks (DNNs).  ...  of neural algorithm and hardware accelerator specifications.  ...  Any opinions, indings, and conclusions or recommendations expressed in this material are those of the author(s).  ... 
doi:10.1145/3524500 fatcat:4ibnwmgbdnbhjpk4u7soc6aom4

GPU-Based Embedded Intelligence Architectures and Applications

Li Minn Ang, Kah Phooi Seng
2021 Electronics  
This paper gives a comprehensive review and representative studies of the emerging and current paradigms for GPU-based EI with the focus on the architecture, technologies and applications: (1) First, the  ...  technologies for GPU-based deep learning techniques and applications are discussed in detail; and (3) Third, various architecture technologies for machine learning techniques and applications are discussed  ...  The face recognition task was realized using a convolution neural network (CNN) on a hardware platform utilizing an embedded GPU.  ... 
doi:10.3390/electronics10080952 fatcat:paubm2sevbhixi2in63ayflmti

Deep Learning for Mobile Multimedia

Kaoru Ota, Minh Son Dao, Vasileios Mezaris, Francesco G. B. De Natale
2017 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
Speci cally, in recent years powerful and compact GPUs have been released at a ordable prices, which allow accelerating the computation of the weights of DNNs.  ...  area, looking back to the evolution of neural networks, and arriving to the most recent results in terms of methodologies, technologies and applications for mobile environments. in Facebook.  ...  DeepSense [51] , a mobile GPU-based deep convolution neural network (CNN) framework, is designed to run CNNs on mobile devices that are equipped with GPUs.  ... 
doi:10.1145/3092831 fatcat:ez2fcgckhjawlfywyecest4jqy

A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks

Vinayak Gokhale, Jonghoon Jin, Aysegul Dundar, Berin Martini, Eugenio Culurciello
2014 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops  
In this paper we present nn-X: a scalable, low-power coprocessor for enabling real-time execution of deep neural networks. nn-X is implemented on programmable logic devices and comprises an array of configurable  ...  These collections perform the most common operations in deep networks: convolution, subsampling and non-linear functions.  ...  Deep networks like convolutional neural networks are inherently parallel and can be accelerated on custom hardware to give a low powered mobile system capable of achieving high performance.  ... 
doi:10.1109/cvprw.2014.106 dblp:conf/cvpr/GokhaleJDMC14 fatcat:vgkyi5s5kfdxnp6cxgmosyswdu

Accelerating convolutional neural network by exploiting sparsity on GPUs [article]

Weizhi Xu, Shengyu Fan, Hui Yu, Xin Fu
2022 arXiv   pre-print
Convolutional neural network (CNN) is an important deep learning method. The convolution operation takes a large proportion of the total execution time for CNN.  ...  Based on these observations, we propose two new methods to accelerate CNN on GPUs. The first method focuses on accelerating convolution operation and reducing the calculation of zero values.  ...  Besides, a quantized approach is proposed to accelerate and compress convolutional networks on mobile devices.  ... 
arXiv:1909.09927v5 fatcat:xdydyeme3baxfebyiztqtnsmui
« Previous Showing results 1 — 15 out of 4,052 results