155 Hits in 6.9 sec

Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers [article]

Manuele Rusci, Marco Fariselli, Alessandro Capotondi, Luca Benini
2020 arXiv   pre-print
The severe on-chip memory limitations are currently preventing the deployment of the most accurate Deep Neural Network (DNN) models on tiny MicroController Units (MCUs), even if leveraging an effective  ...  To tackle this issue, in this paper we present an automated mixed-precision quantization flow based on the HAQ framework but tailored for the memory and computational characteristics of MCU devices.  ...  Acknowledgments Authors thank the Italian Supercomputing Center CINECA for the access to their HPC facilities.  ... 
arXiv:2008.05124v1 fatcat:2jdeg7vqpzhddbcv2mol5lyahy

Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node [article]

Alfio Di Mauro, Francesco Conti, Pasquale Davide Schiavone, Davide Rossi, Luca Benini
2020 arXiv   pre-print
envelope of 674uW - low enough to enable always-on operation in ultra-low power smart cameras, long-lifetime environmental sensors, and insect-sized pico-drones.  ...  Binary Neural Networks (BNNs) have been shown to be robust to random bit-level noise, making aggressive voltage scaling attractive as a power-saving technique for both logic and SRAMs.  ...  In this work, we advance the state-of-the-art with regards to ultra-low power deep inference with BNNs with three key contributions: i) We propose a strategy to execute noisy BNNs on microcontrollers.  ... 
arXiv:2007.08952v1 fatcat:wj5ecbpaejb7dimlbbz3maomjy

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers [article]

Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough
2021 arXiv   pre-print
However, so-called TinyML presents severe technical challenges, as deep neural network inference demands a large compute and memory budget.  ...  ., the mapping from a given neural network architecture to its inference latency/energy on an MCU.  ...  Gural & Murmann (2019) propose a novel convolution kernel, reducing activation memory and enabling inference on low-end MCUs.  ... 
arXiv:2010.11267v6 fatcat:cte3gwj2wnh3nlg3rnonvbpazu

A Construction Kit for Efficient Low Power Neural Network Accelerator Designs [article]

Petar Jokic, Erfan Azarkhish, Andrea Bonetti, Marc Pons, Stephane Emery, Luca Benini
2021 arXiv   pre-print
Reported optimizations range from up to 10'000x memory savings to 33x energy reductions, providing chip designers an overview of design choices for implementing efficient low power neural network accelerators  ...  Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly updated and improved.  ...  Furthermore, knowledge distillation can be used for low-precision quantization [177] , improving the accuracy of a highly quantized model using "distilled" knowledge from a larger (higher precision) teacher  ... 
arXiv:2106.12810v1 fatcat:gx7cspazc5fdfoi64t2zjth7am

FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation

Ahmad Shawahna, Sadiq M. Sait, Aiman El-Maleh, Irfan Ahmad
2022 IEEE Access  
In this paper, we propose a novel framework referred to as the Fixed-Point Quantizer of deep neural Networks (FxP-QNet) that flexibly designs a mixed low-precision DNN for integer-arithmetic-only deployment  ...  Specifically, the FxP-QNet gradually adapts the quantization level for each data-structure of each layer based on the trade-off between the network accuracy and the low-precision requirements.  ...  DESIGNING MIXED LOW-PRECISION DEEP NEURAL NETWORKS FOR INTEGER-ONLY DEPLOYMENT In this section, we provide an insight into our Fixed-Point Quantizer of deep neural Networks (FxP-QNet).  ... 
doi:10.1109/access.2022.3157893 fatcat:zbuthvleuvb2tjimyveamne5sq

Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware [article]

Bharath Sudharsan, Dineshkumar Sundaram, Pankesh Patel, John G. Breslin, Muhammad Intizar Ali, Schahram Dustdar, Albert Zomaya, Rajiv Ranjan
2022 arXiv   pre-print
Researchers and developers can use our optimization sequence to optimize high memory, computation demanding models in multiple aspects in order to produce small size, low latency, low-power consuming models  ...  ., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models.  ...  It is broadly divided into three categories: Machine Learning on Microcontrollers, which focuses on deep optimization of ML models to enable its accommodation and execution on resource-constrained MCUs  ... 
arXiv:2204.10183v1 fatcat:7yelkcwgdvcg5n4t4tmwymsln4

μBrain: An Event-Driven and Fully Synthesizable Architecture for Spiking Neural Networks

Jan Stuijt, Manolis Sifalakis, Amirreza Yousefzadeh, Federico Corradi
2021 Frontiers in Neuroscience  
For these reasons, μBrain is ultra-low-power and offers software-to-hardware fidelity. μBrain enables always-on neuromorphic computing in IoT sensor nodes that require running on battery power for years  ...  application-specific inference.  ...  The networks they trained were one 2D-CNN (seven layers deep) in tandem with a 1D TCN (10 layers deep) with 16 bit fixed-precision weights, which is to be contrasted with our 2-3 layer SNN of only 4-bit  ... 
doi:10.3389/fnins.2021.664208 pmid:34093116 pmcid:PMC8170091 fatcat:edo2oa6xc5ba5jjtdzhhz5c244

Prospector: Multiscale Energy Measurement of Networked Embedded Systems with Wideband Power Signals

Kenji R. Yamamoto, Paul G. Flikkema
2009 2009 International Conference on Computational Science and Engineering  
Experimental results for a prototype Prospector system with a contemporary 16-bit ultra-low power microcontroller show that it can effectively measure power over the extreme time and magnitude scales found  ...  It is based on computerbased control of multimeters to maximize accuracy, precision, flexibility, and minimize target system overhead.  ...  Each node can be considered a micro-scale power grid consisting of multiple consumers, including a microcontroller, one or more radio transceivers, memory chips, I/O peripherals and an array of transducers  ... 
doi:10.1109/cse.2009.413 dblp:conf/cse/YamamotoF09 fatcat:2qmbwjaisbfjnmzufykh63va6i

Memory-Efficient AI Algorithm for Infant Sleeping Death Syndrome Detection in Smart Buildings

Qian Huang, Chenghung Hsieh, Jiaen Hsieh, Chunchen Liu
2021 AI  
Therefore, our proposed memory-efficient AI algorithm has great potential to be deployed and to run on edge devices, such as micro-controllers and Raspberry Pi, which have low memory footprint, limited  ...  Our proposed AI algorithm only requires 6.4 MB of memory space, while other existing AI algorithms for sleep posture detection require 58.2 MB to 275 MB of memory space.  ...  on the newly developed datasets.  ... 
doi:10.3390/ai2040042 fatcat:f3wg3klhgjezjdsuqmcpm3amsy

MLPerf Tiny Benchmark [article]

Colby Banbury, Vijay Janapa Reddi, Peter Torelli, Jeremy Holleman, Nat Jeffries, Csaba Kiraly, Pietro Montino, David Kanter, Sebastian Ahmed, Danilo Pau, Urmish Thakker, Antonio Torrini (+10 others)
2021 arXiv   pre-print
To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems.  ...  Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible  ...  The TFLite model for IC is 96KB in size and fits on most 32-bit embedded microcontrollers.  ... 
arXiv:2106.07597v4 fatcat:ps4y36uq4nevxfbe7p3tne4opu

The Final Frontier: Deep Learning in Space [article]

Vivek Kothari, Edgar Liberis, Nicholas D. Lane
2020 arXiv   pre-print
In this work, we identify deep learning in space as one of development directions for mobile and embedded machine learning.  ...  We detail and contextualise compute platform of satellites and draw parallels with embedded systems and current research in deep learning for resource-constrained environments.  ...  Aakanksha Chowdhery and other anonymous reviewers for their input throughout the submission process for ACM HotMobile 2020.  ... 
arXiv:2001.10362v2 fatcat:4wqslkravnebrilhvhmtsce4ee

Machine Learning Systems for Intelligent Services in the IoT: A Survey [article]

Wiebke Toussaint, Aaron Yi Ding
2020 arXiv   pre-print
This survey moves beyond existing ML algorithms and cloud-driven design to investigate the less-explored systems, scaling and socio-technical aspects for consolidating ML and IoT.  ...  It covers the latest developments (up to 2020) on scaling and distributing ML across cloud, edge, and IoT devices.  ...  [165] also allows for quantization with mixed precision on different neural network layers on the two MobilenetV1/2 image models.  ... 
arXiv:2006.04950v3 fatcat:xrjcioqkrrhpvgmwmutiajgfbe

Evolutionary Algorithms in Approximate Computing: A Survey

Lukas Sekanina
2021 Journal of Integrated Circuits and Systems  
The neural architecture search enabling the automated hardware-aware design of approximate deep neural networks was identified as a newly emerging topic in this area.  ...  This paper deals with evolutionary approximation as one of the popular approximation methods.  ...  EVOLUTIONARY APPROXIMATION IN CNNS The unprecedented success of machine learning methods based on deep neural networks comes with the very high computation cost needed to train these networks [61] .  ... 
doi:10.29292/jics.v16i2.499 fatcat:fyklv5cl7zfdrccvkmqjg6lfle

Evolutionary Algorithms in Approximate Computing: A Survey [article]

Lukas Sekanina
2021 arXiv   pre-print
The neural architecture search enabling the automated hardware-aware design of approximate deep neural networks was identified as a newly emerging topic in this area.  ...  This paper deals with evolutionary approximation as one of the popular approximation methods.  ...  EVOLUTIONARY APPROXIMATION IN CNNS The unprecedented success of machine learning methods based on deep neural networks comes with the very high computation cost needed to train these networks [61] .  ... 
arXiv:2108.07000v1 fatcat:fgmccjbnlvet5h5kehmt6hslm4

Boost Precision Agriculture with Unmanned Aerial Vehicle Remote Sensing and Edge Intelligence: A Survey

Jia Liu, Jianjian Xiang, Yongjun Jin, Renhua Liu, Jining Yan, Lizhe Wang
2021 Remote Sensing  
However, most DL-based methods place high computation, memory and network demands on resources.  ...  Cloud computing can increase processing efficiency with high scalability and low cost, but results in high latency and great pressure on the network bandwidth.  ...  In recent years, low-bit quantization is becoming popular for deep model compression and acceleration.  ... 
doi:10.3390/rs13214387 fatcat:amrm5blon5hmhnk7arme2vsqwq
« Previous Showing results 1 — 15 out of 155 results