A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers
[article]
2020
arXiv
pre-print
The severe on-chip memory limitations are currently preventing the deployment of the most accurate Deep Neural Network (DNN) models on tiny MicroController Units (MCUs), even if leveraging an effective ...
To tackle this issue, in this paper we present an automated mixed-precision quantization flow based on the HAQ framework but tailored for the memory and computational characteristics of MCU devices. ...
Acknowledgments Authors thank the Italian Supercomputing Center CINECA for the access to their HPC facilities. ...
arXiv:2008.05124v1
fatcat:2jdeg7vqpzhddbcv2mol5lyahy
Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node
[article]
2020
arXiv
pre-print
envelope of 674uW - low enough to enable always-on operation in ultra-low power smart cameras, long-lifetime environmental sensors, and insect-sized pico-drones. ...
Binary Neural Networks (BNNs) have been shown to be robust to random bit-level noise, making aggressive voltage scaling attractive as a power-saving technique for both logic and SRAMs. ...
In this work, we advance the state-of-the-art with regards to ultra-low power deep inference with BNNs with three key contributions: i) We propose a strategy to execute noisy BNNs on microcontrollers. ...
arXiv:2007.08952v1
fatcat:wj5ecbpaejb7dimlbbz3maomjy
MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers
[article]
2021
arXiv
pre-print
However, so-called TinyML presents severe technical challenges, as deep neural network inference demands a large compute and memory budget. ...
., the mapping from a given neural network architecture to its inference latency/energy on an MCU. ...
Gural & Murmann (2019) propose a novel convolution kernel, reducing activation memory and enabling inference on low-end MCUs. ...
arXiv:2010.11267v6
fatcat:cte3gwj2wnh3nlg3rnonvbpazu
A Construction Kit for Efficient Low Power Neural Network Accelerator Designs
[article]
2021
arXiv
pre-print
Reported optimizations range from up to 10'000x memory savings to 33x energy reductions, providing chip designers an overview of design choices for implementing efficient low power neural network accelerators ...
Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly updated and improved. ...
Furthermore, knowledge distillation can be used for low-precision quantization [177] , improving the accuracy of a highly quantized model using "distilled" knowledge from a larger (higher precision) teacher ...
arXiv:2106.12810v1
fatcat:gx7cspazc5fdfoi64t2zjth7am
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
2022
IEEE Access
In this paper, we propose a novel framework referred to as the Fixed-Point Quantizer of deep neural Networks (FxP-QNet) that flexibly designs a mixed low-precision DNN for integer-arithmetic-only deployment ...
Specifically, the FxP-QNet gradually adapts the quantization level for each data-structure of each layer based on the trade-off between the network accuracy and the low-precision requirements. ...
DESIGNING MIXED LOW-PRECISION DEEP NEURAL NETWORKS FOR INTEGER-ONLY DEPLOYMENT In this section, we provide an insight into our Fixed-Point Quantizer of deep neural Networks (FxP-QNet). ...
doi:10.1109/access.2022.3157893
fatcat:zbuthvleuvb2tjimyveamne5sq
Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware
[article]
2022
arXiv
pre-print
Researchers and developers can use our optimization sequence to optimize high memory, computation demanding models in multiple aspects in order to produce small size, low latency, low-power consuming models ...
., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models. ...
It is broadly divided into three categories: Machine Learning on Microcontrollers, which focuses on deep optimization of ML models to enable its accommodation and execution on resource-constrained MCUs ...
arXiv:2204.10183v1
fatcat:7yelkcwgdvcg5n4t4tmwymsln4
μBrain: An Event-Driven and Fully Synthesizable Architecture for Spiking Neural Networks
2021
Frontiers in Neuroscience
For these reasons, μBrain is ultra-low-power and offers software-to-hardware fidelity. μBrain enables always-on neuromorphic computing in IoT sensor nodes that require running on battery power for years ...
application-specific inference. ...
The networks they trained were one 2D-CNN (seven layers deep) in tandem with a 1D TCN (10 layers deep) with 16 bit fixed-precision weights, which is to be contrasted with our 2-3 layer SNN of only 4-bit ...
doi:10.3389/fnins.2021.664208
pmid:34093116
pmcid:PMC8170091
fatcat:edo2oa6xc5ba5jjtdzhhz5c244
Prospector: Multiscale Energy Measurement of Networked Embedded Systems with Wideband Power Signals
2009
2009 International Conference on Computational Science and Engineering
Experimental results for a prototype Prospector system with a contemporary 16-bit ultra-low power microcontroller show that it can effectively measure power over the extreme time and magnitude scales found ...
It is based on computerbased control of multimeters to maximize accuracy, precision, flexibility, and minimize target system overhead. ...
Each node can be considered a micro-scale power grid consisting of multiple consumers, including a microcontroller, one or more radio transceivers, memory chips, I/O peripherals and an array of transducers ...
doi:10.1109/cse.2009.413
dblp:conf/cse/YamamotoF09
fatcat:2qmbwjaisbfjnmzufykh63va6i
Memory-Efficient AI Algorithm for Infant Sleeping Death Syndrome Detection in Smart Buildings
2021
AI
Therefore, our proposed memory-efficient AI algorithm has great potential to be deployed and to run on edge devices, such as micro-controllers and Raspberry Pi, which have low memory footprint, limited ...
Our proposed AI algorithm only requires 6.4 MB of memory space, while other existing AI algorithms for sleep posture detection require 58.2 MB to 275 MB of memory space. ...
on the newly developed datasets. ...
doi:10.3390/ai2040042
fatcat:f3wg3klhgjezjdsuqmcpm3amsy
MLPerf Tiny Benchmark
[article]
2021
arXiv
pre-print
To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. ...
Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible ...
The TFLite model for IC is 96KB in size and fits on most 32-bit embedded microcontrollers. ...
arXiv:2106.07597v4
fatcat:ps4y36uq4nevxfbe7p3tne4opu
The Final Frontier: Deep Learning in Space
[article]
2020
arXiv
pre-print
In this work, we identify deep learning in space as one of development directions for mobile and embedded machine learning. ...
We detail and contextualise compute platform of satellites and draw parallels with embedded systems and current research in deep learning for resource-constrained environments. ...
Aakanksha Chowdhery and other anonymous reviewers for their input throughout the submission process for ACM HotMobile 2020. ...
arXiv:2001.10362v2
fatcat:4wqslkravnebrilhvhmtsce4ee
Machine Learning Systems for Intelligent Services in the IoT: A Survey
[article]
2020
arXiv
pre-print
This survey moves beyond existing ML algorithms and cloud-driven design to investigate the less-explored systems, scaling and socio-technical aspects for consolidating ML and IoT. ...
It covers the latest developments (up to 2020) on scaling and distributing ML across cloud, edge, and IoT devices. ...
[165] also allows for quantization with mixed precision on different neural network layers on the two MobilenetV1/2 image models. ...
arXiv:2006.04950v3
fatcat:xrjcioqkrrhpvgmwmutiajgfbe
Evolutionary Algorithms in Approximate Computing: A Survey
2021
Journal of Integrated Circuits and Systems
The neural architecture search enabling the automated hardware-aware design of approximate deep neural networks was identified as a newly emerging topic in this area. ...
This paper deals with evolutionary approximation as one of the popular approximation methods. ...
EVOLUTIONARY APPROXIMATION IN CNNS The unprecedented success of machine learning methods based on deep neural networks comes with the very high computation cost needed to train these networks [61] . ...
doi:10.29292/jics.v16i2.499
fatcat:fyklv5cl7zfdrccvkmqjg6lfle
Evolutionary Algorithms in Approximate Computing: A Survey
[article]
2021
arXiv
pre-print
The neural architecture search enabling the automated hardware-aware design of approximate deep neural networks was identified as a newly emerging topic in this area. ...
This paper deals with evolutionary approximation as one of the popular approximation methods. ...
EVOLUTIONARY APPROXIMATION IN CNNS The unprecedented success of machine learning methods based on deep neural networks comes with the very high computation cost needed to train these networks [61] . ...
arXiv:2108.07000v1
fatcat:fgmccjbnlvet5h5kehmt6hslm4
Boost Precision Agriculture with Unmanned Aerial Vehicle Remote Sensing and Edge Intelligence: A Survey
2021
Remote Sensing
However, most DL-based methods place high computation, memory and network demands on resources. ...
Cloud computing can increase processing efficiency with high scalability and low cost, but results in high latency and great pressure on the network bandwidth. ...
In recent years, low-bit quantization is becoming popular for deep model compression and acceleration. ...
doi:10.3390/rs13214387
fatcat:amrm5blon5hmhnk7arme2vsqwq
« Previous
Showing results 1 — 15 out of 155 results