A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems
[article]
2022
arXiv
pre-print
To address these challenges, we will introduce a series of effective design methods in this book chapter to enable efficient algorithms, compilers, and various optimizations for embedded systems. ...
However, these emerging AI applications also come with increasing computation and memory demands, which are challenging to handle especially for the embedded systems where limited computation/memory resources ...
These methods can be categorized into efficient machine learning algorithms, accelerator and compiler designs, and various co-design and optimization strategies. ...
arXiv:2206.03326v1
fatcat:th66tbqxibez7hmctl2ytdiroa
TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
[article]
2019
arXiv
pre-print
Machine-learning applications rely on efficient parallel processing to achieve performance, and they employ a variety of technologies to improve performance, including compiler technology. ...
But compilers in machine-learning frameworks lack a deep understanding of parallelism, causing them to lose performance by missing optimizations on parallel computation. ...
ACKNOWLEDGMENTS The authors acknowledge the MIT Lincoln Laboratory Supercomputing Center for providing HPC resources that have contributed to the research results reported in this paper. ...
arXiv:1908.11338v1
fatcat:fpe4tmadwnbd3adplv4gfjbrpe
An Evaluation of Autotuning Techniques for the Compiler Optimization Problems
2016
Design, Automation, and Test in Europe
In this paper, we evaluate our different autotuning approaches including the use of Design Space Exploration (DSE) techniques and machine learning to further tackle the both problems of selecting and the ...
To address the problem, different optimization techniques has been used for traversing, pruning the huge space, adaptability and portability. ...
Conclusion paper presents two main approaches for the compiler autotuning problem using DSE and machine learning. ...
dblp:conf/date/AshouriPS16
fatcat:pqyxnkf735gi3iz3o6s7tzrjtu
TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems
[article]
2021
arXiv
pre-print
We introduce TensorFlow Lite Micro (TF Micro), an open-source ML inference framework for running deep-learning models on embedded systems. ...
As a result, the machine-learning (ML) models and associated ML inference framework must not only execute efficiently but also operate in a few kilobytes of memory. ...
INTRODUCTION Tiny machine learning (TinyML) is a burgeoning field at the intersection of embedded systems and machine learning. ...
arXiv:2010.08678v3
fatcat:hemuhoadafd6hi3wg7exoyqnue
TinyIREE: An ML Execution Environment for Embedded Systems from Compilation to Deployment
[article]
2022
arXiv
pre-print
Machine learning model deployment for training and execution has been an important topic for industry and academic research in the last decade. ...
In this paper, we present IREE, a unified compiler and runtime stack with the explicit goal to scale down machine learning programs to the smallest footprints for mobile and edge devices, while maintaining ...
Brehler was supported by the German Federal Ministry of Education and Research (BMBF) as part of the AIA project under grant number 01IS19060A and the Competence Center Machine Learning Rhine-Ruhr (ML2R ...
arXiv:2205.14479v1
fatcat:u77u6s5ypvbcleebc7ec6upvme
The Berlin Big Data Center (BBDC)
2018
it - Information Technology
Framework for transparent and reproducible benchmark experiments of distributed data processing systems, approaches to foster the interpretability of machine learning models and finally provide an overview ...
However, writing efficient implementations of data analysis programs on these systems requires a deep understanding of systems programming, prohibiting large groups of data scientists and analysts from ...
analysis and machine learning algorithms. ...
doi:10.1515/itit-2018-0016
fatcat:mnxe772elba5rekvxds5j4xgpm
An Automatic Compiler Optimizations Selection Framework for Embedded Applications
2009
2009 International Conference on Embedded Software and Systems
This paper aims at system-wide compiler optimizations selection for embedded applications. ...
For this framework, we implemented compiler optimization selection algorithms and evaluated its efficiencies with and without performance monitoring hardware support. ...
., a grant from the National Science Council (95C2443-2), and a grant from Excellent Research Projects of National Taiwan University (96R0062-AE00-07). ...
doi:10.1109/icess.2009.86
dblp:conf/icess/HungTLC09
fatcat:qh4mbsdxcffl3fduxugp5fe5ti
Memory Utilization and Machine Learning Techniques for Compiler Optimization
2021
ITM Web of Conferences
This article aims to provide an overall survey of the cache optimization methods, multi memory allocation features and explore the scope of machine learning in compiler optimization to attain a sustainable ...
The realm of compiler suites that possess and apply efficient optimization methods provide a wide array of beneficial attributes that help programs execute efficiently with low execution time and minimal ...
The paper provides a brief overview of the properties of the memory based and machine learning techniques that have been employed for the purpose of compiler optimization. ...
doi:10.1051/itmconf/20213701021
fatcat:b7lrzsnszrbcdbxqreb2qmqlby
面向机器学习系统的张量中间表示
2021
Scientia Sinica Informationis
His current research interests include compiler, programming model and machine learning. ...
Figure 7 The optimizations base on tensor IR
) . :
,
,
, .
.
:
,
Zhuang Y M, Wen Y B, Li W, et al. A tensor intermediate representation for machine learning systems (in Chinese). ...
The machine learning compilers are crucial to machine learning systems. ...
doi:10.1360/ssi-2020-0398
fatcat:3hhamznebngb5eb7gee7x42tqq
TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems
2021
Conference on Machine Learning and Systems
INTRODUCTION Tiny machine learning (TinyML) is a burgeoning field at the intersection of embedded systems and machine learning. ...
TensorFlow Lite Micro (TFLM) is an open-source ML inference framework for running deep-learning models on embedded systems. ...
We extend our gratitude to many individuals, teams, and organizations: Fredrik ...
dblp:conf/mlsys/DavidDJRJLKNNWW21
fatcat:hvsi6bzy5vakdcun4sgfgehe3m
TinyIREE: An ML Execution Environment for Embedded Systems from Compilation to Deployment
2022
IEEE Micro
Machine learning model deployment for training and execution has been an important topic for industry and academic research in the last decade. ...
In this paper, we present IREE, a unified compiler and runtime stack with the explicit goal to scale down machine learning programs to the smallest footprints for mobile and edge devices, while maintaining ...
We would also like to thank Scott Main for the review and ...
doi:10.1109/mm.2022.3178068
fatcat:7lyffgoqdbgpxetruv7ubjzqiy
Implementing Domain-Specific Languages for Heterogeneous Parallel Computing
2011
IEEE Micro
Even worse, the optimized code for one system is neither portable nor guarantees high performance on another system. ...
Common examples are Pthreads or OpenMP for multicore CPU, OpenCL or CUDA for GPU, and message passing interface (MPI) for clusters. ...
Acknowledgments We thank the anonymous reviewers for their valuable feedback. ...
doi:10.1109/mm.2011.68
fatcat:a77o6qub7fhyrlqccdiyqkbj3y
Language virtualization for heterogeneous parallel computing
2010
Proceedings of the ACM international conference on Object oriented programming systems languages and applications - OOPSLA '10
We define criteria for language virtualization and present techniques to achieve them. ...
We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. ...
Data Analysis and Machine Learning with OptiML The second example of the use of language virtualization is for the implementation of OptiML, a DSL for machine learning. ...
doi:10.1145/1869459.1869527
dblp:conf/oopsla/ChafiDMRSHOO10
fatcat:wgsarnltdnebfmfkp6xejvwl6m
Language virtualization for heterogeneous parallel computing
2010
SIGPLAN notices
We define criteria for language virtualization and present techniques to achieve them. ...
We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. ...
Data Analysis and Machine Learning with OptiML The second example of the use of language virtualization is for the implementation of OptiML, a DSL for machine learning. ...
doi:10.1145/1932682.1869527
fatcat:b26cnd5a5vbmpktbmjb7gcigfm
Running a Java VM inside an operating system kernel
2008
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments - VEE '08
For this purpose, we first implemented a compact Java Virtual Machine with a Just-In-Time compiler on the Intel IA32 instruction set architecture at the user space. ...
In this paper, we describe such an approach, where a lightweight Java virtual machine is embedded within the kernel for flexible extension of kernel network I/O. ...
Third, the virtual machine is embedded inside an operating system (Embedding). ...
doi:10.1145/1346256.1346279
dblp:conf/vee/OkumuraCM08
fatcat:aia2na66lzfztkqy26snkfr5ne
« Previous
Showing results 1 — 15 out of 34,283 results