Filters








209 Hits in 6.8 sec

A Hardware-Software Blueprint for Flexible Deep Learning Specialization [article]

Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
2019 arXiv   pre-print
VTA achieves this flexibility via a parametrizable architecture, two-level ISA, and a JIT compiler.  ...  VTA is integrated and open-sourced into Apache TVM, a state-of-the-art deep learning compilation stack that provides flexibility for diverse models and divergent hardware backends.  ...  An end-to-end approach requires integration between frameworks, systems, compilers, and architecture in order to execute state of the art machine learning using hardware acceleration.  ... 
arXiv:1807.04188v3 fatcat:wpafekkrqzffzfe7vulaa6qnva

Constrained Multi-Objective Optimization for Automated Machine Learning [article]

Steven Gardner, Oleg Golovidov, Joshua Griffin, Patrick Koch, Wayne Thompson, Brett Wujek, Yan Xu
2019 arXiv   pre-print
Automated machine learning has gained a lot of attention recently. Building and selecting the right machine learning models is often a multi-objective optimization problem.  ...  Incorporation of multiple objectives and constraints in the model exploration and selection process provides the flexibility needed to satisfy trade-offs necessary in practical machine learning applications  ...  RELATED WORK Jin [13] , [15] claims that machine learning is inherently a multi-objective task and provides a compilation of various multi-objective applications including feature extraction, accuracy  ... 
arXiv:1908.04909v1 fatcat:asuumybspjelta4stwxfiajwd4

Autotuning of Exascale Applications With Anomalies Detection

Dragi Kimovski, Roland Mathá, Gabriel Iuhasz, Fabrizio Marozzo, Dana Petcu, Radu Prodan
2021 Frontiers in Big Data  
Furthermore, the autotuner employs a machine learning-based event detection approach to detect events and anomalies during application execution, such as hardware failures or communication bottlenecks.  ...  Therefore, we introduce a novel approach for autotuning exascale applications based on a genetic multi-objective optimization algorithm integrated within the ASPIDE exascale computing framework.  ...  Furthermore, the autotuner employs a machine learning-based event detection approach to detect anomalies during execution, such as hardware failures or communication bottlenecks.  ... 
doi:10.3389/fdata.2021.657218 pmid:34901840 pmcid:PMC8661695 fatcat:zcsclapt5bgrpnmzzscr7j7n5y

Siblingrivalry

Jason Ansel, Maciej Pacula, Yee Lok Wong, Cy Chan, Marek Olszewski, Una-May O'Reilly, Saman Amarasinghe
2012 Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems - CASES '12  
We present SiblingRivalry, a new model for alwayson online autotuning that allows parallel programs to continuously adapt and optimize themselves to their environment.  ...  However, autotuning can be burdensome to the deployment of a program, since the tuning process can take a long time and should be rerun whenever the program, microarchitecture, execution environment, or  ...  Our approach to multi-objective optimization is a hybrid of a pareto-based EA [16, 37] and a weighted objectives EA.  ... 
doi:10.1145/2380403.2380425 dblp:conf/cases/AnselPWCOOA12 fatcat:doghqgnt25eqngsaeccktifeii

Guiding Optimizations with Meliora: A Deep Walk down Memory Lane [article]

Kewen Meng, Boyana Norris
2020 arXiv   pre-print
However, since models are not typically available, programmers, compilers or autotuners cannot use them easily to guide optimizations and are limited to heuristic-based methods that potentially take a  ...  To that end, we are building the Meliora code analysis infrastructure for machine learning-based performance model generation of arbitrary codes based on static analysis of intermediate language representations  ...  Machine learning alleviates the complexity of analytical model creation and has been a technique widely used in compiler optimization. Wen et al.  ... 
arXiv:2006.09473v1 fatcat:36nmpcmxvjax3a7p2ypodnnyqi

A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications [article]

Matthew W. Moskewicz and Ali Jannesari and Kurt Keutzer
2016 arXiv   pre-print
Toward this end, this work presents a framework to enable productive, high-efficiency GPU programming for DNN computations across hardware platforms and programming models.  ...  In particular, the framework provides specific support for metaprogramming, autotuning, and DNN-tailored data types.  ...  Such an approach is a middle ground between the traditional library and compiler approaches to the mapping problem: • Compared to a traditional library, our approach is more complex but much more flexible  ... 
arXiv:1611.06945v1 fatcat:clgpegm2ubd6lowwclnheqjf7q

OpenTuner

Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, Saman Amarasinghe
2014 Proceedings of the 23rd international conference on Parallel architectures and compilation - PACT '14  
This paper introduces OpenTuner, a new open source framework for building domain-specific multi-objective program autotuners.  ...  Program autotuning has been shown to achieve better or more portable performance in a number of domains.  ...  We gratefully acknowledge Connelly Barnes for helpful discussions and bug fixes related to autotuning the Halide project.  ... 
doi:10.1145/2628071.2628092 dblp:conf/IEEEpact/AnselKVRBOA14 fatcat:4cg2bxgwsvc77enqvshlq4e424

Collective Mind, Part II: Towards Performance- and Cost-Aware Software Engineering as a Natural Science [article]

Grigori Fursin and Abdul Memon and Christophe Guillon and Anton Lokhmotov
2015 arXiv   pre-print
These wrappers are connected with a public Collective Mind autotuning infrastructure and repository of knowledge (c-mind.org/repo) to continuously monitor various important characteristics of these pieces  ...  hardware to be able to predict best optimizations and improve compilers and hardware depending on usage scenarios and requirements.  ...  This approach contrasts with some existing works on machine learning for compilation and architecture.  ... 
arXiv:1506.06256v1 fatcat:l3xrlisen5gglk5jtiusfx67dy

Portable performance on heterogeneous architectures

Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, Saman Amarasinghe
2013 Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '13  
To address the problem of efficiently programming machines with increasingly heterogeneous computational resources, we propose a programming model in which the best mapping of programs to processors and  ...  These choices are given to an empirical autotuning framework that allows the space of possible implementations to be searched at installation time.  ...  a graphics card used to conduct experiments.  ... 
doi:10.1145/2451116.2451162 dblp:conf/asplos/PhothilimthanaARA13 fatcat:npfbunqajvhhjm4dgb53ojihvy

Portable performance on heterogeneous architectures

Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, Saman Amarasinghe
2013 SIGPLAN notices  
To address the problem of efficiently programming machines with increasingly heterogeneous computational resources, we propose a programming model in which the best mapping of programs to processors and  ...  These choices are given to an empirical autotuning framework that allows the space of possible implementations to be searched at installation time.  ...  a graphics card used to conduct experiments.  ... 
doi:10.1145/2499368.2451162 fatcat:zabwd4gvizdpjhfrnmadr4i63y

Pegasus: Performance Engineering for Software Applications Targeting HPC Systems

Pedro Pinto, Joao Bispo, Joao Cardoso, Jorge Gomes Barbosa, Davide Gadioli, Gianluca Palermo, Jan Martinovic, Martin Golasowski, Katerina Slaninova, Radim Cmar, CRISTINA SILVANO
2020 IEEE Transactions on Software Engineering  
This paper presents Pegasus, a performance engineering approach supported by a framework that consists of a source-to-source compiler, controlled and guided by strategies programmed in a Domain-Specific  ...  Developing and optimizing software applications for high performance and energy efficiency is a very challenging task, even when considering a single target machine.  ...  Then, we pass this information to the autotuner so it can choose the bestsuited number of samples to use.  ... 
doi:10.1109/tse.2020.3001257 fatcat:hselpvqyh5hlhmqp72q4b3m5um

Machine learning for predictive auto-tuning with boosted regression trees

James Bergstra, Nicolas Pinto, David Cox
2012 2012 Innovative Parallel Computing (InPar)  
We show that machine learning methods for non-linear regression can be used to estimate timing models from data, capturing the best of both approaches.  ...  Two major approaches to auto-tuning are empirical and model-based: empirical autotuning is a generic but slow approach that works by measuring runtimes of candidate implementations, model-based auto-tuning  ...  [Schapire, 2001 , Friedman, 2002 In a recent empirical study of a range of machine learning regression problems, boosted decision trees were found to be among the best and easiest models to apply [Caruana  ... 
doi:10.1109/inpar.2012.6339587 fatcat:pbgr3u3b4nfhtjc3zatwovfkzy

Autotuning GPU-Accelerated QAP Solvers for Power and Performance

Abhilash Chaparala, Clara Novoa, Apan Qasem
2015 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems  
QAP is known to be NP-hard and requires heuristic approaches for most real data sets.  ...  On a series of experiments on the well-known QAPLIB data sets, our autotuned solutions run at least an order-of-magnitude faster than baseline implementations.  ...  CNS-1305302 and a CAREER award no. CNS-1253292. Equipement support was provided by Nvidia.  ... 
doi:10.1109/hpcc-css-icess.2015.121 dblp:conf/hpcc/ChaparalaNQ15 fatcat:hecvhtqm3bdxzjasvihp65bqxq

Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions [article]

Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, Albert Cohen
2018 arXiv   pre-print
Our contributions include (1) a language close to the mathematics of deep learning called Tensor Comprehensions, (2) a polyhedral Just-In-Time compiler to convert a mathematical description of a deep learning  ...  by an autotuner.  ...  We also design the domain language to cover a variety of existing and emerging machine learning models.  ... 
arXiv:1802.04730v3 fatcat:2ef5ete4mvao5bz43h7z7dtlwi

Memory Utilization and Machine Learning Techniques for Compiler Optimization

A V Shreyas Madhav, Siddarth Singaravel, A Karmel, J. Kannan R., P. Kommers, A. S, A. Quadir Md
2021 ITM Web of Conferences  
This article aims to provide an overall survey of the cache optimization methods, multi memory allocation features and explore the scope of machine learning in compiler optimization to attain a sustainable  ...  Different compilers provide a certain degree of optimization possibilities and applying the appropriate optimization strategies to complex programs can have a significant impact on the overall performance  ...  This approach towards figuring out compiler optimizations is categorized as autotuning [21] or iterative compilation [22] .  ... 
doi:10.1051/itmconf/20213701021 fatcat:b7lrzsnszrbcdbxqreb2qmqlby
« Previous Showing results 1 — 15 out of 209 results