Filters








9 Hits in 4.0 sec

Libra: Tailoring SIMD Execution Using Heterogeneous Hardware and Dynamic Configurability

Yongjun Park, Jason Jong Kyu Park, Hyunchul Park, Scott Mahlke
2012 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture  
through the use of heterogeneous hardware across the SIMD lanes.  ...  The Libra accelerator increases SIMD utility by blurring the divide between vector and instruction parallelism to support efficient execution of a wider range of loops, and it increases hardware utilization  ...  This research is supported by Samsung Advanced Institute of Technology and the National Science Foundation under grants CCF-0916689 and CNS-0964478.  ... 
doi:10.1109/micro.2012.17 dblp:conf/micro/ParkPPM12 fatcat:3skvsbe2vbeujmh2ctwoqwmthe

Software transparent dynamic binary translation for coarse-grain reconfigurable architectures

Matthew A. Watkins, Tony Nowatzki, Anthony Carno
2016 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)  
custom hardware and the flexibility of software.  ...  In this work we propose DORA, a Dynamic Optimizer for Reconfigurable Architectures, which achieves substantial (2X) power and performance improvements while having low hardware and insertion overhead and  ...  We also thank David Albonesi and the anonymous reviewers for their feedback on the paper.  ... 
doi:10.1109/hpca.2016.7446060 dblp:conf/hpca/WatkinsNC16 fatcat:ssmt2kzalba2xoozatcp6imlxq

Construction and exploitation of VLIW ASIPs with heterogeneous vector-widths

Erkan Diken, Roel Jordans, Rosilde Corvino, Lech Jóźwiak, Henk Corporaal, Felipe Augusto Chies
2014 Microprocessors and microsystems  
This paper proposes the use of heterogeneous vector widths and a method to explore the heterogeneous vector widths for VLIW ASIPs.  ...  A large part of the DLP is usually exploited through application vectorization and implementation of vector operations in processors executing the applications.  ...  Dynamic configurability enables lane resource to execute as a traditional SIMD processor, be re-purposed to behave as a clustered VLIW processor, or combinations of both.  ... 
doi:10.1016/j.micpro.2014.05.004 fatcat:hua42e74vbgllnl4aejatueoe4

Stream-Dataflow Acceleration

Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, Karthikeyan Sankaralingam
2017 Proceedings of the 44th Annual International Symposium on Computer Architecture - ISCA '17  
SIMD, GPGPUs) are insufficient, as evidenced by the orderof-magnitude improvements and industry adoption of application and domain-specific accelerators in important areas like machine learning, computer  ...  This paper explores the hardware and software implications, describes its detailed microarchitecture, and evaluates an implementation.  ...  ACKNOWLEDGMENTS We would first like to thank the anonymous reviewers for their detailed questions and suggestions which helped us to clarify the presentation.  ... 
doi:10.1145/3079856.3080255 dblp:conf/isca/NowatzkiGAS17 fatcat:xm36xv6cbfevveabvmpafgjtli

Stream-Dataflow Acceleration

Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, Karthikeyan Sankaralingam
2017 SIGARCH Computer Architecture News  
SIMD, GPGPUs) are insufficient, as evidenced by the orderof-magnitude improvements and industry adoption of application and domain-specific accelerators in important areas like machine learning, computer  ...  This paper explores the hardware and software implications, describes its detailed microarchitecture, and evaluates an implementation.  ...  ACKNOWLEDGMENTS We would first like to thank the anonymous reviewers for their detailed questions and suggestions which helped us to clarify the presentation.  ... 
doi:10.1145/3140659.3080255 fatcat:g5spj35pyvh7jlr6i3qr5ertlq

Exploring the potential of heterogeneous von neumann/dataflow execution models

Tony Nowatzki, Vinay Gangadhar, Karthikeyan Sankaralingam
2015 Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15  
Mahlke, “Libra: Tailoring simd [3] M. Budiu, P. V. Artigas, and S. C.  ...  Goldstein, “Dataflow: A complement execution using heterogeneous hardware and dynamic configurability,” to superscalar,” in ISPASS, 2005.  ... 
doi:10.1145/2749469.2750380 dblp:conf/isca/NowatzkiGS15 fatcat:hql7xymzgjch3jv4dk5mvbesji

Applications and Techniques for Fast Machine Learning in Science [article]

Allison McCarn Deiana, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini (+74 others)
2021 arXiv   pre-print
training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms.  ...  This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions.  ...  Much of the advancements within ML over the past few years have originated from the use of heterogeneous computing hardware.  ... 
arXiv:2110.13041v1 fatcat:cvbo2hmfgfcuxi7abezypw2qrm

Applications and Techniques for Fast Machine Learning in Science

Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik (+35 others)
2022 Frontiers in Big Data  
training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms.  ...  This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions.  ...  “Dynamic application reconfiguration on heterogeneous hardware,” in Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (Providence, RI: VEE).  ... 
doi:10.3389/fdata.2022.787421 pmid:35496379 pmcid:PMC9041419 fatcat:5w2exf7vvrfvnhln7nj5uppjga

Applications and Techniques for Fast Machine Learning in Science

Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik (+35 others)
2022
training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms.  ...  This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions.  ...  Much of the advancements within ML over the past few years have originated from the use of heterogeneous computing hardware.  ... 
doi:10.26083/tuprints-00021245 fatcat:q5g26rdbfbfozmfcywdpew56be