11 Hits in 4.7 sec

Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation

Haicheng Wu, Gregory Diamos, Srihari Cadambi, Sudhakar Yalamanchili
2012 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture  
Based on this classification, we propose a compiler framework, Kernel Weaver, that can automatically fuse relational algebra operators thereby eliminating redundant data movement.  ...  Kernel fusion fuses the code bodies of two GPU kernels to i) reduce data footprint to cut down data movement throughout GPU and CPU memory hierarchy, and ii) enlarge compiler optimization scope.  ...  First there is a need for the efficient GPU implementations of RA primitives.  ... 
doi:10.1109/micro.2012.19 dblp:conf/micro/WuDCY12 fatcat:stqpjqeehjasbmpnhadwbf5pui

Multipredicate Join Algorithms for Accelerating Relational Graph Processing on GPUs

Haicheng Wu, Daniel Zinn, Molham Aref, Sudhakar Yalamanchili
2014 Very Large Data Bases Conference  
In particular, we present two different join algorithms for the GPU.  ...  The second is a novel approach, inspired by the former but more suitable for GPU architectures. Our preliminary performance benchmarks show that for both approaches using GPUs is cost-effective.  ...  Computing (ISTC-CC).  ... 
dblp:conf/vldb/WuZAY14 fatcat:hfdyv5lmm5fzfo227xaa5wgyka


Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, Anastasia Ailamaki
2019 Proceedings of the VLDB Endowment  
, query parallelization techniques used by analytical database engines are designed for homogeneous multicore servers, where query plans are parallelized across CPUs to process data stored in cache coherent  ...  In doing so, we show that efficiently exploiting CPU-GPU parallelism can provide 2.8x and 6.4x improvement in performance compared to state-of-the-art CPU-based and GPU-based DBMS.  ...  Kernel Weaver [35] is a compiler that automatically tries to fuse multiple relational operations together into a single kernel, in order to i) reduce data movement and ii) enable additional compiler  ... 
doi:10.14778/3303753.3303760 fatcat:rvqcdroc5nadjcnc6r5r4ksyzy

Memory-Efficient Object-Oriented Programming on GPUs [article]

Matthias Springer
2019 arXiv   pre-print
Our goal is to bring efficient, object-oriented programming to massively parallel SIMD architectures, especially GPUs.  ...  Object-oriented programming is often regarded as too inefficient for high-performance computing (HPC), despite the fact that many important HPC problems have an inherent object structure.  ...  For example, Harlan [84] supports nested kernels, Futhark [82] has support for nested parallelism and a powerful fusion engine for map/reduce combinations, and Kernel Weaver focuses on database queries  ... 
arXiv:1908.05845v1 fatcat:5o5fn5jcbbfjhl4ikg65tml7by

TCUDB: Accelerating Database with Tensor Processors [article]

Yu-Ching Hu and Yuliang Li and Hung-Wei Tseng
2021 arXiv   pre-print
Matrix multiplication was considered inefficient in the past; however, this strategy has remained largely unexplored in conventional GPU-based databases, which primarily rely on vector or scalar processing  ...  TCUDB achieves up to 288x speedup compared to a baseline GPU-based query engine.  ...  Kernel weaver: Automatically fusing database primitives for efficient gpu Proceedings of the 2009 ACM SIGMOD International Conference on Management of computation. In MICRO.  ... 
arXiv:2112.07552v1 fatcat:y6furfuc7nh5jml3cew7mqngry

A Roadmap for Big Model [article]

Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han (+88 others)
2022 arXiv   pre-print
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.  ...  We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability  ...  NCCL is short for NVIDIA Collective Communications Library, which provides inter-GPU communication primitives that are topology-aware and easily integrated into applications.  ... 
arXiv:2203.14101v4 fatcat:rdikzudoezak5b36cf6hhne5u4

QCLab: a framework for query compilation on modern hardware platforms

Henning Funke, Technische Universität Dortmund
They were designed for practical use and enable efficient processing, even when workload characteristics are challenging.  ...  Together they serve as basis for building highly efficient query compilers. The techniques make efficient use of communication channels and of the large processing capacities of modern systems.  ...  Many thanks to Thomas Neumann for reviewing the thesis, and to Erich Schubert and Johannes Fischer for taking part in the defense.  ... 
doi:10.17877/de290r-22777 fatcat:bgxsxabtaja6xleu3gv5z7pohm

CARS 2021: Computer Assisted Radiology and Surgery Proceedings of the 35th International Congress and Exhibition Munich, Germany, June 21–25, 2021

2021 International Journal of Computer Assisted Radiology and Surgery  
Conclusion This study presented an efficient method for automatically detecting malignant tumors in breast MRI.  ...  GPU memory use.  ... 
doi:10.1007/s11548-021-02375-4 pmid:34085172 fatcat:6d564hsv2fbybkhw4wvc7uuxcy

Hypothesis Generation in Climate Research with Interactive Visual Data Exploration

J. Kehrer, F. Ladstadter, P. Muigg, H. Doleisch, A. Steiner, H. Hauser
2008 IEEE Transactions on Visualization and Computer Graphics  
The CFD data is courtesy of Innovative Computational Engineering GmbH (, Leoben, Austria.  ...  This article is accepted for publication in IEEE Transactions on Visualization and Computer Graphics, 17 (7) Acknowledgments The authors thank Thomas Nocke, Michael Flechsig, and colleagues from the  ...  Therefore, parameter ranges affecting for instance the computational analysis can be narrowed down efficiently.  ... 
doi:10.1109/tvcg.2008.139 pmid:18989013 fatcat:t6tahgdxjzgsfbkaxds564vyzy

Transparently migrating Java objects at runtime in an infrastructure-as-a-service cloud

Fritz Schrogl, Schahram Dustdar, Philipp Wolfgang Leitner
Finally a conclusion about the work done and possible tasks for future development are given.  ...  In the last couple of years Cloud Computing has become an important topic in industry and computer science.  ...  For example installing a database server to an EC2 instance would be pointless, because database changes would be lost when the instance gets terminated.  ... 
doi:10.34726/hss.2014.22149 fatcat:lkfehccgy5hpdnh7gvcgty46ga

Arrows for knowledge-based circuits [article]

Peter Gammie, University, The Australian National, University, The Australian National
Knowledge-based programs (KBPs) are a formalism for directly relating agents' knowledge and behaviour in a way that has proven useful for specifying distributed systems.  ...  Here we present a scheme for compiling KBPs to executable automata in finite environments with a proof of correctness in Isabelle/HOL.  ...  Acknowledgements I thank my parents, Liz and Richard, for their imperturbable support, and Clem Baker-Finch and John Lloyd for their encouragement and perspective.  ... 
doi:10.25911/5d78d9f7865db fatcat:rhsri4b2zbhvphsr2qc4zv6dze