239 Hits in 6.4 sec

Toward Efficient In-memory Data Analytics on NUMA Systems [article]

Puya Memarzia, Suprio Ray, Virendra C Bhavsar
2020 arXiv   pre-print
In this work, we evaluate a variety of strategies that aim to accelerate memory-intensive data analytics workloads on NUMA systems.  ...  Data analytics systems commonly utilize in-memory query processing techniques to achieve better throughput and lower latency.  ...  We believe this is due to PostGreSQL's rigid multi-process query processing approach. Next we evaluate the effect of memory allocator overriding on MonetDB.  ... 
arXiv:1908.01860v3 fatcat:3ri4vadygzce5ao5dslmakn7zm

Accelerating Database Systems Using FPGAs: A Survey

Philippos Papaphilippou, Wayne Luk
2018 2018 28th International Conference on Field Programmable Logic and Applications (FPL)  
This survey presents a systematic review of research relating to accelerating analytical database systems using FPGAs.  ...  Database systems are key to a variety of applications, and FPGA-based accelerators have shown promise in supporting high-performance database systems.  ...  The authors would like to thank Chris Brooks and Rosie Prior from dunnhumby for their valuable involvement in the partnership program.  ... 
doi:10.1109/fpl.2018.00030 dblp:conf/fpl/PapaphilippouL18 fatcat:gcnfescocngjbkdysdzj3rhpvy

An Elastic Multi-Core Allocation Mechanism for Database Systems

Simone Dominico, Eduardo Cunha de Almeida, Jorge Augusto Meira, Marco Antonio Zanata Alves
2018 2018 IEEE 34th International Conference on Data Engineering (ICDE)  
During the parallel execution of queries in Non-Uniform Memory Access (NUMA) systems, the Operating System (OS) maps the threads (or processes) from modern database systems to the available cores among  ...  In this paper we present a data-distribution aware and elastic multi-core allocation mechanism to improve the OS mapping of database threads in NUMA systems.  ...  Abstract-During the parallel execution of queries in Non-Uniform Memory Access (NUMA) systems, the Operating System (OS) maps the threads (or processes) from modern database systems to the available cores  ... 
doi:10.1109/icde.2018.00050 dblp:conf/icde/DominicoAMA18 fatcat:s47g75rylfaxzpyvdd2mgmatyq

Partitioning Strategy Selection for In-Memory Graph Pattern Matching on Multiprocessor Systems [chapter]

Alexander Krause, Thomas Kissinger, Dirk Habich, Hannes Voigt, Wolfgang Lehner
2017 Lecture Notes in Computer Science  
The continuously increasing size of the underlying graphs requires highly parallel in-memory graph processing engines that need to consider non-uniform memory access (NUMA) and concurrency issues to scale  ...  Hence, we present a classification of graph partitioning strategies and evaluate representative algorithms on medium and large-scale NUMA systems in this paper.  ...  Acknowledgments This work is partly funded within the Collaborative Research Center SFB 912 (HAEC).  ... 
doi:10.1007/978-3-319-64203-1_11 fatcat:5wgamlxqinanlphktayjn5ooh4

HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory

Jie Ren, Minjia Zhang, Dong Li
2020 Neural Information Processing Systems  
The state-of-the-art approximate nearest neighbor search (ANNS) algorithms face a fundamental tradeoff between query latency and accuracy, because of small main memory capacity: To store indices in main  ...  The emergence of heterogeneous memory (HM) brings opportunities to largely increase memory capacity and break the above tradeoff: Using HM, billions of data points can be placed in main memory on a single  ...  Acknowledgments and Disclosure of Funding This work was partially supported by U.S. National Science Foundation (CNS-1617967, CCF-1553645 and CCF-1718194).  ... 
dblp:conf/nips/0015ZL20 fatcat:fg42ojrwmvhuxpsrwkjg2g54rq

Column Scan Acceleration in Hybrid CPU-FPGA Systems

Nusrat Jahan Lisa, Annett Ungethüm, Dirk Habich, Wolfgang Lehner, Tuan D. A. Nguyen, Akash Kumar
2018 Very Large Data Bases Conference  
Nowadays, in-memory column store database systems are state-of-the-art for analytical workloads.  ...  The advantage of those hybrid systems is that the FPGA has usually direct access to the main memory of the CPU avoiding data copy which is a necessary procedure in other hybrid systems like CPU-GPU architectures  ...  INTRODUCTION In our data-driven world, efficient query processing is still an important aspect due to the ever-growing amount of data.  ... 
dblp:conf/vldb/LisaUHLN018 fatcat:mf2a3ca2effh5mz4abppxiyaau

Revisiting the Design of Data Stream Processing Systems on Multi-Core Processors

Shuhao Zhang, Bingsheng He, Daniel Dahlmeier, Amelie Chi Zhou, Thomas Heinze
2017 2017 IEEE 33rd International Conference on Data Engineering (ICDE)  
Multiple sockets bring non-uniform memory access (NUMA) effort. In this paper, we revisit the aforementioned design aspects on a modern scale-up server.  ...  CPU socket, b) the lack of NUMA-aware mechanism causes major drawback on the scalability of DSP systems on multi-socket architectures.  ...  ACKNOWLEDGEMENT This work is partially funded by a MoE AcRF Tier 1 grant (T1 251RES1610), a startup grant of NUS in Singapore and NSFC Project 61628204 in China.  ... 
doi:10.1109/icde.2017.119 dblp:conf/icde/ZhangHDZH17 fatcat:7p6i7kni3rhsfpoim3x6rcsotu

Morsel-driven parallelism

Viktor Leis, Peter Boncz, Alfons Kemper, Thomas Neumann
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
With modern computer architecture evolving, two problems conspire against the state-of-the-art approaches in parallel query execution: (i) to take advantage of many-cores, all query work must be distributed  ...  Further, the dispatcher is aware of data locality of the NUMA-local morsels and operator state, such that the great majority of executions takes place on NUMA-local memory.  ...  the most important relational operators. • A systematic approach to integrating NUMA-awareness into database systems.  ... 
doi:10.1145/2588555.2610507 dblp:conf/sigmod/LeisBK014 fatcat:l2s367jiwfggnleamjxic2gcgq


George Prekas, Marios Kogias, Edouard Bugnion
2017 Proceedings of the 26th Symposium on Operating Systems Principles - SOSP '17  
We evaluate ZYGOS with a networked version of Silo, a state-of-the-art in-memory transactional database, running TPC-C.  ...  This paper focuses on the efficient scheduling on multicore systems of very fine-grain networked tasks, which are the typical building block of online data-intensive applications.  ...  This work was funded in part by the Microsoft-EPFL Joint Research Center, the NanoTera YINS project, and a VMware Research Grant. George Prekas is supported by a Google Graduate Research Fellowship.  ... 
doi:10.1145/3132747.3132780 dblp:conf/sosp/PrekasKB17 fatcat:ocyr44ciajgdhd3td7gaeaihnq

Analyzing efficient stream processing on modern hardware

Steffen Zeuch, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Sebastian Breß, Tilmann Rabl, Volker Markl
2019 Proceedings of the VLDB Endowment  
Furthermore, many state-of-the-art SPEs rely on a Java Virtual Machine to achieve platform independence and speed up system development by abstracting from the underlying hardware.  ...  In this paper, we show that taking the underlying hardware into account is essential to exploit modern hardware efficiently.  ...  This work was funded by the EU projects E2Data (780245), DFG Priority Program "Scalable Data Management for Future Hardware" (MA4662-5), and the German Ministry for Education and Research as BBDC I (01IS14013A  ... 
doi:10.14778/3303753.3303758 fatcat:3ugpwvys3vf2vba2npn2n2t47m

Software challenges in extreme scale systems

Vivek Sarkar, William Harrod, Allan E Snavely
2009 Journal of Physics, Conference Series  
Since the late 1980s' this has focused on scalable single VLSI chip designs integrating both dense memory and logic into "Processing In Memory" (PIM) architectures, efficient execution models to support  ...  He also leads the UPC language effort, a consortium of industry and academic research institutions aiming to produce a unified approach to parallel C programming based on global address space methods.  ...  A companion process could assist the application developer in identifying the problem in several ways. For example, the companion process could make memory watch points significantly more efficient.  ... 
doi:10.1088/1742-6596/180/1/012045 fatcat:iukutry2dvbitfdh6ng7kgz564

An experimental evaluation of large scale GBDT systems

Fangeheng Fu, Jiawei Jiang, Yingxia Shao, Bin Cui
2019 Proceedings of the VLDB Endowment  
Then we conduct an in-depth systematic analysis and summarize the advantageous scenarios of the quadrants.  ...  To validate our analysis empirically, we implement different quadrants in the same code base and compare them under extensive workloads, and finally compare Vero with other state-of-the-art systems over  ...  Using the horizontal approach, the memory consumption would be 56.6GB and the total communication cost would be 900GB for merely one tree in the worst case.  ... 
doi:10.14778/3342263.3342273 fatcat:h3lo7wel25fp3niclkoi2mvrf4

Quantum Monte Carlo for large chemical systems: Implementing efficient strategies for petascale platforms and beyond [article]

Anthony Scemama , William Jalby
2012 arXiv   pre-print
This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), ii.) the possibility of keeping the memory footprint  ...  Various strategies to implement efficiently QMC simulations for large chemical systems are presented.  ...  The authors would also like to thank Bull, GENCI and CEA for their help in this project.  ... 
arXiv:1209.6630v2 fatcat:ymr5olx27fby5l5wqcfscsd64y


Johns Paul, Jiong He, Bingsheng He
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
state-of-the-art kernel-based query processing approaches, with improvement up to 48%.  ...  In this paper, we propose GPL, a novel pipelined query execution engine to improve the resource utilization of query co-processing on the GPU.  ...  Acknowledgement This work is in part supported by Singapore Ministry of Education Academic Research Fund Tier 2 under Grant MOE2012-T2-2-067.  ... 
doi:10.1145/2882903.2915224 dblp:conf/sigmod/PaulHH16 fatcat:oa5agxjyvbbtbh7ty2t73dlkdm

Application-level power and performance characterization and optimization on IBM Blue Gene/Q systems

R. Bertran, Y. Sugawara, H. M. Jacobson, A. Buyuktosunoglu, P. Bose
2013 IBM Journal of Research and Development  
In . Energy accounting for shared virtualized environments under dvfs using pmc-based power models.  ...  Argonne National Laboratory and the Lawrence Livermore National Laboratory on behalf of the U.S.  ...  Department of Energy, under Lawrence Livermore National Laboratory subcontract no. B554331. In addition, we gratefully acknowledge Robert Walkup and John Gunnels for their help with the benchmarks.  ... 
doi:10.1147/jrd.2012.2227580 fatcat:e2yb6krghzdvfkaavxy7ha3li4
« Previous Showing results 1 — 15 out of 239 results