Filters








286 Hits in 4.0 sec

Engineering a compact parallel delaunay algorithm in 3D

Daniel K. Blandford, Guy E. Blelloch, Clemens Kadow
2006 Proceedings of the twenty-second annual symposium on Computational geometry - SCG '06  
We describe an implementation of a compact parallel algorithm for 3D Delaunay tetrahedralization on a 64-processor shared-memory machine.  ...  Our algorithm uses a concurrent version of the Bowyer-Watson incremental insertion, and a thread-safe space-efficient structure for representing the mesh.  ...  All data locks are "test-locks" rather than "wait-locks": if a thread fails to acquire a lock, it aborts the operation rather than waiting for the lock to become free.  ... 
doi:10.1145/1137856.1137900 dblp:conf/compgeom/BlandfordBK06 fatcat:q7xrzyoydjaqfh6tmzhoezfblm

AIFM: High-Performance, Application-Integrated Far Memory

Zhenyuan Ruan, Malte Schwarzkopf, Marcos K. Aguilera, Adam Belay
2020 USENIX Symposium on Operating Systems Design and Implementation  
Our key insight is that exposing application-level semantics to a high-performance runtime makes efficient remoteable memory possible.  ...  AIFM achieves the same common-case access latency for far memory as for local RAM; it avoids read and write amplification that paging-based approaches suffer; it allows data structure engineers to build  ...  Hashtable Hash tables provide unordered maps that typically see random accesses, often with high temporal locality.  ... 
dblp:conf/osdi/RuanSAB20 fatcat:ycvlr6txszhhpkh7onb22kg4sa

Processing Database Joins over a Shared-Nothing System of Multicore Machines [article]

Abhirup Chakraborty
2018 arXiv   pre-print
, parallelizes the communication by allowing multiple simultaneous data transfers (send/receive), and removes synchronization barriers (a scalability bottleneck in a distributed data processing system)  ...  By exploiting multiple processing cores within the individual machines, we implement a system to process database joins that parallelizes computation within each node, pipelines the computation with communication  ...  Line 7 releases the memory in the relevant bucket within the hashtable frame, and line 9 frees up the memory occupied by the hashtable frame when all the buckets within the hashtable frame are processed  ... 
arXiv:1804.09324v1 fatcat:hxfal6qrr5f7jfhz2xoko7gvjm

McRT-STM

Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Chi Cao Minh, Benjamin Hertzberg
2006 Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '06  
We also show a MCAS implementation that works on arbitrary values, coexists with the STM, and can be used as a more efficient form of transactional memory.  ...  Our STM tries to maximize throughput by almost always making readers/writers wait when a lock is not available.  ...  The only difference is that the wait locations are guarded by a mutex. Readers and writers acquire the mutex to allow race-free signaling. Upgrades are handled during validation.  ... 
doi:10.1145/1122971.1123001 dblp:conf/ppopp/SahaAHMH06 fatcat:3uuogbmfqbg4pjvykeqdbi7pju

Lock-free dynamic hash tables with open addressing

H. Gao, J. F. Groote, W. H. Hesselink
2005 Distributed computing  
We present an efficient lock-free algorithm for parallel accessible hash tables with open addressing, which promises more robust performance and reliability than conventional lock-based implementations  ...  Lock-free algorithms are hard to design correctly, even when apparently straightforward.  ...  Introduction We are interested in efficient, reliable, parallel algorithms.  ... 
doi:10.1007/s00446-004-0115-2 fatcat:qrklbzlyo5bvpkp22c3ndjsele

Lock-free dynamic hash tables with open addressing [article]

Hui Gao, Jan Friso Groote, Wim H. Hesselink
2004 arXiv   pre-print
We present an efficient lock-free algorithm for parallel accessible hash tables with open addressing, which promises more robust performance and reliability than conventional lock-based implementations  ...  Lock-free algorithms are hard to design correctly, even when apparently straightforward.  ...  Introduction We are interested in efficient, reliable, parallel algorithms.  ... 
arXiv:cs/0303011v4 fatcat:bneui2yz5vcfnftwls536m7lyq

Compiler and runtime support for efficient software transactional memory

Ali-Reza Adl-Tabatabai, Brian T. Lewis, Vijay Menon, Brian R. Murphy, Bratin Saha, Tatiana Shpeisman
2006 SIGPLAN notices  
Programmers have traditionally used locks to synchronize concurrent access to shared data.  ...  Our system efficiently implements nested transactions that support both composition of transactions and partial roll back.  ...  The coarse-grained synchronized implementation does not scale at all, as all operations to the hashtable are serialized and no parallelization is permitted.  ... 
doi:10.1145/1133255.1133985 fatcat:74y2op54xrfvjgvnvuxn4ozk24

Compiler and runtime support for efficient software transactional memory

Ali-Reza Adl-Tabatabai, Brian T. Lewis, Vijay Menon, Brian R. Murphy, Bratin Saha, Tatiana Shpeisman
2006 Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation - PLDI '06  
Programmers have traditionally used locks to synchronize concurrent access to shared data.  ...  Our system efficiently implements nested transactions that support both composition of transactions and partial roll back.  ...  The coarse-grained synchronized implementation does not scale at all, as all operations to the hashtable are serialized and no parallelization is permitted.  ... 
doi:10.1145/1133981.1133985 dblp:conf/pldi/Adl-TabatabaiLMMSS06 fatcat:p3khq7enrneatok732mqktw3ti

The Atomos transactional programming language

Brian D. Carlstrom, Austen McDonald, Hassan Chafi, JaeWoong Chung, Chi Cao Minh, Christos Kozyrakis, Kunle Olukotun
2006 Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation - PLDI '06  
The Atomos watch statement allows programmers to specify fine-grained watch sets used with the Atomos retry conditional waiting statement for efficient transactional conflict-driven wakeup even in transactional  ...  The results demonstrate both the improvements in parallel programming ease and parallel program performance provided by Atomos.  ...  Hashtable and HashMap use a single mutex, while ConcurrentHashMap uses fine-grained locking to support concurrent access.  ... 
doi:10.1145/1133981.1133983 dblp:conf/pldi/CarlstromMCCMKO06 fatcat:zqoupp3chzayti2drnu3yf64c4

The Atomos transactional programming language

Brian D. Carlstrom, Austen McDonald, Hassan Chafi, JaeWoong Chung, Chi Cao Minh, Christos Kozyrakis, Kunle Olukotun
2006 SIGPLAN notices  
The Atomos watch statement allows programmers to specify fine-grained watch sets used with the Atomos retry conditional waiting statement for efficient transactional conflict-driven wakeup even in transactional  ...  The results demonstrate both the improvements in parallel programming ease and parallel program performance provided by Atomos.  ...  Hashtable and HashMap use a single mutex, while ConcurrentHashMap uses fine-grained locking to support concurrent access.  ... 
doi:10.1145/1133255.1133983 fatcat:uinaqmbfgzd6bezfc5hafbneey

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

Ran Liu, Heng Zhang, Haibo Chen
2014 USENIX Annual Technical Conference  
Further, some scalable rwlocks cannot cope with OS semantics like sleeping inside critical sections, preemption and conditional wait.  ...  Reader-writer locks (rwlocks) aim to maximize parallelism among readers, but many existing rwlocks either cause readers to contend, or significantly extend writer latency, or both.  ...  Compared to RCU, it trades obstruction-free reader access for a much stronger and clearer semantic and much shorter writer latency.  ... 
dblp:conf/usenix/0003ZC14 fatcat:nuxtc5rktjajhilekl4z7rftrq

The Shared Map pattern

Beverly A Sanders
2010 Proceedings of the 2010 Workshop on Parallel Programming Patterns - ParaPLoP '10  
Shared Map is a parallel design pattern that addesses the problem "How can a map (or dictionary) data structure shared by multiple concurrent threads or processes be safely and efficiently implemented?  ...  Many computations, including those performed in parallel, utilize a data structure that can be viewed as a mapping between a key and an associated value.  ...  Almost lock-free gets The lock striping approach significantly increases the amount of concurrency.  ... 
doi:10.1145/1953611.1953625 fatcat:pygti3nj3vbdjdsvemgvplyyfq

Fast, Dynamically-Sized Concurrent Hash Table [chapter]

J. Barnat, P. Ročkai, V. Štill, J. Weiser
2015 Lecture Notes in Computer Science  
Among the main design criteria were the ability to efficiently use variable-length keys, dynamic table resizing to accommodate data sets of unpredictable size and fully concurrent read-write access.  ...  We present a new design and a C++ implementation of a high-performance, cache-efficient hash table suitable for use in implementation of parallel programs in shared memory.  ...  Concurrent Access As we have discussed, open hashing is more cache efficient, and compared to a simple closed hashing scheme is also more space efficient.  ... 
doi:10.1007/978-3-319-23404-5_5 fatcat:6gn5y2bipnh2lkyhoefufmh6ou

Continuously measuring critical section pressure with the free-lunch profiler

Florian David, Gael Thomas, Julia Lawall, Gilles Muller
2014 Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications - OOPSLA '14  
We propose Free Lunch, a new profiler designed to identify locks and critical sections that hamper scalability.  ...  In an evaluation on over thirty applications, we found that the overhead of Free Lunch is never greater than 6%.  ...  The class java.util.Hashtable uses a lock to ensure mutual exclusion on each access to the hashtable, leading to a bottleneck.  ... 
doi:10.1145/2660193.2660210 dblp:conf/oopsla/DavidTLM14 fatcat:dze4bni7qvgq7hkgpqk4jzgvie

Continuously measuring critical section pressure with the free-lunch profiler

Florian David, Gael Thomas, Julia Lawall, Gilles Muller
2014 SIGPLAN notices  
We propose Free Lunch, a new profiler designed to identify locks and critical sections that hamper scalability.  ...  In an evaluation on over thirty applications, we found that the overhead of Free Lunch is never greater than 6%.  ...  The class java.util.Hashtable uses a lock to ensure mutual exclusion on each access to the hashtable, leading to a bottleneck.  ... 
doi:10.1145/2714064.2660210 fatcat:iugfaynxdrcf7o7ophhzxw5kqi
« Previous Showing results 1 — 15 out of 286 results