58 Hits in 6.0 sec

Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically

Abdullah Muzahid, Shanxiang Qi, Josep Torrellas
2012 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture  
Past work has focused on detecting data races as proxies for Sequential Consistency (SC) violations. However, most data races do not violate SC.  ...  This paper presents Vulcan, the first hardware scheme to precisely detect SC violations at runtime, in programs running on a relaxed-consistency machine.  ...  Background A Sequential Consistency Violation (SCV) occurs when the memory operations of a program have executed in an order that does not conform to any SC interleaving.  ... 
doi:10.1109/micro.2012.41 dblp:conf/micro/MuzahidQT12 fatcat:xee34gn7bfgtznyqsdzmoiwupq


Yuanfeng Peng, Benjamin P. Wood, Joseph Devietti
2017 Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-50 '17  
Existing software race detectors are precise but slow, and hardware support for precise data race detection relies on assumptions like type safety that many programs violate in practice.  ...  Data race detection is a useful dynamic analysis for multithreaded programs that is a key building block in record-and-replay, enforcing strong consistency models, and detecting concurrency bugs.  ...  This work is supported by the National Science Foundation through grant #1337174.  ... 
doi:10.1145/3123939.3123946 dblp:conf/micro/PengWD17 fatcat:evzd7kwfkrdadc7s6k2bo5fzpu


Xuehai Qian, Josep Torrellas, Benjamin Sahelices, Depei Qian
2013 Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '13  
Sequential Consistency (SC) is the most intuitive memory model, and SC Violations (SCVs) produce unintuitive, typically incorrect executions.  ...  This paper presents Volition, the first hardware scheme that detects SCVs in a relaxed-consistency machine precisely, in a scalable manner, and for an arbitrary number of processors in the cycle.  ...  Acknowledgements This work was supported in part by NSF grants CCF-1012759 and CNS-1116237; Intel under the Illinois-Intel Parallelism Center (I2PC); Spanish Gov. & European ERDF under grants TIN2010-21291  ... 
doi:10.1145/2451116.2451174 dblp:conf/asplos/QianTSQ13 fatcat:uhgygpewsrczne6gaoddud6374

A co-synthesis approach to embedded system design automation

Rajesh K. Gup, Giovanni de Micheli
1996 Design automation for embedded systems  
Constraint analysis is then used to de ne hardware and software portions of functionality. W e describe algorithms and techniques used in developing a practical co-synthesis framework, Vulcan.  ...  This co-synthesis is based on synthesis techniques for digital hardware and software compilation under constraints.  ...  However, this would not support any branching nor reordering of data arrivals since dynamic scheduling of operations in hardware would not be supported.  ... 
doi:10.1007/bf00134684 fatcat:ulkbdi72xndkvefw2ua2fyguoe


Yuelu Duan, Abdullah Muzahid, Josep Torrellas
2013 Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13  
If fences were largely free, faster fine-grained concurrent algorithms could be devised, and compilers could guarantee Sequential Consistency (SC) at little cost.  ...  Only when an incorrect reodering of accesses is about to happen, does the hardware stall to prevent it.  ...  As an example, in this section, we outline the design for a system supporting release consistency (RC). Under RC, the hardware can reorder rd-rd and wr-wr accesses, in addition to wr-rd as in TSO.  ... 
doi:10.1145/2485922.2485941 dblp:conf/isca/DuanMT13 fatcat:ryu4sa3nkrbn5ggkz2vbzdneyq


Yuelu Duan, Abdullah Muzahid, Josep Torrellas
2013 SIGARCH Computer Architecture News  
If fences were largely free, faster fine-grained concurrent algorithms could be devised, and compilers could guarantee Sequential Consistency (SC) at little cost.  ...  Only when an incorrect reodering of accesses is about to happen, does the hardware stall to prevent it.  ...  As an example, in this section, we outline the design for a system supporting release consistency (RC). Under RC, the hardware can reorder rd-rd and wr-wr accesses, in addition to wr-rd as in TSO.  ... 
doi:10.1145/2508148.2485941 fatcat:r7ifvoafhvcbnm23r27agixnte

Codex-dp: co-design of communicating systems using dynamic programming

Jui-Ming Chang, M. Pedram
2000 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
In this paper, we present a novel algorithm based on dynamic programming with binning to find, subject to a given deadline, the minimum-cost coarse-grain hardware/software partitioning and mapping of communicating  ...  Introduction Previous work on system level synthesis has focused mainly on fine-grain hardware/software partitioning. Examples include Vulcan II [1] and Cosyma [2] .  ...  The task graph for example 4 is a large task graph taken from [16] which performs the voice activity detection in a GSM phone.  ... 
doi:10.1109/43.851989 fatcat:tyfujodoevdkrg54nj6kukdlxy

Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary Parallelisation

Ruoyu Zhou, Timothy M. Jones
2019 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)  
runtime checks and speculation guard against data dependence violations.  ...  It allows us to parallelise even those loops containing dynamically discovered code.  ...  We have presented Janus, a framework for dynamic binary parallelisation that incorporates static analysis, profile information, and runtime checks.  ... 
doi:10.1109/cgo.2019.8661196 dblp:conf/cgo/Zhou019 fatcat:u23nm37usncgrhblqtzwjcpkoy

The Qweak Experiment: A Search for New Physics at the TeV Scale via a Measurement of the Proton's Weak Charge [article]

R. D. Carlini, J.M. Finn, S. Kowalski, S. A. Page, D. S. Armstrong, A. Asaturyan, T. Averett, J. Benesch, J. Birchall, P. Bosted, A. Bruell, C. L. Capuano (+71 others)
2012 arXiv   pre-print
We propose a new precision measurement of parity-violating electron scattering on the proton at very low Q^2 and forward angles to challenge predictions of the Standard Model and search for new physics  ...  A 2200 hour measurement of the parity violating asymmetry in elastic ep scattering at Q^2=0.03 (GeV/c)^2 employing 180 μA of 85 determine the proton's weak charge with approximately 4 and systematic errors  ...  The final collimator design consists of three sequential elements, the middle of which is the acceptance-defining collimator, with the other two inserted for 'clean-up' purposes.  ... 
arXiv:1202.1255v2 fatcat:4jfnrcqzxzachjbcgbxawxg2ui

Towards PDES in a Message-Driven Paradigm

Eric Mikida, Nikhil Jain, Laxmikant Kale, Elsa Gonsiorowski, Christopher D. Carothers, Peter D. Barnes, David Jefferson
2016 Proceedings of the 2016 annual ACM Conference on SIGSIM Principles of Advanced Discrete Simulation - SIGSIM-PADS '16  
Accurate simulation of dynamically varying behavior of large components in these domains requires the DES engines to be scalable and adaptive in order to complete simulations in a reasonable time.  ...  In this paper, we first show that the programming model of Charm++ is highly suitable for implementing a PDES engine such as ROSS.  ...  This work used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the O ce of Science of the U.S.  ... 
doi:10.1145/2901378.2901393 dblp:conf/pads/MikidaJKGCBJ16 fatcat:5uoci6tmxzcubjltnk7xl76d7u

Techniques for minimizing and balancing I/O during functional partitioning

F. Vahid
1999 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
The FunctionBus allows choice of any size for internal I/O by trading off I/O size for performance, while port calling allows distribution of external I/O almost arbitrarily among modules.  ...  Recent work has demonstrated numerous benefits of functionally partitioning a behavioral process into mutually exclusive subprocesses before synthesizing each process into a custom digital-hardware processor  ...  Each subprocess pj will, when generated, consist of a loop that detects a request for one of the subprocess' procedures, receives the necessary input parameters, calls the procedure, and sends back any  ... 
doi:10.1109/43.739060 fatcat:bysbwz6jkbabvdrjiky3lfhumu

Asymmetric Memory Fences

Yuelu Duan, Nima Honarmand, Josep Torrellas
2015 SIGPLAN notices  
., Strong Fence or sF) for the less performance-critical thread(s). We call the result an Asymmetric fence group. We also propose a taxonomy of Asymmetric fence groups under TSO.  ...  Fences prevent a Sequential Consistency Violation (SCV) when multiple fences execute concurrently, each one invoked by a different thread and, as a group, prevent a cycle of dependences [29] .  ...  Our work is also related to schemes that enforce SC or identify SC violations. Examples are Conflict Ordering (CO) [21] , End-to-End SC [22, 31] , Vulcan [24] , and Volition [25] .  ... 
doi:10.1145/2775054.2694388 fatcat:u3crywms4bbu5lud42uejds5si

Implementation of a versatile research data acquisition system using a commercially available medical ultrasound scanner

Martin Christian Hemmsen, Svetoslav Ivanov Nikolov, Mads Møller Pedersen, Michael Johannes Pihl, Marie Sand Enevoldsen, Jens Munk Hansen, Jørgen Arendt Jensen
2012 IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control  
Three examples of system use are presented in this paper: evaluation of synthetic aperture sequential beamformation, transverse oscillation for blood velocity estimation, and acquisition of spectral velocity  ...  The system consists of a standard PC equipped with a camera link and an ultrasound scanner equipped with a research interface.  ...  Fig. 4 . 4 Visualization of wire and contrast phantom measurement, using (a) dynamic receive focusing beamforming and (b) synthetic aperture sequential beamformation.Fig. 5.  ... 
doi:10.1109/tuffc.2012.2349 pmid:22828844 fatcat:y3ce67khrrdo5dgd5kqfi7rm3i

Programming languages for distributed computing systems

Henri E. Bal, Jennifer G. Steiner, Andrew S. Tanenbaum
1989 ACM Computing Surveys  
When distributed systems first appeared, they were programmed in traditional sequential languages, usually with the addition of a few library procedures for sending and receiving messages.  ...  Researchers all over the world began designing new programming languages specifically for implementing distributed applications.  ...  The third and final requirement for distributed programming support, therefore, is the ability to detect and recover from partial failure of the system.  ... 
doi:10.1145/72551.72552 fatcat:y2afbdzlpbdgrhutfhgfuhfjmq

Architectural Support to Accelerate Fine-Grain Program Monitoring

Sotiria Fytraki
DefUse detects a bug when the inferred invariant is violated. Hardware-Based Monitoring Systems A number of proposals have sought to provide hardware support for a variety of monitoring tools.  ...  Hardware Support for Dynamic Information Flow Tracking Early hardware-only proposals implement the monitor directly in hardware and hardwire the monitoring policy.  ... 
doi:10.5075/epfl-thesis-6257 fatcat:mkvqgcbs3zaqnhqd6p66fkhpky
« Previous Showing results 1 — 15 out of 58 results