13 Hits in 7.0 sec

A Survey on Thread-Level Speculation Techniques

Alvaro Estebanez, Diego R. Llanos, Arturo Gonzalez-Escribano
2016 ACM Computing Surveys  
Thread-Level Speculation (TLS) is a promising technique that allows the parallel execution of sequential code without relying on a prior, compile-time dependence analysis.  ...  In this work we introduce the technique, present a taxonomy of TLS solutions, and summarize and put into perspective the most relevant advances in this field.  ...  This paper is dedicated in loving memory of Dr. Agustín de Dios Hernández.  ... 
doi:10.1145/2938369 fatcat:yqqyjoaidvci3d4dyuw2jc2p2i

Exploiting Postdominance for Speculative Parallelization

Mayank Agarwal, Kshitiz Malik, Kevin M. Woley, Sam S. Stone, Matthew I. Frank
2007 2007 IEEE 13th International Symposium on High Performance Computer Architecture  
Task-selection policies are critical to the performance of any architecture that uses speculation to extract parallel tasks from a sequential thread.  ...  The specific contributions of this paper include, first, a description of task selection based on immediate postdominance for a system that speculatively creates tasks.  ...  Acknowledgments We are grateful to several people for helping to make this paper possible. Sanjay Patel was an enthusiastic contributor to the early formulation of this work.  ... 
doi:10.1109/hpca.2007.346207 dblp:conf/hpca/AgarwalMWSF07 fatcat:hthp2iczz5dhrpcyiw2syrzxza

A Survey of Coarse-Grained Reconfigurable Architecture and Design

Leibo Liu, Jianfeng Zhu, Zhaoshi Li, Yanan Lu, Yangdong Deng, Jie Han, Shouyi Yin, Shaojun Wei
2019 ACM Computing Surveys  
However, CGRAs are not yet mature in terms of programmability, productivity, and adaptability.  ...  This article reviews the architecture and design of CGRAs thoroughly for the purpose of exploiting their full potential. First, a novel multidimensional taxonomy is proposed.  ...  The task-level speculation (TLS) technique can perform multiple threads that may have internal data dependence in parallel. Threads should be squashed at the detection of any dependence violations.  ... 
doi:10.1145/3357375 fatcat:pqi4d33i6bg45a6llswhwd44qi

A Survey of Computer Architecture Simulation Techniques and Tools

Ayaz Akram, Lina Sawalha
2019 IEEE Access  
Computer architecture simulators play an important role in advancing computer architecture research.  ...  We believe that this paper will be a very useful resource for the computer architecture community especially for early-stage computer architecture and systems researchers to gain exposure to the existing  ...  ACKNOWLEDGEMENT The authors would like to thank the anonymous reviewers for their valuable feedback and comments.  ... 
doi:10.1109/access.2019.2917698 fatcat:zbf5dapusrewbiti6pmgg6le74

A Primer on Memory Consistency and Cache Coherence, Second Edition

Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, David A. Wood
2020 Synthesis Lectures on Computer Architecture  
In the former (also known as static testing), a real or simulated multiprocessor is tested for consistency violations prior to being deployed.  ...  In online testing (also known as dynamic testing), hardware support is added to a multiprocessor for detecting such violations during execution.  ...  When the write-back for data1 reaches the global directory/LLC, the LLC upon finding that the block has sharers in the CPU, must 7 forward an Inv request for data1 to the CPU local directory, which in  ... 
doi:10.2200/s00962ed2v01y201910cac049 fatcat:diry32l6dva5xbsgzuc7fvq7ie

The use of multithreading for exception handling

C.B. Zilles, J.S. Emer, G.S. Sohi
MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture  
Acknowledgements We thank Amir Roth, Milo Martin and the anonymous reviewers for their comments and valuable suggestions on earlier drafts of this paper and Rebecca Stamm and George Chrysos for providing  ...  This work is supported in part by National Science Foundation Grant MIP-9505853, and an equipment donation from Intel Corp. Craig Zilles was supported by an NSF Graduate Fellowship.  ...  This work explores using separate threads in a multithreaded processor for exception handling to avoid squashing in-flight instructions.  ... 
doi:10.1109/micro.1999.809460 dblp:conf/micro/ZillesES99 fatcat:3nj5yrzbwrflzodzvorqprsi2q

Transactional Memory, 2nd edition

Tim Harris, James Larus, Ravi Rajwar
2010 Synthesis Lectures on Computer Architecture  
for multithreaded programming in C/C++ [30] .  ...  The hardware system ( Figure 5 .9) focuses on speculative parallelization of a singlethreaded program into multiple threads and then executing the program on a multiprocessor.  ...  , programming languages for parallel computing, tools for verifying program correctness, and techniques for compiler  ... 
doi:10.2200/s00272ed1v01y201006cac011 fatcat:25d3gvp5zrfqlgpzdzknqouofi

Do Inputs Matter? Using Data-Dependence Profiling to Evaluate Thread Level Speculation in the BlueGene/Q

Arnamoy Bhattacharyya
Thread Level Speculation (TLS) is a hardware/software technique that guarantees correct speculative parallel execution of the program even in the presence of may dependences.  ...  In the era of many-core architectures, it is necessary to fully exploit the maximum available parallelism in computer programs.  ...  For speculatively parallelizing loops with function calls, new dependences may be introduced that lead to dependence violation and thread squashing.  ... 
doi:10.7939/r3xx1x fatcat:mavrkxqcdrdmhgta2x54glueb4

Techniques for Shared Resource Management in Systems with Throughput Processors [article]

Rachata Ausavarungnirun
2018 arXiv   pre-print
Graphics Processing Units (GPUs) are a prime example of throughput processors that can deliver high performance for applications ranging from typical graphics applications to general-purpose data parallel  ...  We identify and eliminate performance bottlenecks caused by major sources of interference throughout the memory hierarchy.  ...  In addition to my family, I would like to thank my advisor, Prof. Onur Mutlu, for providing me with great research environment.  ... 
arXiv:1803.06958v1 fatcat:3mqbwegpkvdrpk6sqwb3ooyh7e

Variability Mitigation in Nanometer CMOS Integrated Systems: A Survey of Techniques From Circuits to Software

Abbas Rahimi, Luca Benini, Rajesh K. Gupta
2016 Proceedings of the IEEE  
We find that parallel architectures and parallelism in general provide the best means to combat and exploit variability to design resilient and efficient systems.  ...  These can be combined in various ways to achieve specific goals related to observability and controllability of the variability effects, providing means to achieve cross-layer or hybrid resilience.  ...  A variability-aware task dispatching technique enhances predictability and energy efficiency for multimedia streaming applications running on parallel multiprocessor arrays [112] .  ... 
doi:10.1109/jproc.2016.2518864 fatcat:sxrsu3excbdg5p7sk4iczz262y

FUTURE COMPUTING 2011 Editors FUTURE COMPUTING 2011 Foreword FUTURE COMPUTING 2011 Committee FUTURE COMPUTING Advisory Chairs FUTURE COMPUTING 2011 Technical Program Committee

Kendall Nygard, Pascal Lorenz, Miriam Capretz, Hiroyuki Sato, Cristina Seceleanu, Cristina Seceleanu, Mälardalen University, Sweden Sato, Miriam Capretz, Marek Druzdzel, Radu Calinescu, Miriam Capretz (+23 others)
2011 The Third International Conference on Future Computational Technologies and Applications   unpublished
We are grateful to the members of the FUTURE COMPUTING 2011 organizing committee for their help in handling the logistics and for their work to make this professional meeting a success.  ...  We hope that FUTURE COMPUTING 2011 was a successful international forum for the exchange of ideas and results between academia and industry and for the promotion of progress in the field of future computational  ...  ACKNOWLEDGEMENT The first author thanks the Japan Society for Promotion of Science for its support.  ... 

ICCGI 2015 The Tenth International Multi-Conference on Computing in the Global Information Technology

St Julians, Malta, Dan Tamir, Mirela Danubianu, " Stefan, Dominic Girardi, Bernhard Freudenthaler, Mirela Danubianu, " Stefan, Dominic Girardi, Bernhard Freudenthaler, Pablo Adasme (+134 others)
We also gratefully thank the members of the ICCGI 2015 organizing committee for their help in handling the logistics and for their work that made this professional meeting a success.  ...  We hope ICCGI 2015 was a successful international forum for the exchange of ideas and results between academia and industry and to promote further progress in the field of computing in the global information  ...  Because CMP architectures share resources among processing cores, the placement of threads or thread affinity can significantly impact the execution of a multithreaded workload.  ... 

Special Issue: Ant Colonies and Multi-Agent Systems Guest Editors: Nadia Nedjah Luiza de Macedo Mourelle

Anton Železnikar, Matjaž Gams, Jožef Stefan, Drago Torkar, Jožef Stefan, Tomaž Banovec, Ciril Baškovič, Andrej Jerman-Blažič, Jožkočuk Jožkočuk, Vladislav Rajkovič, Ivan Bratko, Marko Jagodič (+14 others)
Precise branch prediction is required to overcome this performance limitation imposed on high performance architecture and is the key to many techniques for enhancing and exploiting Instruction-Level Parallelism  ...  This operation is time consuming for large operands, which is always the case in cryptography.  ...  Loy for his help in generating the contour graphs.  ...