A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Review on Security in Cache Memories
2016
Indian Journal of Science and Technology
XOR operations, extended Hamming codes and multi-bit clustered ECC. ...
Findings: Discussed solutions involve in the design of secured cryptographic based algorithms, secure aware cache mapping and low power cache design by employing techniques such as code convertors, nested ...
Cache-coloring based technique 7 for saving leakage energy in multi-tasking systems involves in dealing with energy efficiency which is a prominent factor. ...
doi:10.17485/ijst/2016/v9i48/96037
fatcat:vv5p5sksczacdp35bndfpd6o3a
A Survey on Cache Management Mechanisms for Real-Time Embedded Systems
2015
ACM Computing Surveys
In this article, we present a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014. ...
Recently, many research works have proposed different techniques to deal with caches in multicore processors in the context of real-time systems. ...
Suhendra and Mitra (this work was discussed in Section 3) were the first authors to evaluate the combination of cache partitioning and cache locking in the context of multicore real-systems [Suhendra ...
doi:10.1145/2830555
fatcat:nckhashqprghfnbcaqqu7vk5vi
Hardware-Accelerated Platforms and Infrastructures for Network Functions: A Survey of Enabling Technologies and Research Studies
2020
IEEE Access
system, and the tag hit circuit for address translations through the TLB is the critical path for the full configuration version. ...
Latencies in multi-core systems affect the overall system performance, especially for latency-critical packet processing functions. ...
doi:10.1109/access.2020.3008250
fatcat:kv4znpypqbatfk2m3lpzvzb2nu
GeantV
2021
Computing and Software for Big Science
AbstractFull detector simulation was among the largest CPU consumers in all CERN experiment software stacks for the first two runs of the Large Hadron Collider. ...
In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport code in order to benefit from features of fine-grained parallelism, including vectorization and increased ...
This allowed a large reduction in contention in the multi-threaded basket mode. An important feature for fine-grained workflows is load balancing. ...
doi:10.1007/s41781-020-00048-6
fatcat:5gekpngjdzbunapugtxuujx6vm
GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP
[article]
2020
arXiv
pre-print
Full detector simulation was among the largest CPU consumer in all CERN experiment software stacks for the first two runs of the Large Hadron Collider (LHC). ...
In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport codes in order to make them benefit from fine-grained parallelism features such as vectorization, ...
This allowed a large reduction in contention in the multi-threaded basket mode. An important feature for fine-grained workflows is load balancing. ...
arXiv:2005.00949v3
fatcat:pwileoh23vdnlaf3dafxqth3iq
A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps
[article]
2016
arXiv
pre-print
For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive. ...
CABA provides flexible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform specific tasks that can improve GPU performance and efficiency. ...
Acknowledgments We thank the reviewers for their valuable suggestions. We thank the members of the SAFARI group for their feedback and the stimulating research environment they provide. ...
arXiv:1602.01348v1
fatcat:qbzuknzcyncrticap55x4i5dhi
A case for core-assisted bottleneck acceleration in GPUs
2015
SIGARCH Computer Architecture News
For example, when a GPU is bottlenecked by the available o -chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive. ...
CABA provides exible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform speci c tasks that can improve GPU performance and e ciency. ...
Gennady Pekhimenko is supported in part by a Microsoft Research Fellowship. Rachata Ausavarungnirun is supported in part by the Royal Thai Government scholarship. ...
doi:10.1145/2872887.2750399
fatcat:mdd25bfj25frrnazvn5aj2cfxm
A case for core-assisted bottleneck acceleration in GPUs
2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15
For example, when a GPU is bottlenecked by the available o -chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive. ...
CABA provides exible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform speci c tasks that can improve GPU performance and e ciency. ...
Gennady Pekhimenko is supported in part by a Microsoft Research Fellowship. Rachata Ausavarungnirun is supported in part by the Royal Thai Government scholarship. ...
doi:10.1145/2749469.2750399
dblp:conf/isca/VijaykumarPJ0AD15
fatcat:vow55cmt3zhlxmg5o3x2lxx6ri
Enhancing Programmability, Portability, and Performance with Rich Cross-Layer Abstractions
[article]
2019
arXiv
pre-print
We propose 4 different approaches to designing richer abstractions between the application, system software, and hardware architecture in different contexts to significantly improve programmability, portability ...
In doing so, they enable a rich space of hardware-software cooperative mechanisms to optimize for performance. ...
, system-level tasks, etc. ...
arXiv:1911.05660v1
fatcat:w5f3g4isqbcphm2jjfzjtvrjnq
Integrating accelerators in heterogeneous systems
2021
This work studies programmability enhancing abstractions in the context of accelerators and heterogeneous systems. ...
I study both suitability in terms of existing operational semantics, as well as design considerations necessary for efficient implementation.First, I study the mapping of high-level dynamic languages to ...
Predator-prey game
Multi-tasking model Like the Stroop model, the multi-tasking model represents conflict in representation. ...
doi:10.7282/t3-8w0g-6257
fatcat:4gjpec34kbb2djjofn777eicmi
OASIcs, Volume 55, WCET'16, Complete Volume
[article]
2016
We want to thank Bendikt Huber for porting the Lift benchmark from Java to C. We want to thank Niklas Holsti from Tidorum Ltd for contributing DEBIE in open-source. ...
The author would like to thank the organizers, the speakers and the participants of the Optimizing Real-Time Systems workshop on Parallelization of real-time tasks 1 : they have inspired this paper. ...
for Timing Analysis on Embedded Multi-Cores
for Timing Analysis on Embedded Multi-Cores
Figure 1 A 1 Figure1A parallel task defined as a set of sub-tasks subject to precedence constraints (DAG). ...
doi:10.4230/oasics.wcet.2016
fatcat:2smyzkdh7rdndcptwasq4qsfue
Exploiting Cache Locality At Run-Time
1998
Each original is also photographed in one exposure and is included in reduced form at the back o f the book. ...
Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type o f computer printer. ...
M em ory-access sp ace shrinking In our run-tim e system, for any given parallel loop, the memory-access pattern of its parallel tasks is captured by a multi-dimensional memory-access space in Section ...
doi:10.21220/s2-h1zs-0y44
fatcat:vw646cw6ojbnzenug53gilrmby
Cache-aware development of high integrity real-time systems
2012
Cost, performance and availability considerations are forcing even the most conservative high-integrity embedded real-time systems industry to migrate from simple hardware processors to ones equipped with ...
Caches are perceived as an additional source of complexity, which has potential for shattering the guarantees of cost- and schedule-constrained qualification of their systems. ...
Nemer et al. in [111] present a task timing analysis for statically scheduled multi-tasking systems that accounts for the interleaving of non-preemptable tasks in computing the abstract cache states ...
doi:10.6092/unibo/amsdottorato/4591
fatcat:f2wb2o6m45fprk74sqybdfhxxy
OASIcs, Volume 57, WCET'17, Complete Volume
[article]
2017
We wish to thank Mathieu Serrurier for patiently sharing with us bits of its expertise in machine learning. Acknowledgements. ...
We would also like to thank the anonymous reviewers for their comments on an earlier draft of this paper and the suggestions for the continuation of the work. F. Markovic, J. Carlson, and R. ...
avionic safety-critical embedded system (different than trace1 ) where the task is a two-mode application executing on a multi-core platform. ...
doi:10.4230/oasics.wcet.2017
fatcat:og6ez4wyovho3fy2sewsoqixqu
Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer Systems
2017
Computer systems have become increasingly diverse and specialized in recent years. ...
The first of the five case studies is Parallel Block Vectors, a new profiling method for understanding parallel programs with a fine-grained, code-centric perspective aids in both future hardware design ...
An extreme form of specialized processor, accelerators have shown great promise in reducing power, saving space in embedded systems, and improving performance for target programs. ...
doi:10.7916/d8kw5fvr
fatcat:mnpba7jj7rdwjk54og7ecnuvja
« Previous
Showing results 1 — 15 out of 16 results