Filters








16 Hits in 6.9 sec

A Review on Security in Cache Memories

R. Vijay Sai, S. Saravanan
2016 Indian Journal of Science and Technology  
XOR operations, extended Hamming codes and multi-bit clustered ECC.  ...  Findings: Discussed solutions involve in the design of secured cryptographic based algorithms, secure aware cache mapping and low power cache design by employing techniques such as code convertors, nested  ...  Cache-coloring based technique 7 for saving leakage energy in multi-tasking systems involves in dealing with energy efficiency which is a prominent factor.  ... 
doi:10.17485/ijst/2016/v9i48/96037 fatcat:vv5p5sksczacdp35bndfpd6o3a

A Survey on Cache Management Mechanisms for Real-Time Embedded Systems

Giovani Gracioli, Ahmed Alhammad, Renato Mancuso, Antônio Augusto Fröhlich, Rodolfo Pellizzoni
2015 ACM Computing Surveys  
In this article, we present a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014.  ...  Recently, many research works have proposed different techniques to deal with caches in multicore processors in the context of real-time systems.  ...  Suhendra and Mitra (this work was discussed in Section 3) were the first authors to evaluate the combination of cache partitioning and cache locking in the context of multicore real-systems [Suhendra  ... 
doi:10.1145/2830555 fatcat:nckhashqprghfnbcaqqu7vk5vi

Hardware-Accelerated Platforms and Infrastructures for Network Functions: A Survey of Enabling Technologies and Research Studies

Prateek Shantharama, Akhilesh S. Thyagaturu, Martin Reisslein
2020 IEEE Access  
system, and the tag hit circuit for address translations through the TLB is the critical path for the full configuration version.  ...  Latencies in multi-core systems affect the overall system performance, especially for latency-critical packet processing functions.  ... 
doi:10.1109/access.2020.3008250 fatcat:kv4znpypqbatfk2m3lpzvzb2nu

GeantV

G. Amadio, A. Ananya, J. Apostolakis, M. Bandieramonte, S. Banerjee, A. Bhattacharyya, C. Bianchini, G. Bitzes, P. Canal, F. Carminati, O. Chaparro-Amaro, G. Cosmo (+30 others)
2021 Computing and Software for Big Science  
AbstractFull detector simulation was among the largest CPU consumers in all CERN experiment software stacks for the first two runs of the Large Hadron Collider.  ...  In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport code in order to benefit from features of fine-grained parallelism, including vectorization and increased  ...  This allowed a large reduction in contention in the multi-threaded basket mode. An important feature for fine-grained workflows is load balancing.  ... 
doi:10.1007/s41781-020-00048-6 fatcat:5gekpngjdzbunapugtxuujx6vm

GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP [article]

G. Amadio, A. Ananya, J. Apostolakis, M. Bandieramonte, S. Banerjee, A. Bhattacharyya, C. Bianchini, G. Bitzes, P. Canal, F. Carminati, O. Chaparro-Amaro, G. Cosmo (+30 others)
2020 arXiv   pre-print
Full detector simulation was among the largest CPU consumer in all CERN experiment software stacks for the first two runs of the Large Hadron Collider (LHC).  ...  In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport codes in order to make them benefit from fine-grained parallelism features such as vectorization,  ...  This allowed a large reduction in contention in the multi-threaded basket mode. An important feature for fine-grained workflows is load balancing.  ... 
arXiv:2005.00949v3 fatcat:pwileoh23vdnlaf3dafxqth3iq

A Framework for Accelerating Bottlenecks in GPU Execution with Assist Warps [article]

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Saugata Ghose, Abhishek Bhowmick, Rachata Ausavarangnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
2016 arXiv   pre-print
For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive.  ...  CABA provides flexible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform specific tasks that can improve GPU performance and efficiency.  ...  Acknowledgments We thank the reviewers for their valuable suggestions. We thank the members of the SAFARI group for their feedback and the stimulating research environment they provide.  ... 
arXiv:1602.01348v1 fatcat:qbzuknzcyncrticap55x4i5dhi

A case for core-assisted bottleneck acceleration in GPUs

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
2015 SIGARCH Computer Architecture News  
For example, when a GPU is bottlenecked by the available o -chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive.  ...  CABA provides exible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform speci c tasks that can improve GPU performance and e ciency.  ...  Gennady Pekhimenko is supported in part by a Microsoft Research Fellowship. Rachata Ausavarungnirun is supported in part by the Royal Thai Government scholarship.  ... 
doi:10.1145/2872887.2750399 fatcat:mdd25bfj25frrnazvn5aj2cfxm

A case for core-assisted bottleneck acceleration in GPUs

Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
2015 Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15  
For example, when a GPU is bottlenecked by the available o -chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive.  ...  CABA provides exible mechanisms to automatically generate "assist warps" that execute on GPU cores to perform speci c tasks that can improve GPU performance and e ciency.  ...  Gennady Pekhimenko is supported in part by a Microsoft Research Fellowship. Rachata Ausavarungnirun is supported in part by the Royal Thai Government scholarship.  ... 
doi:10.1145/2749469.2750399 dblp:conf/isca/VijaykumarPJ0AD15 fatcat:vow55cmt3zhlxmg5o3x2lxx6ri

Enhancing Programmability, Portability, and Performance with Rich Cross-Layer Abstractions [article]

Nandita Vijaykumar
2019 arXiv   pre-print
We propose 4 different approaches to designing richer abstractions between the application, system software, and hardware architecture in different contexts to significantly improve programmability, portability  ...  In doing so, they enable a rich space of hardware-software cooperative mechanisms to optimize for performance.  ...  , system-level tasks, etc.  ... 
arXiv:1911.05660v1 fatcat:w5f3g4isqbcphm2jjfzjtvrjnq

Integrating accelerators in heterogeneous systems

Ján Veselý
2021
This work studies programmability enhancing abstractions in the context of accelerators and heterogeneous systems.  ...  I study both suitability in terms of existing operational semantics, as well as design considerations necessary for efficient implementation.First, I study the mapping of high-level dynamic languages to  ...  Predator-prey game Multi-tasking model Like the Stroop model, the multi-tasking model represents conflict in representation.  ... 
doi:10.7282/t3-8w0g-6257 fatcat:4gjpec34kbb2djjofn777eicmi

OASIcs, Volume 55, WCET'16, Complete Volume [article]

Martin Schoeberl
2016
We want to thank Bendikt Huber for porting the Lift benchmark from Java to C. We want to thank Niklas Holsti from Tidorum Ltd for contributing DEBIE in open-source.  ...  The author would like to thank the organizers, the speakers and the participants of the Optimizing Real-Time Systems workshop on Parallelization of real-time tasks 1 : they have inspired this paper.  ...  for Timing Analysis on Embedded Multi-Cores for Timing Analysis on Embedded Multi-Cores Figure 1 A 1 Figure1A parallel task defined as a set of sub-tasks subject to precedence constraints (DAG).  ... 
doi:10.4230/oasics.wcet.2016 fatcat:2smyzkdh7rdndcptwasq4qsfue

Exploiting Cache Locality At Run-Time

Yong Yan
1998
Each original is also photographed in one exposure and is included in reduced form at the back o f the book.  ...  Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type o f computer printer.  ...  M em ory-access sp ace shrinking In our run-tim e system, for any given parallel loop, the memory-access pattern of its parallel tasks is captured by a multi-dimensional memory-access space in Section  ... 
doi:10.21220/s2-h1zs-0y44 fatcat:vw646cw6ojbnzenug53gilrmby

Cache-aware development of high integrity real-time systems

Enrico <1974> Mezzetti, Tullio Vardanega
2012
Cost, performance and availability considerations are forcing even the most conservative high-integrity embedded real-time systems industry to migrate from simple hardware processors to ones equipped with  ...  Caches are perceived as an additional source of complexity, which has potential for shattering the guarantees of cost- and schedule-constrained qualification of their systems.  ...  Nemer et al. in [111] present a task timing analysis for statically scheduled multi-tasking systems that accounts for the interleaving of non-preemptable tasks in computing the abstract cache states  ... 
doi:10.6092/unibo/amsdottorato/4591 fatcat:f2wb2o6m45fprk74sqybdfhxxy

OASIcs, Volume 57, WCET'17, Complete Volume [article]

Jan Reineke
2017
We wish to thank Mathieu Serrurier for patiently sharing with us bits of its expertise in machine learning. Acknowledgements.  ...  We would also like to thank the anonymous reviewers for their comments on an earlier draft of this paper and the suggestions for the continuation of the work. F. Markovic, J. Carlson, and R.  ...  avionic safety-critical embedded system (different than trace1 ) where the task is a two-mode application executing on a multi-core platform.  ... 
doi:10.4230/oasics.wcet.2017 fatcat:og6ez4wyovho3fy2sewsoqixqu

Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer Systems

Melanie Rae Kambadur
2017
Computer systems have become increasingly diverse and specialized in recent years.  ...  The first of the five case studies is Parallel Block Vectors, a new profiling method for understanding parallel programs with a fine-grained, code-centric perspective aids in both future hardware design  ...  An extreme form of specialized processor, accelerators have shown great promise in reducing power, saving space in embedded systems, and improving performance for target programs.  ... 
doi:10.7916/d8kw5fvr fatcat:mnpba7jj7rdwjk54og7ecnuvja
« Previous Showing results 1 — 15 out of 16 results