Filters








48 Hits in 2.7 sec

UNified Instruction/Translation/Data (UNITD) coherence: One protocol to rule them all

Bogdan F. Romanescu, Alvin R. Lebeck, Daniel J. Sorin, Anne Bracy
2010 HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture  
UNITD eliminates the need for the software TLB shootdown routine, a procedure known to be performance costly and non-scalable.  ...  In UNITD coherence protocols, the TLBs participate in the cache coherence protocol just like the instruction and data caches, without requiring any changes to the existing coherence protocol.  ...  coherence through a software routine called TLB shootdown [6] .  ... 
doi:10.1109/hpca.2010.5416643 dblp:conf/hpca/RomanescuLSB10 fatcat:m5quti22qbgnrmp3izf4thdpey

Architectural and Operating System Support for Virtual Memory

Abhishek Bhattacharjee, Daniel Lustig
2017 Synthesis Lectures on Computer Architecture  
The process by which stale TLB entries are invalidated is known as a TLB shootdown. The details of how TLB shootdowns are performed varies widely by architecture.  ...  invalidate the old TLB entry, and proceed.  ...  OS support for efficient TLB coherence: As discussed at the end of the last chapter, the systems community has also recently been studying ways to improve the overheads of TLB co- Authors' Biographies  ... 
doi:10.2200/s00795ed1v01y201708cac042 fatcat:4re5afn53jhu7ezxwtb25ja3ca

Efficient virtual memory for big memory servers

Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, Michael M. Swift
2013 SIGARCH Computer Architecture News  
For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.  ...  They consume as much as 10% of execution cycles on TLB misses, even using large pages.  ...  TLB Incoherence: The x86-64 architecture does not automatically invalidate or update a TLB entry when the corresponding memory-resident PTE is modified. graph500 99.99 memcached 99.99 mySQL 92.40  ... 
doi:10.1145/2508148.2485943 fatcat:vix4kkpe5veefmas7inuv72uay

Efficient virtual memory for big memory servers

Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, Michael M. Swift
2013 Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13  
For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.  ...  They consume as much as 10% of execution cycles on TLB misses, even using large pages.  ...  TLB Incoherence: The x86-64 architecture does not automatically invalidate or update a TLB entry when the corresponding memory-resident PTE is modified. graph500 99.99 memcached 99.99 mySQL 92.40  ... 
doi:10.1145/2485922.2485943 dblp:conf/isca/BasuGCHS13 fatcat:2p7dghs7g5axrn7dh2tttcufoe

Revisiting virtual L1 caches: A practical design using dynamic synonym remapping

Hongil Yoon, Gurindar S. Sohi
2016 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)  
Virtual caches have potentially lower access latency and energy consumption than physical caches due not to consulting the TLB prior to every cache access.  ...  Experimental results based on real world applications show over 96% dynamic energy savings for TLB lookups without significant impact on performance compared to the system with ideal virtual caches (less  ...  If the counter becomes zero, the ASDT entry can be invalidated. To find the matching ASDT entry, the PPN is needed, via a TLB lookup.  ... 
doi:10.1109/hpca.2016.7446066 dblp:conf/hpca/YoonS16 fatcat:45lj4bwfene7dm7xlv3c63vtm4

FIMCE

Siqi Zhao, Xuhua Ding
2018 ACM Transactions on Privacy and Security  
The operating system and the hypervisor are expected to invalidate the relevant TLB entries after updating the page tables.  ...  The MMU traverses the paging tables only when the TLBs do not store the matching entry.  ...  Stifling Attack The stifling attack prevents the CPU core controlled by the malicious thread from responding to the hypervisor's TLB shootdown, so that its stale TLB entry is not invalidated.  ... 
doi:10.1145/3195181 fatcat:75dls56oxfarporefzodz6k2me

On the Effectiveness of Virtualization Based Memory Isolation on Multicore Platforms

Siqi Zhao, Xuhua Ding
2017 2017 IEEE European Symposium on Security and Privacy (EuroS&P)  
When more than one core accesses an address space, the core that changes the mapping is supposed to perform TLB shootdown to invalidate any existing entries on other cores.  ...  Therefore, when an address mapping is updated, the software needs to explicitly invalidate the corresponding TLB entry.  ... 
doi:10.1109/eurosp.2017.25 dblp:conf/eurosp/ZhaoD17 fatcat:4og7nnfmmfhn7m5r2sqmmqzmqe

The Design and Implementation of Hyperupcalls

Nadav Amit, Michael Wei
2018 USENIX Annual Technical Conference  
For the TLB use case which handles TLB shootdown to inactive cores, our hyperupcall runs faster than native code since the TLB flush is deferred. The overhead of verifying a hyperupcall is minimal.  ...  VCPU re-entry; (2) TLB flush is performed even when the VCPU interrupts are disabled, as experimentally it improves performance.  ... 
dblp:conf/usenix/AmitW18 fatcat:skkqeqbmizeyllyojrzu2ta6wu

Robust architectural support for transactional memory in the power architecture

Harold W. Cain, Maged M. Michael, Brad Frey, Cathy May, Derek Williams, Hung Le
2013 Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13  
In the process of commercializing the feature, we had to resolve some previously unexplored interactions between TM and existing features of the ISA, for example translation shootdown, interrupt handling  ...  a TLB invalidation instruction.  ...  support TLB shootdown via invalidation instructions that result in a system-wide TLB shootdown [13, 12] , including the Power ISA, such reliance on interrupts or locally executed instructions is not  ... 
doi:10.1145/2485922.2485942 dblp:conf/isca/CainMFMWL13 fatcat:yn4uooo6sbbu7kwaso3biiefu4

Robust architectural support for transactional memory in the power architecture

Harold W. Cain, Maged M. Michael, Brad Frey, Cathy May, Derek Williams, Hung Le
2013 SIGARCH Computer Architecture News  
In the process of commercializing the feature, we had to resolve some previously unexplored interactions between TM and existing features of the ISA, for example translation shootdown, interrupt handling  ...  a TLB invalidation instruction.  ...  support TLB shootdown via invalidation instructions that result in a system-wide TLB shootdown [13, 12] , including the Power ISA, such reliance on interrupts or locally executed instructions is not  ... 
doi:10.1145/2508148.2485942 fatcat:5a57nba23fcxzoydwisggrmf7e

NrOS: Effective Replication and Sharing in an Operating System

Ankit Bhardwaj, Chinmay Kulkarni, Reto Achermann, Irina Calciu, Sanidhya Kashyap, Ryan Stutsman, Amy Tai, Gerd Zellweger
2021 USENIX Symposium on Operating Systems Design and Implementation  
Finally, they perform the TLB invalidation. Meanwhile the initiator invalidates its own TLB entries, and then it waits for all acknowledgments from the other cores before it returns to userspace.  ...  ., removing or modifying page table entries) requires the OS to flush TLB entries on cores where the process is active to ensure TLB coherence.  ... 
dblp:conf/osdi/Bhardwaj0ACKSTZ21 fatcat:s7dunts2yrgetof65wpczb3tb4

Boosting Inter-Process Communication with Architectural Support

Yubin Xia, Dong Du, Zhichao Hua, Binyu Zang, Haibo Chen, Haibing Guan
2022 ACM Transactions on Computer Systems  
Exception Fault Instruction Description Invalid x-entry xcall Calling an invalid x-entry. Invalid xcall-cap xcall Calling an x-entry without xcall-cap.  ...  However, memory remapping requires kernel's involvement and causes TLB shootdown.  ... 
doi:10.1145/3532861 fatcat:nuxfib6szrdixhlgyysbgt2xai

Measuring Software Performance on Linux [article]

Martin Becker, Samarjit Chakraborty
2018 arXiv   pre-print
Depending on the processor and OS, the TLB might be invalidated.  ...  We suggest to use grouping and avoid multiplexing as long as the workload is repeatable, to obtain self-consistent and precise results.  ... 
arXiv:1811.01412v2 fatcat:joqggip6jbhejkjplvinaff35m

Techniques for Shared Resource Management in Systems with Throughput Processors [article]

Rachata Ausavarungnirun
2018 arXiv   pre-print
cooperative technique that modifies the memory allocation policy to enable large page support in order to further reduce the inter-address-space interference at the shared Translation Lookaside Buffer (TLB  ...  Similarly, TLB shootdowns are required when a GPU core changes its address space or when a page table entry is updated.  ...  If the SM locates a valid large page entry for the request (i.e., the page is coalesced), it avoids looking up TLB base page entries.  ... 
arXiv:1803.06958v1 fatcat:3mqbwegpkvdrpk6sqwb3ooyh7e

The evolution of an x86 virtual machine monitor

Ole Agesen, Alex Garthwaite, Jeffrey Sheldon, Pratap Subrahmanyam
2010 ACM SIGOPS Operating Systems Review  
We review how the x86 architecture was originally virtualized in the days of the Pentium II (1998), and follow the evolution of the virtual machine monitor forward through the introduction of virtual SMP  ...  To bridge from shared page tables to per-core TLBs, a precisely coordinated "TLB shootdown" approach must be used to invalidate mappings.  ...  Self-modifying code suffers performance overheads as the outer VMM needs to invalidate translated code whenever its "source" code changes.  ... 
doi:10.1145/1899928.1899930 fatcat:m3hzn2xk35bmjk4qqruszxhx7y
« Previous Showing results 1 — 15 out of 48 results