A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
UNified Instruction/Translation/Data (UNITD) coherence: One protocol to rule them all
2010
HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture
UNITD eliminates the need for the software TLB shootdown routine, a procedure known to be performance costly and non-scalable. ...
In UNITD coherence protocols, the TLBs participate in the cache coherence protocol just like the instruction and data caches, without requiring any changes to the existing coherence protocol. ...
coherence through a software routine called TLB shootdown [6] . ...
doi:10.1109/hpca.2010.5416643
dblp:conf/hpca/RomanescuLSB10
fatcat:m5quti22qbgnrmp3izf4thdpey
Architectural and Operating System Support for Virtual Memory
2017
Synthesis Lectures on Computer Architecture
The process by which stale TLB entries are invalidated is known as a TLB shootdown. The details of how TLB shootdowns are performed varies widely by architecture. ...
invalidate the old TLB entry, and proceed. ...
OS support for efficient TLB coherence: As discussed at the end of the last chapter, the systems community has also recently been studying ways to improve the overheads of TLB co-
Authors' Biographies ...
doi:10.2200/s00795ed1v01y201708cac042
fatcat:4re5afn53jhu7ezxwtb25ja3ca
Efficient virtual memory for big memory servers
2013
SIGARCH Computer Architecture News
For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%. ...
They consume as much as 10% of execution cycles on TLB misses, even using large pages. ...
TLB Incoherence: The x86-64 architecture does not automatically invalidate or update a TLB entry when the corresponding memory-resident PTE is modified. graph500
99.99
memcached
99.99
mySQL
92.40 ...
doi:10.1145/2508148.2485943
fatcat:vix4kkpe5veefmas7inuv72uay
Efficient virtual memory for big memory servers
2013
Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13
For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%. ...
They consume as much as 10% of execution cycles on TLB misses, even using large pages. ...
TLB Incoherence: The x86-64 architecture does not automatically invalidate or update a TLB entry when the corresponding memory-resident PTE is modified. graph500
99.99
memcached
99.99
mySQL
92.40 ...
doi:10.1145/2485922.2485943
dblp:conf/isca/BasuGCHS13
fatcat:2p7dghs7g5axrn7dh2tttcufoe
Revisiting virtual L1 caches: A practical design using dynamic synonym remapping
2016
2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Virtual caches have potentially lower access latency and energy consumption than physical caches due not to consulting the TLB prior to every cache access. ...
Experimental results based on real world applications show over 96% dynamic energy savings for TLB lookups without significant impact on performance compared to the system with ideal virtual caches (less ...
If the counter becomes zero, the ASDT entry can be invalidated. To find the matching ASDT entry, the PPN is needed, via a TLB lookup. ...
doi:10.1109/hpca.2016.7446066
dblp:conf/hpca/YoonS16
fatcat:45lj4bwfene7dm7xlv3c63vtm4
FIMCE
2018
ACM Transactions on Privacy and Security
The operating system and the hypervisor are expected to invalidate the relevant TLB entries after updating the page tables. ...
The MMU traverses the paging tables only when the TLBs do not store the matching entry. ...
Stifling Attack The stifling attack prevents the CPU core controlled by the malicious thread from responding to the hypervisor's TLB shootdown, so that its stale TLB entry is not invalidated. ...
doi:10.1145/3195181
fatcat:75dls56oxfarporefzodz6k2me
On the Effectiveness of Virtualization Based Memory Isolation on Multicore Platforms
2017
2017 IEEE European Symposium on Security and Privacy (EuroS&P)
When more than one core accesses an address space, the core that changes the mapping is supposed to perform TLB shootdown to invalidate any existing entries on other cores. ...
Therefore, when an address mapping is updated, the software needs to explicitly invalidate the corresponding TLB entry. ...
doi:10.1109/eurosp.2017.25
dblp:conf/eurosp/ZhaoD17
fatcat:4og7nnfmmfhn7m5r2sqmmqzmqe
The Design and Implementation of Hyperupcalls
2018
USENIX Annual Technical Conference
For the TLB use case which handles TLB shootdown to inactive cores, our hyperupcall runs faster than native code since the TLB flush is deferred. The overhead of verifying a hyperupcall is minimal. ...
VCPU re-entry; (2) TLB flush is performed even when the VCPU interrupts are disabled, as experimentally it improves performance. ...
dblp:conf/usenix/AmitW18
fatcat:skkqeqbmizeyllyojrzu2ta6wu
Robust architectural support for transactional memory in the power architecture
2013
Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13
In the process of commercializing the feature, we had to resolve some previously unexplored interactions between TM and existing features of the ISA, for example translation shootdown, interrupt handling ...
a TLB invalidation instruction. ...
support TLB shootdown via invalidation instructions that result in a system-wide TLB shootdown [13, 12] , including the Power ISA, such reliance on interrupts or locally executed instructions is not ...
doi:10.1145/2485922.2485942
dblp:conf/isca/CainMFMWL13
fatcat:yn4uooo6sbbu7kwaso3biiefu4
Robust architectural support for transactional memory in the power architecture
2013
SIGARCH Computer Architecture News
In the process of commercializing the feature, we had to resolve some previously unexplored interactions between TM and existing features of the ISA, for example translation shootdown, interrupt handling ...
a TLB invalidation instruction. ...
support TLB shootdown via invalidation instructions that result in a system-wide TLB shootdown [13, 12] , including the Power ISA, such reliance on interrupts or locally executed instructions is not ...
doi:10.1145/2508148.2485942
fatcat:5a57nba23fcxzoydwisggrmf7e
NrOS: Effective Replication and Sharing in an Operating System
2021
USENIX Symposium on Operating Systems Design and Implementation
Finally, they perform the TLB invalidation. Meanwhile the initiator invalidates its own TLB entries, and then it waits for all acknowledgments from the other cores before it returns to userspace. ...
., removing or modifying page table entries) requires the OS to flush TLB entries on cores where the process is active to ensure TLB coherence. ...
dblp:conf/osdi/Bhardwaj0ACKSTZ21
fatcat:s7dunts2yrgetof65wpczb3tb4
Boosting Inter-Process Communication with Architectural Support
2022
ACM Transactions on Computer Systems
Exception Fault Instruction Description Invalid x-entry xcall Calling an invalid x-entry. Invalid xcall-cap xcall Calling an x-entry without xcall-cap. ...
However, memory remapping requires kernel's involvement and causes TLB shootdown. ...
doi:10.1145/3532861
fatcat:nuxfib6szrdixhlgyysbgt2xai
Measuring Software Performance on Linux
[article]
2018
arXiv
pre-print
Depending on the processor and OS, the TLB might be invalidated. ...
We suggest to use grouping and avoid multiplexing as long as the workload is repeatable, to obtain self-consistent and precise results. ...
arXiv:1811.01412v2
fatcat:joqggip6jbhejkjplvinaff35m
Techniques for Shared Resource Management in Systems with Throughput Processors
[article]
2018
arXiv
pre-print
cooperative technique that modifies the memory allocation policy to enable large page support in order to further reduce the inter-address-space interference at the shared Translation Lookaside Buffer (TLB ...
Similarly, TLB shootdowns are required when a GPU core changes its address space or when a page table entry is updated. ...
If the SM locates a valid large page entry for the request (i.e., the page is coalesced), it avoids looking up TLB base page entries. ...
arXiv:1803.06958v1
fatcat:3mqbwegpkvdrpk6sqwb3ooyh7e
The evolution of an x86 virtual machine monitor
2010
ACM SIGOPS Operating Systems Review
We review how the x86 architecture was originally virtualized in the days of the Pentium II (1998), and follow the evolution of the virtual machine monitor forward through the introduction of virtual SMP ...
To bridge from shared page tables to per-core TLBs, a precisely coordinated "TLB shootdown" approach must be used to invalidate mappings. ...
Self-modifying code suffers performance overheads as the outer VMM needs to invalidate translated code whenever its "source" code changes. ...
doi:10.1145/1899928.1899930
fatcat:m3hzn2xk35bmjk4qqruszxhx7y
« Previous
Showing results 1 — 15 out of 48 results