A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Compile time instruction cache optimizations
1994
SIGARCH Computer Architecture News
The technique can be applied at compile time or as part of object modules optimization. The technique is based on replication of code together with algorithms for code placement. ...
This paper presents a new approach for improving performance of instruction cache based systems. ...
instruction cache systems that can be applied during both compilation time and object modules optimization. ...
doi:10.1145/181993.182001
fatcat:tbefcxhclfatdctymigdeo6ogi
Compile time instruction cache optimizations
[chapter]
1994
Lecture Notes in Computer Science
The technique can be applied at compile time or as part of object modules optimization. The technique is based on replication of code together with algorithms for code placement. ...
This paper presents a new approach for improving performance of instruction cache based systems. ...
instruction cache systems that can be applied during both compilation time and object modules optimization. ...
doi:10.1007/3-540-57877-3_27
fatcat:liwcwwhzd5a6vnrm7ey4t2icje
Fast and efficient partial code reordering
2006
Proceedings of the 2006 international symposium on Memory management - ISMM '06
For example, our simulation results show that eliminating all instruction cache misses improves performance by as much as 16% for a modestly sized instruction cache. ...
Poor instruction cache locality can degrade performance on modern architectures. ...
We simulate a fully associative instruction cache and compare with a direct-mapped cache with the same access time to show how much performance is lost to instruction cache conflict misses. ...
doi:10.1145/1133956.1133980
dblp:conf/iwmm/HuangBGM06
fatcat:gp3azl4lqrgmvplq7w373vfx5i
Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach
2011
Design, Automation, and Test in Europe
Current processors are optimized for average case performance, often leading to a high worst-case execution time (WCET). ...
To fill the dual-issue pipeline with enough useful instructions, Patmos relies on a customized compiler. ...
Traditionally, compilers seek to optimize the average execution time by focusing the effort on frequently executed hot paths. ...
doi:10.4230/oasics.ppes.2011.11
dblp:conf/date/SchoeberlSPBP11
fatcat:f3mbwaezuvbeppzfkclo426g2q
The superblock: An effective technique for VLIW and superscalar compilation
1993
Journal of Supercomputing
Index terms -c o d e s c heduling, control-intensive programs, instruction-level parallel processing, optimizing compiler, pro le information, speculative execution, superblock, superscalar processor, ...
Superblock optimization and scheduling have been implemented in the IMPACT-I compiler. ...
Instruction Cache E ects The expansion of code from superblock formation and superblock optimizations will have an e ect on instruction cache performance. ...
doi:10.1007/bf01205185
fatcat:pvcamk2wbbd3vknrkecwp5mpfu
The Superblock: An Effective Technique for VLIW and Superscalar Compilation
[chapter]
1993
Instruction-Level Parallelism
Index terms -c o d e s c heduling, control-intensive programs, instruction-level parallel processing, optimizing compiler, pro le information, speculative execution, superblock, superscalar processor, ...
Superblock optimization and scheduling have been implemented in the IMPACT-I compiler. ...
Instruction Cache E ects The expansion of code from superblock formation and superblock optimizations will have an e ect on instruction cache performance. ...
doi:10.1007/978-1-4615-3200-2_7
fatcat:rktyy2dkd5dapokuxhkxa77wqe
Identifying the sources of cache misses in Java programs without relying on hardware counters
2012
Proceedings of the 2012 international symposium on Memory Management - ISMM '12
On average, our technique selected only 2.8% of the load and store instructions generated by the JIT compiler and these instructions accounted for 47% of the L1D cache misses and 49% of the L2 cache misses ...
To prove the effectiveness of our technique in compiler optimizations, we prototyped object placement optimizations, which align objects in cache lines or collocate paired objects in the same cache line ...
Applications in Runtime Optimization In this section, we demonstrate the usefulness of our technique to identify the instructions that cause frequent cache misses in compiler optimizations. ...
doi:10.1145/2258996.2259014
dblp:conf/iwmm/InoueN12
fatcat:eoohpkrr3rcvxizl2q5rodvlva
Identifying the sources of cache misses in Java programs without relying on hardware counters
2013
SIGPLAN notices
On average, our technique selected only 2.8% of the load and store instructions generated by the JIT compiler and these instructions accounted for 47% of the L1D cache misses and 49% of the L2 cache misses ...
To prove the effectiveness of our technique in compiler optimizations, we prototyped object placement optimizations, which align objects in cache lines or collocate paired objects in the same cache line ...
Applications in Runtime Optimization In this section, we demonstrate the usefulness of our technique to identify the instructions that cause frequent cache misses in compiler optimizations. ...
doi:10.1145/2426642.2259014
fatcat:ngcbqjkpkbcr5gpuu2xw527ula
The benefits and costs of DyC's run-time optimizations
2000
ACM Transactions on Programming Languages and Systems
The dynamic optimizations are preplanned at static compile time in order to reduce their run-time cost; we call this staging. ...
Just-in-time compilers (JITs) for Java do all compilation of the entire program at run time. ...
ACKNOWLEDGMENTS We owe thanks to Tryggve Fossum and John O'Donnell for the source for the Alpha version of the Multiflow compiler. ...
doi:10.1145/365151.367161
fatcat:bfdhmbxdyvcyjjxrgvlymynbwq
EFFECTIVENESS OF COMPILER-DIRECTED PREFETCHING ON DATA MINING BENCHMARKS
2012
Journal of Circuits, Systems and Computers
The integration of multithreaded execution onto a single die makes it even more di±cult for the compiler to insert prefetch instructions, since optimizations that are e®ective for single-threaded execution ...
Our study reveals that although properly inserted prefetch instructions can often e®ectively reduce memory access latencies for data mining applications, the compiler is not always able to exploit this ...
In case of PLSA, prefetch instructions inserted by the compiler are redundant, and prefetch the same data multiple times. ...
doi:10.1142/s0218126612400063
fatcat:qivel6cnrvhn5o4gkn7c27dlqa
An evaluation of staged run-time optimizations in DyC
1999
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation - PLDI '99
Polyvariant specialization allows multiple compiled versions of a division to be produced, each specialized for different values of the run-time-constant variables. ...
of variables being treated as run-time constants; each combination is called a division. ...
owe thanks to David Grove and the anonymous PLDI reviewers for improving the quality of our discussion, and to Tryggve Fossum and John O'Donnell for the source for the Alpha version of the Multiflow compiler ...
doi:10.1145/301618.301683
dblp:conf/pldi/GrantPMCE99
fatcat:x5pwez5bzrdljegw2z5tyocieq
An evaluation of staged run-time optimizations in DyC
1999
SIGPLAN notices
Polyvariant specialization allows multiple compiled versions of a division to be produced, each specialized for different values of the run-time-constant variables. ...
of variables being treated as run-time constants; each combination is called a division. ...
owe thanks to David Grove and the anonymous PLDI reviewers for improving the quality of our discussion, and to Tryggve Fossum and John O'Donnell for the source for the Alpha version of the Multiflow compiler ...
doi:10.1145/301631.301683
fatcat:wjl6ahz2xngc5krqpleys5hwqq
A Study of the Performance Potential for Dynamic Instruction Hints Selection
[chapter]
2006
Lecture Notes in Computer Science
Instruction hints have become an important way to communicate compile-time information to the hardware. ...
They can be generated by the compiler and the post-link optimizer to reduce cache misses, improve branch prediction and minimize other performance bottlenecks. ...
For instance, new instructions such as data and instruction cache prefetch have been introduced and they have been effectively used by the compiler and post-link optimizers (including runtime optimizers ...
doi:10.1007/11859802_7
fatcat:tkw4ji4j5zca3j2otayn4ueugm
Worst-Case Execution Time Based Optimization of Real-Time Java Programs
2012
2012 IEEE 15th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing
Therefore, a compiler for real-time systems shall include optimizations that aim to minimize the WCET. One effective compiler optimization is method inlining. ...
Standard compilers optimize execution time for the average case. However, in hard real-time systems the worst-case execution time (WCET) is of primary importance. ...
ACKNOWLEDGMENT Part of this work was supported by the FP7-ICT Project 288008 Time-predictable Multi-Core Architecture for Embedded Systems (T-CREST). ...
doi:10.1109/isorc.2012.17
dblp:conf/isorc/HeppS12
fatcat:5x2xa7fsyfcutoa5v3inntfwke
Compiler-based optimizations impact on embedded software power consumption
2009
2009 Joint IEEE North-East Workshop on Circuits and Systems and TAISA Conference
Moreover, we inspect the optimizations effect on some other execution characteristics, such as the memory references and the data cache miss rate. ...
The results show that the most aggressive performance optimization option -o3 reduce the execution time, on average, by 95%, while it increases the power consumption by 25%. ...
: cache misses, memory references, instructions per cycles, and CPU stall cycles. ...
doi:10.1109/newcas.2009.5290480
fatcat:gfx6cvyjsbdbvi5766wnqi5a2q
« Previous
Showing results 1 — 15 out of 35,945 results