A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2006; you can also visit the original URL.
The file type is application/pdf
.
Filters
On the Optimality of Register Saturation
2005
Electronical Notes in Theoretical Computer Science
Second, we prove that the problem of reducing the register saturation is NPhard. Our detailed experiments in this paper show that our previous heuristics [14] are nearly optimal. ...
However, in a previous work [14], we introduced and mathematically studied the register saturation (RS) concept. ...
Optimal Register Saturation Reduction In the case where the register saturation RS t (G) exceeds the number of available registers R t of the type t, then we must add extra serial arcs into the DAG G to ...
doi:10.1016/j.entcs.2005.01.033
fatcat:qnuc3x3xgrbvlhyknxlolleed4
On the optimality of register saturation
Workshops on Mobile and Wireless Networking/High Performance Scientific, Engineering Computing/Network Design and Architecture/Optical Networks Control and Management/Ad Hoc and Sensor Networks/Compile and Run Time Techniques for Parallel Computing ICPP 2004
Second, we prove that the problem of reducing the register saturation is NPhard. Our detailed experiments in this paper show that our previous heuristics [14] are nearly optimal. ...
However, in a previous work [14], we introduced and mathematically studied the register saturation (RS) concept. ...
Optimal Register Saturation Reduction In the case where the register saturation RS t (G) exceeds the number of available registers R t of the type t, then we must add extra serial arcs into the DAG G to ...
doi:10.1109/icppw.2004.1328069
dblp:conf/icppw/Touati04
fatcat:jozox4v6k5bu5n4wyy4hgbabje
Register Saturation in Superscalar and VLIW Codes
[chapter]
2001
Lecture Notes in Computer Science
In this work, we mathematically study and extend the approach which consists of computing the exact upper-bound of the register need for all the valid schedules, independently of the functional unit constraints ...
Its aim was to add some serial arcs to the original DAG such that the worst register need does not exceed the number of available registers. ...
As consequence, our heuristic does not compute an upper bound of the optimal register saturation and then the optimal RS can be greater than the one computed by Greedy-k. ...
doi:10.1007/3-540-45306-7_15
fatcat:xbrndfdvbfclhctgrs2ykq5bxq
Periodic register saturation in innermost loops
2009
Parallel Computing
We call this upper-limit the periodic register saturation (PRS) of the data dependence graph (DDG). ...
It extends the register saturation (RS) concept to periodic instruction schedules, i.e., software pipelining (SWP). ...
This research result would not succeed without the valuable support of the University of Versailles Saint-Quentin en Yvelines, INRIA-Rocquencourt and INRIA-Saclay in France. ...
doi:10.1016/j.parco.2008.12.001
fatcat:brbfjekj4jgdxksqcvkhc7h7vi
Optimal speech codec implementation on ARM9E (v5E architecture) RISC processor for next-generation mobile multimedia
2004
Visual Communications and Image Processing 2004
Our optimization techniques are based on identification of algorithms, which could exploit either the DSP features or the RISC features or both. ...
By a systematic application of these optimization techniques for a GSM-AMR (NB) codec 1 on ARM9E core 2 , we could achieve more than 77% improvement over the baseline codec and almost 33% (worst-case) ...
One saturation block performs a double and saturate, required for fractional MAC (Q15 x Q15 + Q31→Q31); the other performs a straight saturation of the accumulated value. ...
doi:10.1117/12.532455
dblp:conf/vcip/BanglaVB04
fatcat:fqvxhjmbyzhrhe6567tnduvyqq
Register Saturation in Instruction Level Parallelism
2005
International journal of parallel programming
Our deeper analysis of the problem and our formal methods enable us to provide nearly optimal heuristics and strategies for register optimization in the face of ILP. ...
We call this computed limit the register saturation (RS) of the DAG. Its aim is to detect possible obsolete register constraints, i.e., when RS does not exceed the number of available registers. ...
Consequently, our heuristics do not compute an upper bound of the optimal register saturation, and the optimal RS can be greater than the one computed by Greedy-k. ...
doi:10.1007/s10766-005-6466-x
fatcat:zbtnwdgvp5fbtfs5rqjoxooo2a
A Low-Power Multithreaded Processor for Software Defined Radio
2006
Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology
Using a super-computer class vectorizing compiler, the SB3010 achieves real-time performance in software on a variety of communication protocols including 802.11b, GPS, AM/FM radio, Bluetooth, GPRS, and ...
We also describe the processor's programming environment and the SB3010 platform, a complete system-on-chip solution for software defined radio. ...
To reduce the number of ports, the VRF uses a novel technique, which divides it into two register banks; one for even threads and one for odd threads. ...
doi:10.1007/s11265-006-7267-1
fatcat:nfqhlyks6bhmnfyg2opfpsm4xy
Effective compiler generation by architecture description
2006
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers and tool support for embedded systems - LCTES '06
From a specification, we can derive an optimized tree pattern matching instruction selector, a register allocator and an instruction scheduler. ...
Architecture description languages (ADLs) provide a single concise architecture specification for the generation of hardware, instruction set simulators and compilers. ...
Acknowledgments This work is supported in part by Infineon Technologies Austria and the Christian Doppler Forschungsgesellschaft. We like to thank ...
doi:10.1145/1134650.1134671
dblp:conf/lctrts/FarfelederKSB06
fatcat:a57pdprmkbeozk4jfmwzbxll4q
Effective compiler generation by architecture description
2006
SIGPLAN notices
From a specification, we can derive an optimized tree pattern matching instruction selector, a register allocator and an instruction scheduler. ...
Architecture description languages (ADLs) provide a single concise architecture specification for the generation of hardware, instruction set simulators and compilers. ...
Acknowledgments This work is supported in part by Infineon Technologies Austria and the Christian Doppler Forschungsgesellschaft. We like to thank ...
doi:10.1145/1159974.1134671
fatcat:yanz6oia2fhdtm25l32fokvuba
Analysis of Execution Efficiency in the Microthreaded Processor UTLEON3
[chapter]
2011
Lecture Notes in Computer Science
As the compiler specifies the blocksize parameter for each family of threads individually, it can optimize the register file utilization of the processor. ...
We analyse an impact of long-latency instructions, the family blocksize parameter, and the thread switch modifier on execution efficiency of families of threads in a single-core configuration of the UTLEON3 ...
The paper reflects only the authors' view; neither the European Commission nor the Czech Ministry of Education are liable for any use that may be made of the information contained herein. ...
doi:10.1007/978-3-642-19137-4_10
fatcat:iwxumipjabaxre2qv7yznnhu44
Universality and Optimality of Programmable Quantum Processors
2006
Acta Physica Hungarica A: Heavy Ion Physics
We define several characteristics how to quantify the optimality and we study in detail performance of three types of programmable quantum processors based on (1) the C-NOT gate, (2) the SWAP operation ...
We also investigate optimality of the so-called U-processors and we also compare the optimal approximative implementation of U(1) qubit rotations with the known probabilistic implementation as introduced ...
In order to realize n unitary transformation of the data register one must use n dimensional program register. ...
doi:10.1556/aph.26.2006.3-4.8
fatcat:ntnf4erezndt3elhba23r4kwem
PLX: An Instruction Set Architecture and Testbed for Multimedia Information Processing
2005
Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology
We demonstrate the use and high performance of PLX on some frequently-used code kernels selected from image, video, and graphics processing applications: discrete cosine transform, pixel padding, clip ...
Another design goal of PLX is to facilitate exploration and evaluation of novel techniques in instruction set architecture, microarchitecture, arithmetic, VLSI implementations, compiler optimizations, ...
Acknowledgments PLX is a project of the Princeton Architecture Laboratory for Multimedia and Security (PALMS). ...
doi:10.1007/s11265-005-4940-8
fatcat:vzndq4zbfvdt7cey2yncygiuoi
Loner: utilizing the CPU vector datapath to process scalar integer data
2022
Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction
In this paper, we present Loner, a profile-guided compiler methodology for optimizing scalar integer loops using the otherwise idle vector datapath. ...
Thus, CPU vector registers and functional units frequently sit idle while the scalar datapath unilaterally executes code. ...
Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors(s) and do not necessarily reflect the views of the National Science Foundation. ...
doi:10.1145/3497776.3517767
fatcat:2cvuymu7tjemldf3bftrtyhpyi
The Evaluation of Traffic Control in Changsha City
2012
Procedia - Social and Behavioral Sciences
The second issue is the low saturation flow observed on the intersections, that appear to be 20 to 30% lower than the ones in comparable situations in Europe or North America. ...
Lastly, the signal timing of a 13nodes network in the CBD of Changsha has been optimized with TRANSYT-14. ...
The disobedience of drivers has been registered and it appears indeed that the number of such actions has a relationship with the traffic performance, i.e. the more disobedience occurs on an intersection ...
doi:10.1016/j.sbspro.2012.04.094
fatcat:6ko7siqggngqbeubw3764ixz44
Optimal Pulsing Schemes for Galileo Pseudolite Signals
2007
Journal of Global Positioning Systems
Basically these studies have been focused on the GPS pseudolites and the proposed pulsing schemes are optimised for the GPS signals (RTCM, RTCA). ...
Simulations based on the Galileo signal structure (codes, chipping rates, cross correlation properties) have been performed and the results will be presented. ...
the spacing is defined as the optimal one. ...
doi:10.5081/jgps.6.2.133
fatcat:7zmf54htpffdbl2bpue2yv5szu
« Previous
Showing results 1 — 15 out of 91,473 results