A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2007; you can also visit the original URL.
The file type is application/pdf
.
Filters
ASIP architecture exploration for efficient IPSec encryption
2007
ACM Transactions on Embedded Computing Systems
Efficient ASIP design requires an iterative architecture exploration loop -gradual refinement of processor architecture starting from an initial template. ...
This paper describes an architecture exploration loop for an ASIP coprocessor which implements common encryption functionality used in symmetric block cipher algorithms for IPsec. ...
Tool based processor architecture exploration loop
Fig. 4 . 4 Fig. 3. Blowfish
Fig. 6 . 6 Fig. 6. Parallel S-Box access in the execution stage. ...
doi:10.1145/1234675.1234679
fatcat:webkzsdkrvho7k3xqivnhskywu
ASIP Architecture Exploration for Efficient Ipsec Encryption: A Case Study
[chapter]
2004
Lecture Notes in Computer Science
Efficient ASIP design requires an iterative architecture exploration loop -gradual refinement of processor architecture starting from an initial template. ...
This paper describes an architecture exploration loop for an ASIP coprocessor which implements common encryption functionality used in symmetric block cipher algorithms for IPsec. ...
Tool based processor architecture exploration loop
Fig. 4 . 4 Fig. 3. Blowfish
Fig. 6 . 6 Fig. 6. Parallel S-Box access in the execution stage. ...
doi:10.1007/978-3-540-30113-4_4
fatcat:e477r42nlzbpjbl3u6y7p3txn4
Power Aware Framework for Dense Matrix Operations in Multimedia Processors
2005
2005 Pakistan Section Multitopic Conference
The approach is illustrated using functional unit usage within a VLIW architecture for low power, which improves energy dissipation up to 34% and CPU performance up to 87% for an idct example. ...
In this paper we analyze 1 the use of Decision Tree Grafting, Blocking and Loop Unfolding to improve the performance of dense matrix computations on high performance multimedia processors. ...
The approach is illustrated using functional unit usage within a VLIW architecture and identifies a new operation rebinding technique for low power. ...
doi:10.1109/inmic.2005.334414
fatcat:pzmkdxwzx5a5flh4kczmsxv5za
Exploring the potential of heterogeneous von neumann/dataflow execution models
2015
Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15
Lipasti, “Revolver: Processor architec-
lar workloads, completely obviating the need for short-vector ture for power efficient loop execution,” in HPCA, 2014. ...
executed program regions, a combination of power-efficient
bines known dataflow-architecture techniques for high en- hardware structures, and a set of compiler techniques. ...
doi:10.1145/2749469.2750380
dblp:conf/isca/NowatzkiGS15
fatcat:hql7xymzgjch3jv4dk5mvbesji
Exploring the potential of heterogeneous von neumann/dataflow execution models
2015
SIGARCH Computer Architecture News
General purpose processors (GPPs), from small inorder designs to many-issue out-of-order, incur large power overheads which must be addressed for future technology generations. ...
Interestingly, well known explicit-dataflow architectures eliminate these overheads by directly executing the data-dependence graph and eschewing instruction-precise recoverability. ...
Support for this research was provided by NSF under the grant CNS-1228782 and by a Google US/Canada PhD Fellowship. ...
doi:10.1145/2872887.2750380
fatcat:f7i5ox5p6vgq5eqd65isiyhe2a
Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI
[article]
2019
arXiv
pre-print
and outlines directions to maintain high energy efficiency even for small matrix sizes where the vector architecture achieves suboptimal utilization of the available FPUs. ...
An analysis on several vectorizable linear algebra computation kernels for a range of different matrix and vector sizes gives insight into performance limitations and bottlenecks for vector processors ...
ACKNOWLEDGMENTS We would like to thank Frank Gürkaynak and Francesco Conti for the helpful discussions and insights. ...
arXiv:1906.00478v3
fatcat:h7zn4tkpqjf6xd35iuacpkre2a
Techniques for low energy software
1997
Proceedings of the 1997 international symposium on Low power electronics and design - ISLPED '97
In addition several compiler techniques such as loop unrolling, software pipelining, recursion elimination and of effects of different algorithms on power and energy consumption are studied. ...
This evaluation methodology is useful for computer architects to evaluate energy improvements of their hardware, compiler writers to evaluate energy of the compiled code and program writers to evaluate ...
[15] build the instruction level power models after the design has been completed using actual current measurements of the processor chip as it executes instruction patterns. Landman et al. ...
doi:10.1145/263272.263286
dblp:conf/islped/MehtaOICG97
fatcat:eoggc6xvprb6ta7k2rvov5x7bu
Evaluación de parámetros de optimización GCC
2012
Ingenierías USBMed
In the mean time, compilers will need to become even more efficient at utilizing the underlying system architecture through self-optimization. ...
Furthermore, sometimes such code may be less efficient than a code that has been compiled for generic hardware. ...
doi:10.21500/20275846.272
fatcat:z4v2ut2s3jfixkeugcl5ogapbu
Performance Estimation of a LEON 3FT Processor Based Design for Spacecraft Applications
2014
IOSR Journal of Electronics and Communication Engineering
A set of selected benchmark programs have been executed on the superior processor mainly to track the execution times. ...
The content of this paper is intended to highlight the performance of the 32-bit LEON 3FT processor in terms of execution speed in comparison with the currently used 16-bit processor. ...
The logics may also be PC running GRMON Protoboard containing LEON 3FT processor RS232 cable UART interface called within multiple looping constructs for the purpose of testing complex looping times. ...
doi:10.9790/2834-09354854
fatcat:zycynh5kpzc6ljpjujkoe7nw64
Task Scheduling Frameworks for Heterogeneous Computing Toward Exascale
2018
International Journal of Advanced Computer Science and Applications
The race for Exascale Computing has naturally led computer architecture to transit from the multicore era and into the heterogeneous era. ...
They investigate the important role of optimization and tackle intelligently scheduled tasks on the combination of CPU/GPU architecture CPUs and GPUs cores in achieving the peace of performance and power ...
In [77] the researchers study the impact of power variation of scheduling multi programming concurrently. They present an efficient algorithm for power capping.
VIII. ...
doi:10.14569/ijacsa.2018.091029
fatcat:xqr3zoybwjbq5etrsb3msdjrxq
Performance efficiency of context-flow system-on-chip platform
2003
ICCAD-2003. International Conference on Computer Aided Design (IEEE Cat. No.03CH37486)
We demonstrate the performance efficiency of this architecture over bus based and packet-switch based networks by two case studies using a multi-processor architecture simulator. ...
Recent efforts in adapting computer networks into system-on-chip (SOC), or network-on-chip, present a setback to the traditional computer systems for the lack of effective programming model, while not ...
For this purpose, the memory space and register files were replicated, one per PE, and the main execution loop of the simulator was modified to execute one instruction from each PE code at each simulation ...
doi:10.1109/iccad.2003.159711
fatcat:fksjka24dzed7db225wv3mw33y
Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads
[article]
2020
arXiv
pre-print
more flexible than a contemporary vector processor lane, achieving a 2× energy-efficiency improvement. ...
With increasing integration density, the quest for energy efficiency becomes the number one design concern. ...
similar compute per area efficiency with around 6 % for all execution units [8] . ...
arXiv:2002.10143v1
fatcat:jrugjgr4yzdyro4tka3czt6x64
A Survey on Coarse-Grained Reconfigurable Architectures from a Performance Perspective
2020
IEEE Access
recent research has shown performance-or power-benefits for multiple applications [10]-[14]. ...
These limitations have been recognized for decades (e.g., [15]-[17]), and have driven forth a different branch of reconfigurable architecture: the Coarse-Grained Reconfigurable Architecture (CGRAs). ...
and power-efficiency. ...
doi:10.1109/access.2020.3012084
fatcat:xx6k4lxbjbc4tjebbymp42w634
Customized architectures for faster route finding in GPS-based navigation systems
2010
2010 IEEE 8th Symposium on Application Specific Processors (SASP)
In this paper, we present a practical approach to extract small-scale parallelism by shifting priority queue operations to a secondary tightly-coupled processor. ...
We obtain a substantial speedup on real-world graphs (in particular, road maps), allowing the development of navigation systems that are more responsive, and also lower in total power consumption. ...
In [3] , novel loop acceleration architecture and the dynamic algorithm for mapping loops onto the loop accelerators are presented and analyzed. ...
doi:10.1109/sasp.2010.5521148
dblp:conf/sasp/LoewPM10
fatcat:tnsiaalz25bezbh77ekn2wkyta
3D tomography back-projection parallelization on FPGAs using opencl
2017
2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)
For this purpose, we start with evaluating different custom OpenCL implementations of the backprojection algorithm. ...
This paper deals with the evaluation of FPGAs resurgence for hardware acceleration applied to computed tomography on the back-projection operator used in iterative reconstruction algorithms. ...
A key difficulty for single work-item implementations are loop handling, because the Altera Offline Compiler default behaviour is to have each loop iteration executed sequentially, thus drastically reducing ...
doi:10.1109/dasip.2017.8122119
dblp:conf/dasip/MartelliGME17
fatcat:ujzjjcughzckplnqpknwmm6e7u
« Previous
Showing results 1 — 15 out of 1,644 results