Filters








19,100 Hits in 8.2 sec

A First Step Towards Time Optimal Software Pipelining of Loops with Control Flows [chapter]

Han-Saem Yun, Jihong Kim, Soo-Mook Moon
2001 Lecture Notes in Computer Science  
However, surprisingly, there have been few theoretical or empirical results on time optimal software pipelining of loops with control flows.  ...  First, we propose a necessary and sufficient condition for a loop with control flows to have an optimally software-pipelined program. We also present a decision procedure to compute the condition.  ...  The results solve two fundamental open problems on time optimal software pipelining of loops with control flows.  ... 
doi:10.1007/3-540-45306-7_13 fatcat:7yeiysh5qrfmnh2g7f5q77zmwe

Generation of Control and Data Flow Graphs from Scheduled and Pipelined Assembly Code [chapter]

David C. Zaretsky, Gaurav Mittal, Robert Dick, Prith Banerjee
2006 Lecture Notes in Computer Science  
This process consists of three stages: generating a control flow graph, linearizing the assembly code, and generating the data flow graph.  ...  High-level synthesis tools generally convert abstract designs described in a high-level language into a control and data flow graph (CDFG), which is then optimized and mapped to hardware.  ...  Building a CDFG consists of a two-step process: building the control flow graph (CFG), which represents the path of control in the design, and building the data flow graph (DFG), which represents the data  ... 
doi:10.1007/978-3-540-69330-7_6 fatcat:3lb6xhacmvgdblemqmzuqpgjhu

Design space minimization with timing and code size optimization for embedded DSP

Qingfeng Zhuge, Zili Shao, Bin Xiao, Edwin H.-M. Sha
2003 Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign & system synthesis - CODES+ISSS '03  
Theoreies are presented to produce a small set of feasible design choices with provable quality.  ...  This paper presents an Integrated Framework for Design Optimization and Space Minimization (IDOM) towards finding the minimum configuration satisfying timing and code size constraints.  ...  The loop schedule length is reduced from four control steps to one control step for software-pipelined loop.  ... 
doi:10.1145/944645.944685 dblp:conf/codes/ZhugeSXS03 fatcat:2tycfkldyjbojhvotoz7f6lsn4

Design space minimization with timing and code size optimization for embedded DSP

Qingfeng Zhuge, Zili Shao, Bin Xiao, Edwin H.-M. Sha
2003 Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign & system synthesis - CODES+ISSS '03  
Theoreies are presented to produce a small set of feasible design choices with provable quality.  ...  This paper presents an Integrated Framework for Design Optimization and Space Minimization (IDOM) towards finding the minimum configuration satisfying timing and code size constraints.  ...  The loop schedule length is reduced from four control steps to one control step for software-pipelined loop.  ... 
doi:10.1145/944682.944685 fatcat:ufyl2p5dqrgm7j4mwhynjgblfi

Design optimization and space minimization considering timing and code size via retiming and unfolding

Qingfeng Zhuge, Chun Xue, Zili Shao, Meilin Liu, Meikang Qiu, Edwin H.-M. Sha
2006 Microprocessors and microsystems  
The increasingly complicated DSP processors and applications with strict timing and code size constraints require design automation tools to consider multiple optimizations such as software pipelining  ...  It provides an efficient technique for reducing the code size of any software-pipelined loops. In this paper, we propose an Integrated Framework for Design Optimization and Space Minimization (IDOM).  ...  The loop schedule length is reduced from four control steps to one control step for software-pipelined loop.  ... 
doi:10.1016/j.micpro.2005.11.002 fatcat:qq7qef422rafvpfdeagzwnlguq

From software to accelerators with LegUp high-level synthesis

Andrew Canis, Jongsok Choi, Blair Fort, Ruolong Lian, Qijing Huang, Nazanin Calagar, Marcel Gort, Jia Jun Qin, Mark Aldham, Tomasz Czajkowski, Stephen Brown, Jason Anderson
2013 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)  
With LegUp, a designer can start from an embedded application running on a processor and incrementally migrate portions of the program to hardware accelerators implemented on an FPGA.  ...  This paper presents on overview of the LegUp design methodology and system architecture, and discusses ongoing work on profiling, hardware/software partitioning, hardware accelerator quality improvements  ...  The financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC) and Altera Corporation is gratefully acknowledged.  ... 
doi:10.1109/cases.2013.6662524 dblp:conf/cases/CanisCFLHCGQACBA13 fatcat:mkl646vbefa43irr2i725vmh6u

Global Software Pipelining with Iteration Preselection [chapter]

David Gregg
2000 Lecture Notes in Computer Science  
Software pipelining loops containing multiple paths is a very difficult problem. Loop shifting offers the possibility of a close to optimal schedule with acceptable code growth.  ...  We separate loop shifting from scheduling, and present new, non-greedy heuristics. Experimental results show that our approach yields better performance and less code growth.  ...  Watson Research Center for providing us with the Chameleon experimental test-bed. Special thanks to Mayan Moudgill and Michael Gschwind.  ... 
doi:10.1007/3-540-46423-9_13 fatcat:6xu5clunuzcvxdcl5s3jqfk43e

Architectural synthesis of computational pipelines with decoupled memory access

Shaoyi Cheng, John Wawrzynek
2014 2014 International Conference on Field-Programmable Technology (FPT)  
With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-ofthe-art HLS tool.  ...  In this paper, we present an automatic flow to refactor and restructure processorcentric software implementations, making them better suited for FPGA platforms.  ...  The ASPIRE Lab is funded by DARPA Award Number HR0011-12-2-0016, the Center for Future Architecture Research, a member of STARnet, a Semiconductor Research Corporation program sponsored by MARCO and DARPA  ... 
doi:10.1109/fpt.2014.7082758 dblp:conf/fpt/ChengW14 fatcat:wdka47i2qjefpjqumez6nkoiyq

A coarse-grained stream architecture for cryo-electron microscopy images 3D reconstruction

Wendi Wang, Bo Duan, Wen Tang, Chunming Zhang, Guangming Tang, Peiheng Zhang, Ninghui Sun
2012 Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays - FPGA '12  
The proposed stream architecture is built by first offloading computing-intensive software kernels to dedicated hardware modules, which emphasizes the importance of optimizing computing dominated data  ...  The efficiency of the proposed stream architecture is justified by the reported 2.54 times speedup over a 4-cores CPU.  ...  First, operators of the DFG are scheduled to different pipeline stages with respect to data dependency.  ... 
doi:10.1145/2145694.2145719 dblp:conf/fpga/WangDTZTZS12 fatcat:tb63lp3urvcdfhe5njamqngm24

HIR: An MLIR-based Intermediate Representation for Hardware Accelerator Description [article]

Kingshuk Majumder, Uday Bondhugula
2021 arXiv   pre-print
While offering rich optimization opportunities and a high level abstraction, HIR enables sharing of optimizations, utilities and passes with software compiler infrastructure.  ...  HIR combines high level language features, such as loops and multi-dimensional tensors, with programmer defined explicit scheduling, to provide a high-level IR suitable for DSL compiler pipelines without  ...  Listing 1 shows the syntax of the for loop. The loop takes a lower bound, an upper bound, a step, and start time as its inputs.  ... 
arXiv:2103.00194v1 fatcat:vwv7jfr2ofgxjih7uamqxvv4xe

Are Hls Tools Healthy? The C-Cubed Project

M. Dossis, G. Dimitriou
2015 Zenodo  
This paper completes with a number of experiments that were executed using the author's methodology and they are used to evaluate the specific HLS tools.  ...  The present article is a practical perspective of the required fully automated and formal tools, which are needed to constitute integral parts in Electronic Design Automation (EDA) flows.  ...  accuracy analysis and optimization of polynomial data-flow graphs with respect to a reference model that is found in many DSP applications [9] , a technique to improve nested loop pipelining for HLS,  ... 
doi:10.5281/zenodo.16989 fatcat:fcirdvxuwvgjhd73eqcqr554f4

Deep jam: conversion of coarse-grain parallelism to instruction-level and vector parallelism for irregular applications

P. Carribault, A. Cohen, W. Jalby
2005 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)  
We show that good speedups can be achieved through deep jam, a new transformation of the program control-and data-flow.  ...  Deep jam combines scalar and array renaming with a generalized form of recursive unroll-and-jam; it brings together independent instructions across irregular control structures, removing memorybased dependences  ...  We would like to thank Antoine Joux for providing the SHA-0 cryptanalysis code, Christophe Lemuet and Jean-Thomas Acquaviva for their contributions to the manual optimization of this code, and Gonzalo  ... 
doi:10.1109/pact.2005.16 dblp:conf/IEEEpact/CarribaultCJ05 fatcat:pkr5izrmbjbw7jwsgjdeuow37e

A simple, verified validator for software pipelining

Jean-Baptiste Tristan, Xavier Leroy
2010 SIGPLAN notices  
Software pipelining is a loop optimization that overlaps the execution of several iterations of a loop to expose more instruction-level parallelism.  ...  It can result in first-class performance characteristics, but at the cost of significant obfuscation of the code, making this optimization difficult to test and debug.  ...  The use of structured control above is a notational convenience: in reality, software pipelining is performed on a flow graph representation of control (CFG), and step 1 actually isolates a sub-graph of  ... 
doi:10.1145/1707801.1706311 fatcat:f4gqjlfxevc3hfeqzp7lwbtlou

A simple, verified validator for software pipelining

Jean-Baptiste Tristan, Xavier Leroy
2010 Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '10  
Software pipelining is a loop optimization that overlaps the execution of several iterations of a loop to expose more instruction-level parallelism.  ...  It can result in first-class performance characteristics, but at the cost of significant obfuscation of the code, making this optimization difficult to test and debug.  ...  The use of structured control above is a notational convenience: in reality, software pipelining is performed on a flow graph representation of control (CFG), and step 1 actually isolates a sub-graph of  ... 
doi:10.1145/1706299.1706311 dblp:conf/popl/TristanL10 fatcat:7ou7vfuasjf23pon4dunnenhqu

Optimal code size reduction for software-pipelined and unfolded loops

Qingfeng Zhuge, Bin Xiao, Zili Shao, Edwin H.-M. Sha, Chantana Chantrapornchai
2002 Proceedings of the 15th international symposium on System Synthesis - ISSS '02  
We propose a code size reduction framework to achieve the optimal code size of software-pipelined and unfolded loops by using conditional registers.  ...  The experimental results on several wellknow benchmarks show the effectiveness of our code size reduction technique in controlling the code size of optimized loops.  ...  The schedule length of the new loop body is then reduced from two control steps to one control steps. Hence, every retiming operation corresponds to a software pipelining operation.  ... 
doi:10.1145/581199.581232 fatcat:rawexlc46regfpjxevwz7dpyvy
« Previous Showing results 1 — 15 out of 19,100 results