Filters








631 Hits in 4.7 sec

Hierarchical Dataflow Model for efficient programming of clustered manycore processors

Julien Hascoet, Karol Desnos, Jean-Francois Nezan, Benoit Dupont de Dinechin
2017 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)  
This paper introduces a technique for deploying hierarchical dataflow graphs efficiently onto MPSoC.  ...  Dataflow Models of Computation (MoCs) are increasingly used for developing parallel applications as their high-level of abstraction eases the automation of mapping, task scheduling and memory allocation  ...  Second because fine-grained synchronizations can strongly degrade system performances.  ... 
doi:10.1109/asap.2017.7995270 dblp:conf/asap/HascoetDND17 fatcat:zrl6ffctonhrzixhzhh35nbzdy

Improving the scalability of parallel N-body applications with an event driven constraint based execution model [article]

Chirag Dekate, Matthew Anderson, Maciej Brodowicz, Hartmut Kaiser, Bryce Adelstein-Lelbach, Thomas Sterling
2011 arXiv   pre-print
This paper explores the space of effective parallel execution of ephemeral graphs that are dynamically generated using the Barnes-Hut algorithm to exemplify dynamic workloads.  ...  For comparison, results using conventional execution model semantics are also presented.  ...  Acknowledgments: We would like to thank Steven Brandt and Dylan Stark for stimulating discussions. We acknowledge support comes from NSF grants 1048019 and 1029161 to Louisiana State University.  ... 
arXiv:1109.5190v1 fatcat:ashsmlawhbbglfif4ykcu5vc3q

Software synthesis from the dataflow interchange format

Chia-Jui Hsu, Ming-Yung Ko, Shuvra S. Bhattacharyya
2005 Proceedings of the 2005 workshop on Software and compilers for embedded systems - SCOPES '05  
Furthermore, the DIF-to-C framework provides a standard, vendor-neutral mechanism for linking coarse grain dataflow optimizations with fine grain hand-optimized libraries and the large body of optimization  ...  The dataflow interchange format (DIF) [11] and the associated DIF package have been developed for specifying, working with, and transferring dataflow-based DSP designs across tools.  ...  Here, by a schedule, we mean a sequence of actor firings or more generally, any sequencing mechanism for executing actors (including static, dynamic, and hybrid static/dynamic sequencing).  ... 
doi:10.1145/1140389.1140394 dblp:conf/scopes/HsuB05 fatcat:7m4rr2bdwjgk7brhgtdwg4ipiu

Exploring the potential of heterogeneous von neumann/dataflow execution models

Tony Nowatzki, Vinay Gangadhar, Karthikeyan Sankaralingam
2015 SIGARCH Computer Architecture News  
To this end, we propose the Specialization Engine for Explicit-Dataflow (SEED).  ...  This paper makes the observation that if both out-of-order and explicit-dataflow were available in one processor, many types of GPP cores can benefit from dynamically switching during certain phases of  ...  Support for this research was provided by NSF under the grant CNS-1228782 and by a Google US/Canada PhD Fellowship.  ... 
doi:10.1145/2872887.2750380 fatcat:f7i5ox5p6vgq5eqd65isiyhe2a

Application-aware Retiming of Accelerators: A High-level Data-driven Approach [article]

Ana Lava, Mahdi Jelodari Mamaghani, Siamak Mohammadi, Steve Furber
2016 arXiv   pre-print
This paper proposes a memory smart technique for a particular class of adaptive systems: Elastic Circuits which enjoy slack elasticity at fine level of granularity.  ...  By forming coarse synchronous islands the available fine grained adaptivity is sacrificed.  ...  Power Model The dynamic and static power of our dataflow circuits can be modelled using the following equations: P ower dynamic ∝ A · f · C · V 2 DD (2) where ( 1 2 ≤ A ≤ 1) as in our elastic controllers  ... 
arXiv:1612.08163v1 fatcat:ju6oyld3jjc2zpsbhdexfforaq

Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo [article]

Le Xu, Shivaram Venkataraman, Indranil Gupta, Luo Mai, Rahul Potharaju
2020 arXiv   pre-print
Our framework called Cameo uses fine-grained stream processing (inspired by actor computation models), and is able to provide high resource utilization while meeting latency targets.  ...  Cameo dynamically calculates and propagates priorities of events based on user latency targets and query semantics.  ...  Acknowledgements We thank Matei Zaharia and our anonymous referees at NSDI 2020 for their reviews and help with improving the paper. We thank Kai Zeng for providing feedbacks for initial ideas.  ... 
arXiv:2010.03035v1 fatcat:losceanybfeudogr5cqvgfh6ja

Exploring the potential of heterogeneous von neumann/dataflow execution models

Tony Nowatzki, Vinay Gangadhar, Karthikeyan Sankaralingam
2015 Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15  
SEED: An Architecture for Fine-Grain of fine-grained interleaving of execution models significant Dataflow Specialization enough?  ...  fine grain inside an application.  ... 
doi:10.1145/2749469.2750380 dblp:conf/isca/NowatzkiGS15 fatcat:hql7xymzgjch3jv4dk5mvbesji

Hardware/Software Cosynthesis of DSP Systems [chapter]

Shuvra Bhattacharyya
2001 Signal Processing and Communications  
Thus, we do not explore techniques for fine-grain cosynthesis [21] , including synthesis of applicationspecific instruction processors (ASIPs) [43] , nor do we explore cosynthesis for control-dominant  ...  Motivation for coarse-grain dataflow specification stems from the growing trend towards specifying, analyzing, and verifying embedded system designs in terms of domain-specific concurrency models [33]  ...  Cyclo-static dataflow Cyclo-static dataflow (CSDF) and scalable synchronous dataflow (described in Section 6) are presently the most widely-used alternatives to SDF.  ... 
doi:10.1201/9780203908068.ch8 fatcat:z2tvsup2f5adtehjabmyop6jpi

Megaphone: Latency-conscious state migration for distributed streaming dataflows [article]

Moritz Hoffmann, Andrea Lattuada, Frank McSherry, Vasiliki Kalavri, John Liagouris, Timothy Roscoe
2019 arXiv   pre-print
We design and implement Megaphone, a data migration mechanism for stateful distributed dataflow engines with latency objectives.  ...  Megaphone is implemented as a library on an unmodified timely dataflow implementation, and provides an operator interface compatible with its existing APIs.  ...  Megaphone plans fine-grained migrations using the logical timestamps of the stream processor, and interleaves the migrations with regular streaming dataflow processing.  ... 
arXiv:1812.01371v3 fatcat:6czxmg757ndotiuotaznq2wnza

Megaphone

Moritz Hoffmann, Andrea Lattuada, Frank McSherry
2019 Proceedings of the VLDB Endowment  
We design and implement Megaphone, a data migration mechanism for stateful distributed dataflow engines with latency objectives.  ...  Megaphone is implemented as a library on an unmodified timely dataflow implementation, and provides an operator interface compatible with its existing APIs.  ...  Megaphone plans fine-grained migrations using the logical timestamps of the stream processor, and interleaves the migrations with regular streaming dataflow processing.  ... 
doi:10.14778/3329772.3329777 fatcat:2l2szo4445c67m5y7jm4m4o3tu

Energy efficiency and performance management of parallel dataflow applications

Simon Holmbacka, Erwan Nogues, Maxime Pelcat, Sebastien Lafond, Johan Lilius
2014 Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing  
cores (DVFS vs.  ...  Rather than providing this information by hand, dataflow frameworks such as PREESM [16] provides tools for explicit parallelization by single rate Synchronous Data Flow (SDF) transforms, which can be  ... 
doi:10.1109/dasip.2014.7115624 dblp:conf/dasip/HolmbackaNPLL14 fatcat:6wyoryrtnjdxrjeo4w6b5maw4q

Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks [article]

Hao Zhang, Shizhen Xu, Graham Neubig, Wei Dai, Qirong Ho, Guangwen Yang, Eric P. Xing
2017 arXiv   pre-print
Existing dataflow-based programming models for DL---both static and dynamic declaration---either cannot readily express these dynamic models, or are inefficient due to repeated dataflow graph construction  ...  on training of various dynamic NN architectures, and ablations demonstrate the contribution of our proposed batching and memory management strategies.  ...  DyNet proposes an autobatching strategy that searches for batching opportunities by profiling every fine-grained operator, while this step itself has non-negligible overhead, and loses the opportunities  ... 
arXiv:1712.04048v1 fatcat:uha5kzwolzh6xidsk3lgqfd6y4

Cavs: An Efficient Runtime System for Dynamic Neural Networks

Shizhen Xu, Hao Zhang, Graham Neubig, Wei Dai, Jin Kyu Kim, Zhijie Deng, Qirong Ho, Guangwen Yang, Eric P. Xing
2018 USENIX Annual Technical Conference  
Cavs represents a dynamic NN as a static vertex function F and a dynamic instance-specific graph G.  ...  However, existing DL programming models are inefficient in handling dynamic network architectures because of: (1) substantial overhead caused by repeating dataflow graph construction and processing every  ...  DyNet proposes an auto-batching strategy that searches for batching opportunities by profiling every fine-grained operator, while this step itself has non-negligible overhead ( §5.2).  ... 
dblp:conf/usenix/XuZN0KDHYX18 fatcat:tmihvtu625d7bax3wfpd5uh5sa

An empirical characterization of stream programs and its implications for language and compiler design

William Thies, Saman Amarasinghe
2010 Proceedings of the 19th international conference on Parallel architectures and compilation techniques - PACT '10  
The lessons learned have implications for the design of future architectures, languages and compilers for the streaming domain.  ...  In order to develop effective compilation techniques for the streaming domain, it is important to understand the common characteristics of these programs.  ...  Many of the advanced scheduling strategies for synchronous dataflow graphs have the highest payoff when the input and output rates of neighboring filters are mismatched.  ... 
doi:10.1145/1854273.1854319 dblp:conf/IEEEpact/ThiesA10 fatcat:67vnec7u2ja7rjst55lplqo52q

Scheduling multiple independent hard-real-time jobs on a heterogeneous multiprocessor

Orlando Moreira, Frederico Valente, Marco Bekooij
2007 Proceedings of the 7th ACM & IEEE international conference on Embedded software - EMSOFT '07  
This paper proposes a scheduling strategy and an automatic scheduling flow that enable the simultaneous execution of multiple hard-real-time dataflow jobs.  ...  We show how a combination of Time-Division Multiplex (TDM) and static-order scheduling can be modeled as additional nodes and edges on top of the dataflow representation of the job using Single-Rate Dataflow  ...  is more fine-grained; and it also allows for a different scheduling mechanism per processor.  ... 
doi:10.1145/1289927.1289941 dblp:conf/emsoft/MoreiraVB07 fatcat:mcmcoc35ljcvhjjjhec5y5dwba
« Previous Showing results 1 — 15 out of 631 results