A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
Filters
Compiler-directed Data Partitioning for Multicluster Processors
International Symposium on Code Generation and Optimization (CGO'06)
This work proposes a compiler-directed approach to synergistically partition both data objects and computation across multiple clusters. ...
The distribution of data objects is generally ignored. In this work, we examine explicit partitioning of data objects and its affects on operation partitioning. ...
GLOBAL DATA PARTITIONING This section introduces our compiler-directed Global Data Partitioning (GDP) approach for jointly partitioning data objects and computation across a multicluster architecture. ...
doi:10.1109/cgo.2006.9
dblp:conf/cgo/ChuM06
fatcat:hd7fbatnyre75lckc3vdbygo5u
Cost-sensitive partitioning in an architecture synthesis system for multicluster processors
2004
IEEE Micro
This article focuses on the latter topic-compiler-directed architecture synthesis. More specifically, we examine compiler-directed synthesis of an ASIP's data path architecture. ...
Hierarchical multicluster data path synthesis system Figure 1 shows our hierarchical system for multicluster architecture synthesis. ...
doi:10.1109/mm.2004.7
fatcat:hi3tnsrh4bdblhsrggff4weibu
Code and data partitioning for fine-grain parallelism
2007
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools - LCTES '07
This paper focuses on an alternative compiler-directed method for program parallelization by exploiting fine-grain instructionlevel parallelism (ILP). ...
Introduction The recent shift to multicore designs for mainstream processors offers the potential to improve the performance of current applications. ...
Our profile-guided data access partitioning technique was implemented as part of the Trimaran compiler infrastructure, a retargetable compiler for VLIW/EPIC processors. ...
doi:10.1145/1254766.1254798
dblp:conf/lctrts/ChuM07
fatcat:emwrcua3ofavdaa6fgsxtemvlm
Code and data partitioning for fine-grain parallelism
2007
SIGPLAN notices
This paper focuses on an alternative compiler-directed method for program parallelization by exploiting fine-grain instructionlevel parallelism (ILP). ...
Introduction The recent shift to multicore designs for mainstream processors offers the potential to improve the performance of current applications. ...
Our profile-guided data access partitioning technique was implemented as part of the Trimaran compiler infrastructure, a retargetable compiler for VLIW/EPIC processors. ...
doi:10.1145/1273444.1254798
fatcat:ergokj7amnghppgaj36vqfknji
Data Access Partitioning for Fine-grain Parallelism on Multicore Architectures
2007
40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)
We propose a profile-guided method for partitioning memory accesses across distributed data caches. ...
Overall, our data partitioning reduces stall cycles by up to 51% versus data-incognizant partitioning, and has an overall speedup average of 30% over a single core processor. ...
RELATED WORK The topic of compiler partitioning for distributed architectures has been studied significantly in the past, especially in the context of multicluster VLIW processors. ...
doi:10.1109/micro.2007.15
dblp:conf/micro/ChuRM07
fatcat:z2rxc2rffra3fjb27p67l6ga2m
Data Access Partitioning for Fine-grain Parallelism on Multicore Architectures
2007
Microarchitecture (MICRO), Proceedings of the Annual International Symposium on
We propose a profile-guided method for partitioning memory accesses across distributed data caches. ...
Overall, our data partitioning reduces stall cycles by up to 51% versus data-incognizant partitioning, and has an overall speedup average of 30% over a single core processor. ...
RELATED WORK The topic of compiler partitioning for distributed architectures has been studied significantly in the past, especially in the context of multicluster VLIW processors. ...
doi:10.1109/micro.2007.4408269
fatcat:2dypqbgdajampkgtqmgjzx7hoi
Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-thread Applications
2007
2007 IEEE 13th International Symposium on High Performance Computer Architecture
This paper describes the Voltron architecture and associated compiler support for orchestrating bi-modal execution. 1-4244-0805-9/07/$25.00 ©2007 IEEE ...
However, general-purpose applications do not provide many opportunities for identifying such threads, due to frequent use of pointers, recursive data structures, if-then-else branches, small function bodies ...
Acknowledgments We thank Mike Schlansker for his excellent comments and suggestions for this work. Much gratitude goes to the anonymous referees who provided helpful feedback on this work. ...
doi:10.1109/hpca.2007.346182
dblp:conf/hpca/ZhongLM07
fatcat:sauqiioqtvfaro65x6xyffqr6m
Experiments with an ocean circulation model on CEDAR
1992
Proceedings of the 6th international conference on Supercomputing - ICS '92
The code was parameterized to offer several choices for data partitionings of the computational domain, for placement strategies for the data in the memory hierarchy, and for the number of clusters and ...
We present the design of the GFDL ocean circulation model as adapted for simulations of the Mediterranean basin for the Cedar multicluster architecture. ...
For Cedar, we consider partitionings to be of two types, one primary, taking into account data partitioning across clusters, the other secondary, specifying the partitioning across vector processors in ...
doi:10.1145/143369.143440
dblp:conf/ics/DeRoseGG92
fatcat:xejyme4nirf2xbkhudgqoik5ue
A distributed control path architecture for VLIW processors
2005
14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)
In this paper, we propose a distributed control path architecture for VLIW processors (DVLIW) to overcome the scalability problem of VLIW control paths. ...
DVLIW employs a multicluster design where each cluster contains a local instruction memory that provides all intra-cluster control. ...
We also thank the anonymous referees for their excellent suggestions and feedback. ...
doi:10.1109/pact.2005.5
dblp:conf/IEEEpact/ZhongFMS05
fatcat:j6lo66yhp5dufhcpnot4s4fziu
algorithms for multiclusters. ...
It is desired to have a tool to map parallel processes to processors (or cores) automatically. ...
INTRODUCTION SMP(Symmetric Multi-Processor) clusters and multiclusters are widely used to execute message-passing parallel applications. ...
doi:10.1145/1183401.1183451
dblp:conf/ics/ChenCHRK06
fatcat:y2etu5dounefpiygz6l4ucabtq
Parallel hyperspectral image processing on distributed multicluster systems
2011
Journal of Applied Remote Sensing
Such approaches work well for individual compute clusters, but-due to the inherently large wide-area communication overheads-these are generally not applied in distributed multicluster systems. ...
As individual cluster computers often cannot satisfy the computational demands of emerging problems in hyperspectral imaging, there is a growing need for distributed supercomputing using multicluster systems ...
Acknowledgments This work has been supported by the Netherlands Organization for Scientific Research (NWO) under Grant No. 643.000.602 (JADE-MM: Adaptive High-Performance Distributed Multimedia Computing ...
doi:10.1117/1.3595292
fatcat:ijiih7lb7naubjhcgg6bmyq3au
Region-based hierarchical operation partitioning for multicluster processors
2003
Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation - PLDI '03
The main challenge associated with clustered architectures is compiler support to effectively partition operations across the available resources on each cluster. ...
In this work, we present a novel technique for clustering operations based on graph partitioning methods. ...
Research on partitioning for multiprocessors has many similarities to clustering for multicluster processors. ...
doi:10.1145/781163.781165
fatcat:4mha7dlh3reixbibxtvmuywjrm
Region-based hierarchical operation partitioning for multicluster processors
2003
Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation - PLDI '03
The main challenge associated with clustered architectures is compiler support to effectively partition operations across the available resources on each cluster. ...
In this work, we present a novel technique for clustering operations based on graph partitioning methods. ...
Research on partitioning for multiprocessors has many similarities to clustering for multicluster processors. ...
doi:10.1145/781131.781165
dblp:conf/pldi/ChuFM03
fatcat:j6344vduifawnkfzxrylgpmm3u
Region-based hierarchical operation partitioning for multicluster processors
2003
SIGPLAN notices
The main challenge associated with clustered architectures is compiler support to effectively partition operations across the available resources on each cluster. ...
In this work, we present a novel technique for clustering operations based on graph partitioning methods. ...
Research on partitioning for multiprocessors has many similarities to clustering for multicluster processors. ...
doi:10.1145/780822.781165
fatcat:iub2t4orwnb3zo6osvt323wflq
Parallelization and performance of Conjugate Gradient algorithms on the Cedar hierarchical-memory multiprocessor
1991
Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming - PPOPP '91
We describe its parallel implementation on the Cedar hierarchical memory multiprocessor from both angles, explicit manual parallelization and automatic compilation. ...
The broad application range makes it an interesting object for investigating novel architectures and programming systems. ...
A m uch less understood area in parallelizing compilation is entered once we attempt to partition data and distribute them to di erent processor or processor clusters. ...
doi:10.1145/109625.109644
dblp:conf/ppopp/MeierE91
fatcat:eakzchabpvht3dt5fg3njrkswa
« Previous
Showing results 1 — 15 out of 84 results