Filters








4,426 Hits in 7.3 sec

Systematic data reuse exploration methodology for irregular access patterns

T. Van Achteren, R. Lauwereins, F. Catthoor
Proceedings 13th International Symposium on System Synthesis  
They work well for homogeneous signal access patterns but cannot handle other cases.  ...  where holes are present in the signal access pattern.  ...  Current methodology for data reuse exploration In this Section we give a short summary of the current methodology for data reuse exploration [14] [7].  ... 
doi:10.1109/isss.2000.874037 dblp:conf/isss/AchterenLC00 fatcat:xyen2nkyqzerpbyjeyyqnfecli

Scaling irregular parallel codes with minimal programming effort

Dimitrios S. Nikolopoulos, Constantine D. Polychronopoulos, Eduard Ayguadé
2001 Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01  
We present a simple runtime methodology for scaling irregular applications parallelized with the standard OpenMP interface.  ...  Irregular parallel applications are a particularly challenging application domain for parallel programming models, since they require domain specific data distribution and load balancing algorithms.  ...  Acknowledgments We are grateful to the ECMWF and Siegfried Benkner for providing us with the irregular kernels.  ... 
doi:10.1145/582034.582050 dblp:conf/sc/NikolopoulosPA01 fatcat:iq75fa4my5bsjbfe5kmx4fq2te

A Systematic Design Space Exploration Approach to Customising Multi-Processor Architectures: Exemplified Using Graphics Processors [chapter]

Ben Cope, Peter Y. K. Cheung, Wayne Luk, Lee Howes
2011 Lecture Notes in Computer Science  
A systematic approach to customising Homogeneous Multi-Processor (HoMP) architectures is described. The approach involves a novel design space exploration tool and a parameterisable system model.  ...  We also analyse on-chip and off-chip memory access for systems with one or more processing elements (PEs), and study the impact of the number of threads per PE on the amount of off-chip memory access and  ...  A systematic design space methodology to explore the customisation options for a HoMP. The key feature is the notion of pre-and post-fab options (Section 4). 3.  ... 
doi:10.1007/978-3-642-24568-8_4 fatcat:7r43f3e5hjhdje5g7njmb2zczq

Array Size Computation under Uniform Overlapping and Irregular Accesses

Angeliki Kritikakou, Francky Catthoor, Vasilios Kelefouras, Costas Goutis
2016 ACM Transactions on Design Automation of Electronic Systems  
We propose a methodology to compute the minimum resources required for storing an array which keeps the exploration time low and provides a near-optimal result for regularly and non-regularly occurring  ...  Otherwise their exploration time is increased with an increase over the number of the different accessed parts of the array.  ...  In [Wuytack et al. 1998 ] a data access graph based on polytopes is used to describe all the memory operations in time for a given array, which is used as input to the data reuse exploration and decision  ... 
doi:10.1145/2818643 fatcat:olifuqxswjdcnb27sijdu73jpy

50 & 25 Years Ago

Erich Neuhold
2020 Computer  
(p. 27) : "Recently, researchers have focused on achieving the above goals through organizational changes and methodologies for systematic software reuse.  ...  (p. 39) "Irregular computations: In many important applications, compile-time analysis is insufficient when communication patterns are data dependent and known only at runtime.  ... 
doi:10.1109/mc.2020.3010073 fatcat:gd35v3gdvjcv7g4xqpmyvlw7k4

Quantifying Data Locality in Dynamic Parallelism in GPUs

Xulong Tang, Ashutosh Pattnaik, Onur Kayiran, Adwait Jog, Mahmut Taylan Kandemir, Chita Das
2018 Proceedings of the ACM on Measurement and Analysis of Computing Systems  
We observe that, for DP applications, data reuse is highly irregular and is heavily dependent on the application and its input.  ...  Thus, existing techniques cannot exploit data reuse efectively for DP applications.  ...  ACKNOWLEDGMENT We thank Ganesh Ananthanarayanan for shepherding our paper. We also thank the anonymous reviewers for their constructive feedback.  ... 
doi:10.1145/3287318 fatcat:zmop6pak6jefve6jtypricmo2a

Characteristics of workloads used in high performance and technical computing

Razvan Cheveresan, Matt Ramsay, Chris Feucht, Ilya Sharapov
2007 Proceedings of the 21st annual international conference on Supercomputing - ICS '07  
Since prefetching plays an important role in the performance of computational workloads, we explore the prefetching potential and for parallel workloads we study the sharing properties of memory accesses  ...  We also analyze memory access patterns including various aspects of cache utilization and locality properties of address distributions.  ...  Interesting data points and conclusions from this data include: 1) the lack of speedup for GTC, where memory access patterns are irregular and sufficiently complex that there is no effective software prefetching  ... 
doi:10.1145/1274971.1274984 dblp:conf/ics/CheveresanRFS07 fatcat:ptpam3kzxzcebp6jm3m3cahlaa

Specializing Coherence, Consistency, and Push/Pull for GPU Graph Analytics [article]

Giordano Salvador, Wesley H. Darvin, Muhammad Huzaifa, Johnathan Alsop, Matthew D. Sinclair, Sarita V. Adve
2020 arXiv   pre-print
Third, we show that the design dimensions explored here are inter-dependent, reinforcing the need for software-hardware co-design in the above design dimensions.  ...  This work provides the first study to explore the interaction of update propagation with and without fine-grained synchronization (push vs. pull), emerging coherence protocols (GPU vs.  ...  Each implementation represents a design specialization that can be made for the irregular graph workloads that we explore.  ... 
arXiv:2002.10245v2 fatcat:xth5zg7erbdadhte4zhelgopp4

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights [article]

Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral Shrivastava, Baoxin Li
2021 arXiv   pre-print
Unstructured sparsity and tensors with varying dimensions yield irregular computation, communication, and memory access patterns; processing them on hardware accelerators in a conventional manner does  ...  The takeaways from this paper include: understanding the key challenges in accelerating sparse, irregular-shaped, and quantized tensors; understanding enhancements in accelerator systems for supporting  ...  Further, a systematic methodology for mapping communication onto interconnect topology can enable design space exploration of interconnects needed for accelerating target ML models, allowing minimum overhead  ... 
arXiv:2007.00864v2 fatcat:k4o2xboh4vbudadfiriiwjp7uu

The knowledge circulated-organisational management for accomplishing e-learning

2009 Knowledge Management & E-Learning: An International Journal  
This means "knowledge in universities circulated-systematic process" of finding, selecting, organising, distilling and presenting information in a way that improves a learner's competency and/or ability  ...  In order to construct such educational management systems, the fundamental processing modules are required, such as a distributed file system, synchronous data communications, etc.  ...  In order with the average values for three patterns: Progressive-pattern>Regressive-pattern >Spiral-pattern.  ... 
doi:10.34105/j.kmel.2009.01.002 fatcat:ebk47nqtorfvpbffyo3tbkikeu

Use of Computation-Unit Integrated Memories in High-Level Synthesis

C. Huang, S. Ravi, A. Raghunathan, N.K. Jha
2006 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
Efficient data reuse of register files have also been fully exploited to further improve system performance.  ...  This paper addresses the challenge of providing a systematic synthesis framework for a CIM-based architecture.  ...  The issue of systematic data reuse for irregular access patterns is discussed in [29] .  ... 
doi:10.1109/tcad.2005.862749 fatcat:k5x37m5ilvbfbm27oziovprs2y

Understanding object-level memory access patterns across the spectrum

Xu Ji, Chao Wang, Nosayba El-Sayed, Xiaosong Ma, Youngjae Kim, Sudharshan S. Vazhkudai, Wei Xue, Daniel Sanchez
2017 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17  
Especially we thank our shepherd, Simon Hammond, for his guidance and responsiveness. This work was supported in part by the National Key  ...  Also, the same application cannot be expected to have the same memory access pattern for di erent input problem sizes or input data.  ...  Our systematic study of object-level access patterns (across 38 applications from diverse domains) supplements these existing approaches and provides practical implications for ongoing architectural, system  ... 
doi:10.1145/3126908.3126917 dblp:conf/sc/JiWEMKVXS17 fatcat:nqmd4px5hfawfhp6itubchgjr4

Data Service API Design for Data Analytics [chapter]

Yun Zhang, Liming Zhu, Xiwei Xu, Shiping Chen, An Binh Tran
2018 Lecture Notes in Computer Science  
Last but not least, current data services do not support the reuse of data exploration processes and the data derived from data analysts.  ...  Across the entire data analytics lifecycle, data service can be regarded as a method for data retrieval and exploration.  ...  The discussion focuses on the data preparation for data analytics by analyzing the data retrieval patterns and data exploration features.  ... 
doi:10.1007/978-3-319-94376-3_6 fatcat:cg3pqi5tzbe6blkyg557aqif5q

A Portable Optimization Engine for Accelerating Irregular Data-Traversal Applications on SIMD Architectures

Bin Ren, Todd Mytkowicz, Gagan Agrawal
2014 ACM Transactions on Architecture and Code Optimization (TACO)  
This article develops support for exploiting such data parallelism for a class of nonnumeric, nongraphic applications, which perform computations while traversing many independent, irregular data structures  ...  To address this challenge, we develop a set of data layout optimizations that improve spatial locality for applications that traverse many irregular data structures.  ...  We refer to this traversal pattern as sparse buckets accesses.  ... 
doi:10.1145/2632215 fatcat:qegvfixkezacxgkrg557sphsje

Generation of Heterogeneous Distributed Architectures for Memory-Intensive Applications Through High-Level Synthesis

Chao Huang, Srivaths Ravi, Anand Raghunathan, Niraj K. Jha
2007 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
The high-level synthesis (HLS) techniques presented in this paper are motivated by the fact that many memory-intensive applications exhibit irregular array data access patterns.  ...  Synthesis should, therefore, be capable of determining a partitioned architecture, wherein array data and computations may have to be heterogeneously distributed for achieving the best performance speed-up  ...  The issue of systematic data reuse for irregular access patterns is discussed in [13] , which motivates us to explore data duplication effects among partitioned subsystems in our paper.  ... 
doi:10.1109/tvlsi.2007.904096 fatcat:czc256r4zfc7hbe44ir6smrqwu
« Previous Showing results 1 — 15 out of 4,426 results