Filters








81,869 Hits in 9.1 sec

Communication strategies for out-of-core programs on distributed memory machines

Rajesh Bordawekar, Alok Choudhary
1995 Proceedings of the 9th international conference on Supercomputing - ICS '95  
In this paper, we show that communication in the outof-core distributed memory problems requires both interprocessor communication and file 1/0.  ...  We first describe how the communication is done for in-core programs and then describe three communication strategies for out-of-core programs.  ...  The overall time required for an out-of-core program can be computed as a sum of times for local 1/0 Tl,o, in-core computation TCO~P and communication Communication Method.  ... 
doi:10.1145/224538.224642 dblp:conf/ics/BordawekarC95 fatcat:wkskvguwtbedraqhwrkii5qck4

Compilation and Communication Strategies for Out-of-Core Programs on Distributed Memory Machines

Rajesh Bordawekar, Alok Choudhary, J. Ramanujam
1996 Journal of Parallel and Distributed Computing  
Finally, w e discuss how the out-of-core and in-core communication methods can be used in virtual memory environments on distributed memory machines.  ...  We present three methods for performing communication in out-of-core distributed memory problems.  ...  This work was performed in part using the Intel Paragon System operated by Caltech on behalf of the Center for Advanced Computing Research C A CR. Access to this facility w as provided by CRPC.  ... 
doi:10.1006/jpdc.1996.0148 fatcat:sd3os7rby5fmvmjw2epmoyvczi

A model and compilation strategy for out-of-core data parallel programs

Rajesh Bordawekar, Alok Choudhary, Ken Kennedy, Charles Koelbel, Michael Paleczny
1995 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPOPP '95  
out-of-core problems.  ...  Our results compare several communication methods and I/O optimizations using two out-of-core problems, Jacobi iteration and LU factorization.  ...  The strategy for compiling an entire program is conceptually identical, but may i n volve m o r e Out-of-core Compilation For out-of-core programs, in addition to the steps in Figure 3 , the compiler  ... 
doi:10.1145/209936.209938 dblp:conf/ppopp/BordawekarCKKP95 fatcat:6adhfbdhtbhqzf67silnz3mgs4

A model and compilation strategy for out-of-core data parallel programs

Rajesh Bordawekar, Alok Choudhary, Ken Kennedy, Charles Koelbel, Michael Paleczny
1995 SIGPLAN notices  
out-of-core problems.  ...  Our results compare several communication methods and I/O optimizations using two out-of-core problems, Jacobi iteration and LU factorization.  ...  The strategy for compiling an entire program is conceptually identical, but may i n volve m o r e Out-of-core Compilation For out-of-core programs, in addition to the steps in Figure 3 , the compiler  ... 
doi:10.1145/209937.209938 fatcat:dnlp7stqvrfsxo3jusg42mibyy

Large-Scale Scientific Irregular Computing on Clusters and Grids [chapter]

Peter Brezany, Marian Bubak, Maciej Malawski, Katarzyna Zajcac
2002 Lecture Notes in Computer Science  
Data sets involved in many scientific applications are often too massive to fit into main memory of even the most powerful computers and therefore they must reside on disk, and thus communication between  ...  The experimental performance results achieved on a cluster of PCs are included.  ...  The out-of-core parallelization strategy is a natural extension of the in-core approach.  ... 
doi:10.1007/3-540-46043-8_49 fatcat:zxaunyxcpzh75ixhthaopounvy

Compilation techniques for out-of-core parallel computations

M. Kandemir, A. Choudhary, J. Ramanujam, R. Bordawekar
1998 Parallel Computing  
Since writing an efficient out-of-core version of a program is a difficult task and virtual memory systems do not perform well on scientific computations, we believe that there is a clear need for compiler  ...  In this paper, we first present an out-of-core compilation strategy based on a disk storage abstraction.  ...  issues involved in compilation of out-of-core codes on distributed-memory message-passing machines.  ... 
doi:10.1016/s0167-8191(98)00027-1 fatcat:lrtlsrx5k5aebe7uin72hdbuxm

Hybrid MPI/openMP application on multicore architectures: the case of profit-sharing life insurance policies valuation

P. L. De Angelis, F. Perla, P. Zanetti
2013 Applied Mathematical Sciences  
Further, since in the future an increasing number of cores per-chip -tens and even hundreds -and smaller per-core resources, as memory, are expected, it seems necessary, in the implementation of very largescale  ...  The DISAR (Dynamic Investment Strategy with Accounting Rules) system -an Asset-Liability Management software for monitoring portfolios of life insurance policies -has been proven to be extremely efficient  ...  In [9, 10] we reported numerical experiments carried out applying to DISAR a parallelisation strategy based on the distribution of Monte Carlo trajectories among processors.  ... 
doi:10.12988/ams.2013.37357 fatcat:66n26qiyyne4th2mtgu43ofq4a

Compiler and runtime support for out-of-core HPF programs

Rajeev Thakur, Rajesh Bordawekar, Alok Choudhary
1994 Proceedings of the 8th international conference on Supercomputing - ICS '94  
There has been considerable research on compiling in-core data parallel programs for distributed memory machines [6, 22, 21] .  ...  to translate out-of-core programs written in a data-parallel language like HPF into node programs for distributed memory machines with explicit communication and parallel 1/0.  ... 
doi:10.1145/181181.181571 dblp:conf/ics/ThakurBC94 fatcat:l4puhr7ixjbnbbibnp3nnt7uce

Many-core virtual machines

Stefan Marr, Theo D'Hondt
2010 Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion - SPLASH '10  
from clock rate to core count, i. e., the number of computing units on a single chip.  ...  We propose to search for common abstractions for concurrency models to enable multi-language virtual machines to support a wide range of them.  ...  For instance, the Cell B.E. uses a distributed memory model with an inter-core ring network for explicit memory transfers [3] .  ... 
doi:10.1145/1869542.1869593 dblp:conf/oopsla/MarrD10 fatcat:vzsvanfjkjarzbksjnpw66yb6y

Using Data Dependencies to Improve Task-Based Scheduling Strategies on NUMA Architectures [chapter]

Philippe Virouleau, François Broquedis, Thierry Gautier, Fabrice Rastello
2016 Lecture Notes in Computer Science  
We also evaluate their performances on linear algebra applications executed on a 192-core NUMA machine, reporting noticeable performance improvement when considering both the architecture topology and  ...  This paper introduces several heuristics for these strategies and their implementations in our OpenMP runtime XKAAPI.  ...  work is integrated and supported by the ELCI project, a French FSN ("Fond pour la Société Numérique") project that associates academic and industrial partners to design and provide software environment for  ... 
doi:10.1007/978-3-319-43659-3_39 fatcat:i6ou3fm2efcz5np6rafvfnuvkm

An events based algorithm for distributing concurrent tasks on multi-core architectures

David W. Holmes, John R. Williams, Peter Tilke
2010 Computer Physics Communications  
The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine.  ...  In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures.  ...  Also George Chrysanthakopoulos and Henrik Nielsen of Microsoft Research for their assistance with Robotics Studio and the CCR.  ... 
doi:10.1016/j.cpc.2009.10.009 fatcat:5c7pf24jtvaovlwj5kaql66ose

Have abstraction and eat performance, too: optimized heterogeneous computing with parallel patterns

Kevin J. Brown, HyoukJoong Lee, Tiark Rompf, Arvind K. Sujeeth, Christopher De Sa, Christopher Aberger, Kunle Olukotun
2016 Proceedings of the 2016 International Symposium on Code Generation and Optimization - CGO 2016  
To optimize distributed applications both for modern hardware and for modern programmers we need a programming model that is sufficiently expressive to support a variety of parallel applications, sufficiently  ...  High performance in modern computing platforms requires programs to be parallel, distributed, and run on heterogeneous hardware.  ...  Acknowledgments We are grateful to the anonymous reviewers for their comments and suggestions.  ... 
doi:10.1145/2854038.2854042 dblp:conf/cgo/BrownLRSSAO16 fatcat:cye5j5gi3vfgzh7xyku5cfttq4

Scalable Dynamic Load Balancing Using UPC

Stephen Olivier, Jan Prins
2008 2008 37th International Conference on Parallel Processing  
However, to obtain performance portability with UPC in both shared memory and distributed memory settings requires the careful use of onesided reads and writes to minimize the impact of high latency communication  ...  Our implementation achieves better scaling and parallel efficiency in both shared memory and distributed memory settings than previous efforts using UPC [1] and MPI [2].  ...  Acknowledgment The authors thank the Renaissance Computing Institute for the use of the Kitty Hawk cluster and the University of North Carolina for the use of the Topsail cluster and the SGI Altix.  ... 
doi:10.1109/icpp.2008.19 dblp:conf/icpp/OlivierP08 fatcat:wgivv2ozofgjlm6fvlgkuxrqvm

PicoGrid: A Web-Based Distributed Computing Framework for Heterogeneous Networks Using Java

Nipun Wittayasooporn, Apikrit Panichevaluk, Yan Zhao
2015 Engineering Journal  
We propose a framework for distributed computing applications in heterogeneous networks. The system is simple to deploy and can run on any operating systems that support the Java Virtual Machine.  ...  Our system also does not affect the normal use of client machines to guarantee satisfactory user experience.  ...  core) on the machine.  ... 
doi:10.4186/ej.2015.19.1.153 fatcat:l5penh5bvzak7oobc3wyo2xzlm

Graph3S: A Simple, Speedy and Scalable Distributed Graph Processing System [article]

Xubo Wang, Lu Qin, Lijun Chang, Ying Zhang, Dong Wen, Xuemin Lin
2020 arXiv   pre-print
Our observation is that enforcing the communication flexibility of a system leads to the gains of both system efficiency and scalability as well as simple usage.  ...  The rapidly increasing data volume calls for efficient and scalable graph data processing.  ...  distributed memory machines.  ... 
arXiv:2003.00680v1 fatcat:uhbqkkslfverhprieeuy44l2zm
« Previous Showing results 1 — 15 out of 81,869 results