A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Optimizing Shape Design with Distributed Parallel Genetic Programming on GPUs
[chapter]
2012
Studies in Computational Intelligence
This technique is well suited for a distributed parallel system to increase efficiency. ...
Fitness evaluation of the genetic programming technique is accomplished through a custom implementation of a fluid dynamics solver running on graphics processing units (GPUs). ...
The technique uses a so-called distributed parallel evolutionary algorithm to optimize the solution, along with a general purpose parallel fluid dynamics solver to evaluate the shape parameters. ...
doi:10.1007/978-3-642-28789-3_3
fatcat:c625ysjdlffe7m4nddrtgsgayu
Parallel Programming Of A Reservoir Simulator
1992
Jurnal Teknologi
This study concerns applying parallel programming tore ervoir simulation using a 32-Mbyte, 12-processor parallel computer. ...
Matrix generation was parallelized using monitors as macros to synchronize calculation . The performance of the simulator was measured by the speed up. ...
For convergence, the simulation proceeds to the next time step, otherwise the iteration is repeated. ...
doi:10.11113/jt.v19.1053
fatcat:i6pt4oylnjaj7e2tnb2nf7qjei
Runtime vs. Manual Data Distribution for Architecture-Agnostic Shared-Memory Programming Models
2002
International journal of parallel programming
These techniques can be used to effectively replace manual data distribution in regular applications. ...
This paper compares data distribution methodologies for scaling the performance of OpenMP on NUMA architectures. ...
(39) Irregular parallel applications appear to be one class of programs where page migration is not an option and domain-specific knowledge may be required to encode proper algorithms for data and load ...
doi:10.1023/a:1019899812171
dblp:journals/ijpp/NikolopoulosAP02
fatcat:3neic6ykybhkzpytgjf4kqpija
Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations
2002
SIAM Review
Thanks alsoto BruceHendrickson andtheotheranonymous referees for theirsuggestionsthat helped improve thepaper. ...
This work investi-
gates the performance
and the programming
effort for the Conjugate
Gradient
(CG)
iterative
solver for sparse matrices on each of these architectural
platforms
using their ...
A second group of iterative
techniques
uses a projection
process, which is a canonical
way of extracting
an approximate
solution
from a subspace. ...
doi:10.1137/s00361445003820
fatcat:o7uwxsbcfnf4ppcfc2sbpxncmm
Parallelization of an Unsteady ALE Solver with Deforming Mesh Using OpenACC
2017
Scientific Programming
This paper presents a parallel, GPU-based, deforming mesh-enabled unsteady numerical solver for solving moving body problems by using OpenACC. ...
And both 2D and 3D cases are conducted to validate the efficiency, correctness, and accuracy of the present solver. ...
of an ALE solver that is able to simulate unsteady flows with deforming mesh. ...
doi:10.1155/2017/4610138
fatcat:bfjxxfuesbcufekrfelw2tgxly
Migrant threads on process farms: parallel programming with Ariadne
1998
Concurrency Practice and Experience
Sequential programs are readily converted into parallel programs for shared or distributed memory, with low development effort. ...
We present a novel and portable threads-based system for the development of concurrent applications on shared and distributed memory environments. ...
Distributed applications described in Section 6 include a a distributed successive over-relaxation (SOR) linear solver, a particle-physics application, and adaptive quadrature. ...
doi:10.1002/(sici)1096-9128(19980810)10:9<673::aid-cpe362>3.0.co;2-5
fatcat:ixdg54l7v5b2zibur5fwljtzuy
Programming with transactional coherence and consistency (TCC)
2004
SIGPLAN notices
The performance of these programs may then easily be optimized, based on feedback from real program execution, using a few simple techniques. ...
We describe two basic programming language constructs for decomposing programs into transactions, a loop conversion syntax and a general transaction-forking mechanism. ...
ACKNOWLEDGEMENTS This work was supported by NSF grant CCR-0220138 and DARPA PCA program grants F29601-01-2-0085 and F29601-03-2-0117. ...
doi:10.1145/1037187.1024395
fatcat:izhh37goeffmhlpl3xizvyv66m
Programming with transactional coherence and consistency (TCC)
2004
ACM SIGOPS Operating Systems Review
The performance of these programs may then easily be optimized, based on feedback from real program execution, using a few simple techniques. ...
We describe two basic programming language constructs for decomposing programs into transactions, a loop conversion syntax and a general transaction-forking mechanism. ...
ACKNOWLEDGEMENTS This work was supported by NSF grant CCR-0220138 and DARPA PCA program grants F29601-01-2-0085 and F29601-03-2-0117. ...
doi:10.1145/1037949.1024395
fatcat:tyuk7ppydbdxfgqf6ym5cxaxpe
Programming with transactional coherence and consistency (TCC)
2004
Proceedings of the 11th international conference on Architectural support for programming languages and operating systems - ASPLOS-XI
The performance of these programs may then easily be optimized, based on feedback from real program execution, using a few simple techniques. ...
We describe two basic programming language constructs for decomposing programs into transactions, a loop conversion syntax and a general transaction-forking mechanism. ...
ACKNOWLEDGEMENTS This work was supported by NSF grant CCR-0220138 and DARPA PCA program grants F29601-01-2-0085 and F29601-03-2-0117. ...
doi:10.1145/1024393.1024395
dblp:conf/asplos/HammondCWHCKO04
fatcat:6dkmd6hpdjbo5pkle2uxasqbju
Programming with transactional coherence and consistency (TCC)
2004
SIGARCH Computer Architecture News
The performance of these programs may then easily be optimized, based on feedback from real program execution, using a few simple techniques. ...
We describe two basic programming language constructs for decomposing programs into transactions, a loop conversion syntax and a general transaction-forking mechanism. ...
ACKNOWLEDGEMENTS This work was supported by NSF grant CCR-0220138 and DARPA PCA program grants F29601-01-2-0085 and F29601-03-2-0117. ...
doi:10.1145/1037947.1024395
fatcat:tjylhp5ikjbofblefxpdr3wpei
Towards Architecture-Adaptable Parallel Programming
1997
Scientific Programming
In this article, we propose a solution to this problem in the form of an architecture-adaptable programming environment. ...
From a pragmatic point of view, this is not a major liability since our strategy will be useful in building domain-specific problem solving environments and application-oriented compilers, which can be ...
I am grateful to my parents for teaching me to dream and to work hard to make those dreams come true. This work was supported by NSF grant ASC-9208971. ...
doi:10.1155/1997/586912
fatcat:2el6aq4kwjb2zlwh42mty7wj6u
Thread scheduling for cache locality
1996
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems - ASPLOS-VII
Experiments with several application programs, on two systems with different cache structures, show that our thread scheduling method can improve program performance by reducing second-level cache misses ...
This paper describes a method to improve the cache locality of sequential programs by scheduling fine-grained threads. ...
Acknowledgements We would like to thank Thomas Anderson, Susan Eggers, ...
doi:10.1145/237090.237151
dblp:conf/asplos/PhilbinEADL96
fatcat:idrgeas7v5fsxim4ir3mzdntp4
The SPLASH-2 programs
1995
Proceedings of the 22nd annual international symposium on Computer architecture - ISCA '95
The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful ...
The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed sharedaddress-space multiprocessors. ...
We simulate a cache-coherent shared address space multiproces-sor with physically distributed memory and one processor per node. ...
doi:10.1145/223982.223990
dblp:conf/isca/WooOTSG95
fatcat:46sii34aejgf7myonpew5lhqxu
The SPLASH-2 programs
1995
SIGARCH Computer Architecture News
The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful ...
The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed sharedaddress-space multiprocessors. ...
We simulate a cache-coherent shared address space multiproces-sor with physically distributed memory and one processor per node. ...
doi:10.1145/225830.223990
fatcat:t5jsmginbrffzff6x57qpv4nra
A low-computation-complexity, energy-efficient, and high-performance linear program solver based on primal dual interior point method using memristor crossbars
2018
Nano Communication Networks
Wang, A low-computation-complexity, energy-efficient, and high-performance linear program solver based on primal dual interior point method using memristor crossbars, Nano Communication Networks (2018) ...
Abstract Linear programming is required in a wide variety of application including routing, scheduling, and various optimization problems. ...
Thus, a more robust feasibility detection technique is required to guarantee an optimal solution is given. ...
doi:10.1016/j.nancom.2018.01.001
fatcat:glqyjfvqkbechlvgqv2uvzztee
« Previous
Showing results 1 — 15 out of 961 results