Filters








88,642 Hits in 4.1 sec

Page 26 of Journal of Research and Practice in Information Technology Vol. 20, Issue 1 [page]

1988 Journal of Research and Practice in Information Technology  
We use a refined static data dependency analysis to predict the mode of arguments and suggest the method of generating a dataflow graph which exploits the AND-parallelism based on the predicted mode.  ...  (i983): The AND/OR Model for Parallel Interpretation of Logic Programs, PhD thesis, Dept. of Information and Computer Science, Univ. of California, Irvine..  ... 

Compiler Synthesis of Task Graphs for Parallel Program Performance Prediction [chapter]

Vikram Advea, Rizos Sakellariou
2001 Lecture Notes in Computer Science  
This work was carried out while the authors were with the Computer Science Department at Rice University.  ...  Acknowledgements: The authors would like to acknowledge the valuable input that several members of the POEMS project have provided to the development of the application representation.  ...  The Static Task Graph: The static task graph (STG) captures the static parallel structure of a program and is defined only by the program per se.  ... 
doi:10.1007/3-540-45574-4_14 fatcat:56txodm3ubbrfpjg3zt7ryavdq

Optimizing Mixture of Experts using Dynamic Recompilations [article]

Ferdinand Kossmann, Zhihao Jia, Alex Aiken
2022 arXiv   pre-print
To address the limitation of these frameworks, we present DynaMoE, a DNN library that uses dynamic recompilations to optimize and adapt the use of computational resources to the dynamic needs of Mixture  ...  The Mixture of Experts architecture allows for outrageously large neural networks by scaling model parameter size independently from computational demand (FLOPs).  ...  Other popular frameworks also fall into one of these two categories: Caffe, Jax, Theano, and CNTK employ static computation graphs while Chainer and DyNet employ dynamic computation graphs 1.  ... 
arXiv:2205.01848v1 fatcat:nzminir74ze7tkzereaevzpxri

A GPU Task-Parallel Model with Dependency Resolution

Stanley Tzeng, Brandon Lloyd, John D. Owens
2012 Computer  
We apply our methods to intra prediction in the H.264 video codec and an N-queens backtracking problem.  ...  We present two dependency-aware scheduling schemes-static and dynamic-and analyze their behavior using a synthetic workload.  ...  His interests include computer graphics, parallel programming, and performance optimization.  ... 
doi:10.1109/mc.2012.255 fatcat:bzm74dafengdvm57bwjz4bdko4

Compiler-supported simulation of highly scalable parallel applications

Vikram S. Adve, Rajive Bagrodia, Ewa Deelman, Thomas Phan, Rizos Sakellariou
1999 Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99  
We use a compilersynthesized static task graph model to identify the control-flow and the subset of the computations that determine the parallelism, communication and synchronization of the code, and to  ...  This information allows us to avoid executing or simulating large portions of the computational code during the simulation.  ...  This work was supported by DARPA/ITO under Contract N66001-97-C-8533, "End-to-End Performance Modeling of Large Heterogeneous Adaptive Parallel/Distributed Computer/Communication Systems," (http://www.cs.utexas.edu  ... 
doi:10.1145/331532.331533 dblp:conf/sc/AdveBDPS99 fatcat:obdwb2s3uzdovmmxdmgnpjdkgm

Parallel program performance prediction using deterministic task graph analysis

Vikram S. Adve, Mary K. Vernon
2004 ACM Transactions on Computer Systems  
This solution technique, which we call deterministic task graph analysis, applies to parallel programs with arbitrary task graphs and a wide range of static and dynamic task scheduling methods.  ...  In this paper, we consider analytical techniques for predicting detailed performance characteristics of a single shared memory parallel program for a particular input.  ...  The Pamela language is restricted to series-parallel task graphs, and to simple static or work-conserving dynamic task scheduling.  ... 
doi:10.1145/966785.966788 fatcat:z4ojdrx6infnvaj4zgkcljzne4

Efficient Parallelization of H.264 Decoding with Macro Block Level Scheduling

Jike Chong, Nadathur Satish, Bryan Catanzaro, Kaushik Ravindran, Kurt Keutzer
2007 Multimedia and Expo, 2007 IEEE International Conference on  
The run time MB level scheduling increases the efficiency of parallel execution in the rest of the H.264 decoder, providing 60% speedup over greedy dynamic scheduling and 9-15% speedup over static compile  ...  Preparsing is a functional parallelization technique to resolve this front end bottleneck.  ...  Static scheduling algorithms typically assume an application description in the form of a directed acyclic precedence task graph.  ... 
doi:10.1109/icme.2007.4285040 dblp:conf/icmcs/ChongSCRK07 fatcat:dapqndwswfdpbgxyw7jqrc2vr4

A Performance Prediction Methodology for Data-dependent Parallel Applications

P. Fritzsche, C. Roig, A. Ripoll, E. Luque, A. Hernandez
2006 2006 IEEE International Conference on Cluster Computing  
The development of a new prediction methodology to estimate the performance of data-dependent parallel applications is the primary target of this study.  ...  The increase in the use of parallel distributed architectures in order to solve large-scale scientific problems has generated the need for performance prediction for both deterministic applications and  ...  Static parallel techniques use source code or pseudo code as their main input.  ... 
doi:10.1109/clustr.2006.311879 dblp:conf/cluster/FritzscheRRLH06 fatcat:cxmek2gg7jgndfcent2s2ciqcy

CLOUD WORKFLOW SCHEDULING BASED ON STANDARD DEVIATION OF PREDICTIVE RESOURCE AVAILABILITY

Vijayalakshmi A. Lepakshi
2017 International Journal of Advanced Research in Computer Science  
In this paper, we propose a new heuristic called Workflow Scheduling based on Standard Deviation of Predictive Resource Availability in cloud computing considers the dynamic nature of cloud resources and  ...  Hence, non-availability of these allocated resources may cause delays in completion of execution of parallel applications.  ...  The SLR is the ratio of the parallel execution time to the sum of weights of the critical paths tasks on the fastest processors. c) Speedup: Speedup of scheduling algorithm is computed by dividing sequential  ... 
doi:10.26483/ijarcs.v8i7.4214 fatcat:pijtvpp3pfhkpm4bjqlt2dspla

Predictive Probabilistic Resource Availability based Cloud Workflow Scheduling (PPRA)

Chitra S, Dr. Prashanth C S R
2017 IOSR Journal of Computer Engineering  
We propose a new static workflow scheduling algorithm called Predictive Probabilistic Resource Availability based Cloud Workflow Scheduling (PPRA) with the objective of minimizing makespan considering  ...  Cloud Computing provides access to a shared pool of computing resources such as servers, storage, computer networks and services, which can be rapidly provisioned and released, for the execution of various  ...  where is the average computation cost of the given graph and is set randomly.  ... 
doi:10.9790/0661-1904015463 fatcat:s4n3rxlmt5cuna55kkqcnptq3i

Page 5551 of Mathematical Reviews Vol. , Issue 95i [page]

1995 Mathematical Reviews  
A method is explained that allows the conflict-free and synchronous-parallel substitution of all components of a graph.”  ...  Summary: “We describe a new parallel algorithm for computing a maximal matching in a graph. The algorithm runs in time O(log* n) on (m+n)/log*n EREW PRAM processors.  ... 

On the Combination of Argumentation Solvers into Parallel Portfolios [chapter]

Mauro Vallati, Federico Cerutti, Massimiliano Giacomin
2017 Lecture Notes in Computer Science  
In particular, four methodologies aim at combining solvers in static portfolios, while two methodologies are designed for the dynamic configuration of parallel portfolios.  ...  In this work, we introduce six methodologies for the automatic configuration of parallel portfolios of argumentation solvers for enumerating the preferred extensions of a given framework.  ...  Acknowledgement The authors would like to acknowledge the use of the University of Huddersfield Queensgate Grid in carrying out this work.  ... 
doi:10.1007/978-3-319-63004-5_25 fatcat:mccwklzg2baphciqzq7b62budi

Skeletons for parallel image processing: an overview of the SKIPPER project

Jocelyn Sérot, Dominique Ginhac
2002 Parallel Computing  
This paper focuses on these implementation issues, by making a comparative survey, according to a set of four criteria (efficiency, expressivity, portability, predictability), of these implementation techniques  ...  The main goal of the Skipper project was to demonstrate the applicability of skeleton-based parallel programming techniques to the fast prototyping of reactive vision applications.  ...  , predictability, portability and expressivity. 4 Static data-flow.  ... 
doi:10.1016/s0167-8191(02)00189-8 fatcat:347kuuvfyrgx3pnjjfc2qsy5yq

POEMS: end-to-end performance design of large parallel adaptive computational systems

M.K. Vernon, P.J. Teller, D.J. Sundaram-Stukel, R. Sakellariou, J.R. Rice, E.N. Houstis, A. Dube, E. Deelman, J.C. Browne, R. Bagrodia, V.S. Adve
2000 IEEE Transactions on Software Engineering  
A single application representation based on static and dynamic task graphs serves as a common workload representation for all these modeling approaches.  ...  This composition can be specified using a generalized graph model of a parallel system, together with interface specifications that carry information about component behaviors and evaluation methods.  ...  The authors also would like to thank Thomas Phan and Steven Docy for their help with the use of MPI-Sim to predict the Sweep3D performance on the SP/2, the Office of Academic Computing at UCLA, and Paul  ... 
doi:10.1109/32.881716 fatcat:w47k2yff2jde3jrnvcf3rrvidu

Iterative solution of large, sparse linear systems on a static data flow architecture: Performance studies

Daniel A. Reed, Merrell L. Patrick
1985 IEEE transactions on computers  
The applicability of static data flow architectures to the iterative solution of sparse linear systems of equations is investigated.  ...  An analytic performance model of a static data flow com putation is developed.  ...  AN ANALYTIC PERFORMANCE MODEL OF STATIC DATA FLOW COMPUTATION As we have noted, the M.I.T. static data flow architecture provides two levels of parallelism: pipelined and spatial.  ... 
doi:10.1109/tc.1985.6312190 fatcat:ck3jkp7iwvbmldhhay7fu556hy
« Previous Showing results 1 — 15 out of 88,642 results