Filters








115 Hits in 2.2 sec

Hierarchical Parallelism Control for Multigrain Parallel Processing [chapter]

Motoki Obata, Jun Shirako, Hiroki Kaminaga, Kazuhisa Ishizaka, Hironori Kasahara
2005 Lecture Notes in Computer Science  
In order to efficiently use hierarchical parallelism of each nest level, or layer, in multigrain parallel processing, it is required to determine how many processors or groups of processors should be assigned  ...  This paper proposes an automatic hierarchical parallelism control scheme to assign suitable number of processors to each layer so that the parallelism of each hierarchy can be used efficiently.  ...  Performance evaluation This section evaluates the performance of the proposed parallelism control scheme for multigrain parallel processing on IBM RS6000 SP 604e High Node 8 processors SMP server.  ... 
doi:10.1007/11596110_3 fatcat:3krnjuzrlbcehni2xviqj5conu

Multigrain Parallelization for Model-Based Design Applications Using the OSCAR Compiler [chapter]

Dan Umeda, Takahiro Suzuki, Hiroki Mikami, Keiji Kimura, Hironori Kasahara
2016 Lecture Notes in Computer Science  
Since embedded systems require real-time processing, the use of multi-core CPUs poses more opportunities for accelerating program execution to satisfy the real-time constraints.  ...  While prior approaches exploit parallelism among blocks by inspecting MATLAB/Simulink models, this may lose an opportunity for fully exploiting parallelism of the whole program because models potentially  ...  ., LTD. for providing the anomaly detection model. We would like to express appreciation to A&D CO., LTD..  ... 
doi:10.1007/978-3-319-29778-1_8 fatcat:2jhp6m2iqrfrhardeqkddqwicm

Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters

James R. McCombs, Andreas Stathopoulos
2003 Parallel Computing  
To test the effectiveness of the multigrain parallelism, we implemented a multigrain, block Jacobi-Davidson algorithm for computing a few extreme eigenvalues of a symmetric matrix.  ...  We call this combination of fine and coarse-grain parallelism multigrain.  ...  be implemented with multigrain parallelism.  ... 
doi:10.1016/s0167-8191(03)00101-7 fatcat:d3gbbga5m5dapnf4z4zvc7gdw4

Performance of OSCAR Multigrain Parallelizing Compiler on SMP Servers [chapter]

Kazuhisa Ishizaka, Takamichi Miyamoto, Jun Shirako, Motoki Obata, Keiji Kimura, Hironori Kasahara
2005 Lecture Notes in Computer Science  
The OS-CAR compiler hierarchically exploits the coarse grain task parallelism among loops, subroutines and basic blocks and the near fine grain parallelism among statements inside a basic block in addition  ...  This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 and SGI Altix 3700.  ...  Also, the auhours thank to NEC soft, Ltd. and SGI Japan, Ltd. for the kind offer of the use of the NEC TX7/i6010 and SGI Altix 3700 System for this research.  ... 
doi:10.1007/11532378_23 fatcat:ipm637l2brevhi5ycvdbshkeby

Compiler Control Power Saving Scheme for Multi Core Processors [chapter]

Jun Shirako, Naoto Oshiyama, Yasutaka Wada, Hiroaki Shikano, Keiji Kimura, Hironori Kasahara
2006 Lecture Notes in Computer Science  
This paper proposes a compilation scheme for reduction of power consumption under the multigrain parallel processing environment that controls Voltage/Frequency and power supply of each processor core  ...  To this end, the compiler for a multi core processor is expected not only to parallelize program effectively, but also to control the voltage and clock frequency of processors and storages carefully inside  ...  core processors for real time consumer electronics".  ... 
doi:10.1007/978-3-540-69330-7_25 fatcat:3trynyg6abeybembh56nntvqjq

OSCAR API for Real-Time Low-Power Multicores and Its Performance on Multicores and SMP Servers [chapter]

Keiji Kimura, Masayoshi Mase, Hiroki Mikami, Takamichi Miyamoto, Jun Shirako, Hironori Kasahara
2010 Lecture Notes in Computer Science  
By using the OSCAR API as an interface between the OSCAR compiler and backend compilers, the OSCAR compiler enables hierarchical multigrain parallel processing with memory optimization under capacity restriction  ...  for cache memory, local memory, distributed shared memory, and on-chip/off-chip shared memory; data transfer using a DMA controller; and power reduction control using DVFS (Dynamic Voltage and Frequency  ...  Acknowledgement This paper is supported by the METI/NEDO projects "Multicore Technology for Realtime Consumer Electronics", "Heterogeneous Multicore for Consumer Electronics" and "Low Power Manycore Processor  ... 
doi:10.1007/978-3-642-13374-9_13 fatcat:n75vuldrrzcfnpq76er5pwv6dq

Automatic Coarse Grain Task Parallel Processing on SMP Using OpenMP [chapter]

Hironori Kasahara, Motoki Obata, Kazuhisa Ishizaka
2001 Lecture Notes in Computer Science  
This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine.  ...  based on hierarchical coarse grain task parallel processing concept.  ...  The coarse grain task parallel processing scheme in OSCAR multigrain automatic parallelizing compiler consists of the following steps. 1.  ... 
doi:10.1007/3-540-45574-4_13 fatcat:lqmfuzm7jvgkhomgv5fg7cnzmy

Parallelization of automotive engine control software on embedded multi-core processor using OSCAR compiler

Y. Kanehagi, D. Umeda, A. Hayashi, K. Kimura, H. Kasahara
2013 2013 IEEE COOL Chips XVI  
On the other hand, this paper is the first paper has successfully parallelized the practical automotive engine control software using automatic multigrain parallelizing compiler, or the OSCAR Compiler  ...  distribution [2] rather than improvement of response time, or performance by parallel processing.  ...   Faster processing of Engine Control Programs is required  New sophisticated control function will be used  Parallel processing of engine control programs on multi-core processors is required  ... 
doi:10.1109/coolchips.2013.6547921 dblp:conf/coolchips/KanehagiUHKK13 fatcat:h7uv2kq24rfcfof4zfomu3wmvy

Evaluation of Power Consumption at Execution of Multiple Automatically Parallelized and Power Controlled Media Applications on the RP2 Low-Power Multicore [chapter]

Hiroki Mikami, Shumpei Kitaki, Masayoshi Mase, Akihiro Hayashi, Mamoru Shimaoka, Keiji Kimura, Masato Edahiro, Hironori Kasahara
2013 Lecture Notes in Computer Science  
OSCAR compiler enables the hierarchical multigrain parallel processing and power reduction control using DVFS (Dynamic Voltage and Frequency Scaling), clock gating and power gating for each processor core  ...  This paper confirmed parallel processing and power reduction by OSCAR compiler are efficient for multiple application executions.  ...  A part of this research has been supported by NEDO "Advanced Heterogeneous Multiprocessor", NEDO "Multi core processors for realtime consumer electronics" and STARC "Automatic Parallelizing Compiler Cooperative  ... 
doi:10.1007/978-3-642-36036-7_3 fatcat:hgyucsbr5fgsnkednsi2hp3f2i

OSCAR Compiler Controlled Multicore Power Reduction on Android Platform [chapter]

Hideo Yamamoto, Tomohiro Hirano, Kohei Muto, Hiroki Mikami, Takashi Goto, Dominic Hillenbrand, Moriyuki Takamura, Keiji Kimura, Hironori Kasahara
2014 Lecture Notes in Computer Science  
The OS-CAR Compiler enables automatic exploitation of multigrain parallelism within a sequential program, and automatically generates a parallelized code with the OSCAR Multi-Platform API power reduction  ...  This paper evaluates the power reduction control by the OSCAR Automatic Parallelizing Compiler on an Android platform with the newly developed precise power measurement environment on the ODROID-X2, a  ...  Multigrain Parallel Processing and Low Power Optimization by the OSCAR Compiler The OSCAR (Optimally Scheduled Advanced multiprocessor) Compiler exploits multigrain parallelism, which consists of coarse  ... 
doi:10.1007/978-3-319-09967-5_9 fatcat:4k37wdwdcnhajkapju3nm5cp6i

Effective cross-platform, multilevel parallelism via dynamic adaptive execution

W. Ko, M. Yankelevsky, D.S. Nikolopoulos, C.D. Polychronopoulos
2002 Proceedings 16th International Parallel and Distributed Processing Symposium  
This paper presents preliminary efforts to develop compilation and execution environments that achieve performance portability of multilevel parallelization on hierarchical architectures.  ...  This algorithm can be used as a rule of thumb for automatic multilevel parallelization. The effectiveness of the approach is demonstrated on the NAS benchmarks running on two architectural platforms.  ...  These hierarchical architectures provide the hardware needed to utilize multigrain parallelism in programs.  ... 
doi:10.1109/ipdps.2002.1016495 dblp:conf/ipps/KoYNP02 fatcat:ge65dyzbvbhzxdt6dhvbagls4a

Generating Fine-Grain Multithreaded Applications Using a Multigrain Approach

Jaime Arteaga, Stéphane Zuckerman, Guang R. Gao
2017 ACM Transactions on Architecture and Code Optimization (TACO)  
To evaluate the type of applications that benefit from executing in a unified fine-grain program execution model, this article presents a multigrain parallel programming environment for the generation  ...  data-intensive workloads with irregular and dynamic parallelism, reaching speedups as high as 2.6× for Graph500 and 51× for NAS Data Cube.  ...  MULTIGRAIN PARALLEL PROGRAMMING ENVIRONMENT This section presents two of the main components of our multigrain parallel programming environment for the fine-grain execution of OpenMP programs.  ... 
doi:10.1145/3155288 fatcat:pqpz423lufdb5d7xycqscenssm

Scalable black-box prediction models for multi-dimensional adaptation on NUMA multi-cores

Aleksandr Khasymski, Dimitrios S. Nikolopoulos
2014 International Journal of Parallel, Emergent and Distributed Systems  
Research interests Parallel computing systems hardware-software boundary: Runtime systems, ubiquitous parallel programming models, memory management, energy-efficient parallel execution, operating systems  ...  Parallel computer architecture: chip multiprocessors, computational accelerators, heterogeneous computer architectures, emerging memory technologies, heterogeneous memory hierarchies.  ...  Malleable Memory Mapping: User-Level Control of Memory Bounds for Effective Program Adaptation.  ... 
doi:10.1080/17445760.2014.895346 fatcat:32mn2ijvbja5njdo43axnsjmjq

Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors [chapter]

Scott Schneider, Christos D. Antonopoulos, Dimitrios S. Nikolopoulos
2005 Lecture Notes in Computer Science  
Recent advances in processor technology such as Simultaneous Multithreading (SMT) and Chip Multiprocessing (CMP) enable parallel processing on a single die.  ...  This paper introduces Factory, an object-oriented parallel programming substrate which allows programmers to express multigrain parallelism, but alleviates them from having to manage it.  ...  As such, Factory can serve as a runtime library for next-generation, object-oriented parallel programming systems that target deep, multigrain parallel architectures.  ... 
doi:10.1007/11557654_28 fatcat:kd7m5pwvxfc35pt4bv2z3sfsqe

MGS

Donald Yeung, John Kubiatowicz, Anant Agarwal
1996 Proceedings of the 23rd annual international symposium on Computer architecture - ISCA '96  
Parallel workstations, each comprising 10-100 processors, promise cost-effective general-purpose multiprocessing.  ...  The authors would like to thank Kavita Bala, Fred Chong, Fredrik Dahlgren, Matt Frank, Silvina Hanono,Kirk Johnson,Kathy Knobe, Victor Lee, and Deborah Wallach for providing valuable comments on early  ...  While cluster locality can be exploited by any hierarchical shared memory system, multigrain locality can only be exploited by hierarchical systems that use different grains of sharing at different levels  ... 
doi:10.1145/232973.232980 dblp:conf/isca/YeungKA96 fatcat:nprbszfczrhnzosqc2fb3cktom
« Previous Showing results 1 — 15 out of 115 results