Filters








2,852 Hits in 4.0 sec

A peta-scalable CPU-GPU algorithm for global atmospheric simulations

Chao Yang, Weimin Zheng, Wei Xue, Haohuan Fu, Lin Gan, Linfeng Li, Yangtong Xu, Yutong Lu, Jiachang Sun, Guangwen Yang
2013 Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '13  
In this paper, we propose a peta-scalable hybrid algorithm that is successfully applied in a cubed-sphere shallow-water model for global atmospheric simulations.  ...  We employ an adjustable partition between CPUs and GPUs to achieve a balanced utilization of the entire hybrid system, and present a pipe-flow scheme to conduct conflict-free inter-node communication on  ...  Acknowledgments We would like to thank Xiao-Chuan Cai for insightful discussion on scalable algorithms in climate modeling, thank Paulius Micikevicius for giving suggestions on optimizing the CUDA kernel  ... 
doi:10.1145/2442516.2442518 dblp:conf/ppopp/YangXFGLXLSYZ13 fatcat:lebcyz3yqfbehb3rpojpzhs3mi

Buffer management in wormhole-routed torus multicomputer networks

Kamala Kotapati, Sivarama P. Dandamudi
2000 Future generations computer systems  
developed to perform the desired study on the interconnection network.  ...  One area that requires study is in the design of a programmable router that uses the hybrid buffer organization which could respond to the dynamic conditions in the network.  ... 
doi:10.1016/s0167-739x(99)00128-4 fatcat:6emrnixhcjhrvfhzskliadp5vm

An area-efficient high-throughput hybrid interconnection network for single-chip parallel processing

Aydin O. Balkan, Gang Qu, Uzi Vishkin
2008 Proceedings of the 45th annual conference on Design automation - DAC '08  
Earlier studies [5] proposed a specific Mesh-of-Trees (MoT) on-chip network that provides high performance (high throughput and low latency) for large amounts of parallelism with high traffic rates.  ...  A recently proposed Mesh-of-Trees (MoT) network provides high throughput and low latency at relatively high area cost.In this paper, we introduce a hybrid MoT-BF network that combines MoT network with  ...  CONCLUSION A hybrid network architecture incorporating mesh-of-trees (MoT) and butterfly (BF) networks is presented.  ... 
doi:10.1145/1391469.1391583 dblp:conf/dac/BalkanQV08 fatcat:coyehvtb4nepthyw4e42saf6ci

Memory-centric system interconnect design with Hybrid Memory Cubes

Gwangsun Kim, John Kim, Jung Ho Ahn, Jaeha Kim
2013 Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques  
As a result, the HMC (Hybrid Memory Cube) has recently been proposed to improve DRAM bandwidth as well as energy efficiency. In this paper, we explore different system interconnect designs with HMCs.  ...  Memory bandwidth has been one of the most critical system performance bottlenecks.  ...  Fig. 3 . 3 Block diagram of a hybrid memory cube (HMC) that consists of several layers of DRAM dies and a logic layer on the bottom. The logic layer consists of an intra-HMC network.  ... 
doi:10.1109/pact.2013.6618812 dblp:conf/IEEEpact/KimKAK13 fatcat:msleo2kuejezjg5xba3p65pamm

Off-chip communication architectures for high throughput network processors

Jacob Engel, Taskin Kocak
2009 Computer Communications  
In recent years there is a significant increase in memory bandwidth demand on line cards as a result of higher line rates, an increase in deep packet inspection operations and an unstoppable expansion  ...  Therefore, indirect interconnects are replaced with direct, packet-based networks such as mesh, torus or k-ary n-cubes.  ...  A study done by Dally in [12] , demonstrated that low-dimensional networks provide better performance than high-dimensional networks.  ... 
doi:10.1016/j.comcom.2008.12.043 fatcat:n44wcdfjgjhzji5tzqgx2abo3q

SOCD Sort on Centralized Diamond Architecture

Kamal Jadidy Aval, Masumeh Damrudi
2015 International Journal of Computer and Communication Engineering  
Parallel sorting is a technique which researchers have studied from the time parallelism was proposed as a way of making fast algorithms.  ...  Different parallel sorting techniques on different architectures have been studied for many years. Sorting is one of the most important operations in different algorithms.  ...  Linnear array (1D) Mesh(2D) Cube(3D) Hypercube(4D) Multi Mesh of Trees (Hybrid) [16] Pyramid(Hybrid) Fig. 1 . Examples of static interconnection network topologies.  ... 
doi:10.17706/ijcce.2016.5.4.246-252 fatcat:t76x3ie7pvdzjb77f4ps4h74hu

Star-crossed cube: an alternative to star graph

Nibedita ADHIKARI, Chitta Ranjan TRIPATHY
2014 Turkish Journal of Electrical Engineering and Computer Sciences  
This paper introduces a new interconnection network topology called the star-crossed cube (SCQ).  ...  The current study proposes a new class of IN topology called the star-CQ (SCQ(m,n)). It is a product graph on the nstar and m-dimensional CQ (CQ(m)).  ...  Introduction The performance of a distributed memory parallel computer heavily depends on the effectiveness of its interconnection network (IN) [1, 2] .  ... 
doi:10.3906/elk-1202-44 fatcat:l5m3ppir6vbvdgym3nmcghy73i

Triangle Hyper Hexa-cell Interconnection Network A Novel Interconnection Network

Asmaa Aljawawdeh, Esraa Emriziq, Saher Manaseer
2019 International Journal of Advanced Computer Science and Applications  
This work proposes a new topology; a hybrid topology between hyper hexa cell topology and triangle topology.  ...  The interconnection networks play the main role in many applications, because it has a direct influence on it.  ...  The proposed algorithm on tree-hyper cube can perform for irregular shape, but this study didn't perform for regular shape and show some weakness such as increase cost, dilation one, and congestion one  ... 
doi:10.14569/ijacsa.2019.0100378 fatcat:vgx4dpasuvdzvhdz35qee3o7fa

Author Index

2019 2019 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS)  
Peak Load Data Forecasting Based on Long Short Term Memory Comparison of the Energy Efficiency in Fat-Tree and B-Cube Based Data Center Network Implement Time Based One Time Password and Secure Hash Algorithm  ...  Indonesian YouTube Video Comments: Case Study of the Indonesian Government's Plan to Move the Capital City Purnama, Devi IF2.6 237 Handoff Mechanism in Wireless Mesh Network Using Deficit Round  ... 
doi:10.1109/icimcis48181.2019.8985220 fatcat:ytjzkbeyovfbbcagwqyavn53yu

Automatic hippocampal surface generation via 3D U-net and active shape modeling with hybrid particle swarm optimization [article]

Pinyuan Zhong, Yue Zhang, Xiaoying Tang
2021 arXiv   pre-print
Secondly, ASM was performed on a group of pre-obtained template surfaces to generate mean shape and shape variation parameters through principal component analysis.  ...  Ultimately, hybrid particle swarm optimization was utilized to search for the optimal shape variation parameters that best match the segmentation.  ...  INTRODUCTION The hippocampus has a close relationship with the memory function of the human brain.  ... 
arXiv:2109.06817v1 fatcat:wyssqh5cyfgmrfirfpefpco6ui

Interconnection networks: dimensions in design

Abraham
1996 Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing ICPPW-96  
The interconnection network is the switching fabric responsible for providing communication between all processors in a parallel computer.  ...  The speakers for this panel session were asked to address the following question: for a given range of number of commodity high-pe$ormance processors (e.g. 256 to 1024) what interconnection network should  ...  The most widely known direct network topologies are the multi-dimensional meshes and tori, also called called k-ary n-cubes. The simplest of these networks is the one dimensional ring.  ... 
doi:10.1109/icppw.1996.538589 dblp:conf/icpp/Abraham96 fatcat:i3ehpmobmff5pohkxzgef7hcrm

Server-based rendering of large 3D scenes for mobile devices using G-buffer cube maps

Juergen Doellner, Benjamin Hagedorn, Jan Klimke
2012 Proceedings of the 17th International Conference on 3D Web Technology - Web3D '12  
represented by G-buffer cube maps, for a requested camera setting.  ...  The client reconstruction process uses these cube maps to reconstruct the 3D scene and allows users to operate on and interact with that representation.  ...  Either it executes on the same CPU as the Render Master, on a different CPU on the same computer, or a different computer connected via a network.  ... 
doi:10.1145/2338714.2338729 dblp:conf/vrml/DoellnerHK12 fatcat:tmd4kuvw3vcwrnw4otwetqfjfe

GOTPM: a parallel hybrid particle-mesh treecode

John Dubinski, Juhan Kim, Changbom Park, Robin Humble
2004 New Astronomy  
We describe a parallel, cosmological N-body code based on a hybrid scheme using the particle-mesh (PM) and Barnes-Hut (BH) oct-tree algorithm.  ...  The gravitational potential is determined on a mesh using a standard PM method with particle forces determined through interpolation.  ...  Memory budget The memory budget for a pure n-body simulation can be broken down into particle memory and mesh memory.  ... 
doi:10.1016/j.newast.2003.08.002 fatcat:soiq7v6fvfafzd4owkcmf6zyzu

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms [article]

Amani AlOnazi, David Keyes, Alexey Lastovetsky, Vladimir Rychkov
2015 arXiv   pre-print
In this work, we study optimizations aimed at acceleration of OpenFOAM-based applications on emerging hybrid heterogeneous platforms.  ...  In our study, we use two OpenFOAM applications, icoFoam and laplacianFoam, both based on Krylov iterative methods.  ...  steady state 7-point Laplacian with Drichlet boundary conditions in a 3D cube using different none/preconditioned CG solvers with a tolerance of 10 −8 on different mesh sizes running on one socket, which  ... 
arXiv:1505.07630v1 fatcat:ncqycx2hmvbaxjbfeavwhyw3v4

Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System [article]

Pritam Majumder, Jiayi Huang, Sungkeun Kim, Abdullah Muzahid, Dylan Siegers, Chia-Che Tsai, Eun Jung Kim
2021 arXiv   pre-print
To meet the bandwidth and capacity demands of memory-centric computing, 3D memory has been adopted to form a scalable memory-cube network.  ...  Along with NMP and memory system development, the mapping for placing data and guiding computation in the memory-cube network has become crucial in driving the performance improvement in NMP.  ...  For the memory network, we model 4×4 and 8×8 mesh networks with 3D memory cubes that consists of vaults, banks and router switches.  ... 
arXiv:2104.13671v1 fatcat:fe2slbojkndufibfikgxpajqqe
« Previous Showing results 1 — 15 out of 2,852 results