Filters








2,240 Hits in 7.5 sec

IsoNet: Hardware-Based Job Queue Management for Many-Core Architectures

Junghee Lee, Chrysostomos Nicopoulos, Hyung Gyu Lee, Shreepad Panth, Sung Kyu Lim, Jongman Kim
2013 IEEE Transactions on Very Large Scale Integration (vlsi) Systems  
Thus, IsoNet is a network tasked with maintaining equal load between the processing cores of the CMP.  ...  IsoNet is a lightweight job queue manager responsible for administering the list of jobs to be executed, and maintaining load balance among all CMP cores.  ...  Section II serves as a concise preamble on the basics of parallel programming on modern CMPs and provides motivation on the need for hardware-based dynamic load balancing.  ... 
doi:10.1109/tvlsi.2012.2202699 fatcat:kcd2je7dave6dmyx6rjmze3g6q

Fast and Memory-Efficient Traffic Classification with Deep Packet Inspection in CMP Architecture

Tingwen Liu, Yong Sun, Li Guo
2010 2010 IEEE Fifth International Conference on Networking, Architecture, and Storage  
Furthermore, we design a fast and memory-efficient system of a two-layer architecture for traffic classification with the help of regular expressions in multi-core architecture, which is different from  ...  The classic way to identify flows, e.g., examining the port numbers in the packet headers, becomes ineffective.  ...  We design a new load balancing algorithm based on static preallocation and adaptive adjustments, which could be used in network flow-based parallel processing system in CMP architecture.  ... 
doi:10.1109/nas.2010.43 dblp:conf/nas/LiuSG10 fatcat:4w3i67ejyfdmppejmi4ap5nn24

The Cell Broadband Engine: Exploiting Multiple Levels of Parallelism in a Chip Multiprocessor

Michael Gschwind
2007 International journal of parallel programming  
in many designs fails to exploit all the levels of available parallelism in many workloads for CMP systems.  ...  By taking advantage of opportunities at all levels of the system, this CMP revolutionizes parallel architectures to deliver previously unattained levels of single chip performance.  ...  In the process, a range of new innovative solutions has been proposed and implemented, from CMPs based on homogeneous single ISA systems (Piranha, Cyclops, Xbox360), to heterogenous multi-ISA systems (  ... 
doi:10.1007/s10766-007-0035-4 fatcat:zcndew73k5bevgjnuc3h3lbtya

Scalability of Multimedia Applications on Next-Generation Processors

Guy Amit, Yaron Caspi, Ran Vitale, Adi Pinhas
2006 2006 IEEE International Conference on Multimedia and Expo  
In particular, the study discusses the decomposition method, load balancing, synchronization primitives, interaction with the operating system and hardware issues such as cache hierarchy and memory bandwidth  ...  In the near future, the majority of personal computers are expected to have several processing units. This is referred to as Core Multiprocessing (CMP).  ...  Yefet for their contributions to this research.  ... 
doi:10.1109/icme.2006.262503 dblp:conf/icmcs/AmitCVP06 fatcat:fk3dzpldjbfoxl6myovc4zzdkm

Evaluation Scheme for NoC-based CMP with Integrated Processor Management System

Dawid Zydek, Henry Selvaraj, Leszek Koszałka, Iwona Poźniak-Koszałka
2010 International Journal of Electronics and Telecommunications  
Analyzed results reveal that CMP with a PA controlled by IFF allocation algorithm for mesh systems and torus-based NoC driven by DORLB routing with express-virtual-channel flow control achieved the best  ...  Evaluation Scheme for NoC-based CMP with Integrated Processor Management System With the opportunities and benefits offered by Chip Multiprocessors (CMPs), there are many challenges that need to be addressed  ...  INTRODUCTION T ILED CHIP MULTIPROCESSOR (CMP) architectures with many Processing Elements (PEs) integrated on one die attract a lot of attention and they are dominant trend in parallel processing systems  ... 
doi:10.2478/v10177-010-0021-4 fatcat:cqeqzba2vrgi7lovnnxw475svq

Event-driven configuration of a neural network CMP system over an homogeneous interconnect fabric

M.M. Khan, A.D. Rast, J. Navaridas, X. Jin, L.A. Plana, M. Luján, S. Temple, C. Patterson, D. Richards, J.V. Woods, J. Miguel-Alonso, S.B. Furber
2011 Parallel Computing  
SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation.  ...  Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that  ...  A system for neural network simulation will be, correspondingly, architecturally different from parallel systems designed mostly for generalpurpose computing.  ... 
doi:10.1016/j.parco.2011.02.003 fatcat:xznsapdhsne4xemiwgns2aop34

Energy characteristic of a processor allocator and a network-on-chip

Dawid Zydek, Henry Selvaraj, Grzegorz Borowik, Tadeusz Łuba
2011 International Journal of Applied Mathematics and Computer Science  
Simulation results show that a PA with an IFF allocation algorithm for mesh systems and a torus-based NoC with express-virtual-channel flow control are very energy efficient.  ...  Besides efficient on-chip processing elements, a well-designed Processor Allocator (PA) and a Network-on-Chip (NoC) are also important factors in the energy budget of novel CMPs.  ...  Acknowledgment This work has been supported by the European Union in the framework of the European Social Fund through the Warsaw University of Technology Development Programme, and by the Ministry of  ... 
doi:10.2478/v10006-011-0029-7 fatcat:cby3q2ucczczzpdknrm5lwk2ly

Location of Processor Allocator and Job Scheduler and Its Impact on CMP Performance

Dawid Zydek, Grzegorz Chmaj, Alaa Shawky, Henry Selvaraj
2012 International Journal of Electronics and Telecommunications  
We present energy models for the researched CMP components, mathematical model of the system, and experimentation system.  ...  Processors that are being developed and used as nodes in HPC systems are Chip Multiprocessors (CMPs) with a number of cores.  ...  We investigate NoC architectures with: 1) 2D-Mesh and 2D-Torus topologies, 2) Virtual-Channel (VC) and Express Virtual-Channel (EVC) flow controls, 3) Dimensional Order Routing with Load Balance extension  ... 
doi:10.2478/v10177-012-0001-y fatcat:pzfc77eyhzeg3g4jko4l643cka

Characterizing processor architectures for programmable network interfaces

Patrick Crowley, Marc E. Fluczynski, Jean-Loup Baer, Brian N. Bershad
2000 Proceedings of the 14th international conference on Supercomputing - ICS '00  
To date, no performance data exist to aid in the decision of what processor architecture to use in next generation network processor. Our goal is to remedy this situation.  ...  The network interface environment is simulated in detail, and our results indicate that SMT is the architecture best suited to this environment.  ...  ., flow management) and emerging applications (HTTP load balancing) to round out our network processor workloads.  ... 
doi:10.1145/335231.335237 dblp:conf/ics/CrowleyFBB00 fatcat:zf23lcuw6fflfpbkppcbkzxnqa

Characterizing processor architectures for programmable network interfaces

Patrick Crowley, Marc E. Fluczynski, Jean-Loup Baer, Brian N. Bershad
2014 25th Anniversary International Conference on Supercomputing Anniversary Volume -  
To date, no performance data exist to aid in the decision of what processor architecture to use in next generation network processor. Our goal is to remedy this situation.  ...  The network interface environment is simulated in detail, and our results indicate that SMT is the architecture best suited to this environment.  ...  (HTTP load balancing) to round out our network processor workloads.  ... 
doi:10.1145/2591635.2667178 fatcat:f5h26uvk5rf65oi3sgg5e4mjna

Author Index

2008 2008 IEEE International Symposium on Parallel and Distributed Processing  
DiCo-CMP: Efficient Cache Coherency in Tiled CMP Architectures García, Miguel Angel Perimeter Quadrature-based Metric for Estimating FPGA Fragmentation in 2D HW Multitasking Garey, Larry A Linear Solver  ...  for Component Based Hardware/Software Interaction in Embedded Real-Time Systems Foughali, Laidi A Parallel Insular Model for Location Areas Planning in Mobile Networks Fu, Song Random Choices for Churn  ... 
doi:10.1109/ipdps.2008.4536576 fatcat:7unikf5ywjhjtdd6xtrmcom3gq

High-Performance Energy-Efficient Multicore Embedded Computing

A. Munir, S. Ranka, A. Gordon-Ross
2012 IEEE Transactions on Parallel and Distributed Systems  
Embedded systems differ from traditional high-performance supercomputers in that power is a first-order constraint for embedded systems; whereas, performance is the major benchmark for supercomputers.  ...  Finally, we present design challenges and future research directions for HPEEC system development.  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSERC and the NSF.  ... 
doi:10.1109/tpds.2011.214 fatcat:vagqmojdsjevvc2u2ewqrcjjpq

Implementation and Evaluation of Fock Matrix Calculation Program on the Cell Processor

Hiroaki Honda, Tetsuo Hayashi, Yuichi Inadomi, Koji Inoue, Kazuaki J. Murakami, Theodore E. Simos, George Maroulis
2007 AIP Conference Proceedings  
Recently, the Chip Multi-processors (CMPs), which has many processor cores onto a chip, are proposed for further performance improvement.  ...  Various processor architectures have been proposed until today, and the performance has improved remarkably.  ...  ACKNOWLEDGMENTS This work has been supported in part by the Grant-in-Aid for Young Scientists (A) No.17680005 of the Ministry of Education, Science, Sports and Culture (MEXT).  ... 
doi:10.1063/1.2836167 fatcat:umbyy6vpifhefkspzadsngmmhm

Event-Driven Configuration of a Neural Network CMP System over a Homogeneous Interconnect Fabric

Muhammad Mukaram Khan, Javier Navaridas Palma, Alexander D. Rast, Xin Jin, Luis A. Plana, Mikel Lujan, John Viv Woods, Jose Miguel-Alonso, Steve B. Furber
2009 2009 Eighth International Symposium on Parallel and Distributed Computing  
The architecture of SpiNNaker, a parallel chip multiprocessor (CMP) system for neural network simulation, is in this class.  ...  Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that  ...  The architecture of SpiNNaker, a parallel chip multiprocessor (CMP) system for neural network simulation, is in this class.  ... 
doi:10.1109/ispdc.2009.25 dblp:conf/ispdc/KhanNRJPLWMF09 fatcat:bx2ce5yagfbntmzx47s5dq3fdu

Embracing heterogeneity with dynamic core boosting

Hyoun Kyu Cho, Scott Mahlke
2014 Proceedings of the 11th ACM Conference on Computing Frontiers - CF '14  
Even for embarrassingly parallel programs in the form of SPMD (single program multiple data), the threads are not perfectly balanced due to control flow divergence, non-deterministic memory latencies,  ...  We propose Dynamic Core Boosting (DCB), a software-hardware cooperative system that mitigates the workload imbalance problem in performance asymmetric CMPs.  ...  Increasing core-to-core process variation also creates performance asymmetry in CMPs [29] .  ... 
doi:10.1145/2597917.2597932 dblp:conf/cf/ChoM14 fatcat:tx22yxw3mnbu7dpe5ccwealwua
« Previous Showing results 1 — 15 out of 2,240 results