37,195 Hits in 4.0 sec

Fast switching of threads between cores

Richard Strong, Jayaram Mudigonda, Jeffrey C. Mogul, Nathan Binkert, Dean Tullsen
2009 ACM SIGOPS Operating Systems Review  
We address the software costs of switching threads between cores in a multicore processor.  ...  Fast core switching enables a variety of potential improvements, such as thread migration for thermal management, fine-grained load balancing, and exploiting asymmetric multicores, where performance asymmetry  ...  Acknowledgments We could not have done this work without significant help from Rakesh Kumar, Partha Ranganathan, Vanish Talwar, and the members of the M5 community.  ... 
doi:10.1145/1531793.1531801 fatcat:qaieoix2srgb7nuxskqqdl644q

XMOS architecture XS1 chips

David May
2011 2011 IEEE Hot Chips 23 Symposium (HCS)  
Fast barrier synchronisation -one instruction per thread Guarantee that each of n threads has 1/n core cycles.  ...  An XCore can power down when all of its threads are waitingevent-driven processing Thread Scheduler Concurrency and Thread Scheduler Fast initiation and termination of threads -forking and joining.  ... 
doi:10.1109/hotchips.2011.7477496 fatcat:lvx6did4xfdvdhuwhw5wn6yszu

Page 29 of English Electric Journal Vol. 17, Issue 5 [page]

1962 English Electric Journal  
THE ENGLISH Fast speed selection is until all stand lockout switches are in the run position and inoperative when running above thread speed a change of selection is ineffective.  ...  The composite arrangement of some twenty-six flat-back and cubicle type boards and resistance ELECTRIC JOURNAL 29 frameworks led to a considerable interconnecting multi-core cables, connections between  ... 

Evaluating private vs. shared last-level caches for energy efficiency in asymmetric multi-cores

Anthony Gutierrez, Ronald G. Dreslinski, Trevor Mudge
2014 2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV)  
We show that for switching threads between cores at intervals on the order of 100k or more instructions, the performance difference is negligible when private last-level caches are used in place of shared  ...  In this work we explore the tradeoffs between energy and performance for several last-level cache configurations in an asymmetric multi-core system.  ...  The scope of this work focuses on how to support fast switching of threads between cores.  ... 
doi:10.1109/samos.2014.6893211 dblp:conf/samos/GutierrezDM14 fatcat:tl3npdfhdrcy7kh5gyk3h73lva

Hardware-modulated parallelism in chip multiprocessors

Julia Chen, Philo Juang, Kevin Ko, Gilberto Contreras, David Penry, Ram Rangan, Adam Stoler, Li-Shiuan Peh, Margaret Martonosi
2005 SIGARCH Computer Architecture News  
of threads to cores at low overheads.  ...  This is achieved with modest amounts of hardware support that allows for low overheads in thread creation, scheduling and context-switching.  ...  Acknowledgments We thank David August and his group at Princeton for their support of our use of the Liberty Simulation Environment (LSE) and for extensive discussions on the NDP architecture.  ... 
doi:10.1145/1105734.1105742 fatcat:d5iferqj5fghrmndyw6b4ill6y

Micromagnetic Simulations on GPU, A Case Study: Vortex Core Switching by High-Frequency Magnetic Fields

Ben Van de Wiele, Arne Vansteenkiste, Matthias Kammerer, Bartel Van Waeyenberge, Luc Dupre, Daniël De Zutter
2012 IEEE transactions on magnetics  
By exploiting MUMAX's numerical power we were able to explore new switching opportunities at moderate field amplitudes in the frequency range between 5 and 12 GHz.  ...  Vortex core switching can be induced by exciting the gyrotropic eigenmode, e.g., by applying cyclic magnetic fields with typically a sub-gigahertz frequency.  ...  Here, the number of parallel threads (i.e. the number of cores) is relatively small, but the communication is fast.  ... 
doi:10.1109/tmag.2011.2179551 fatcat:apvs3retv5bulm3x5mak2t3f74

Real-Time and Real-Fast Performance of General-Purpose and Real-Time Operating Systems in Multithreaded Physical Simulation of Complex Mechanical Systems

Carlos Garre, Domenico Mundo, Marco Gubitosa, Alessandro Toso
2014 Mathematical Problems in Engineering  
This type of applications is usually present in the automotive industry and requires a good trade-off between real-fast and real-time performance.  ...  The performance of an RTOS and a GPOS is compared by running a tire model scalable on the number of degrees-of-freedom and parallel threads.  ...  chosen as the maximum number of threads for all the tests, which means a maximum of 2 threads per virtual core (or 4 threads per physical core).  ... 
doi:10.1155/2014/945850 fatcat:cp6dxqyp5nblhmza4icqaiiwru

A memory access model for highly-threaded many-core architectures

Lin Ma, Kunal Agrawal, Roger D. Chamberlain
2014 Future generations computer systems  
Ma). between them; this fast context-switch mechanism is used to hide the memory access latency of transferring data from slow large (and often global) memory to fast, small (and typically local) memory  ...  . a b s t r a c t A number of highly-threaded, many-core architectures hide memory-access latency by low-overhead context switching among a large number of threads.  ...  In contrast, highly-threaded, many-core machines are explicitly designed to have a large number of threads per core and a fast context switching mechanism.  ... 
doi:10.1016/j.future.2013.06.020 fatcat:qhvb6p445ra4vow5cvwvp6kbxq

High Speed Cycle-Approximate Simulation of Embedded Cache-Incoherent and Coherent Chip-Multiprocessors

Christopher Thompson, Miles Gould, Nigel Topham
2018 International journal of parallel programming  
The complexity parameter influences the number of switches and number of layers in the switching network before adding cores and peripherals, extra switches are added if the configured complexity does  ...  The platform comprises clusters of 1-8 processor cores, connected through a per-cluster arbiter to an ARM AMBA AXI [3] based packet switched network, which connects cores to memory banks and devices, such  ...  Acknowledgements This work has made use of the resources provided by the Edinburgh Compute and Data Facility (ECDF) [25].  ... 
doi:10.1007/s10766-018-0566-x fatcat:pmvu23xlafaejhva5o2vzldr4e

An evaluation of parallel optimization for OpenSolaris® network stack

Hongbo Zou, Wenji Wu, Xian-He Sun, Phil DeMar, Matt Crawford
2010 IEEE Local Computer Network Conference  
The fundamental goal of multiprocessing is improved performance through the introduction of additional hardware threads or cores (referred to as "cores" for simplicity).  ...  OpenSolaris has redesigned and parallelized to better utilize additional cores.  ...  The metrics of interest are: 1) smtx -the number of times cores failed to obtain a mutex immediately; 2) migr -the number of thread migrations to between cores; 3) l2_miss -the number of L2 cache misses  ... 
doi:10.1109/lcn.2010.5735726 dblp:conf/lcn/ZouWSDC10 fatcat:rg6pieupjnd6dnjpbajxnpzys4

A Memory Access Model for Highly-threaded Many-core Architectures

Lin Ma, Kunal Agrawal, Roger D. Chamberlain
2012 2012 IEEE 18th International Conference on Parallel and Distributed Systems  
Many-core architectures are excellent in hiding memory-access latency by low-overhead context switching among a large number of threads.  ...  In this paper, we introduce the Threaded Many-core Memory (TMM) model which is meant to capture the important characteristics of these highly-threaded, many-core machines.  ...  The high-level characteristics that we will focus on are: (1) These many-core machines have a large number of threads and a super fast context switching mechanism.  ... 
doi:10.1109/icpads.2012.54 dblp:conf/icpads/MaAC12 fatcat:gabo4pghavhl3fcmxjokhoqdjy

Design of interleaved multithreading for Network Processors on Chip

H.C. Freitas, F.L. Madruga, M. Alves, P. Navaux
2009 2009 IEEE International Symposium on Circuits and Systems  
Thread level parallelism and multi-core processors are current alternatives to increase performance of generalpurpose applications.  ...  In the same way, Networks-on-Chip (NoCs) are the main alternatives for supporting packet throughput for the next generations of many-core processors.  ...  In common architectures without multithreading support, there is a high latency to switch thread context between register bank and memory.  ... 
doi:10.1109/iscas.2009.5118237 dblp:conf/iscas/FreitasMAN09 fatcat:zgromdz24bfhxnfltptrjxcth4

Performance Evaluation of DPDK Open vSwitch with Parallelization Feature on Multi-Core Platform

Guanchao Sun, Key Lab of Beijing Network Technology, School of Computer Science and Engineering, Beihang University, Beijing, China, Wei Li, Di Wang
2018 Journal of Communications  
Open vSwitch is one of the popular open source software of Openflow switch. However, the performance degradation of Open vSwitch caused by processing smaller packets has been widely criticized.  ...  In this paper, we analyze the parallel performance of OVS-DPDK on the multi-core platform from two aspects of throughput and delay.  ...  This prevents threads from switching frequently between different cores, resulting in frequent cache miss and cache write back, which would lead to significant performance losses.  ... 
doi:10.12720/jcm.13.11.685-690 fatcat:jahny7ac6rfe3ayzozcz24xcyy

A Generic Implementation of Barriers Using Optical Interconnects

Sandeep Chandran, Eldhose Peter, Preeti Ranjan Panda, Smruti R. Sarangi
2016 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID)  
multiple applications, context switches, thread migrations, and variability in the number of active threads.  ...  One of these protocols is a centralized protocol (suitable for less cores), and the other is a distributed protocol, which is scalable.  ...  In specific, they assume that the number of cores is equal to the number of threads, every thread is interested in entering the barrier, there are no context switches or thread migrations, and we do not  ... 
doi:10.1109/vlsid.2016.16 dblp:conf/vlsid/ChandranPPS16 fatcat:fpjhsj3ftzb5xpx45v4pzxumda


Barry Smith, Jed Brown, Matthew Knepley, Karl Rupp, Mark Adams
2018 Figshare  
to exascale (on emerging architectures) using node-aware MPI techniques, including neighborhood collectives and portable shared memory within a node, instead of threads.  ...  We provide technical details on why we feel the use of threads does not offer any fundamental performance advantage over using processes for high-performance computing and hence why we plan to extend PETSc  ...  Department of Energy, Office of Science, Advanced Scientific Computing Research under Contract DE-AC02-06CH11357.  ... 
doi:10.6084/m9.figshare.5824950.v1 fatcat:uxdoiebu55fhths6rrlysojjlu
« Previous Showing results 1 — 15 out of 37,195 results