782 Hits in 7.3 sec

Addressing End-to-End Memory Access Latency in NoC-Based Multicores

Akbar Sharifi, Emre Kultursay, Mahmut Kandemir, Chita R. Das
2012 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture  
NoC Based Multicores  Contribution of NoC to end-to-end memory access latency MC1 MC0 MC2 MC3 NoC Based Multicores  Contribution of NoC to end-to-end memory access latency 4 5 3 1  ...  2 MC1 Request Message Response Message End-to-End Memory Latency L2-L1 Mem-L2 Mem L2-Mem L1-L2 End-to-End Memory Latency 0 0.05 0.1 0.15 0.2 0.25 100 300 500 700  ... 
doi:10.1109/micro.2012.35 dblp:conf/micro/SharifiKKD12 fatcat:3sbvyhlxzvhipppibxrdsu5lva

Models of Communication for Multicore Processors

Martin Schoeberl, Rasmus Bo Sorensen, Jens Sparso
2015 2015 IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops  
To efficiently use multicore processors we need to ensure that almost all data communication stays on chip, i.e., the bits moved between tasks executing on different processor cores do not leave the chip  ...  In this paper we explore the different hardware mechanism for on-chip communication and how they support or favor different models of communication.  ...  ACKNOWLEDGMENT The work presented in this paper was partially funded by the Danish Council for Independent Research | Technology and Production Sciences under the project RTEMP, contract no. 12-127600  ... 
doi:10.1109/isorcw.2015.57 dblp:conf/isorc/SchoeberlSS15 fatcat:2xfammjbbnb5vkdm6zfd5wxefy

Hierarchical Cluster Based NoC Design Using Wireless Interconnects for Coherence Support

Tanya Shreedhar, Sujay Deb
2016 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID)  
In this paper, we propose a novel multicore hierarchical cluster based architecture that utilizes a hybrid NoC having both wired and wireless interconnects.  ...  To the best of our knowledge, our work is the first to utilize wireless interconnects to solve the cache coherence problem in multicore system on NoC.  ...  L2 cache access latency The L2 cache access latency for an architecture is directly proportional to inter-core Manhattan distance.  ... 
doi:10.1109/vlsid.2016.54 dblp:conf/vlsid/ShreedharD16 fatcat:aed5n6evv5bgvfzv67zpjkxce4

A Time-Predictable Memory Network-on-Chip

Martin Schoeberl, David Vh Chong, Wolfgang Puffitsch, Jens Sparsø, Marc Herbstritt
2014 Worst-Case Execution Time Analysis  
The memory network-onchip is organized as a tree with time-division multiplexing (TDM) of accesses to the shared memory.  ...  The TDM based arbitration completely decouples processor cores and allows WCET analysis of the memory accesses on individual cores without considering the tasks on the other cores.  ...  Source Access The source of the described memory NoCs is open source under the simplified BSD licenses and available at GitHub within the Patmos project: patmos.  ... 
doi:10.4230/oasics.wcet.2014.53 dblp:conf/wcet/SchoeberlCPS14 fatcat:flub4mutdfby7cp4l54cqayxky

A Multicore Processor for Time-Critical Applications

Martin Schoeberl, Luca Pezzarossa, Jens Sparso
2018 IEEE design & test  
Time-critical applications need a processor and software where it is possible to prove that all critical tasks will complete in time.  ...  This paper presents such a radically different design of a multicore processor for future time-critical systems.  ...  Acknowledgment We would like to thank all T-CREST team members and students who helped to build this platform and for all the joy of the discussions during the project meetings and dinners: Sahar Abbaspour  ... 
doi:10.1109/mdat.2018.2791809 fatcat:rqywywqm2nd3bhiad6atqrumdy

Support for the logical execution time model on a time-predictable multicore processor

Florian Kluge, Martin Schoeberl, Theo Ungerer
2016 ACM SIGBED Review  
In this work, we extend a multicore operating system running on a timepredictable multicore processor to support the LET model.  ...  We report our experiences and present results on the costs in terms of memory and execution time.  ...  In contrast to the AEthereal family of NoCs our NoC implements TDM arbitration from end-to-end. I.e., access to the scratchpad memory (SPM) is scheduled with the NoC TDM schedule.  ... 
doi:10.1145/3015037.3015047 fatcat:65x7ewyknjello7mmd7khnlmgy

Reducing NoC and Memory Contention for Manycores [chapter]

Vishwanathan Chandru, Frank Mueller
2016 Lecture Notes in Computer Science  
Such systems provide increased processing power and system availability, but often impose latencies and contention for memory accesses as multiple cores try to reference data at the same time.  ...  Experiments show that targeted memory allocation results in reduced execution times and NoC contention, the latter of which has not been studied before at this scale. banks.  ...  This work was funded in part by NSF grants 1239246 and 1058779 as well as a grant from AFOSR via Securboration.  ... 
doi:10.1007/978-3-319-30695-7_22 fatcat:ntdpzv2epnevtpdmysb5hb5dr4

One-way shared memory

Martin Schoeberl
2018 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)  
With the use of time-division multiplexing for the memory accesses and the network-on-chip routers we achieve a timepredictable solution where the communication latency and bandwidth can be bounded.  ...  A network-on-chip constantly copies data from a sender core-local memory to a receiver core-local memory.  ...  Acknowledgment The work presented in this paper was partially funded by the Danish Council for Independent Research | Technology and Production Sciences under the project PREDICT (http: //  ... 
doi:10.23919/date.2018.8342017 dblp:conf/date/Schoeberl18 fatcat:rwl2ywung5ggpplqq5vsxbnz7e


Martin Schoeberl, Luca Pezzarossa, Jens Sparsø
2019 Proceedings of the 12th International Workshop on Network on Chip Architectures - NoCArc  
Static scheduled traffic allows computing upper bounds for end-to-end latencies of messages, which is a requirement for building multicore real-time systems.  ...  Message passing using a network-on-chip (NoC) is an efficient way to provide core-to-core communication on a multicore processor.  ...  Hoplite is modified to prioritize deflections and perform traffic shaping at the network interface to provide guarantees on end-to-end latencies for packets.  ... 
doi:10.1145/3356045.3360714 dblp:conf/micro/SchoeberlPS19 fatcat:lbnmtfqco5glnfybfvmq4nxr4i

Message Passing on a Time-predictable Multicore Processor

Rasmus Bo Sorensen, Wolfgang Puffitsch, Martin Schoeberl, Jens Sparso
2015 2015 IEEE 18th International Symposium on Real-Time Distributed Computing  
We combine these WCET numbers with the calculation of the network latency of a message and then provide a statically computed end-to-end latency for this core-to-core message.  ...  For a multicore processor to be time-predictable, communication between processor cores needs to be time-predictable as well.  ...  ACKNOWLEDGMENTS The work presented in this paper was funded by the Danish Council for Independent Research | Technology and Production Sciences under the project RTEMP, 1 contract no. 12-127600.  ... 
doi:10.1109/isorc.2015.15 dblp:conf/isorc/SorensenPSS15 fatcat:urvck7u3unaidluc4d6hp66goq

The Connection-Then-Credit Flow Control Protocol for Heterogeneous Multicore Systems-on-Chip

Nicola Concer, Luciano Bononi, Michael Soulie, Riccardo Locatelli, Luca P. Carloni
2010 IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
Connection-then-credits (CTC) is a novel end-to-end flow control protocol to handle message-dependent deadlocks in best-effort networks-on-chip (NoC) for embedded multicore systems-on-chip (SoCs).  ...  Index Terms-End-to-end flow control, message-dependent deadlock, multicore systems-on-chip (SoC), network interface design, networks-on-chip (NoC).  ...  end-to-end flow control protocol to regulate the access to the NoC [23] , [24] .  ... 
doi:10.1109/tcad.2010.2048592 fatcat:kx6vrrp3qjcm5izrujsopqu6wa

End-to-end schedulability tests for multiprocessor embedded systems based on networks-on-chip with priority-preemptive arbitration

Leandro Soares Indrusiak
2014 Journal of systems architecture  
Simulation-based techniques can be used to evaluate whether a particular NoC-based platform configuration is able to meet the timing constraints of an application, but they can only evaluate a finite set  ...  This paper presents a particular NoC-based multiprocessor architecture, as well as a number of analytical methods that can be derived from that architecture, aiming to allow designers to check, for a given  ...  ACKNOWLEDGEMENTS The author would like to thank Zheng Shi, Alan Burns, Osmar Marchi dos Santos and Borislav Nikolic for the discussions on the tests presented in Section 4; and Paris Mesidis, Adrian Racu  ... 
doi:10.1016/j.sysarc.2014.05.002 fatcat:vrthcy3scvhxxfvwkus4stmfuy

Scratchpad Memories with Ownership

Martin Schoeberl, Torur Biskopsto Strom, Oktay Baris, Jens Sparso
2019 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)  
The base architecture uses time-division multiplexing for the arbitration of the access to the shared SPM.  ...  Having exclusive access to the SPM reduces the access time to a single clock cycle.  ...  Acknowledgment The work presented in this paper was partially funded by the Danish Council for Independent Research | Technology and Production Sciences under the project PREDICT(no. 4184-00127A) and by  ... 
doi:10.23919/date.2019.8714926 dblp:conf/date/SchoeberlSBS19 fatcat:yrmpvhq4ezhglii4sfrm3yoqha

Scalable, accurate multicore simulation in the 1000-core era

Mieszko Lis, Pengju Ren, Myong Hyon Cho, Keun Sup Shim, Christopher W. Fletcher, Omer Khan, Srinivas Devadas
HORNET can run in network-only mode using synthetic traffic or traces, directly emulate a MIPS-based multicore, or function as the memory subsystem for native applications executed under the Pin instrumentation  ...  We present HORNET, a parallel, highly configurable, cycle-level multicore simulator based on an ingress-queued wormhole router NoC architecture.  ...  frugal communication; the plentiful bandwidth and relatively short latencies available in NoC-based multicores make this kind of optimization less critical today.  ... 
doi:10.1109/ispass.2011.5762734 dblp:conf/ispass/LisRCSFKD11 fatcat:337x5tplaraezctrx7jtlzhqtq

Command-Triggered Microcode Execution for Distributed Shared Memory Based Multi-Core Network-on-Chips

Xiaowen Chen
2015 Journal of Software  
Memory accesses are handled by the programmable coprocessor to hit the right memory banks in local or remote nodes.  ...  As microcode examples, basic DSM functions (Virtual-to-Physical address translation, shared memory access and synchronization) are implemented by following the proposed hardware/software co-design flow  ...  It plots the average read transaction latency for uniform and hotspot traffic versus burst length in a 8×8 mesh multicore NoC. The burst length varies from 1, 2, 4, 6 to 8 words.  ... 
doi:10.17706/jsw.10.2.142-161 fatcat:jovscxriofb5hfi7brsnphkiqe
« Previous Showing results 1 — 15 out of 782 results