A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Addressing End-to-End Memory Access Latency in NoC-Based Multicores
2012
2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
NoC Based Multicores Contribution of NoC to end-to-end memory access latency MC1
MC0
MC2
MC3
NoC Based Multicores
Contribution of NoC to end-to-end memory access latency
4
5
3
1 ...
2
MC1
Request
Message
Response
Message
End-to-End Memory Latency
L2-L1
Mem-L2
Mem
L2-Mem
L1-L2
End-to-End Memory Latency
0
0.05
0.1
0.15
0.2
0.25
100
300
500
700 ...
doi:10.1109/micro.2012.35
dblp:conf/micro/SharifiKKD12
fatcat:3sbvyhlxzvhipppibxrdsu5lva
Models of Communication for Multicore Processors
2015
2015 IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops
To efficiently use multicore processors we need to ensure that almost all data communication stays on chip, i.e., the bits moved between tasks executing on different processor cores do not leave the chip ...
In this paper we explore the different hardware mechanism for on-chip communication and how they support or favor different models of communication. ...
ACKNOWLEDGMENT The work presented in this paper was partially funded by the Danish Council for Independent Research | Technology and Production Sciences under the project RTEMP, contract no. 12-127600 ...
doi:10.1109/isorcw.2015.57
dblp:conf/isorc/SchoeberlSS15
fatcat:2xfammjbbnb5vkdm6zfd5wxefy
Hierarchical Cluster Based NoC Design Using Wireless Interconnects for Coherence Support
2016
2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID)
In this paper, we propose a novel multicore hierarchical cluster based architecture that utilizes a hybrid NoC having both wired and wireless interconnects. ...
To the best of our knowledge, our work is the first to utilize wireless interconnects to solve the cache coherence problem in multicore system on NoC. ...
L2 cache access latency The L2 cache access latency for an architecture is directly proportional to inter-core Manhattan distance. ...
doi:10.1109/vlsid.2016.54
dblp:conf/vlsid/ShreedharD16
fatcat:aed5n6evv5bgvfzv67zpjkxce4
A Time-Predictable Memory Network-on-Chip
2014
Worst-Case Execution Time Analysis
The memory network-onchip is organized as a tree with time-division multiplexing (TDM) of accesses to the shared memory. ...
The TDM based arbitration completely decouples processor cores and allows WCET analysis of the memory accesses on individual cores without considering the tasks on the other cores. ...
Source Access The source of the described memory NoCs is open source under the simplified BSD licenses and available at GitHub within the Patmos project: https://github.com/t-crest/ patmos. ...
doi:10.4230/oasics.wcet.2014.53
dblp:conf/wcet/SchoeberlCPS14
fatcat:flub4mutdfby7cp4l54cqayxky
A Multicore Processor for Time-Critical Applications
2018
IEEE design & test
Time-critical applications need a processor and software where it is possible to prove that all critical tasks will complete in time. ...
This paper presents such a radically different design of a multicore processor for future time-critical systems. ...
Acknowledgment We would like to thank all T-CREST team members and students who helped to build this platform and for all the joy of the discussions during the project meetings and dinners: Sahar Abbaspour ...
doi:10.1109/mdat.2018.2791809
fatcat:rqywywqm2nd3bhiad6atqrumdy
Support for the logical execution time model on a time-predictable multicore processor
2016
ACM SIGBED Review
In this work, we extend a multicore operating system running on a timepredictable multicore processor to support the LET model. ...
We report our experiences and present results on the costs in terms of memory and execution time. ...
In contrast to the AEthereal family of NoCs our NoC implements TDM arbitration from end-to-end. I.e., access to the scratchpad memory (SPM) is scheduled with the NoC TDM schedule. ...
doi:10.1145/3015037.3015047
fatcat:65x7ewyknjello7mmd7khnlmgy
Reducing NoC and Memory Contention for Manycores
[chapter]
2016
Lecture Notes in Computer Science
Such systems provide increased processing power and system availability, but often impose latencies and contention for memory accesses as multiple cores try to reference data at the same time. ...
Experiments show that targeted memory allocation results in reduced execution times and NoC contention, the latter of which has not been studied before at this scale. banks. ...
This work was funded in part by NSF grants 1239246 and 1058779 as well as a grant from AFOSR via Securboration. ...
doi:10.1007/978-3-319-30695-7_22
fatcat:ntdpzv2epnevtpdmysb5hb5dr4
One-way shared memory
2018
2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)
With the use of time-division multiplexing for the memory accesses and the network-on-chip routers we achieve a timepredictable solution where the communication latency and bandwidth can be bounded. ...
A network-on-chip constantly copies data from a sender core-local memory to a receiver core-local memory. ...
Acknowledgment The work presented in this paper was partially funded by the Danish Council for Independent Research | Technology and Production Sciences under the project PREDICT (http: //predict.compute.dtu.dk ...
doi:10.23919/date.2018.8342017
dblp:conf/date/Schoeberl18
fatcat:rwl2ywung5ggpplqq5vsxbnz7e
Static scheduled traffic allows computing upper bounds for end-to-end latencies of messages, which is a requirement for building multicore real-time systems. ...
Message passing using a network-on-chip (NoC) is an efficient way to provide core-to-core communication on a multicore processor. ...
Hoplite is modified to prioritize deflections and perform traffic shaping at the network interface to provide guarantees on end-to-end latencies for packets. ...
doi:10.1145/3356045.3360714
dblp:conf/micro/SchoeberlPS19
fatcat:lbnmtfqco5glnfybfvmq4nxr4i
Message Passing on a Time-predictable Multicore Processor
2015
2015 IEEE 18th International Symposium on Real-Time Distributed Computing
We combine these WCET numbers with the calculation of the network latency of a message and then provide a statically computed end-to-end latency for this core-to-core message. ...
For a multicore processor to be time-predictable, communication between processor cores needs to be time-predictable as well. ...
ACKNOWLEDGMENTS The work presented in this paper was funded by the Danish Council for Independent Research | Technology and Production Sciences under the project RTEMP, 1 contract no. 12-127600. ...
doi:10.1109/isorc.2015.15
dblp:conf/isorc/SorensenPSS15
fatcat:urvck7u3unaidluc4d6hp66goq
The Connection-Then-Credit Flow Control Protocol for Heterogeneous Multicore Systems-on-Chip
2010
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Connection-then-credits (CTC) is a novel end-to-end flow control protocol to handle message-dependent deadlocks in best-effort networks-on-chip (NoC) for embedded multicore systems-on-chip (SoCs). ...
Index Terms-End-to-end flow control, message-dependent deadlock, multicore systems-on-chip (SoC), network interface design, networks-on-chip (NoC). ...
end-to-end flow control protocol to regulate the access to the NoC [23] , [24] . ...
doi:10.1109/tcad.2010.2048592
fatcat:kx6vrrp3qjcm5izrujsopqu6wa
End-to-end schedulability tests for multiprocessor embedded systems based on networks-on-chip with priority-preemptive arbitration
2014
Journal of systems architecture
Simulation-based techniques can be used to evaluate whether a particular NoC-based platform configuration is able to meet the timing constraints of an application, but they can only evaluate a finite set ...
This paper presents a particular NoC-based multiprocessor architecture, as well as a number of analytical methods that can be derived from that architecture, aiming to allow designers to check, for a given ...
ACKNOWLEDGEMENTS The author would like to thank Zheng Shi, Alan Burns, Osmar Marchi dos Santos and Borislav Nikolic for the discussions on the tests presented in Section 4; and Paris Mesidis, Adrian Racu ...
doi:10.1016/j.sysarc.2014.05.002
fatcat:vrthcy3scvhxxfvwkus4stmfuy
Scratchpad Memories with Ownership
2019
2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)
The base architecture uses time-division multiplexing for the arbitration of the access to the shared SPM. ...
Having exclusive access to the SPM reduces the access time to a single clock cycle. ...
Acknowledgment The work presented in this paper was partially funded by the Danish Council for Independent Research | Technology and Production Sciences under the project PREDICT(no. 4184-00127A) and by ...
doi:10.23919/date.2019.8714926
dblp:conf/date/SchoeberlSBS19
fatcat:yrmpvhq4ezhglii4sfrm3yoqha
Scalable, accurate multicore simulation in the 1000-core era
2011
(IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE
HORNET can run in network-only mode using synthetic traffic or traces, directly emulate a MIPS-based multicore, or function as the memory subsystem for native applications executed under the Pin instrumentation ...
We present HORNET, a parallel, highly configurable, cycle-level multicore simulator based on an ingress-queued wormhole router NoC architecture. ...
frugal communication; the plentiful bandwidth and relatively short latencies available in NoC-based multicores make this kind of optimization less critical today. ...
doi:10.1109/ispass.2011.5762734
dblp:conf/ispass/LisRCSFKD11
fatcat:337x5tplaraezctrx7jtlzhqtq
Command-Triggered Microcode Execution for Distributed Shared Memory Based Multi-Core Network-on-Chips
2015
Journal of Software
Memory accesses are handled by the programmable coprocessor to hit the right memory banks in local or remote nodes. ...
As microcode examples, basic DSM functions (Virtual-to-Physical address translation, shared memory access and synchronization) are implemented by following the proposed hardware/software co-design flow ...
It plots the average read transaction latency for uniform and hotspot traffic versus burst length in a 8×8 mesh multicore NoC. The burst length varies from 1, 2, 4, 6 to 8 words. ...
doi:10.17706/jsw.10.2.142-161
fatcat:jovscxriofb5hfi7brsnphkiqe
« Previous
Showing results 1 — 15 out of 782 results