QoS for high-performance SMT processors in embedded systems

F.J. Cazorla, A. Ramirez, M. Valero, P.M.W. Knijnenburg, R. Sakellariou, E. Fernandez
2004 IEEE Micro  
Embedded systems have specific constraints and characteristics, such as realtime constraints, low-power requirements, and severe cost limitations that differentiate them from general-purpose systems. Processors for embedded systems typically are simple, with short pipelines and in-order execution. When they are used for real-time applications, they also lack unpredictable components, such as caches and branch predictors. These bare processors provide predictable performance, and hence they can
more » ... uarantee worst-case execution times of realtime applications. However, embedded systems must host increasingly complex applications and have increasingly higher data throughput rates. To meet these growing demands, future embedded processors will resemble current high-performance processors. For example, the new Philips TriMedia already has a deep pipeline, L1 and L2 caches, and branch prediction. 1 But because of their unpredictable components, such processors have unpredictable execution times, so they are difficult to use in real-time applications. Because embedded processors must be low in cost, obtaining as much performance as possible from each resource is desirable. Hence, a viable option is a simultaneous multithreading (SMT) processor, which shares many resources between several threads for a good cost-performance tradeoff. 2 An SMT design adapts a superscalar processor's front end to fetch from several threads, while the back end is shared. An instruction fetch policy decides from which threads to fetch instructions, thereby implicitly determining how internal processor resources, such as rename registers or instruction queue (IQ) entries, are allocated to threads. SMT processors have high throughput but, because of uncontrolled interference between threads, poor performance predictabilityeven worse than that of superscalar processors running only one thread. This poses problems for the suitability of high-performance SMT processors in real-time systems. Other resource-sharing approaches include multiprocessors, which share only the higher
doi:10.1109/mm.2004.37 fatcat:pt23h6klhvd3lffgxnshrt7rse