Editor: Marko Bertogna; Article No. 25

Tao Qian, Frank Mueller, Yufeng Xin
2017 22 Leibniz International Proceedings in Informatics Schloss Dagstuhl-Leibniz-Zentrum für Informatik   unpublished
In a distributed computing environment, guaranteeing the hard deadline for real-time messages is essential to ensure schedulability of real-time tasks. Since capabilities of the shared resources for transmission are limited, e.g., the buffer size is limited on network devices, it becomes a challenge to design an effective and feasible resource sharing policy based on both the demand of real-time packet transmissions and the limitation of resource capabilities. We address this challenge in two
more » ... challenge in two cooperative mechanisms. First, we design a static routing algorithm to find forwarding paths for packets to guarantee their hard deadlines. The routing algorithm employs a validation-based backtracking procedure capable of deriving the demand of a set of real-time packets on each shared network device, and it checks whether this demand can be met on the device. Second, we design a packet scheduler that runs on network devices to transmit messages according to our routing requirements. We implement these mechanisms on virtual software-defined network (SDN) switches and evaluate them on real hardware in a local cluster to demonstrate the feasibility and effectiveness of our routing algorithm and packet scheduler. 1998 ACM Subject Classification C.3 Real-Time and Embedded Systems 1 Introduction In a distributed computing environment, multiple compute nodes share communication resources to transmit data in order to collaborate with each other. In such systems, employing an effective resource sharing mechanism is essential to meet the real-time requirements of tasks. These mechanisms can be divided into two categories. First, the compute nodes control their own behavior of how to utilize shared resources. Past research has studied mechanisms of shaping the resource access pattern to increase timing predictability, e.g., memory sharing based on limiting the memory bandwidth for different cores [31, 32] and network bandwidth limitation on compute nodes connected via Ethernet [23]. These mechanisms are passive since they can only reduce the probability of resource contention instead of preventing contention in the first place. Thus, passive mechanisms usually guarantee probabilistic deadlines. The second category includes active resource sharing mechanisms, which either assign the resource * This work was supported in part by NSF grants 1525609, 1329780, 1239246, 0905181 and 0958311. † Aranya Chakrabortty helped to scope the problem in discussions.