Scalable, distributed cycle-breaking algorithms for gigabit Ethernet backbones

Francesco De Pellegrini, David Starobinski, Mark G. Karpovsky, Lev B. Levitin
2006 Journal of Optical Networking  
Ethernet networks rely on the so-called spanning tree protocol (IEEE 802.1d) in order to break cycles, thereby avoiding the possibility of infinitely circulating packets and deadlocks. This protocol imposes a severe penalty on the performance and scalability of large Gigabit Ethernet backbones, since it makes inefficient use of fibers and may lead to bottlenecks. In this paper, we propose a significantly more scalable cycle-breaking approach, based on the novel theory of turn-prohibition.
more » ... -prohibition. Specifically, we introduce, analyze and evaluate a new algorithm, called Tree-Based Turn-Prohibition (TBTP). We show that this polynomialtime algorithm maintains backward-compatibility with the IEEE 802.1d standard and never prohibits more than 1/2 of the turns in the network, for any given graph and any given spanning tree. We further introduce a distributed version of the algorithm that nodes in the network can run asynchronously. Through extensive simulations on a variety of graph topologies, we show that the TBTP algorithm can lead to an order of magnitude improvement over the spanning tree protocol with respect to throughput and end-of-end delay metrics. In addition, we propose and evaluate heuristics to determine the replacement order of legacy switches that results in the fastest performance improvement. c 2005 Optical Society of America OCIS codes: 060.4250, 060.4510. Gigabit Ethernet has the same plug-and-play functionalities as its Ethernet (10 Mb/s) and Fast Ethernet (100 Mb/s) precursors, requiring minimal manual intervention for connecting hosts to the network. In addition, Gigabit Ethernet relies on full-duplex technologies and on a flow control (backpressure) mechanism that significantly reduce the amount of congestion and packet loss in the network [1, 4] . More specifically, the flow control mechanism (IEEE 802.3x) prevents switches from loosing packets due to buffer overflow. This protocol makes use of Pause messages, whereby a congested receiver can ask the transmitter to suspend (pause) its transmissions. Each Pause message includes a timer value that specifies how long the transmitter needs to remain quiet. Currently, the network topology for Gigabit Ethernet follows the traditional rules of Ethernet. The spanning tree protocol (IEEE 802.1d) is used to avoid the occurrence of any cycle in the networks, thus pruning the network into a tree topology [5] . The reasons for breaking cycles are two-fold. The first is to avoid broadcast packets (or packets with unknown destination) from circulating forever in the network. Unlike IP, Ethernet packets do not have a Time-to-Live (TTL) field. Moreover, Ethernet switches must be transparent, which means that they are not allowed to modify headers of Ethernet packets. The second reason is to prevent the occurrence of deadlocks as a result of the IEEE 802.3x flow control mechanism [6]. Such deadlocks may occur when Pause messages are sent from one switch to another along a circular path, leading to a situation where no switch is allowed to transmit. The use of a spanning tree precludes this problem, since deadlocks cannot arise in an acyclic network [7] . The spanning tree protocol works well in LAN networks, which are often organized hierarchically and under-utilized [8] . However, it imposes a severe penalty on the performance and scalability of large Gigabit Ethernet backbones, since a spanning tree allows the use of only one cycle-free path in the entire network. As pointed out by the Metro Ethernet Forum, an industry-wide initiative promoting the use of optical Ethernet in metropolitan area networks, this leads to inefficient utilization of expensive fiber links and may result in uneven load distribution and bottlenecks, especially close to the root [9, 10]. One of the current approaches to address this issue is to overlay the physical network with logical networks, referred to as virtual LANs [5] . A spanning tree instance is then run separately for each virtual LAN (or group of virtual LANs). This approach of maintaining multiple spanning trees can add significant complexity to network management and be very CPU-intensive [9] . Other approaches based on multiple spanning trees are described in [11, 12] . In this paper, we propose a significantly more scalable approach, based on the novel theory of turn-prohibition [13, 14] , in order to solve the cycle-breaking problem in Gigabit Ethernet backbones. Turn-prohibition is much less restrictive than linkprohibition, the approach employed to construct a spanning tree. The main idea is to consider pairs of links around nodes, referred to as turns [15] , and show that all the cycles in a network can be broken through the prohibition of carefully selected turns in the network (a turn (a, b, c) around node b is prohibited if no packet can be forwarded from link (a, b) to link (b, c)). One of the main challenges in making use of the turn-prohibition approach is to maintain backward-compatibility with the IEEE 802.1d standard. Our main contribution in this paper is to propose and analyze a novel algorithm, called Tree-Based Turn-Prohibition (TBTP), that addresses this issue. This algorithm receives a graph along with a spanning tree, as its input, and generates a set of prohibited turns, as
doi:10.1364/jon.5.000122 fatcat:y7isgbwl6rbnragufsl36uzkwq