Layer 2 Switching and Spanning Tree Protocol (STP)

0
17
layer 2 switching and spanning tree protocol stp

We already have an article “Datalink networking concepts”, that you can use bridges and switches to segment a LAN into smaller collision domains. This article looks in close detail at the operations of bridges and switches. Specifically, you will investigate the transparent functionality that occurs when a switch is building and utilizing its frame forwarding logic, as well as the peculiar nature of Spanning Tree Protocol (STP) in redundant switched networks.

Switching Functionality

Switches forward frame based on the Layer 2 Ethernet MAC addresses. These devices receive Ethernet frames transmitted from other devices and dynamically build a MAC address table based on the source MAC address inside those frames. This MAC address table is commonly referred to as a Content Addressable Memory (CAM) table.

These dynamic entries in the CAM table are not permanent, however. After the switch or bridge stops receiving frames from a certain MAC address (this varies, but it’s typically five minutes), the entry is removed from the CAM table to save memory and processor resources. The exceptions to this are static MAC entries that have been manually configured on a port by the port basis for security and control purposes.

When deciding which port to forward the Ethernet frame, a switch consults this CAM table and forwards the Ethernet frame based on the destination MAC address of the Ethernet header. In instances where the destination MAC address is not in the table, it copies and forwards the frame out every port except the one at which it was received. This action is commonly known as flooding.

Recall that switches segment LANs into collision domains; however, they still are in a single broadcast domain. Switches do not have entries for broadcast addresses (FFFF.FFFF.FFFF) or multicast address (0100.5E00.000-0100.5E7F.FFFF) in their CAM tables. As previously mentioned, when a bridge or a switch receives a frame with a destination MAC address not in its table, it floods that frame out every port.

For instance, consider the switched topology example illustrated in the picture below. When computers A, B, and C and printer D originally sent an Ethernet frame, the switch recorded the source MAC address of that frame and the associated port in its CAM table. If computer A sends an Ethernet frame destined for printer D’s MAC address of 1111.2222.3333, the switch forwards only that frame out to its Fast Ethernet 0/14 interface. If computer A sends a broadcast with a destination of FFFF.FFFF.FFFF, that entry does not exist in the CAM table, so that frame is flooded out all interfaces except for Fast Ethernet0/1.

layer 2 switching and spanning tree protocol stp

Notice that computers B and C are plugged into a hub. So what happens when computer B sends an Ethernet frame to computer C? the frame hits the Layer 1 hub, which regenerates the signal out all ports except the one it came in on (regardless of the MAC address because it is a Physical layer device). When the frame reaches the switch, the switch realizes that the source and destination MAC addresses reside on the same interface, so it does not send that frame on to any other ports. This process is also commonly known as filtering.

Frame Transmission Methods

Switches are often classified based on the method in which they process and forward frames in and out of their interfaces. This classification differs depending on the device’s processing capabilities and the manufacturer. The three transmission methods that a bridge or switch may use are discussed in the following section.

Store and forward

Properly named, the store and forward method of frame transmission involves the switch, which buffers (stores temporarily in small memory location) the entire Ethernet frame and performs a cyclic redundancy check (CRC) of that frame to make sure it is not a corrupted frame (damaged or abnormally changed in frame’s transmission). If the frame calculation detects a bad frame, it is dropped at that point. Thus, the frame is forwarded only if the CRC calculation results in a normal frame.

Because the entire frame is checked, store-and-forward switching is said to be latency (delay) varying. In other words, depending on the payload inside the frame, the switch takes varying processing times to buffer the entire frame and perform the CRC before sending it to its destination. Although this method sounds like a lengthy process, this is the most widely used method of switching in Cisco Catalyst switches because the hardware processors for the interfaces are so advanced and robust that the switch hardly works up a sweat.

Cut-through

Cut-through transmissions are practically the antithesis of store-and-forward frame transmission. In fact, instead of processing the entire frame, cut-through switching entails the switch buffering just enough information to know where to forward the frame before sending it on to another segment. In other words, it looks only up to the destination MAC address in the Ethernet header and sends it on regardless of whether the frame contains errors.

This hot-potato method of frame transmission was once appealing for devices with low processing power. Because it has to inspect only the beginning of an Ethernet frame header, latency is not a factor with this method. The downside of cut-through switching, however, is that it still passes bad frames on to other segments because it does not perform CRC calculations of any kind.

Fragment-free

In a true Goldilock fashion, if cut-through is too hot and store and forward is too cold, fragment-free may be just right for you. Fragment-free is a hybrid of the two transmission methods because it buffers up to the first 64 bytes of the frame (all collisions occur within the first 64 bytes). This obviously is not as fast as cut-through; nevertheless, it ensures that many of the invalid frames in the LAN are not transmitted on to other segments. The following picture illustrates how much of an Ethernet frame is buffered and processed with each of the three transmission methods discussed.

layer 2 switching and spanning tree protocol stp

Half and Full-Duplex Connection

Data communication on switch ports can occur in either half or full-duplex transmissions. Half-duplex connections are unidirectional in that data can be sent in only one direction at a time. This is similar to two-way radios or walkie-talkies, in which only one person can speak at one time. With half-duplex communication in an Ethernet network, CSMA/CD (carrier sense multiple access with collision detection) is enabled, which results in 50 to 60% of the bandwidth on the link is available to be used.

Full duplex, on the other hand, is indicative of two-way communication in which devices can send and receive information at the same time. With these connections, CSMA/CD is automatically disabled, allowing for theoretically 100% of the bandwidth in both directions. In fact, it uses the two wires that typically are used for detecting collisions to simultaneously transmit and receive. Because CSMA/CD is disabled, that means the connection has to be in an environment where collisions cannot occur. In other words, it must be connected to a switch or directly connected with a cross-over cable.

Switching design

You have already seen how switches operate when connected to end-user devices such as PCs, printers, and servers. However, when switches are connected to other switches to form a redundant network, a switching loop can occur. The following picture illustrates a scenario in which a switching loop can occur.

layer 2 switching and spanning tree protocol stp

In this design, redundant links interconnect the switches. Although it is a good idea to have redundancy in the network, the problem arises when a computer sends out a frame with a broadcast, multicast, or unknown unicast destination MAC address. Recall that any of these three transmissions causes a switch to copy and flood that frame out all ports except for the one on which it came in. so if computer A sends a broadcast, switch A floods that out to Switches B and D. again, if this is a broadcast message, switches B and D flood that frame out to switch C. staying true to its design, Switch C floods the frame back to switches B and D, and so on. Broadcasts continuously circle the switched network until ultimately the amount of bandwidth and all traffic ceases to flow. This unsettling scenario just described is called a broadcast storm and can be avoided completely by using a Layer 2 protocol sent among switches called the Spanning Tree Protocol.

Spanning Tree Protocol

Once a proprietary protocol from DEC, spanning Tree Protocol (STP) was standardized and blessed by the IEEE specification, 802.1d. STP allows networks to maintain a level of redundancy while disabling the detrimental side effects that can occur such as broadcast storms. Enabled by default on most switches, STP forms noncircular (no looping) paths throughout the internetwork by performing an election and basing calculations on the election. These calculations dictate which ports should remain in a non-forwarding (known as blocking) state to eliminate redundant loops that can cause broadcast storms. STP also can react changes in the switched network to ensure that the redundant links may be used in the event of a topology change such as a link going down. The following sections explain exactly how this remarkable protocol operates behind the scenes in a LAN.

Root Bridge

As previously mentioned, STP performs an election in the switched topology. The winner of this election serves as the base of all calculations and ultimately becomes the root of the spanning tree. Conveniently, this elected bridge or switch is called the root bridge. From the root bridge, non-circular branches extend throughout the switched network like those of a tree a spanning tree. So how does this election take place? You can rule out voting because each bridge or switch believes itself to be the root bridge at startup. The deciding factor on who becomes the root bridge is something referred to as the Bridge ID. The bridge ID comprises two components:

  • Priority: This is an arbitrary number from 0 to 61440, which can be administratively set in increments of 4096. The default value for priority is 32768, or 8000 in hex.
  • MAC address: The 48-bit MAC address of the switch itself.

The device with the lowest Bridge ID becomes the root bridge. If a new switch or bridge is added with a lower Bridge ID to the switched network, a new election takes place, and that switch ultimately becomes the new root bridge for the switched network.

Consider the example in the picture below. Notice that all switches have their default priority value of 32768 in their Bridge ID. Thus, the lowest MAC address ultimately dictates who will win the election. Because switch A has the lowest MAC address in the switched network, it will be the root bridge.

Layer 2 Switching and Spanning Tree Protocol (STP)

Because this election process occurs automatically with bridges and switches, it is highly advised that you change your priority in a robust and reliable switch in your internetwork as opposed to letting this election occur by chance. This is especially true because manufacturers choose the MAC address, and a lower MAC address could very well mean an old or low-end switch or bridge, which might not be the best choice for your root bridge. How to manually set the priority is discussed in further articles.

These bridges IDs are advertised to each other through Bridge Protocol Data Units (BPDUs). These messages are sent as multicasts every two seconds by the root bridge out its interfaces to other switches on adjoining segments who, in turn, forward them on to other connected switches. In addition, these messages also contain the Bridge ID of the root bridge in every update that is sent. As long as you are receiving BPDUs that contain a higher Bridge ID than your Bridge ID, you will remain the root bridge (because all devices assume they are the root at startup).

Root Ports

In addition to the local bridge ID and the root bridge ID, BPDUs contain information that helps switches perform calculations to decide which ports should be forwarding and which should be blocking to create a loop-free switched network. The key to this calculation lies within the cumulative cost back to the root bridge. Although it sounds as if these Cisco switches are keeping track of how much you paid for them, this is not what is meant when you use the term “cost”. The cost is actually an inverse of the bandwidth for each link segment. Because it is the inverse, the lower cumulative cost back to the root bridge, the faster the path is.  The following table lists the standard costs used today in switches. It is possible to change these values administratively if you want to control which link becomes the best path to the root bridge.

Interface Cost
10Gbps 2
1Gbps 4
100Mbps 19
10Mbps 100

After the root bridge is determined, each non-root switch or bridge forms an association back to the root bridge based on the lowest cumulative path cost back to the root. Whichever interface has the fastest route to the root bridge automatically becomes a forwarding port called the root port.

The root bridge advertises a root path cost of 0 to switches B and D. as the BPDU enters their interfaces, they add the cost value of that interface and advertise that to any adjacent switches on other segments. Every non-root bridge determines its fastest path back to the root by looking at these BPDUs that it receives from other switches. For instance, switch B knows that going out of the top segment back to the root has a cost of 4, and going through switch C has a cost of 42. Because the top segment has the lowest cumulative cost, that becomes the root port for switch B.

What would happen if there were a tie in the root path cost? For instance, switch C has two equal-cost paths of 23 back to the root bridge through Switch B and Switch D. in the event of a tie, the following are calculated to determine the root port.

  1. The port with a switch advertising the lowest Bridge ID.
  2. If the same Bridge ID (parallel links to the same switch), the lowest port priority is used. The port priority is an arbitrary number assigned to an interface that can be administratively set to choose one link over another. The default value is 128.
  3. If the same port priority, the ultimate tiebreaker is the lowest interface number, for example, FastEthernet 0/1 over FastEthernet 0/6, because the links are identical.

The following figure expands on the switched networking example to include the path costs.

Layer 2 Switching and Spanning Tree Protocol (STP)

Designated Ports

After every switch has determined its root port, the switches and bridges determine which port is to become the designated port for every segment that connects the two switches. As the name states, the designated port is the port on each interconnecting segment that is designated to forward traffic from that segment to another segment back to the root bridge. This too is determined through a calculation of the fastest way back to the root port. In the case of a tie, the same decision criteria apply to designated ports as root ports as described earlier.

In the following figure, the designated ports have been calculated based on which switch is advertising the lowest cumulative cost back to the root of each segment. For instance, the BPDUs from switch B to switch C are advertising a root path cost of 19, whereas the BPDU being sent from switch C to switch B is advertising 38. Because Switch B has the lower root path cost, that is the designated port for that segment.

Layer 2 Switching and Spanning Tree Protocol (STP)

Blocked Ports

To this point, the discussion has focused on how to determine which ports will be forwarding traffic in a switched network. Yet to be addressed is the original point of STP, which is to remove any potential switching loops. To remove potential switching loops, switches and bridges keep any port that is not rotting or designated port in a blocked state. Keep in mind that a blocked state is not disabled (shut down); the interface is just not participating in forwarding any data. Blocked interfaces still receive BPDUs from other switches to react to any changes in the topology.

In the following figure, notice that all the root ports have been elected, as well as the designated ports for each segment. Notice on the segment between switch C and Switch D that a port connected to switch C is not a root port or a designated port. This port blocks user data to ensure that a switching loop does not occur and expose the network to broadcast storms. This also means that any devices connected to switch C sending Ethernet data to any device connected to switch D will ultimately go through switch B, and then switch A, to finally arrive at switch D.

Layer 2 Switching and Spanning Tree Protocol (STP)

Port state Transitions

You now know how STP removes switching loops in your switched LAN by electing a root bridge and calculating which ports should forward based on the lowest root path cost. However, as explained earlier, STP must be able to react to topology changes, such as a segment or switch going down, to ensure the redundant design is put to good use. When this type of change occurs, ports were once in a blocking state that could quite possibly transition to the forwarding state.

If devices were to immediately transition from a blocking state to a forwarding state, they could easily cause loops in the network because the topology change did not have a chance to propagate throughout the entire switched network. To remedy this dilemma, STP transitions into two intermediate states before moving to a forwarding role. In these transitionary states, the switch ensures that enough time has transpired to propagate the changes, and it undergoes a pre-forwarding routine to ensure that it will know where to forward the data when the interface is forwarding. The following table displays, in order, the possible STP states, their functions, and the time it takes to transition out of each state.

State Function Transition time
Disabled The interface is administratively shut down or inoperative as a result of a security violation. N/A  
Blocking It does not forward any user data. All ports start out in this state. It does not send, but still, receive BPDUs to react to topology changes. 0 to 20 seconds
Listening It begins to transition to a forwarding state by listening and spending BPDUs. No user data sent. 15 seconds
Learning It begins to build MAC addresses learned on the interface. No user data sent. 15 seconds
Forwarding User data forwarded.  

It may initially take the switch 20 seconds to start the transition process to the listening stage because that is the default time limit that STP uses to consider a neighbor switch to be down. In other words, if a switch stops hearing 10 BPDUs (equal to 20 seconds) from an adjoining switch or bridge, it considers that device to be dead and must react to the new topology. This 20-second timer is known as the max-age timer.

When a topology change occurs in the network, a nonroot switch sends a specific BPDU called a Topology Change Notification (TCN) out its root port back to the root bridge. This BPDU is one of the only times that a BPDU does not originate from the root bridge. As soon as the root bridge finally receives that notification, it broadcasts a special BPDU to all the switches in the network to start aging out old MC entries in their CAM tables after about eight times faster (default is 300 seconds). At that point, the switches start rebuilding their CAM tables to reflect the new topology and forwarding frames accordingly.

The listening and learning states wait 15 seconds each by default but can be administratively changed if you have a relatively small switched network. These 15-second intervals are commonly referred to as forwarding delays because they delay the transition to a forwarding state. It is important to consider that it could take up to 50 seconds for an interface to transition to a forwarding state when the topology changes. Consequently, no data is transferred in those 50 seconds which in the networking world is about 10 phone calls of complaining end users.

The max-age and forward delay timers are based on a default network diameter of server switches including the root bridge. Diameter (in switching terms) refers to the number of bridges or switches between any two hosts. If your network is, for instance, only a diameter of 2, you can decrease these timers because it doesn’t take as long to propagate a change in the topology. Another benefit of STP is that these timers are ultimately dictated by the root bridge. Thus, to change timers, you have to configure the change on only the root bridge, and it gets to propagate the other switches. This change could possibly backfire and cause switching loops in instances when you add more switches to the network and forget to change the timers. The next section discusses some safer alternatives to speed up the convergence time of STP when a topology change occurs.

Initial switch configurations

Catalyst switches, for the most part, are designed so that the default state of the switch allows for basic layer 2 functionality without requiring any configuration from the administrator. For example, the physical interfaces on the switch are already enabled, which means that you can plug a cable in the switch and the interface operates without requiring you to perform a no shutdown on that interface. Does that mean you don’t have to learn about catalyst switch commands? No such luck.

The majority of the administrative configurations such as configuring hostnames, login banners, passwords, and telnet/SSH access are identical to the configurations of the router IOS, as described in the article “Foundation Cisco IOS operations”.