The Latest Fad in Data Center Underlay Networking – Leaf and Spine Architecture

by April 7, 2016 0 comments

– Anuj Goel, Consultant-IP & Optics Networks, Nokia

A traditional three-tiered model was designed for use in general networks, usually segmented into pods which constrained the location of devices such as virtual servers. For many years, data center networks have been built in layers that, when diagrammed, suggest a hierarchical tree. As this hierarchy runs up against limitations, a new model is taking its place.

In the hierarchical tree data center, the bottom of the tree is the access layer, where hosts connect to the network. The middle layer is the aggregation, or distribution, layer, to which the access layer is redundantly connected. The aggregation layer provides connectivity to adjacent access layer switches and data center rows, and in turn to the top of the tree, known as the core.

The hierarchical tree data center model

The hierarchical tree data center model

The core layer provides routing services to other parts of the data center, as well as to services outside of the data center such as the Internet, geographically separated data centers and other remote locations.


The Leaf-Spine Model

This model scales somewhat well, but it is subject to bottlenecks if uplinks between layers are oversubscribed. This can come from latency incurred as traffic flows through each layer and from blocking of redundant links (assuming the use of the spanning tree protocol, STP). Another issue is Inter-subnet traffic tromboning to Core node (while overwhelming majority of traffic is E-W traffic).


Anuj Goel Consultant-IP & Optics Networks, Nokia

“With Leaf-Spine configurations, all devices are exactly the same number of segments away and contain a predictable and consistent amount of delay or latency for traveling information. This is possible because of the new topology design that has only two layers, the Leaf layer and Spine layer.”

The Leaf-Spine Architecture

With Leaf-Spine configurations, all devices are exactly the same number of segments away and contain a predictable and consistent amount of delay or latency for traveling information. This is possible because of the new topology design that has only two layers, the Leaf layer and Spine layer. The Leaf layer consists of access switches that connect to devices like servers, firewalls, load balancers, and edge routers. The Spine layer (made up of switches that perform routing) is the backbone of the network, where every Leaf switch is interconnected with each and every Spine switch.

To allow for the predictable distance between devices in this two-layered design, dynamic Layer 3 routing is used to interconnect the layers. Dynamic routing allows the best path to be determined and adjusted based on responses to network change. This type of network is for data center architectures with a focus on “East-West” network traffic. “East-West” traffic contains data designed to travel inside the data center itself and not outside to a different site or network. This new approach is a solution to the intrinsic limitations of Spanning Tree with the ability to utilize other networking protocols and methodologies to achieve a dynamic network.

Advantages of Leaf-Spine Architecture

With Leaf-Spine, the network uses Layer 3 routing. All routes are configured in an active state through the use of Equal-Cost Multipathing (ECMP). This allows all connections to be utilized at the same time while still remaining stable and avoiding loops within the network. With traditional Layer 2 switching protocols like Spanning Tree on three-tiered networks, it must be configured on all devices correctly and all of the assumptions that Spanning Tree Protocol (STP) relies on must be taken into account (one of the easy mistakes to make when configuring STP is with mislabeling device priorities which could lead to an inefficient path setup). The removal of STP between the Access and Aggregation layers in lieu of Layer 3 routing results in a much more stable environment.

Another advantage is the ease of adding additional hardware and capacity. When oversubscription of links occurs (meaning that more traffic is generated than can be aggregated onto the active link at one time), the ability to expand capacity is straightforward. An additional Spine switch may be added and uplinks may be extended to every Leaf switch, resulting in the addition of interlayer bandwidth and reduction of the oversubscription. When device port capacity becomes an issue, a new Leaf switch can be added by connecting it to every spine and adding the network configuration to the switch. The ease of expansion optimizes the IT department’s process of scaling the network without managing or disrupting the Layer 2 switching protocols.

Use Cases for Leaf-Spine Architecture

Web scale applications where server location within the network is static would benefit from the implementation of Leaf-Spine. The use of Layer 3 routing between layers of the architecture does not hinder web scale applications because they do not require server mobility. The removal of Spanning Tree Protocol (STP) results in a more stable and reliable network performance of East-West traffic flows. Scalability of the architecture is also improved.

Enterprise applications leveraging mobile virtual machines (e.g. vMotion) create an issue when a server needs to be supportable anywhere within the data center. The use of Layer-3 routing and lack of VLANs extending between Leafs breaks this requirement. To work around this issue, a solution such as Software Defined Networking (SDN) can be employed, which creates a virtual Layer 2 or Overlay tunnels above/on top of the Leaf-Spine network. This allows servers to move around within the environment with impunity at no detriment to “East-West” performance, scalability, and stability attributes of a Leaf-Spine network topology.

Other Considerations – BGP as Control Plane Protocol

Routing in the traditional DCs was done by IGPs (OSPF/ ISIS/ EIGRP). But, for large-scale DCs there is a lot more complexity to consider. When creating an IP Fabric there are a few services that we need: prefix distribution, prefix filtering, traffic engineering, traffic tagging, and multi-vendor stability. Creating an IP Fabric is an incremental process; not many people build out the entire network to the maximum scale from day one. It’s critical that the IP Fabric architecture not change over time and the protocols used are stable across a set of different vendors.

What is interesting is that BGP pulls ahead as the best protocol choice in creating an IP Fabric. It excels in prefix filtering, traffic engineering, and traffic tagging. BGP was intended for large-scale networks and has number of advantages over IGPs like OSPF and ISIS:

  • Failure propagation has a limited scope => More stable network
  • Troubleshooting: Path shows precisely where prefix originated and how it was propagated
  • Traffic Engineering through path attributes manipulation


Leaf-Spine networks offer many unique benefits over the traditional 3-tier model. The use of BGP for Layer 3 routing with Equal Cost Multipathing (ECMP) improves the stability and total available bandwidth by utilizing all available uplinks. With easily adaptable configurations and design, Leaf-Spine has improved the IT department’s management of oversubscription and scalability. Eliminating the Spanning Tree Protocol (STP) has led to drastically improved network stability. Utilizing new tools and the ability to overcome inherent limitations with other solutions such as SDN, Leaf-Spine environments allow IT departments and data centers to thrive while accomplishing all needs and wants of the business.

No Comments so far

Jump into a conversation

No Comments Yet!

You can be the one to start a conversation.