Lippis Report 200: Say Goodbye to Three-Tier Spanning Tree and Hello to Two-Tier Active-Active DC Networks
October 8th at Ixia’s iSimCity in Santa Clara, CA, started the industry’s first public test of data center switches that boast active-active multi-path protocols such as Transparent Interconnection of Lots of Links or TRILL and SPB or Shortest Path Bridging that eliminate active-standby Spanning Tree Protocol (STP) in the design of modern computer networks. This is a big deal as active-active is one of the main designs to speed up application performance in public and private cloud computing. To flatten and scale up cloud networks, the industry is offering multiple active-active fabric options such as Cisco’s FabricPath, Juniper’s Qfabric, Brocade’s VCS Fabric, Avaya’s VENA, Arista’s SDCN, Extreme’s Open Fabric, HP’s FlexFabric, IBM’s DOVE, etc. Some of these offerings are built with standard active-active protocols such as TRILL and SPBM others are proprietary. But it’s not as complex as it sounds, because even though the public active-active test is available without fee to all vendors, only four out of 17 companies are ready and have the confidence to test their products. Those are Arista Networks, Avaya, Brocade and Extreme Networks. In this Lippis Report Research Note, we share what we have learned thus far in the Lippis/Ixia Open Industry Active-Active Cloud Network Fabric Test for Two-Tier Ethernet Network Architecture.
A Realistic Approach To Dynamic Workload Scaling
Our industry has been ramping up to build private and public cloud infrastructure with 10GbE, increasing 40GbE and soon, 100GbE data center switches. IT architects are looking for direction as to what is the next generation data center/cloud network architecture. Based upon our large industry touch points, many architects are exploring a two-tier Ethernet fabric to increase application performance, thanks to its lower latency via a fully-meshed non-blocking network fabric or partial mesh. The set of tests taking place at iSimCity this autumn will provide empirical data that will enable the Lippis Report to advocate that IT architects implement a two-tier network architecture based upon active-active protocols. But the data will speak for itself, and what we have seen, thus far, is promising.
There are many drivers to this change in network design, such as server virtualization, increased compute density scale, hyperlinked servers, mobile computing, cloud economics, etc. All of these drivers are fundamentally changing traffic patterns toward east-west flows on top of existing north-south. At the center of next generation data center/cloud networking design are active-active protocols.
Cisco To Offer Campus Slicing via SDN/OpenFlow
Active-active eliminates STP in the design of modern computer networks. STP has been used since the 1980s when the first Ethernet networks were implemented with Digital Equipment Corporation’s 10/10 Ethernet bridges, a phenomenally successful product. But STP, by design, works in active-standby mode, shutting down one of two links between switches, for example. As such, precious bandwidth and cost is wasted, and most importantly, it fosters a three-tier network design of access, aggregation and core switching. Enter active-active protocols, such as TRILL and SPB
To deliver a two-tier network, STP’s active-standby is replaced with active-active multi-path links between servers and Top-of-Rack (ToR) switches, ToR-Core switches and between, Core switches. To increase bandwidth beyond, say 10GbE between switches, LAG or Link Aggregation Protocol and/or M-LAG (Multi-Chassis LAG) is used to increase link speeds to 10, 20, 30, 80 … Gbs. Some use ECMP or Equal Cost Multi-Path while others use their own aggregation protocol. Most IT architectures use M-LAG between ToR and Core switches and LAG between servers, and ToR switches today to flatten their networks and reduce the number of devices packets have to flow through to reach their destination. The two-tier design is based upon the assumption that large bandwidth trunks are created between ToR and Core switches, thanks to MLAG, while active-active protocols best utilize the available links to forward packets.
Cisco’s Nexus 1000V-based Programmable Virtual Network Overlays
The data center LAN switching market is $6 billion large and undergoing a significant evolution as an expected 80% of data traffic will be east-west by 2014, fundamentally changing network architecture requirements. Ethernet fabrics are the optimal platform to address converged storage/network, high-growth virtualization and cloud businesses plus the foundation for Open Networking, including Software Defined-Networking. Each networking vendor is seeking thought leadership with game-changing fabric technology. The Ethernet fabric addressable market is greater than $8 billion when fibre channel is included. Also margins for data center switching is much greater than all other Ethernet switch categories. In short, this market is in play as all vendors innovate to deliver Ethernet fabric solutions to market.
But IT architects do not have comparative active-active protocol performance information to assist them in purchase decisions and product differentiation. New data center Ethernet fabric design requires attributes, such as automated configuration, to support VM moves across L3 boundaries, low latency, high performance and resiliency under north-south plus east-west flows, low power consumption and minimum number of network tiers. In addition to these fabric attributes, nearly all IT business leaders are looking for a fundamentally different, meaning low-cost, operational model. During past industry cycles, open industry tests contributed to growing the network market by shortening sales cycles due to the elimination of IT departments conducting internal performance tests, if they could afford the equipment and possessed appropriate skill sets.
Cisco’s LISP For Workload Mobility in Multi-Data Center and Cloud Use Cases Explained
In addition, performance and resilience question obstacles were eliminated during product acquisition, thanks to reliable and repeatable industry data being available, speeding up market adoption. This was true in the LAN plus bridge/router industry battles as well as every major Ethernet switch evolution. There is no broad industry comparative metrics for active-active protocols—the cornerstone of today’s two-tier network fabric.
Lippis Enterprises has developed a series of open industry performance plus reliability tests and teamed with Ixia for their execution. The goal of the evaluation is to provide the industry with comparative performance and reliability test data across all active-active protocols. Both modular switching (Core plus End-of-Row) products and fixed ToR configuration switches are currently being tested. A final test report, due out in January 2013, will profile each supplier’s products and results, followed by a comparative section with industry recommendations.
Advances in Wireless LANs:Meeting the Needs of Midmarket Firms
Testing is taking place now into mid-December. Thus far, we have tested Brocade’s VDX 6720 ToR switches and its new VDX 8770 Modular switch series of Core switches. Brocade’s Virtual Cluster Switching (VCS) fabric is based upon TRILL and is included in its VDX switch family. From what we have seen thus far, it’s impressive.
Brocade’s VCS approach to data center Ethernet fabric design supports flows of both storage and data gram traffic over one physical Ethernet network fabric. Some of Brocade’s VCS fabric technology attributes we observed and tested are its automated provisioning of a switch entering the fabric, Layer 1, 2 plus 3 multi-pathing, VM awareness, scale out and high performance.
VXLAN: Eliminating Cloud Boundaries with SDN
Below is a quick checklist of what’s unique about Brocade’s VCS fabric built within its VDX 6720 and VDX 8770.
TRILL-based standard fabric over 1G, 10G and 40GbE.
VDX switches support of direct Fibre Channel and Fibre Channel over Ethernet connections, reducing cabling and storage switch cost.
Layer 3 multi-pathing, where a single VM may have as many as four default gateways, distributing load at Layer 3 and avoiding weird traffic patterns, such as tromboning, etc.
Auto provisioning of Layer 1, 2 and 3 active-active multi-pathing.
40GbE links between VDX 8770 or Core switches to create a high-speed fabric.
VM aware to support auto provisioning of VM joins, moves and removes.
Distributed Intelligence where each switch’s Channel Access Method are automatically updated and distributed to all switches when network changes occur.
Software-Defined Networking ready with VXLAN support and OpenFlow table distribution, thanks to its distributed intelligence.
These attributes and more enable the Brocade VCS fabric to deliver an Ethernet fabric with auto-provisioning and discovery features that can be leveraged to deliver an elastic and scalable cloud network with minimal IT operational staff for its management.
Moving to an Open Data Center with an Interoperable Network
So how are we testing Arista, Avaya, Brocade and Extreme’s for active-active fabrics? We seek fabric performance information; that is measured latency and packet loss. We also seek reliability; how fast does the fabric recover from disruption, such as a ToR switch going off line, topology changes, etc.? As we are testing both standard-based TRILL and SPB active-active protocols and MLAG implementations, there are two test suites. For standard TRILL and SPB, Ixia’s IxNetwork software simulates a core of our design; more on this below. For MLAG, we manually configure a logically core network within Ixia test equipment that distributes and receives traffic load through the vendor’s ToR and Core switch fabric.
MLAG Fabric Test
There are two MLAG fabric tests: one that supports servers that are single homed, and one that is dual homed. When servers are dual homed and either a server port or ToR switch is disconnected, fabric failover measurement is impossible as the server implements its own failover to its second NIC that’s connected to a secondary ToR switch. Therefore, to test reliability of the fabric, we start with single-homed server connections to a ToR switch.
Single-Homed MLAG Test
In the single-homed MLAG test set-up, servers are single homed with one server per port. Note that IxNetwork simulates servers. There are three VLANs per server and 128 host/MACs per VLAN. The number of server facing ports is a maximum of 32 10GbE, aggregated into eight LAGs with each LAG being a max of four 10GbE links each. There are two traffic profiles that are used during the test, traditional line rate traffic stress meaning specific packet sizes and our CloudPerf Internet Mix or IMIX that Henry He and Michael Githens of Ixia and I developed for the Lippis Data Center Switch Benchmark test.
For the line rate traffic stress test, a logical mesh consisting of both unicast and multicast flows are generated. For a 32 10GbE-port configuration, Unicast traffic is generated on Ixia ports 1-16, in full mesh. Traffic within each server’s three VLANs is distributed at 60%, 30% and 10% of line rate. That is, VLAN1 contains 60%, VLAN2 contains 30% and VLAN 3 contains 10% of traffic load. In addition to unicast traffic, multicast traffic flows into each VLAN and into the fabric simultaneously. The multicast configuration starts on Ixia port 17 to port 32, for example, via a fan-out to eight multicast groups per VLAN. That is, VLAN1 on Ixia port 17 will generate 100% line rate of multicast traffic to Ixia ports 25, 26, 27, 28, 29, 30, 31, 32. Therefore, a 64-port ToR switch will receive unicast and multicast traffic on each of its 32 server facing ports while the second 32 10GbE ports connect to its Core switch. Each ToR 10GbE server facing port supports a server, three VLANs with a mix of unicast and multicast traffic. Further, reverse direction unicast traffic is set 12.5% line rate per port and is contained within Ixia ports 17 to 32.
We strive for a fully-meshed non-blocking design to run the above-mentioned traffic profiles through the fabric and measure performance plus reliability. But the maximum number of ports into the fabric is constrained by the ToR switch’s port density. That is, a 24 10GbE port ToR switch will be able to support 12 server and fabric facing ports while the Core switch may have a density of a hundred plus 10GbE ports. As the figure below depicts, this test requires two ToR and two Core switches as the fundamental fabric building block. With Ixia test gear distributing traffic into the fabric and measuring its output, we collect fabric performance data. A ToR switch is powered down, and we observe and measure failover time, packet loss and time required for convergence upon the new topology. In the January 2013 report, we’ll note how each vendor interconnects its switches to create a fabric. Most Core switches support greater than 64 10GbE ports and thus, should support the above traffic loads. However, Core switch contribution to latency and convergence time is important and is being noted. The weakest link is the ToR switch; therefore, we look for performance and reliability under heavy load here.
MLAG Dual Homed
In the MLAG dual-homed configuration, we create two compute zones. A zone could be a compute cluster, such as a Hadoop cluster, where processes and computation are performed in each zone. Another example is within the Financial Services High Frequency Trading (HFT) environment where one compute zone runs HFT algorithms while another processes and manages customer accounts. In this MLAG test, servers are obviously dual homed to different ToR switches; that is, two 10GbE links are connected to different ToR switches. The logical network configuration is defined to support traffic flows both within and across compute zones. We allow Inter-Switch Links or ISLs within compute zones. East-west traffic that flows across compute zones must traverse Core switches over the fabric.
Again, there are three VLANs per server and 128 host/MACs per VLAN. The number of server facing ports is a maximum of 32 10GbE, aggregated into eight LAGs with each LAG being a max of four 10GbE links each. We increase the number of ToRs to four. Two traffic profiles are to generate traffic during the test, traditional line rate traffic stress and the CloudPerf Internet Mix or IMIX. The same logical networks are used as in the single homed scenario where performance and reliability is measured. We test for reliability differently in this scenario, shutting downs server ports, ToR ports, ToR switches, Core Switches and the links between them and measure packet loss plus reconfiguration time.
SPB/TRILL Simulation Test
For standard-based active-active products TRILL and SPB, we focus on the ToR switch as the core fabric is being simulated by Ixia’s ixNetwork. The configuration for this test includes two ToR switches only; no Core switches are needed. We define the number of server facing ports per ToR switch to be 16x10GE, aggregated into four LAGs, with each LAG being four 10GbE ports. Core facing ports are 16x10GbE. The same server, VLAN and traffic profiles are used to distribute flows into and over the fabric. Performance that is packet loss and latency is measured. For reliability or the failover test, we disable Ixia BCB/RBridge2, see below, on the fly and measure packet loss plus latency.
The above active-active fabric tests will provide the first set of imperial industry data as to how active-active products perform. The configurations of two ToR switches connected either into a simulated TRILL/SPB fabric or Core switches utilizing MLAG were designed to be building blocks so that IT architects can approximate the performance they may experience in their data centers. In addition to performance and reliability, we’re also testing for the level of work required to join, move and remove VMs for the fabric.
The maturity of active-active products is telling just by the number of firms that are ready to test in public. However, while we are early in our iSimCity testing schedule, what we see is impressive and promising. Some active-active fabrics do offer auto provisioning, discovery, reliability and performance at scale with support for both storage and data gram traffic.