Posted: December 11, 2011
[This post was written with Andrew Lambeth. A version of it has been posted on Search Networking, but I wasn’t totally happy with how that turned out. So here is a revised version. ]
There has been a lot of talk about fabrics lately. An awful lot. However, our experience has been that the message is somewhat muddled and there is confusion on what fabrics are, and what they can do for the industry. It is our opinion that the move to fabric is one of the more significant events in datacenter networking. And it is important to understand this movement not only for the direct impact it is having on the way we build datacenter networks, but the indirect implications it will have on the networking industry more broadly. (I’d like to point out that there is nothing really “new” in this writeup. Ivan and Brad have been covering these issues for a long time. However, I do think it is worth refining the discussion a bit, and perhaps providing an additional perspective.)
Lets first tackle the question of, “why fabric” and “why now”? The short answer is that the traditional network architecture was not designed for modern datacenter workloads.
The longer answer is that datacenter design has evolved to treat all aspects of the infrastructure (compute, storage, and network) as generic pools of resources. This means that any workload should be able to run anywhere. However, traditional datacenter network design does not make this easy. The classic three tier architecture (top of rack (ToR), aggregation, core) has non-uniform access to bandwidth and latency depending on the traffic matrix. For example, hosts connected to the same top ToR switch will have more bandwidth (and lower latency) than hosts connected through an aggregation switch, which will again have access to more total bandwidth than hosts trying to communicate through the core router. The net result? Deciding where to put a workload matters. Meaning that allocating workloads to ports is a constant bin-packing problem, and in dynamic environments, the result is very likely to be suboptimal allocation of bandwidth to workloads, or suboptimal utilization of compute due to placement constraints.
Enter fabric. In our vernacular (admittedly there is ample disagreement on what exactly a fabric is), a fabric is a physical network which doesn’t constrain workload placement. Basically, this means that communicating between any two ports should have the same latency, and the bandwidth between any disjoint subset of ports is non-oversubscribed. Or more simply, the physical network operates much as a backplane does within a network chassis.
The big question is, in addition to highly available bandwidth, what should a fabric offer? Let’s get the obvious out of the way. In order to offer multicast, the fabric should support packet replication in hardware as well as a way to manage multicast groups. Also, the fabric should probably offer some QoS support in which packet markings indicate the relative priority to aid packet triage during congestion.
But what else should the fabric support? Most vendor fabrics on the market tout a wide array of additional capabilities. For example, isolation primitives (VLAN and otherwise), security primitives, support for end-host mobility, and support for programmability, just to name a few.
Clearly these features add value in a classic enterprise or campus network. However, the modern datacenter hosts very different types of workloads. In particular, datacenter system design often employes overlays at the end hosts which duplicate most of these functions. Take for example a large web-service, it isn’t uncommon for load balancing, mobility, failover, isolation and security to be implemented within the load balancer, or the back-end application logic. Or a distributed compute platform. Similar properties are often implemented within the distribution harness rather than relying on the fabric. Even virtualized hosting environments (such as IaaS) are starting to use overlays to implement these features within the vswitch (see for example NVGRE or VXLAN).
There is good reason to implement these functions as overlays at the edge. Minimally it allows compatibility with any fabric design. But much more importantly, the edge has extremely rich semantics with regard to true end-to-end addressing, security contexts, sessions, mobility events, and so on. And implementing at the edge allows the system builders to evolve these features without having to change the fabric.
In such environments, the primary purpose of the fabric is to provide raw bandwidth, and price/performance not features/performance is king. This is probably why many of the datacenter networks we are familiar with (both in big data and hosting) are in fact IP fabrics. Simple, cheap and effective. That is also why many next generation fabric companies and projects are focused on providing low-cost IP fabrics.
If existing deployments of the most advanced datacenters in the world are any indication, edge software is going to consume a lot of functionality that has traditionally been in the network. It is a non-disruptive disruption whose benefits are obvious and simple to articulate. Yet the implications it could have on the traditional network supply chain are profound.