defining fabric in the era of overlays

Defining “Fabric” in the Era of Overlays

[This post was written with Andrew Lambeth.  A version of it has been posted on Search Networking, but I wasn’t totally happy with how that turned out.  So here is a revised version. ]

There has been a lot of talk about fabrics lately.  An awful lot.  However, our experience has been that the message is somewhat muddled and there is confusion on what fabrics are, and what they can do for the industry.  It is our opinion that the move to fabric is one of the more significant events in datacenter networking.  And it is important to understand this movement not only for the direct impact it is having on the way we build datacenter networks, but the indirect implications it will have on the networking industry more broadly. (I’d like to point out that there is nothing really “new” in this writeup.  Ivan and Brad have been covering these issues for a long time.  However, I do think it is worth refining the discussion a bit, and perhaps providing an additional perspective.)

Lets first tackle the question of, “why fabric” and “why now”?  The short answer is that the traditional network architecture was not designed for modern datacenter workloads.

The longer answer is that datacenter design has evolved to treat all aspects of the infrastructure (compute, storage, and network) as generic pools of resources.  This means that any workload should be able to run anywhere.  However, traditional datacenter network design does not make this easy.  The classic three tier architecture (top of rack (ToR), aggregation, core) has non-uniform access to bandwidth and latency depending on the traffic matrix.  For example, hosts connected to the same top ToR switch will have more bandwidth (and lower latency) than hosts connected through an aggregation switch, which will again have access to more total bandwidth than hosts trying to communicate through the core router.  The net result? Deciding where to put a workload matters.  Meaning that allocating workloads to ports is a constant bin-packing problem, and in dynamic environments, the result is very likely to be suboptimal allocation of bandwidth to workloads, or suboptimal utilization of compute due to placement constraints.

Enter fabric.   In our vernacular (admittedly there is ample disagreement on what exactly a fabric is),  a fabric is a physical network which doesn’t constrain workload placement.  Basically, this means that communicating between any two ports should have the same latency, and the bandwidth between any disjoint subset of ports is non-oversubscribed.  Or more simply, the physical network operates much as a backplane does within a network chassis.

The big question is, in addition to highly available bandwidth, what should a fabric offer?  Let’s get the obvious out of the way.  In order to offer multicast, the fabric should support packet replication in hardware as well as a way to manage multicast groups.  Also, the fabric should probably offer some QoS support in which packet markings indicate the relative priority to aid packet triage during congestion.

But what else should the fabric support?  Most vendor fabrics on the market tout a wide array of additional capabilities. For example, isolation primitives (VLAN and otherwise), security primitives, support for end-host mobility,  and support for programmability, just to name a few.

Clearly these features add value in a classic enterprise or campus network.  However, the modern datacenter hosts very different types of workloads.  In particular, datacenter system design often employes overlays at the end hosts which duplicate most of these functions.  Take for example a large web-service, it isn’t uncommon for load balancing, mobility, failover, isolation and security to be implemented within the load balancer, or the back-end application logic.  Or a distributed compute platform.  Similar properties are often implemented within the distribution harness rather than relying on the fabric.  Even virtualized hosting environments (such as IaaS) are starting to use overlays to implement these features within the vswitch (see for example NVGRE or VXLAN).

There is good reason to implement these functions as overlays at the edge.  Minimally it allows compatibility with any fabric design.  But much more importantly, the edge has extremely rich semantics with regard to true end-to-end addressing, security contexts, sessions, mobility events, and so on.  And implementing at the edge allows the system builders to evolve these features without having to change the fabric.

In such environments, the primary purpose of the fabric is to provide raw bandwidth, and price/performance not features/performance is king.  This is probably why many of the datacenter networks we are familiar with (both in big data and hosting) are in fact IP fabrics.  Simple, cheap and effective.  That is also why many next generation fabric companies and projects are focused on providing low-cost IP fabrics.

If existing deployments of the most advanced datacenters in the world are any indication, edge software is going to consume a lot of functionality that has traditionally been in the network.  It is a non-disruptive disruption whose benefits are obvious and simple to articulate. Yet the implications it could have on the traditional network supply chain are profound.

7 Comments on “Defining “Fabric” in the Era of Overlays”

  1. Hi Martin,

    I have a suggestion, if I may. When referring to the “modern datacenter workloads”, would it be possible to clarify what is meant by these? Are they the likes of EC2 instances, where application resiliency is provided by workload software itself, or does it also include the traditional Enterprise apps, which rely on infrastructure for their availability?

    Why I’m asking is because the later ones are usually only portable within fairly small clusters, 32 hosts in VMware’s case is an example, and as such, can be fairly easily designed for without the need for a massive fabric.

    I’m not saying there’s no need for those fabrics; I’m just highlighting that the need for them isn’t as universal as it may appear to some readers.

    — D

    • Hi Dmitri,

      Yes, this is a fair point. By modern datacenter workloads, I mean those that treat compute like a resource pool at sizes that far exceed the bisectional capacity of a ToR switch. This can be a virtualized datacenter, a distributed computation environment, the back-end to a web service, etc. These are most likely found in service providers, web giants, and large enterprise.

      thanks for the comment,


  2. Brad Hedlund says:

    Expanding on the point about “Simplicity” … In the era of software switching overlays, the configuration of physical network switch fabric has no bearing on the network topology as observed by the application — where the topology is constructed by the overaly, not the physical network.

    The impact of this two fold. (1) a vastly simplified physical network deployment, and (2) a more robust overall fabric because the failure of a physical switch has no impact/change on the topology as observed by the application — physical failures are hidden under the overlay.

    If your physical IP fabric is designed properly, physical switch/link failure will also have very minimal impact on application throughput.

    Thanks for the link! ?



  3. […] Martin Casado wrote about Defining “Fabric” in the Era of Overlays […]

  4. I couldn’t agree more. The DC of the near future will be a simple layer-3 Clos network, with various application-specific or virtualized layer 2 solutions on top. But those will be software-based, and the network itself will be “dumb”. Software is flexible, hardware should be simple, fast, and general purpose.

  5. This is the most awesome article i have read in a long time for a variety of reasons:-

    1: I am new to the field of n/w virtualization and how datacenter networks are built

    2: Most crisp to the point explanation of what fabric and overlays are and why we need to move traditional n/w functionality to the edge

    3: why traditional n/w design didnt meet the requirement of modern datacenter workloads

    I am guessing in a few years you would have a book in this area and it would be awesome. Thanks for writing.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account.
Log Out / 
Change )

Google+ photo

You are commenting using your Google+ account.
Log Out / 
Change )

Twitter picture

You are commenting using your Twitter account.
Log Out / 
Change )

Facebook photo

You are commenting using your Facebook account.
Log Out / 
Change )

Connecting to %s