is openflowsdn good at forwarding

Is OpenFlow/SDN Good at Forwarding?

[A lot of the content of this post was drawn from conversations with Juan Lage, Rajiv Ramanathan, and Mohammad Attar]

Some of the more aggressive buzz around OpenFlow has lauded it as a mechanism for re-implementing networking in total. While in some (fairly fanciful) reality, that could be the case, categorical statements of this nature tend hide practical design trade-offs that exist between any set of technologies within a design continuum.

What do I mean by that? Just that there are things that OpenFlow and more broadly SDN are better suited for than others.

In fact, the original work that lead to OpenFlow was not meant to re-implement all of networking. That’s not to say that this isn’t a worthwhile (if not quixotic) goal. Yet our focus was to explore new methods for datapath state whose management was difficult to do with using the traditional approach of full distribution with eventual consistency.

And what sort of state might that be? Clearly destination-based, shortest path forwarding state can be calculated in a distributed fashion. But there is a lot more state in the datapath beyond that used for standard forwarding (filters, tagging, policy routing, QoS policy, etc.). And there are a lot more desired uses for networks than vanilla destination-based forwarding.

Of course, much of this “other” state is not computed algorithmically today. Rather, it is updated manually, or through scripts whose function more closely resembles macro replacement than computation.

Still, our goal was (and still is) to compute this state programmatically.

So the question really boils down to whether the algorithm needed to compute the datapath state is easily distributed. For example, if the state management algorithm has any of the following properties it probably isn’t, and is therefore a potential candidate for SDN.

  •  It is not amenable to being split up into many smaller pieces (as opposed to fewer, larger instances). The limiting property in these cases is often excessive communication overhead between the control nodes.
  • It is not amenable to running on heterogeneous compute environments. For example those with varying processor speeds and available memory.
  •  It is not amenable to relatively long RTT’s for communication between distributed instances. In the purely distributed case, the upper bound for communicating between any two nodes scales linearly with the longest loop free path.
  • The algorithm requires sophisticated distributed coordination between instances (for example distributed locking or leader election)

There are many examples of algorithms that have these properties which can (and are) used in networking. The one I generally use as an example is a runtime policy compiler. One of our earliest SDN implementations was effectively a Datalog compiler that would take a topologically independent network policy and compile it into flows (in the form of ACL and policy routing rules).

Other examples include implementing global solvers to optimize routes for power, cost, security policy, etc., and managing distributed virtual network contexts.

The distribution properties of most of these algorithms are well understood (and have been for decades).  And so it is fairly straightforward to put together an argument which demonstrates that SDN offers advantages over traditional approaches in these environments. In fact, outside of OpenFlow, it isn’t uncommon to see elements of the control plane decoupled from the dataplane in security, management, and virtualization products.

However, networking’s raison d’être, it’s killer app, is forwarding. Plain ol’ vanilla forwarding. And as we all know, the networking community long ago developed algorithms to do that which distribute wonderfully across many heterogeneous compute nodes.

So, that begs the question. Does SDN provide any value to the simple problem of forwarding? That is, if the sole purpose of my network is to move traffic between two end-points, should I use a trusty distributed algorithm (like an L3 stack) that is well understood and has matured for the last couple of decades? Or is there some compelling reason to use an SDN approach?

This is the question we’d like to explore in this post. The punchline (as always, for the impatient) is that I find it very difficult to argue that SDN has value when it comes to providing simple connectivity. That’s simply not the point in the design space that OpenFlow was created for. And simple distributed approaches, like L3 + ECMP, tend to work very well. On the other hand, in environments where transport is expensive along some dimension, and global optimization provides value, SDN starts to become attractive.

First, lets take a look at the problem of forwarding:

Forwarding to me simply means find a path between a source and a destination in a network topology. Of course, you don’t want the path to suck, meaning that the algorithm should efficiently use available bandwidth and not choose horribly suboptimal hop counts.

For the purposes of this discussion, I’m going to assume two scenarios in which we want to do forwarding: (a) let’s assume that bandwidth and connectivity are cheap and plentiful and that any path to get between two points are roughly the same (b) lets assume none of these properties.

We’ll start with the latter case.

In many networks, not all paths are equal. Some may be more expensive than others due to costs of third party transit, some may be more congested than others leading to queuing delays and loss, different paths may support different latencies or maximum bandwidth limits, and so on.

In some deployments, the forwarding problem can be further complicated by security constraints (all flows of type X must go through middleboxes) and other policy requirements.

Further, some of these properties change in real time. Take for example the cost of transit. The price of a link could increases dramatically after some threshold of use has been hit. Or consider how a varying traffic matrix may affect the queuing delay and available bandwidth of network paths.

Under such conditions, one can start to make an argument for SDN over traditional distributed routing protocols.

Why? For two reasons. First, the complexity of the computation increases with the number of properties being optimized over. It also increases with the complexity in the policy model, for example a policy that operates over source, protocol and destination is going to be more difficult than one that only considers destination.

Second, as the frequency in which these properties change increases, the amount of information the needs to be disseminated increases. An SDN approach can greatly limit the total amount of information that needs to hit the wire by reducing the distribution of the control nodes. Fully distributed protocols generally flood this information as it isn’t clear which node needs to know about the updates. There have been many proposals in the literature to address these problems, but to my knowledge they’ve seen little or no adoption.

I presume these costs are why a lot of TE engines operate offline where you can throw a lot of compute and memory at the problem.

So, if optimaility is really important (due to the cost of getting it wrong), and the inputs change a lot, and there is a lot of stuff being optimized over, SDN can help.

Now, what about the case in which bandwidth is abundant, relatively cheap and any sensible path between two points will do?

The canonical example for this is the datacenter fabric. In order to accommodate workload placement anywhere, datacenter physical networks are often built using non-oversubscribed topologies. And the cost of equipment to build these is plummeting. If you are comfortable with an extremely raw supply channel, it’s possible to get 48 ports of 10G today for under $5k. That’s pretty damn cheap.

So, say you’re building a datacenter fabric, you’ve purchased piles of cheap 10G gear and you’ve wired up a fat tree of some sort. Do you then configure OSPF with ECMP (or some other multipathing approach) and be done with it? Or does it make sense to attempt to use an SDN approach to compute the forwarding paths?

It turns out that efficiently calculating forwarding paths in a highly connected graph, and converging those paths on failure is something distributed protocols do really, really well. So well, in fact, that it’s hard to find areas to improve.

For example, the common approach of using multipathing approximates Valiant load balancing which effectively means the following: if you send a packet to an arbitrary point in the network, and that point forwards it to the destination, then for a regular topology, and any traffic matrix, you’ll be pretty close to fully using the fabric bandwidth (within a factor of two given some assumptions on flow arrival rates).

That’s a pretty stunning statement. What’s more stunning is that it can be accomplished with local decisions, and without requiring per-flow state, or any additional control overhead in the network. It also obviates the need for a control loop to monitor and respond to changes in the network. This latter property is particularly nice as the care and feeding of control loops to prevent oscillations or divergence can add a ton of complexity to the system.

On the other hand, trying to scale a solution using classic OpenFlow almost certainly won’t work. To begin with the use of n-tuples (say per-flow, or even per host/destination pair) will most likely result in table space exhaustion. Even with very large tables (hundreds of thousands) the solution is unlikely to be suitable for even moderately sized datacenters.

Further, to efficiently use the fabric, multipathing should be done per flow. It’s highly unlikely (in my experience) that having the controller participate in flow setup will have the desired performance and scale characteristics. Therefore the multipathing should be done in the fabric (which is possible in a later version of OpenFlow like 1.1 or the upcoming 1.2).

Given these constraints, an SDN approach would probably look a lot like a traditional routing protocol. That is, the resulting state would most likely be destination IP prefixes (so we can take advantage of aggregation and reduce the table requirements by a factor of N over source-destination pairs). Further, multipathing, and link failure detection would have to be done on the switch.

Another complication of using SDN is establishing connectivity to the controller. This clearly requires each switch to run something to bootstrap communication, like a traditional protocol. In most SDN implementations I know of, L3 is used for this purpose. So now, not only are we effectively mimicking L3 in the controller to manage datapath state, we haven’t been able to get rid of the distributed approach potentially doubling the control complexity network wide.


So, does SDN provide value for forwarding in these environments? Given the previous discussion it is difficult to argue in favor of SDN.

Does SDN provide more functionality? Unlikely, being limited to manipulating destination prefixes with multipathing being carried out by switches robs SDN of much of its value. The controller cannot make decisions on anything but the destination. And since the controller doesn’t know a priori which path a flow will take, it will be difficult to add additional table rules without replicating state all over the network (clearly a scalability issue).

How about operational simplicity? One might argue that in order to scale an IGP one would have to manually configure areas which presumably would be done automatically with SDN. However, it is difficult to argue that building a separate control network, or in-band-configuring is any less complex than a simple ID to switch mapping.

What about scale? I’ll leave the details of that discussion to another post. Juan Lage, Rajiv Ramanathan, and I have had a on-again, off-again e-mail discussion comparing that scaling properties of SDN to that of L3 for building a fabric. The upshot is that there are nice proof-points on both sides, but given today’s switching chipsets, even with SDN, you basically end up populating the same tables that L3 does. So any scale argument revolves around reducing the RTT to the controller through a single-function control network, and reducing the need to flood. Neither of these tricks are likely to produce a significant scale advantage for SDN, but they seem to produce some.

So, what’s the take-away?

It’s worth remembering that SDN, like any technology, is actually a point in the design space, and is not necessarily the best option for all deployment environments.

My mental model for SDN starts with looking at the forwarding state, and asking the question “what algorithm needs to run to compute that state”. The second question I ask is, “can that algorithm be easily distributed amongst many nodes with different compute and memory resources”. In many use cases, the answer to this latter question is “not-really”. The examples I provide general reduce to global solvers used for finding optimal solutions with many (often changing) variables. And in such cases, SDN is a shoo in.

However, in the case of building out networking fabrics in which bandwidth is plentiful and shortest path (or something similar) is sufficient, there are a number of great, well tested distributed algorithms for the job. In such environments, it’s difficult to argue the merits of building out a separate control network, adding additional control nodes, and running vastly less mature software.

But perhaps, that’s just me. Thoughts?

16 Comments on “Is OpenFlow/SDN Good at Forwarding?”

  1. Fantastic summary of what a lot of us have been trying to say for the last 6 months. Thank you, it’s so nice to have something so well articulated.

  2. Sunny Rajagopalan says:

    Excellent post, Martin. This is what I’ve been arguing as well. For regular forwarding, we already have excellent distributed control protocols. Centralizing them in an external controller would scale only if you then implemented the controller in a distributed fashion.

    So the end result would be that you took something that was already distributed and re-implemented them in a different distributed fashion. It’s hard to argue for value in that.

    Aside from the external TE engine example, can you give other scenarios where openflow/sdn would bring in a real value add?

    • Like I mention in the post, policy based networking is a great candidate (NAC, isolated communities of interest with mobile end points, etc.), as is network virtualization (manipulating the tagging or tunneling logic network wide). I’ve also seen great use cases involving many-casting over multiple channels, service interposition, and path optimization for the purposes of power consumption reduction or reducing transport costs. In these cases, the algorithm used to update the state generally converges on a linear solver, or a more general constraint solver which is much more amenable to an SDN style approach. If you look at a switch datapath, I would argue that pretty much the only state we know how to update well in a purely distributed fashion are the L2 and L3 tables and MPLS. There is a heck of a lot more datapath state to deal with however.

  3. James Liao says:

    Great post.

    If the legacy protocols are good at certain applications while OpenFlow is good at another, my next question would be whether OpenFlow should/could be used in the same environment of these legacy protocols.

    If the answer is yes, the hardware vendors will need to address issues of hybrid application? If the answer is no, will we see specialized hardware to optimize for OpenFlow applications?

    • Thanks James. Yes, I think this is a real and important issue. Most deployments I know of use a hybrid approach, and it’s done in an ad-hoc fashion. However, Dave Ward is spearheading an effort to look into this issue more formally within ONF.

      • David Meyer says:

        Its probably important to note that the “Integrated Mode” being proposed currently in the ONF (part of the “hybrid switch” proposal) changes what I would consider to be the core aspect of OpenFlow, namely, the ability to directly program forwarding state. In Integrated Mode, OF is for the most part an API for inserting routes into the RIB (i.e., it is treated like another routing protocol), or program classifiers at the edge. What is interesting about OF in this case is the ability to configure ephemeral or transitent state; this is a very different architectural model than “classic OF” (for example, what is the switch model exposed by a RIB?). In any event, APIs to the RIB and/or ACL TCAMs would seem to be something very different that what has been previous proposed as OpenFlow (or even SDN).

        That said, the discussion around Integrated Mode is central to the industry and where OF/SDN is going, precisely because it exposes the core question: Is there a value proposition in being able to directly program the forwarding plane of a switch (fabric) with an off-box control element (centralized or distributed)? The model embodied in Integrated Mode would seem to imply that the answer to this question is no.

  4. Rob Sherwood says:

    Interesting post as always!

    One thing though: you say a few times that SDN is just one point in the design space and this confuses me a bit. I think of SDN as a constantly evolving and hard to pin down concept, so guess I would describe SDN more as “a new dimension to the design space” . Specifically, that when the data and control planes are decoupled, you have more flexibility about how to propagate control plane state. I guess maybe another way of looking at your article is saying that “the flexibility SDN provides is useful in some cases (e.g., virtualization) and not as useful in others (e.g., vanilla forwarding)” — which is a statement that I think many people would agree with.

    • Hey Rob,

      Your last sentence is exactly what I was getting at. To me, SDN offers the ability to decouple the distribution model from the physical topology, and it does so at the cost of the mechanism for that decoupling (additional control channel and control components). So if it is not providing additional value, then a system implementor has to weigh the ancillary benefits (better programming environment, better testing support, etc.) against the overhead that SDN carries with it.

  5. Thanks for writing this up, Martin. I agree with you, and you’ve made the argument more compellingly than I have.

    There may still be benefit in some SDN capabilities in a L3+ECMP datacenter network operating in a hybrid mode. Let the existing routing protocols control normal forwarding, but layer SDN capabilities on top. For example, implement active application-level connectivity monitoring using OpenFlow’s ability to inject packets out any port and install rules on adjacent devices to capture probe packets and deliver them back to the controller. This type of usage probably doesn’t require an additional control channel, relying on normal L3 connectivity to connect the devices and the controller.

  6. I disagree there is no value in having an OpenFlow interface on gear implemented in the traditional way (distributed intelligence). I think there is a distinct lack of imagination going on if someone thinks OpenFlow is only good for centralized intelligence.

    Also there are degrees of distribution. So do you use two controllers for your whole datacenter? I would argue no. What about a controller per rack or row or more abstractly… per application/application set? I think that last one is very interesting, personally…

    Whatever constraints there are around forwarding performance now will be quantified for SDN/OpenFlow and addressed in the future, I predict… assuming the bus keeps moving in that direction.

    • Thanks for the comment. However, I’m unclear exactly what you’re responding to.

      – To your first paragraph, no one is arguing against OpenFlow on traditional gear.

      – To your second paragraph, there is no mention of centralization in the post, or a limited distribution model

      – To your third paragraph, there is no mention of performance (except flow-setups to the controller which can easily be avoided when using OpenFlow so is not really relevant).

      Or more to the point, I work on a controller which uses traditional gear, has both a tightly clustered and a federated distribution model, and does not impact performance. So clearly, I agree with you ?

      • hmm…

        I’m not sure what I was responding to either. I remember writing this with a clear notion in my head that someone argued there is no point to OpenFlow unless you centralize all network intelligence and that any other model (distributed, even if its less distributed but still not centralized) would be a zero net gain.

        Clearly thats not what is in your post.

        I’m not crazy.

        Also, I agree with your post entirely.


        • ? well I certainly appreciate the sentiment, and I do think it bears repeating. A common strawman fallacy used against OpenFlow is the assumption of centralization. Thanks for the comment.

  7. Brad Hedlund says:

    Extremely well written arguments, but I must admit, I’m still sitting on the fence here. I tend to believe there is a market out there for fully automated systems, vswitch to pswitch, operating at a relatively small scale (think small to mid-market enterprise). We also know that the data center fabric in this system will be constructed in a very prescriptive and deterministic topology (as most fabrics are). Therefore such a system can likely make assumptions in building the initial control path to the SDN controller. Something along the lines of the PortLand proposal. If the initial control path deployment is simplified (an removed as a detractor), the arguments for SDN controlling the physical and virtual elements of such a system become compelling. No?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account.
Log Out / 
Change )

Google+ photo

You are commenting using your Google+ account.
Log Out / 
Change )

Twitter picture

You are commenting using your Twitter account.
Log Out / 
Change )

Facebook photo

You are commenting using your Facebook account.
Log Out / 
Change )

Connecting to %s