Visibility, Debugging and Network Virtualization (Part 1)

[This post was written by Martin Casado and Amar Padmanahban, with helpful input from Scott Lowe, Bruce Davie, and T. Sridhar]

This is the first in a multi-part discussion on visibility and debugging in networks that provide network virtualization, and specifically in the case where virtualization is implemented using edge overlays.

In this post, we’re primarily going to cover some background, including current challenges to visibility and debugging in virtual data centers, and how the abstractions provided by virtual networking provide a foundation for addressing them.

The macro point is that much of the difficulty in visibility and troubleshooting in today’s environments is due to the lack of consistent abstractions that both provide an aggregate view of distributed state and hide unnecessary complexity. And that network virtualization not only provides virtual abstractions that can be used to directly address many of the most pressing issues, but also provides a global view that can greatly aid in troubleshooting and debugging the physical network as well.

A Messy State of Affairs

While it’s common to blame server virtualization for complicating network visibility and troubleshooting, this isn’t entirely accurate. It is quite possible to build a static virtual datacenter and, assuming the vSwitch provides sufficient visibility and control (which they have for years), the properties are very similar to physical networks. Even if VM mobility is allowed, simple distributed switching will keep counters and ACLs consistent.

A more defensible position is that server virtualization encourages behavior that greatly complicates visibility and debugging of networks. This is primarily seen as server virtualization gives way to full datacenter virtualization and, as a result, various forms of network virtualization are creeping in. However, this is often implemented as a collection of disparate (and distributed) mechanisms, without exposing simplified, unified abstractions for monitoring and debugging. And the result of introducing a new layer of indirection without the proper abstractions is, as one would expect, chaos. Our point here is not that network virtualization creates this chaos – as we’ll show below, the reverse can be true, provided one pays attention to the creation of useful abstractions as part of the network virtualization solution.

Let’s consider some of the common visibility issues that can arise. Network virtualization is generally implemented with a tag (for segmentation) or tunneling (introducing a new address space), and this can hide traffic, confuse analysis on end-to-end reachability, and cause double counting (or undercounting) of bytes or even packets. Further, the edge understanding of the tag may change over time, and any network traces collected would therefore become stale unless also updated. Often logically grouped VMs, like those of a single application or tenant, are scattered throughout the datacenter (not necessarily on the same VLAN), and there isn’t any network-visible identifier that signifies the grouping. For example, it can be hard to say something like “mirror all traffic associated with tenant A”, or “how many bytes has tenant A sent”. Similarly, ACLs and other state affecting reachability is distributed across multiple locations (source, destination, vswitches, pswitches, etc.) and can be difficult to analyze in aggregate. Overlapping address spaces, and dynamically assigned IP addresses, preclude any simplistic IP-based monitoring schemes. And of course, dynamic provisioning, random VM placement, and VM mobility can all make matters worse.

Yes, there are solutions to many of these issues, but in aggregate, they can present a real hurdle to smooth operations, billing and troubleshooting in public and private data centers. Fortunately, it doesn’t have to be this way.

Life Becomes Easy When the Abstractions are Right

So much of computer science falls into place when the right abstractions are used. Servers provide a good example of this. Compute virtualization has been around in pieces since the introducing of the operating system. Memory, IO, and the instruction sets have long been virtualized and provide the basis of modern multi-process systems. However, until the popularization of the virtual machine abstraction, these virtualization primitives did not greatly impact the operations of servers themselves. This is because there was no inclusive abstraction that represented a full server (a basic unit of operations in an IT shop). With virtual machines, all state associated with a workload is represented holistically, allowing us to create, destroy, save, introspect, track, clone, modify, limit, etc. Visibility and monitoring in multi-user environments arguably became easier as well. Independent of which applications and operating systems are installed, it’s possible to know exactly how much memory, I/O and CPU a virtual machine is consuming, and that is generally attributed back to a user.

So is it with network virtualization – the virtual network abstraction can provide many of the same benefits as the virtual machine abstraction. However, it also provides an additional benefit that isn’t so commonly enjoyed with server virtualization: network virtualization provides an aggregated view of distributed state. With manual distributed state management being one of the most pressing operational challenges in today’s data centers, this is a significant win.

To illustrate this, we’ll provide a quick primer on network virtualization and then go through an example of visibility and debugging in a network virtualization environment.

Network Virtualization as it Pertains to Visibility and Monitoring

Network virtualization, like server virtualization, exposes a a virtual network that looks like a physical network, but has the operational model of a virtual machine. Virtual networks (if implemented completely) support L2-L7 network functions, complex virtual topologies, counters, and management interfaces. The particular implementation of network virtualization we’ll be discussing is edge overlays, in which the mechanism used to introduce the address space for the virtual domain is an L2 over L3 tunnel mesh terminated at the edge (likely the vswitch). However, the point of this particular post is not to focus on the how the network virtualization is implemented, but rather, how decoupling the logical view from the physical transport affects visibility and troubleshooting.

A virtual network (in most modern implementations, at least) is a logically centralized entity. Consequently, it can be monitored and managed much like a single physical switch. Rx/Tx counters can be monitored to determine usage. ACL counters can be read to determine if something is being dropped due to policy configuration. Mirroring of a virtual switch can siphon off traffic from an entire virtual network independent of where the VMs are or what physical network is being used in the datacenter. And of course, all of this is kept consistent independent of VM mobility or even changes to the physical network.

The introduction of a logically centralized virtual network abstraction addresses many of the problems found in todays virtualized data centers. The virtualization of counters can be used for billing and accounting without worrying about VM movements, the hiding or double counting of traffic, the distribution of VMs and network services across the datacenter. The virtualization of security configuration (e.g. ACLs) and their counters turns a messy distributed state problem into a familiar central rule set. In fact, in the next post, we’ll describe how we use this aggregate view to perform header space analysis to answer sophisticated reachability questions over state which would traditionally be littered throughout the datacenter. The virtualization of management interfaces natively provides accurate, multi-tenant support of existing tool chains (e.g. NetFlow, SNMP, sFlow, etc.), and also resolves the problem of state gathering when VMs are dispersed across a datacenter.

Impact On Workflow

However, as with virtual machines, while some level of sanity has been restored to the environment, the number of entities to monitor has increased. Instead of having to divine what is going on in a single, distributed, dynamic, complex network, there are now multiple, much simpler (relatively static) networks that must be monitored. These network are (a) the physical network, which now only needs to be concerned with packet transport (and thus has become vastly simpler) and (b) the various logical networks implemented on top of it.

In essence, visibility and trouble shooting now much take into account the new logical layer. Fortunately, because virtualization doesn’t change the basic abstractions, existing tools can be used. However, as with the introduction of any virtual layer, there will be times when the mapping of physical resources to virtual ones becomes essential.

We’ll use troubleshooting as an example. Let’s assume that VM A can’t talk to VM B. The steps one takes to determine what goes on are as follows:

  1. Existing tools are pointed to the effected virtual network and rx/tx counters are inspected as well as any ACLs and forwarding rules. If something in the virtual abstraction is dropping the packets (like an ACL), we know what the problem is, and we’re done.
  2. If it looks like nothing in the virtual space is dropping the packet, it becomes a physical network troubleshooting problem. The virtual network can now reveal the relevant physical network and paths to monitor. In fact, often this process can be fully automated (as we’ll describe in the next post). In the system we work on, for example, often you can detect which links in the physical network packets are being dropped on (or where some amount of packet loss is occurring) solely from the edge.

A number of network visibility, management, and root cause detection tools are already undergoing the changes needed to make this a one step process form the operators view. However, it is important to understand what’s going on, under the covers.

Wrapping Up for Now

This post was really aimed at teeing up the topic on visibility and debugging in a virtual network environment. In the next point, we’ll go through a specific example of an edge overlay network virtualization solution, and how it provides visibility of the virtual networks, and advanced troubleshooting of the physical network.  In future posts, we’ll also cover tool chains that are already being adapted to take advantage of the visibility and troubleshooting gains possible with network virtualization.


12 Comments on “Visibility, Debugging and Network Virtualization (Part 1)”

  1. Colin Scott says:

    Just curious, isn’t it possible that the network controllers themselves are buggy?

    Going back to your example, let’s assume that VM A can’t talk to VM B. Suppose that:
    – We examine the virtual configuration, and verify that nothing in the virtual abstraction is dropping packets.
    – We examine the physical network, and find that the switches are forwarding packets between hypervisors correctly.
    – We examine the vSwitches themselves, and discover that their routing tables do not properly implement the virtual networks’ policies. We ultimately track the problem to a bug in the controllers that installed the routing entries.

    Presumably network controllers will become stable at some point, just as modern operating systems and hypervisors are generally bug-free. But how long are we from that point? [cf. http://goo.gl/8zydR

    • Of course, there can be bugs anywhere in the system, including the management tool that’s displaying the information (we’ve run into this problem before in which the flows being displayed were not representative of what were actually in the forwarding table). However, given the logical view and the physical forwarding state (of which the vswitch flows are part of), you should be able to have an end-to-end accounting of how a packet is handled. Even then, you can have a disconnect between the flows as state in the network and how the forwarding mechanisms (whether hardware or software) interpret them. But that problem exists independent of network virtualization.

  2. Thanks for taking the time to post this guys. So, If I understand correctly the idea is to create a complete abstraction of the virtual network from the physical network in the design. In developing your virtual network design you don’t intermix constructs between the two networks. You wouldn’t say have a VLAN that spans both your physical network and your virtual network allowing you to simplify network troubleshooting between VM Host A & B. It would also make troubleshooting the underlying physical network simple. I think I get this even if on surface it looks like the work is doubled because of two independent networks.

    Does the counters break when tracking virtual to physical ports? What are the considerations when needing to support a design that is completely abstract for the virtual hosts on the network but have application level interaction with physical hosts within the same logical constructs such as a VLAN that has to span both constructs due to application requirements?

    • Hi Keith. This is a great question.

      Under our definition, a virtual network introduces a new address space. So VLAN 3 in virtual network A, is completely different than VLAN 3 in virtual network B, *unless* those two virtual networks are bridged together, or bridged to the same physical L2 domain.

      With regards to counters, physical counters should continue to accurately depict how much packets/bytes have been transferred on the wire. Virtual counters, on the other hand, will only provide information on bytes/packets that have traversed the virtual networks. This means that they don’t double count any tags or headers used to implement the virtualization, and they keep the counters consistent across mobility events.

  3. Paul Gleichauf says:

    Edge overlays are preferred classical physical network model, particularly for extension up to the application layer, its hard to argue against the advantages of your approach in setting up the arguments to follow in your anticipated posts in this thread. Prior experience also favors the appropriate use of tags and the right abstractions. I look forward to learning more about what the right abstractions are, but wonder whether the two principles are sufficient.

    Instead of starting a debate now, perhaps I can encourage consideration of a few complicating issues in later posts in the series. They include:

    1) Time constraints, how dynamic virtual networks can and should be: limiting and measuring the rate of change.
    2) Satisfaction of resource constraints and the degree to which traffic engineering can be avoided
    3) Definition of the extent of virtual traffic visibility at the edge
    4) Violations of the physical-virtual boundary for security (in addition to debugging)
    5) The management network as a separate physically protected channel or a secured prioritized virtual one

    A more detailed exposition for the patient reader is below.

    In a virtual world things can change very quickly, but should they if we are to avoid chaos? Many protocols assume both sufficient sampling time to gather meaningful statistics as well as depend upon relatively long convergence times to settle. Our abstractions must take into account how fast things are allowed to change, especially since many of the processes you already have described will be automated. Moreover as in any network we have to be able to correlate monitored events, so global time services with sufficient resolution are required. Virtual machines cannot really march to their own drummer. This is even more significant, and more tied to the physical realm, when latency must be tightly controlled.

    In the spatial dimension, have you made an implicit assumption of no resource contention, that is over-provisioned transports and links are always available and especially application-level traffic engineering is minimal? One of the big selling points for virtualization has traditionally been both higher and more flexible utilization. Customers may resist over-provisioning when their link costs are especially high, and therefore be willing to take on the cost of traffic engineering. L3 over L2 meshes may not be universally attractive, even in a data center (which is not identified in this blog posting as even the main design target as opposed to the discussion in the earlier What Should Networks Do For Applications).

    There is likely to be wide agreement that the network is the ideal place to sense and arbitrate between edge compute node traffic demands. It is very hard and likely misplaced to have edge nodes react to other than their own flows and adjust their behavior accordingly. The tricky bit is getting the abstractions right to convey the necessary relevant information to whatever controller/management-console/policy-decision-point initiates appropriate reactions. End-to-end encryption, where the end is a compute node, greatly complicates trustworthy monitoring, so in cases where the soft switch can allowed to terminate and tag traffic things are much simpler. Is there recognition under various compliance and security models that this is sufficient? Even assuming that all parties are authenticated and traffic is protected between switches, in the past there have been issues with this model for some categories of customer. In addition when the network is virtualized across multiple cooperating soft switches the security model can become painful to coordinate.

    Many cooperating virtual and physical devices are difficult to secure on multiple fronts that can violate the nice assumptions of a well-defined separation between the physical and virtual. It is not a good idea to package secret keys in virtual images. Some form of hardware support for protected keystones is a much preferred solution. This is especially important when you cannot afford to trust every node, physical or virtual. Certificate management, including both provisioning and expirations (again a tie to consistent global clocks) is important and must be continuously monitored, protected, and automated, but on relatively slow timescales.

    Lastly, if you want to make sure that you can always monitor and always take management actions on your network it may be simplest to have a separate physical network for management. Large scale data centers seem to have readopted this old pattern that dates back to at least SS7 circuit switches. Otherwise one is again faced with providing traffic engineering to appropriately prioritize the management plane to make sure one always has an accurate and effective global ability to sense and react to problems.

    In summary, could this mean that our evolving virtual networks should retain several significant physical anchors and constraints to be sufficiently manageable,flexible, secure and generally useful?

    • Paul, I think your comment is way more insightful than the post to which it serves ;)

      Excellent discussion points all around. Perhaps you and I should work together to highlight these issues in a subsequent post?

  4. […] Terry Slattery ruminates on the power of creating (and using) the right abstraction in networking. The value of the “right abstraction” has come up a number of times; it was a featured discussion point of Martin Casado’s talk at the OpenStack Summit in Portland in April, and takes center stage in a recent post over at Network Heresy. […]

  5. […] making it more scalable, enabling rapid deployment of networking services, and providing centralized operational visibility and monitoring into the state of the virtual and physical […]

  6. […] making it more scalable, enabling rapid deployment of networking services, and providing centralized operational visibility and monitoring into the state of the virtual and physical […]

  7. With regard to troubleshooting networks, I’ve encountered a number of situations where an error counter was incrementing, but there was no information regarding the endpoints that were impacted. For example, a packet whose forwarding lookup failed. Many of these errors then require a packet capture on the interface or device that reported the error. Virtualization won’t help solve this problem for physical network problems (i.e. the underlay). The hardware will probably need to record N bits of the header in order to identify the endpoints that were impacted.

    Paul makes a good point about a separate management network. However, the management network will eventually fail. A fall-back mechanism would be to send management traffic in-band, which would require prioritization of management traffic. If prioritization is going to be required to handle the backup scenario, doesn’t the problem simply become one of shared fate?

  8. dovydas says:

    It’s been a year. Where is the second part of this great discussion?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 424 other followers