Raising the Bar on SDN and Virtual Routing

One of the questions being asked by both network operators and larger enterprises is how SDN can play a role in their future WAN.  In some sense, though it’s an obvious question, it’s the wrong one.  The broad issue is how virtualization can play; SDN is one option within that larger question.  If you look at the issues systematically, it’s possible to uncover some paths forward, and even to decide which is likely to bear the most fruit.  But most important, it’s likely you’ll realize that the best path to the future lies in the symbiosis of all the virtualization directions.  And realize why we may not be taking it already.

Virtualization lets us create real behaviors by committing resources to abstract functionality.  If we apply it to connection/transport (not to service features above the network) there are two ways that it can change how we build networks.  The first is to virtualize network devices (routers and switches) and then commit them in place of the real thing.  The second is to virtualize network paths, meaning tunnels.  I would assert that in the WAN, the first of these two things is a cloud/NFV application and the second is an SDN application.

When a user connects to a service, they get two tangible things.  One is a conduit for data-plane traffic to deliver stuff according to the forwarding rules of the service.  The other is “control plane” traffic that isn’t addressed to the other user(s) but to the network/service itself.  If you connected two users with a pipe that carried IP or Ethernet, chances are they wouldn’t be able to communicate because there would be control exchanges expected that couldn’t take place because the network elements designed to support them didn’t exist.

SDN in OpenFlow form doesn’t do control packets.  If we want an SDN network to look like an Ethernet or router network, we have to think in terms of satisfying all of the control- and data-plane relationships.  For IP in particular, that likely means providing a specific edge function to emulate the real devices.  The question becomes “why bother?” when you have the option of just deploying virtual routers or switches.

We couldn’t build the Internet on virtual routing alone; some paths have too much traffic in aggregate.  What we could do is to build any large IP network for an individual user, or even individual service, by segregating its traffic below the IP layer and doing its routing on a per-user, per-service basis.  That’s the biggest value of virtual routing; you can build your own “VPN” with virtual devices instead of with a segment of a real device.  Now your VPN is a lot more private.

The challenge with this is that below-IP segregation, which is where SDN comes in.  A virtual router looks like a router.  SDN creates what looks like a tunnel, a pipe.  That’s a Level 1 artifact, something that looks like a TDM pipe or an optical trunk or lambda.  The strength of SDN in the WAN, IMHO, lies in its ability to support virtual routing.

To make virtual routing truly useful we have to be able to build a virtual underlayment to our “IP network” that segregates traffic by user/service and does the basic aggregation needed to maintain reasonable transport efficiency.  The virtual subnets that virtual routing creates when used this way are typically going to be contained enough that servers could host the virtual routers we need.  The structure can be agile enough to support reconfiguration in case of failures or even load and traffic patterns because the path the virtual pipes create and the locations of the virtual routers can be determined dynamically.

This model could also help SDN along.  It’s difficult to make SDN emulate a complete end-to-end service, both because of the scaling issues of the central controller and because of the control-plane exchanges.  It’s easy to create an SDN tunnel; a stitched sequence of forwarding paths does that without further need for processing.  Transport tunnel routing isn’t as dynamic as per-user flow routing, so the controller has less to do and the scope of the network could be larger without creating controller demands that tax the performance and availability constraints of real servers.

If we suggest this is the appropriate model for a service network, then we can immediately point to something that virtual router vendors need to be able to handle better—the “adjacency problem”.  The trouble with multiplying the tunnels below Level 3 to do traffic segmentation and manage trunk loading is that we may create too many such paths, making it difficult to control failovers.  It’s possible to settle this issue in two basic ways—use static routing or create a virtual BGP core.  Static routing doesn’t work well in public IP networks but there’s no reason it couldn’t be applied in a VPN.  Virtual BGP cores could abstract all of the path choices by generating what looks like a giant virtual BGP router.  You could use virtual routers for this BGP core, or do what Google did and create what’s essentially a BGP edge shell around SDN.

This approach of marrying virtual routing with OpenFlow-style SDN could also be adapted to use for the overlay-SDN model popularized by Nicira/VMware.  Overlay SDN doesn’t present its user interface out of Level 2/3 devices, but rather from endpoint processes hosted locally to the user.  It could work, in theory, over any set of tunnels that provide physical connectivity among the endpoint hosting locations, which means we could run it over Layer 1 facilities or over tunnels at L2 or L3.

I mentioned NFV earlier, and I think you can see that virtual routing/switching could be a cloud application or an NFV application.  Both allow for hosting the functionality, but NFV offers more dynamism in deployment/redeployment and more explicit management integration (at least potentially).  If you envisioned a fairly static positioning of your network assets, cloud-based virtual routers/switches would serve.  If you were looking at something more dynamic (likely because it was bigger and more exposed to changes in the conditions of the hosting points and physical connections) you could introduce NFV to optimize placement and replacement.

I think the SDN community is trying to solve too many problems.  I think that virtual router supporters aren’t solving enough.  If we step up to the question of virtual networks for a moment, we can see a new model that can make optimal use of both technologies and at the same time build a better and more agile structure, something that could change security and reliability practices forever and also alter the balance of power in networking.

That’s why we can’t expect this idea to get universal support.  There are players in the network equipment space (like Brocade) who aren’t exposed to the legacy switch/router market enough that a shift in approach would hurt them as much (or more) than help.  Certainly server players (HP comes to mind, supported by Intel/Wind River) with aggressive SDN/NFV programs could field something like this.  The mainstream network people, even those with virtual router plans, are likely to be concerned about the loss of revenue from physical switch/router sales.   The question is whether a player with little to lose will create market momentum sufficient to drag everyone along.  We may find that out in 2015.