Virtualization and Cost Reduction

Is virtualization a cost-saving strategy?  An article in Light Reading on Monday talks about whether a single-vendor virtual network is more likely to save money than a multi-vendor network (Huawei is a major source for the data).  That’s a reasonable question, perhaps, but I don’t think it’s the central question here.  That honor goes to the question I opened with, and there are other better questions that big question introduces.

The operators I recently connected with were all over the map regarding whether “virtualization” saved money, or was even intended to save money.  The reason was that there are many different views of what constitutes “virtualization”.  Is it NFV?  If so, most operators don’t think it will deliver substantial bottom-line savings.  Is it SDN?  Operators generally think SDN can save money.  Is it white-box switch technology?  Nearly all operators think white boxes will save, but most doubt they’re “virtualization”.  But the big point I got was that most operators are seeing virtualization as a means of creating agility or resiliency, and thus aren’t assigning savings goals to it.  To me, it’s the SDN, white-box, and virtualization-goals topics that need some exploration, but I’m going to attack them in the opposite order.

Almost every operator sees virtualization as a key element in any feature-hosting model, whether they believe in NFV broadly or don’t.  Right now, there are three camps on virtual-feature-hosting strategies.  The smallest, which is represented to a degree in most operators but discounted in terms of bottom-line impact, is “true” NFV.  The next camp is the 5G cloud-native group, who believe that 5G will employ cloud-native technology for feature hosting but won’t conform to the NFV ISG specs, and the last group is the group that believes they’ll rely on virtualization, probably in vSwitch and SDN form, in all their data center strategies.

A few, but growing, number of operators see virtualization as a general tool in making their transport networks cheaper.  “Transport” here means largely mobile metro or IP core, and in both cases the operators are looking primarily at a form of SDN, often roughly mirroring Google’s Andromeda model of an SDN core surrounded by an IP/BGP “emulator” edge that lets their structure fit into the Internet seamlessly.  They’re, in a real sense, doing “network virtualization” as I’ve described it, abstracting the entire transport network as a black box that looks like IP from the outside.

At the access network level, the greatest interest is in the white-box model.  Operators are generally more white-box focused where there’s a large number of new devices likely to be deployed, which means both the 5G edge and CPE.  This separation is interesting because it connects this approach to the NFV camp in feature hosting.  In fact, about a third of operators see the possibility that white-box uCPE and NFV could combine usefully.  That’s far less likely for missions like 5G edge, where operators overall tend to favor the AT&T-launched DANOS model of a highly programmable edge software OS that marries to a white-box switch.

You might wonder what this has to do with multi-vendor, and the answer at the high level is “not much”.  Most of the operator virtualization objectives are implicitly (at the least) linked to a no-vendor or open model.  But do single- or multi-vendor virtual networks really create significant differences in TCO, as Huawei’s comments in the article suggest?  That’s a lot more complicated.

Virtual networks are typically transport overlays, meaning the ride on something else, something “real”.  If the networks involve something that has to be explicitly hosted, then they introduce capex.  In all cases, since the virtual network is still a network in terms of behavior, they all introduce opex.  But the numbers Huawei suggests seem to be averages across a universe with an enormous variation in experiences.

Huawei tells Light Reading that virtualization will initially cause costs to rise, and then with single-vendor implementation, costs will fall to 91% of the original amount.  Even if we consider the “original amount” to be the pre-virtualization costs, every operator knows darn well that they’d never approve a project that delivered a 9% reduction in TCO.  In fact, the article from Light Reading I cited on Monday quoted an operator as saying that NFV projects had to deliver a 40% reduction in TCO to even be interesting to talk about.  But the reality is, as I noted, that most virtualization isn’t really about lowering TCO, it’s about improving agility and scalability, or about avoiding cost rather than lowering a current cost.

What about the multi-vendor assertion?  I go back to my point that most virtualization processes are really about getting rid of vendors in favor of open platforms, both hardware (white-box) and software (virtual features/functions or software switching/routing).  What I think Huawei may be doing (besides an expected bit of self-promotion) is relating that if you have an application for virtualization that’s pretty broad, as 5G would be, you could approach that application in single- or multi-vendor form.  In the latter case, there could indeed be cost issues according to my own operator sources, but it’s more complicated than being just a “multi-vendor” problem.

There are, as they say, many ways to skin a cat (no disrespect to cats intended!), and vendors tend to look for differentiation, which means there’s probably a different way selected for each of the major suppliers you might involve in 5G.  The 5G connection is critical here, first because 5G is source Huawei’s biggest current target and second because 5G is at the moment incredibly vague overall, and particularly so in terms of how virtualization is used.  In fact, 5G could well be the space that proves that most of the “NFV” implementations being proposed are “ninos”, meaning “NFV in name only.”  They don’t abide by the NFV ISG spec because the spec is of no value in the applications of virtual functions to 5G.  It does bring a feeling of openness and comfort to wary operators, though.

Going back to Google Andromeda, the thing about virtualization is that it either has to be packaged inside a network abstraction that, at the edge, conforms to standard industry protocols and practices (IP/BGP in Google’s case), or you have to adopt a specific virtual-network standard.  What’s that?  Actually, the problem with virtualization standards is that we don’t have a broad, universal, picture of a virtual network ecosystem so we don’t really know what “standards” we should expect.  That makes the standards idea a kind of wild west myth, and so it could well be dangerous to consider a multi-vendor virtual network.

Even trying to create rational standards and interoperability could be politically dangerous for vendors, too.  Back when NFV looked more promising, HPE created a press tornado by proposing to emphasize specific vendor relationships in its implementation of NFV in order to evolve to a secure, orderly, and deployable community of elements.  It didn’t sound “open” and HPE got trashed, but of course it’s more open than the single-vendor approach.  The HPE strategy was proof that we weren’t completely addressing the integration issues of virtual networking, leaving too much to be handled by individual vendors.

Standardizing virtualization interfaces at this point would be difficult, because we don’t have a single source of them, and because many of the attempts that have been made aren’t likely to be effective.  NFV wouldn’t have onboarding problems if it had effective virtualization interfaces to standardize.  Multiply NFV’s issues by the ones that could be created by ETSI ZTA, 3GPP 5G, and ONF SDN, and you can see how much disorder even “standards” could create.

What we need may be less a virtualization standard than a virtualization architecture.  Let’s assume that we’re going to virtualize a device, a box.  The model is that we have one or more hosted elements “inside” and these behave so as to present an external interface set that fully matches the device the virtualization/abstraction represents.  We could call this a “abstraction domain”.  The model could then be extended by saying that a collection of ADs that operate as a unit are also an AD, presenting the kind of interface that a network/subnetwork of devices would present.  Inside an AD, then, can be anything that can be harmonized to the external interface requirements of the thing the AD’s abstraction represents.  You abstract a box, you look like a box, and if you abstract a network or subnetwork, that’s what you look like.  Inside is whatever works.

This approach, I think, is more realistic than one that tries to define a new set of control/management protocols for the virtualization process.  A side benefit is that it creates interoperability where we already need it and have it, at the device/network level.  I think that if 5G virtualization is ever going to amount to anything, this is how we’ll need it to be done.

One thing I think is clear about 5G and transformation alike, which is that operators are committed to open models.  Sorry, Huawei, but I reject the notion of a single-vendor solution as the ideal for optimizing TCO.  Why bother with open technology if that’s the approach you take with it?  We need to make open approaches work, not only for operators but for the market at large.