According to a decent interview/story in Light Reading, Broadcom has an NFV strategy (no surprise) and it’s one that favors the notion of a hybrid technology approach, augmenting COTS with specialized technology to enhance performance (no surprise there either; they make the stuff). But just because you’re opportunistic doesn’t mean you’re wrong, and there are a number of factors that complicate the seemingly simple notion of NFV savings through off-the-shelf hardware. Specialization of hardware, hybridization as Broadcom would put it, is one, but Broadcom will need to think about all the others too.
Factor number one is that it’s not really what carriers want these days. Most of the operators who have delved into NFV seriously have already decided that capital savings won’t make enough difference to help them. NFV targets primarily specialized appliances, and there aren’t enough of them in a network to drive a big shift in cost. Not only that, most operators agree that capital savings wouldn’t likely be more than 25% max, and most of that could easily be eaten up in operations costs. Replacing a single branch access box by three or four containers linked with service chaining generates more components, thus more complexity, thus more cost. That means that first and foremost, an NFV strategy has to address the totality of the business issues, primarily operational efficiency and service agility.
Factor number two is that many of the NFV targets are really not even NFV applications. IMS and EPC, as I’ve pointed out, are more like simple cloud hosting, which we can already do. Service chaining, assuming you really want to approach it right, is more likely to require an agile edge box that you can load with the proper functionality, as a number of vendors have already noted. The good news for Broadcom is that it’s easier to justify specialized hardware in a custom device, even one that’s service-edge-based.
You can probably see a point here already. Yes, it is true that many possible applications of servers in networking would benefit from specialized hardware—and software for that matter. We should be thinking about what an optimized NFV platform would look like overall, and then recognizing that COTS may be giving up too much for too little gain. But, we have to note that if capex reduction isn’t really the target then optimizing hardware to support the real NFV mission demands we know what that mission is.
We need good data plane performance for NFV to be sure, and there are probably applications that would demand specialized stuff like content-addressable memory or high-speed arithmetic processing. Having these things might be essential in creating an NFV server to fulfill our mission. But they won’t make the mission’s business case. For that, we need to manage the complexity of networks and services—not only for NFV but for SDN and everything else.
Remember my comments yesterday on functional agility and the movement of service logic toward an almost-transactional model? Yes, you need to have high-performance data and control paths to make that work at scale, but you also have to stitch together functional components ad hoc to meet service demands. You’re not “provisioning” something in the traditional sense because you’ve made many of the key service components multi-tenant so they’re there all the time. Yes, you’ll need horizontal scaling and load balancing just like you do in a realistic NFV IMS/EPC implementation, but you also need to be able to find components quickly, steer work reliably, and do all of the stuff that distributed OLTP systems have to do. We don’t get to this end-game immediately, we do start along the path as soon as we admit that service chains and multi-VNF services are more complicated, less easily operationalized, than a service made up of a couple shared or dedicated boxes. Even if the new costs less, capex-wise, it’s not likely to demonstrate any stellar TCO benefits, unless we can control that complexity. Service automation isn’t about doing what we do today, but with software. It’s about doing what needs to be done for the next generation, stuff that will kill the business case if we don’t do it right.
What this means for Broadcom’s hybrid approach is that it could be totally correct, even smart, even essential, but still not be sufficient. There has to be NFV deployment before you worry about how efficient it is. If you can identify a solid driver for NFV deployment you have a specific business case with specific dollar benefits, against which you can apply the cost for COTS, for augmented servers, or whatever. If you can’t identify the driver then it doesn’t matter whether you have hybrid or singular COTS.
Of course, Broadcom could define mechanisms for its augmented/hybrid hardware approach to reduce complexity. There are credible missions for hardware augmentation in operations and workflow, for OLTP-like stuff. The question is whether Broadcom is looking for these missions. In the article, the Broadcom CTO uses switches as an example of a mission that justifies special hardware rather than COTS, which is probably true. But recall that NFV wasn’t targeted at switching/routing, though there will certainly be a need for switches/routers in virtual networks of virtual functions. In those missions, which are more contained than transit switch/routing missions, it may or may not be true that special hardware is needed. Same with the agility/operations missions, the ones Broadcom needs to have proved out—by themselves or by somebody else—before their hybrid argument takes hold.
So here we are again, dancing with NFV missions, which is probably as frustrating for the operators as it is for me (or you, my reader). It would be nice if we could lay out the totality of NFV, the business drivers, quantified benefits, requirements to be met for each benefit/driver, and do the math. That’s not likely to happen right away, but eventually it’s inevitable. My spring survey said that operators believed that their NFV trials were proving technology but not making the business case. NFV, they say, is feasible technically, but the extent to which it can deploy depends on those hazy operations savings and service agility benefits, things that we still have not explored much in trials.
The ISG is looking for a mission for Phase Two; here’s one. We need to look at NFV in context, in service applications that are credible and that have service revenues and cost targets. We need to understand how various assumptions about servers and augmented hardware and specialized software and edge versus central hosting will impact the benefit case, how far we need to extend the concept of “management and orchestration” beyond virtual functions to capture enough costs and complexity to be meaningful