How Apple Gave Contextual Services a Big Boost

Apple announced its iWatch yesterday, and it’s likely to become an icon among the hip crowd, among which I’m unhesitatingly not counting myself.  But notwithstanding the forces that drive Apple addicts to buy everything under that famous logo, useful or not, iWatch does open some interesting questions and suggest some interesting trends.  So while I’m not rushing out to get one, I do propose to look at what the announcement could mean for the industry, but in particular for mobility and contextual services.  We may all be rushing to dance to Apple’s new tune, in a sense, Apple fans or not.

With the exception of biometrics, there’s not much that you can do with an iWatch that you couldn’t do with an iPhone in a pure-function sense, and even Apple isn’t likely to be relying on simplistic repackaging functionality as the basis for its growth in share price.  In my view, iWatch is intended to be a step along a trend line that has taken us from phones as instruments of calling to phones as a means of driving our lives along.

Most of what people do with smartphones require, no surprise, that you access the phone.  You dig it out of the pocket or purse and diddle with it to text, answer IMs or emails, update social networks.  That’s more convenient than sitting down at the computer, but it’s still not exactly instant gratification.  First and foremost, the iWatch is about improving the gratification dimension.  It’s a handy agent element for the iPhone, a more generalized extension than a Bluetooth headset but like a headset it lets you do something without handling the phone itself.  Somebody calls or texts, and instead of digging for the phone you look at your iWatch to see who and what.  You can generate simple responses, you can mute the alert, all without diddling with your phone.

Cynics among us might say that this is another gilding of the lily but I have to point out that cellphones gilded the black phone lily and smartphones did the same for cellphones.  The truth is that we’re trying to figure out how to best interact with a gadget whose role in our lives is expanding radically and rapidly.  The iWatch may not have all the answers at this point, but it’s asking an important question about how to best support interactions with a device that’s more than a phone and getting more functional every day.

The big question for iWatch, in my view, is where it might drive Siri.  Our same hypothetical cynics will look at the small face and limited messaging capabilities of the iWatch and suggest that it will further trivialize human interaction.  Yuppies will refuse to communicate with anything other than iWatch-suggested texts and eventually carry this simple-polysyllabic model to the rest of their lives, making conversation seem like it’s extracted from first-grade reading material. Well, maybe.  It seems more likely that what will happen is that iWatch will create enormous pressure for voice control and response, and then for what I’ve been calling “contextual services”.

It’s obvious that a super-Siri could immeasurably enhance the utility of iWatch, simply because there would be more things an owner could do without hauling out the phone.  Apple is already in a competition with Google and Microsoft over voice-operated personal assistants, but it doesn’t seem that Apple has really worked hard to get Siri to measure up to its potential (recent commercials touting Microsoft Cortana’s superiority could change that).  Perhaps now Apple will have a second reason to push Siri forward.

The “Cortana factor” could also join with iWatch in moving Apple toward contextual services in general.  Right now, Cortana offers a few contextual capabilities relating to location and calls that Siri lacks.  Rather than getting into a jab-for-jab match with Microsoft, iWatch might induce Apple to look at a general architecture for contextual services, something that could be truly revolutionary.

iWatch is a limited vehicle for interaction; it’s convenient but hardly rich.  Any sort of clues from any source would help a lot to get the number of things a user might mean or want down to a point where iWatch could navigate the field of options.  “Buzz me when I get close to my hotel” and “buzz me when I pass a bar with my favorite wine” aren’t that far apart in the sense of the iWatch interaction but there’s a lot more behind the second than the first, and more behind the first than asking for a reminder to buy flowers triggered by location being near a flower shop.  Both demand contextual understanding of the request.

Contextual services is a big opportunity, if you address it.  Apple is taking a big risk with iWatch, I think, but not in the sense that some on Wall Street think.  The problem is that iWatch makes it very clear that any wearable adjunct to a smartphone is crying out for contextual, voice-assistant, support simply because there’s no strong mission for such a wearable element other than to act as an agent for simple interactions.  It follows that the more things you can turn into simple interactions through application of context and speech recognition the more valuable the gadget is.  Will Google or Microsoft or even Amazon miss that point?  Any of them could field something here.

In fact, anyone could provide contextual services and a good framework.  Carriers, startups, anyone.  A standards group or industry group could be spawned to create a specification for wearable-tech-to-smartphone interactions, and the whole space would then open up.  That’s something any Apple competitor might consider if only to poison Apple’s well should iWatch succeed.

This impacts even vendors who have nothing to do with wearable tech, watches, or even high-level services.  What we’re seeing here is an example of how something outside the network creates a new set of opportunities that the network, in part at least, would have to fulfill.  Operators don’t get to plan iWatch launches, they have to react to the conditions the launch creates, and they have to plan to do that at price points appropriate to the consumer market.  So even though iWatch today is still a shadow of the revolution many say it is, there’s still time for it to become truly revolutionary, and drag all of networking along for the ride.

Should SDN be About OpenDaylight and not OpenFlow?

I had a couple of very interesting discussions recently on SDN, and they point out what I think might be the emerging key issue for the whole concept.  That issue is whether SDN is about services or about protocols, and it’s been framed to me recently by discussions on the evolution of OpenDaylight.

The central concept of what most people would call “true” SDN is the OpenFlow protocol, which provides the means for a central controller to manage network routing by managing per-device forwarding tables.  Unlike traditional switching/routing which “learns” network topology and reachability by adaptive exchanges with adjacent/connected elements, SDN imposes forwarding rules based on any arbitrary policies it might like.

Interestingly, this definition makes three interdependent but not necessarily tightly bound points.  Point one is that there is a protocol—OpenFlow—to control the devices.  Point two is that there is a central mechanism for deciding what network routing should be like, and point three is that there is a service conception behind what that central control point decides.  In the SDN world, we’ve seen a polarization of solutions based on which of these points are most important.  Can you build “SDN” if you can create services—“Networking-as-a-Service” or NaaS—regardless of how you do it?  Can you do SDN without central control, without OpenFlow?  All valid questions, since buyers of next-gen networking will buy/do what is useful to them, which might be any or all of these things.

What’s interesting about OpenDaylight in this context is that it’s been, at least for the companies I talked with recently, a kind of mind-opening experience.  All the people involved in the projects I reviewed had started with the notion of ODL as a model for an OpenFlow controller.  All are now questioning whether that’s what’s really important.

OpenDaylight is in fact a controller model, but it takes a different slant on the process.  You have higher-layer “northbound” applications that frame some conception of NaaS, and then impose that conception on hardware through what’s essentially a plug-in architecture.  Unlike a pure SDN controller, OpenDaylight can speak a theoretically infinite number of control protocols southward into the network equipment/software layer.

What got my conversational partners interested in OpenDaylight beyond OpenFlow was the question of evolution.  You can use ODL and OpenFlow to control legacy devices that can support the OpenFlow protocol, but you could also use it to control legacy devices in other ways.  So my subjects didn’t like the OpenFlow solution on some of their hardware so they asked the following question:  “Could I frame up some notions of NaaS that are meaningful to my business case, then simply adopt them across a mixed infrastructure using whatever protocol was handy.”  And that was the start of something big for them.

The companies quickly realized that what was going to make the SDN business case for them was the intersection of two things.  First, what useful NaaS models could be identified and applied to current and evolving applications?  Second, what is the cost of supporting those useful NaaS models through a full set of network-control options?

One quick realization that came out of these questions was that Cisco’s notion of ACI wasn’t as dumb or manipulative as at least some of these companies had believed.  If you take a NaaS-centric view of SDN, you’re really taking a view that depends on API converting goals into network behaviors.  How that’s done is secondary (it matters only in a cost sense).  If no changes to current devices or practices are an option, then there would be no risk of increases in operations complexity and cost.

The second realization was that the value of stepping beyond an SDN model that simply manipulated current adaptive behavior likely lies in the value of NaaS models that can’t be easily induced from current IP and Ethernet devices behaving in their usual adaptive/cooperative way.  These companies all realized that cutting off the option of enhanced NaaS models was taking a risk, so they were looking for a way of managing that risk.  Which became the mission for OpenDaylight.

OpenDaylight is essentially a shim layer that operates between NaaS models and infrastructure.  If you have NaaS models that map to simple control of legacy IP or Ethernet connectivity, then there is no OpenFlow involved because you don’t need to change forwarding models for NaaS reasons.  That would make OpenFlow adoption totally dependent on the credibility of substituting white-box switches for current adaptive switch/routers.  That’s a savings in capex with a so-far-difficult-to-quantify accompanying change in opex.  In short, it’s a future issue for these SDN planners.

One of the companies is a service provider looking at SDN to redo their access and metro network architecture, and in particular to combine Ethernet and Optics more efficiently.  They had been very interested in the evolving OpenFlow for optics specification, but as they got into OpenDaylight as a means of making NaaS evolution more accommodating to new NaaS models, they realized that the OpenDaylight approach could be used to eliminate the need for OpenFlow’s optical extension.  Why not just put a plugin at the bottom of OpenDaylight that talks the language of the current optical network?  Whether OpenFlow’s optical extension is useful or not (“not” is my own choice), it’s obvious to my carrier friends that in the near term the devices don’t support it.  Why invent a new control protocol and refit devices to use it when the old protocols would work via OpenDaylight?

I think that this illustrates an important point about OpenDaylight, which is that we should think of it not as an SDN strategy but as a NaaS strategy.  It could be, for networking in general, what OpenStack’s Neutron is to cloud networking.  You frame a NaaS in the top layer of ODL’s stack then push control out a plugin at the bottom in a way that’s appropriate to the devices you’re using—new or old.

In NFV terms, this means that you could view OpenDaylight as a kind of Infrastructure Manager that crosses between the two models of IM that the ISG has been kicking around.  You could use OpenDaylight as an element of a Virtual Infrastructure Manager to connect cloud elements, essentially an implementation of Neutron.  You could also use it as what some are calling a Wide-area Infrastructure Manager (WIM) that’s used to connect NFV data centers or even connect NFV service zones to metro/access and WAN services created through legacy technology.

In terms of industry history, OpenDaylight evolved out of the ONF’s efforts, but I wonder if OpenDaylight might now have surpassed its teacher.  Should we be thinking about SDN and NFV and the cloud in terms of NaaS and OpenDaylight and leaving OpenFlow as a device control option?  My conversation with some SDN prospects suggests that might be a good plan.

A Closer Look at HP’s OpenNFV

One of the most interesting players in the NFV game is HP.  Not only do they have a strong position in IT, the cloud, and the data center, they also have a strong OSS/BSS position and they’re prominent in all of the standards groups involved in network evolution.  That includes NFV, of course, but also SDN.  In a prior blog I had noted that operators thought their OpenNFV architecture was the strongest or one of the strongest in the market, but that it was still not fully delivered.  HP was surprised at the view, and I spent some quality phone time with the key HP people to understand their perspective, and how that might relate to or contrast with the views expressed by the operators in my survey early this year.

OpenNFV is a suite and an architecture.  HP explains it (not surprisingly) by referencing the NFV ISG’s architecture, and this is actually a good place to start because OpenNFV is surprisingly inclusive.  In particular, HP is committed to both open infrastructure (NFVI) and open third-party VNFs.  The latter point is critical in my view, though as you’ll see I still have some concerns about the details.

The heart of HP’s OpenNFV is the NFV Director, which provides the central MANO functionality outlined in the ISG’s E2E architecture diagram.  HP’s Director can also support either external management (including the NFV’s Virtual Infrastructure Manager or VIM) and an internal version, so it’s a bit broader than the pure orchestration model of the ISG.  The NFV Director developed out of HP’s Service Activator (an element of HP’s OSS solution), which is now used to deploy MPLS and GigE services, so it has the ability to control legacy deployment out of the box.  In one of the current PoCs, HP says that only 20% of the configuration is NFV and the rest is legacy.

The OSS roots of the NFV Director mean that HP’s approach is “higher-level” orchestration in the sense that it can manipulate service-layer elements, but HP also extends orchestration downward to a low level using the same logic.  Services are composed from service elements that can be “service-like” or “resource-like” in nature.  The elements are structured in the Customer-Facing and Resource-Facing services approach that the TMF has used historically (they’re apparently moving away from it now), and also that both my CloudNFV and ExperiaSphere architecture employed.

Part of HP’s approach is a very state/event orientation.  Service elements appear to have states and can be driven by events, and there’s policy logic that for example lets you decide where you are (state-wise) where you need to be, and how to get there.  This approach molds a lot of service intelligence and policy management into the model itself, which is what I think is the right approach.

The Director supports multiple VIMs and so can orchestrate services that require different parts of the network be based on different vendors or technologies.  HP embraces the notion of an infrastructure manager for non-NFV elements and for SDN services, which is how they can control legacy infrastructure.  The “internal” VIMs work with the same plugin architecture used in OpenStack, and HP believes that it’s likely that multiple NFV data centers would mean multiple VIMs and cross-data-center orchestration.  They’ve obviously thought through this aspect of NFV deployment pretty carefully.

If I have a quarrel with the HP approach it’s that they may be following the ISG E2E process very literally when there are aspects of the ISG model I’m concerned about.  I don’t see any virtualization of management; HP follows the model of the ISG in presuming that management elements are composed into virtual functions and that resource MIBS are potentially directly accessed.  A Director element called “agent-less monitoring” provides infrastructure event detection, something that seems like it could evolve into a more “virtualized” management approach, but HP isn’t yet there with respect to using Repositories and derived management views.  I’m concerned, as I’ve noted in other blogs, about the scalability and security of the native ISG model.

The logical question here is whether HP can make the NFV business case, or at least do as good a job of making it as anyone can at this stage.  The business case for NFV is complicated, and I think there’s no question that the V2 targets for NFV Director will be needed to drive NFV deployments in the real world, but it looks like HP has a good chance that its execution will live up to the architecture.  One of the interesting things about NFV Director is that it evolved from a very different point than most NFV approaches.  It’s an evolution of HP’s Service Activator, an OSS component, and that gives it a jump on others in two ways.

First, obviously, Service Activator can activate legacy services, which means that NFV Director can as well, so legacy and NFV can be comingled in deployments.  Classic NFV considers the parts of the network not implemented as virtual functions to be out of scope, which can seriously limit the ability of an implementation to deliver meaningful overall gains in service agility and operations efficiency.

Second, Service Activator was always a high-level orchestrator, meaning that it was expected that there would be a second level of technical orchestration below it for most services.  I’ve said from the first that NFV orchestration really had to be multi-leveled or specifics of network control would end up percolating into service definitions, creating major headaches when changes to infrastructure were made.  Not only that, unless even current infrastructure is homogeneous, infrastructure-specific orchestration at the service level means that service models might have to accommodate different equipment in different places in the network.  None of that makes any sense if you’re looking to reduce operations costs.

I noted in a prior blog that I had attended a carrier meeting where two operations types sitting next to each other had totally different visions of OSS evolution.  HP heard that same thing, apparently, because they reflect the same two options—limit OSS impact or use NFV to restructure operations overall—in OpenNFV.  They have a nice slide pair that describes how the Director can be essentially a siding down which NFV elements can drive before resuming the mainstream of operations, just to handle the virtualization issue.  They also show that the Director could be used to drive lower-level processes from service models, essentially restructuring operations to be model-driven.

I’ve always liked the HP architecture, and while their current implementation hasn’t reached the V2 state that would bring full OSS/BSS and NMS integration to the table, I’m pretty comfortable they’ll get there.  Given that their model seems far more likely to meet NFV service agility and operations efficiency goals than other more limited approaches, I think it’s fair to say that HP’s OpenNFV could be the best major-vendor offering out there for service modernization in an SDN and NFV world.

Could Cisco Create Substance for the Internet of Everything?

Cisco has long been an advocate of what I like to call “Chicken-Little” marketing; you run around yelling that the traffic sky is falling and expect operators to accept sub-par ROIs to do their part in addressing the issue.  Their “Internet of Everything” story has always had elements of that angle, but it’s also been a bit muddled in terms of details.  As Light Reading reported yesterday, they’ve now made things “clearer”, but not necessarily any more plausible.

What Cisco is doing with IoE is taking every trend in networking, from the cloud to mobility to content to telepresence…you name it…and making it an explicit part of “everything”.  Semantically it’s hard to quarrel with that, but at the end of the day it’s hard to resolve the question of evolution if you look at every possible organism in every possible ecosystem.  Thus, I have to admit a strong instinctive reaction to shout out a dismissive negative in response to IoE.  That would probably be justified given that Cisco almost certainly intends it to be nothing more than a positioning exercise, a way of seducing the media to write stories.

But is there more to this, or could there be?  The truth is that increasingly there is only one network—the Internet.  We have a lot of drivers that seek to change how it works, how it’s built, but at the end of the day we still have one network that can’t evolve in a zillion different ways.  All our revolutions have to tie into some macro trend.  You could reasonably name it the Internet of Everything, even.  What you need isn’t a name, though, it’s a strategy.  To get one we have to look at the factors that shape the “everything”.

Factor number one is regulatory.  As Cisco touts all the great stuff you could do with the Internet, they implicitly support a neutrality model that would eliminate the possibility of any business model other than all-you-can-eat bill-and-keep.  The decision by many of the larger ISPs to charge Netflix isn’t a bad one, it’s a market experiment in a business model suitable for massive increases in content traffic.  We don’t need to eradicate these experiments in the name of fairness (or unfairness, depending on your perspective) we need to encourage more of them.

Would it be helpful to be able to dial in more capacity or QoS?  Interestingly, advocates of “more neutrality” seem to agree that it would be fine for users to pay for QoS on their own.  It’s not OK for them to pay indirectly, though, by having their content provider pay the ISP and the user pay the content provider.  This is (in my view, rightfully) what the courts had a problem with in the old Neutrality Order.  We should have both options, all options in face.  We should also have the option to settle for QoS on cross-ISP traffic flows, so users could get QoS not only for their content delivery (much of which is from CDNs directly peered with the ISP) but end to end for any relationship.  How could you say you had an Internet of Everything without that capability?

Factor two is security.  It’s already clear that the Internet is terribly insecure, and that a lot of the problem could be easily fixed.  We don’t validate much of anything online—you could spoof an email address, a DNS update, even an individual packet’s source address.  Just making the Internet more accountable would be helpful, and if every ISP had a code of online protective conduct that it enforced, then interconnected only with ISPs who did likewise, you’d have a pretty good start toward making the Internet at least accountable, and that’s the first step toward security.

You could also improve access security.  We have fingerprint readers and retinal scanners already; why not make them mandatory so that every user could be authenticated biometrically rather than by passwords that most people can’t make complicated enough or change often enough so they’re secure?  You could take at least some positive steps just by having a user associated with an in-device hard code—something that says “This is my phone” and that can be changed (presumably when the phone is sold legitimately) only through multi-step processes.

SDN could improve security by creating application-specific subnetworks that users and servers must be explicitly admitted to.  No more having people slip across the DMZ in a network.  You could even have some IP addresses unreachable except by specific central-control intervention; forwarding rules have to accommodate connections with highly secure assets in a specific sense, rather than allowing those addresses to fall into a pool that’s routed by subnet or another portion of the IP address.

But the final point may be the hardest to achieve—the IoE has to be profitable enough that all the stakeholders are prepared to play their roles.  The FCC Chairman is reportedly bemoaning the lack of broadband competition in the US.  Well, Mr. Wheeler, competition happens when there’s an opportunity valuable enough for multiple providers to want to exploit it.  Clearly, given that carriers both in the US and elsewhere are investing in non-network business lines (some even investing in networking in other countries), they are prepared to put money into their business.  If they’re not putting it into their network, there’s something wrong with the ROI.

I think that Cisco’s IoE definition is a call for a complete rethinking of networking.  The Internet, as we know it, is not IoE.  We’re not investing in the right stuff, offering the right services, supporting the right business relationships, providing the right rules to govern behavior, or even insuring that everyone is who they say they are.  We’ve let a network model that wasn’t designed for the Internet of anything beyond a community connection in the scientific and academic space aspire to be an IoE.  There’s more to it than that, and what I’d like Cisco to do is address all of the real requirements, all the places where change is needed, with a clear position and realistic proposals.

Ciena’s Numbers Show We’re Not Facing Networking’s Future Squarely

I’ve been blogging about the fact that the Internet’s pricing model has been undermining the revenue potential for “connection services”.  If you can’t charge incrementally for bits, then there’s less value in investing to generate them.  One thing this has meant for the industry is pressure on network equipment vendors; operators don’t want buy as much when their ROI is low.  However, you can’t be a “carrier” or “operator” without bits, and many have seen the fiber vendors like Ciena as the winners in the equipment space.

Well, maybe not “winners”.  Ciena reported, and while the company beat estimates in both profit and revenue, it offered what the Street characterized as a “soft” outlook.  Revenues guided significantly lower than Street estimates and gross margins were also projected to be off, suggesting considerable price pressure.  So what does this tell us?

First, it proves that if bits aren’t profitable, operators will do everything to reduce their cost of generating them.  You can argue that the price pressure on switching and routing and network features that spawned interest in SDN and NFV were attempts to streamline infrastructure at the higher layers and so to cut costs there.  The thing is, we have little operator SDN deployment at this point, no NFV deployment, and the cost pressures remain.  So operators will economize where they have to, which is everywhere.

This is also why Huawei is riding high and presents such a formidable threat to the rest of the vendors in the space.  As the acknowledged price leader, Huawei can expect to sell and sustain acceptable profits at price points where other vendors would see their margins trashed and their stock prices in the toilet.  Huawei doesn’t have to innovate to get ahead in this game; they are ahead and all they need to do is innovate enough to prevent a competitor from gaining a feature advantage that would offset Huawei’s pricing power.

The second thing it tells us is that Ciena and the optical players haven’t done enough with things like SDN.  The one thing that is absolutely true about networks is that they are built on Level 1—the physical layer—no matter what goes on above.  Optics are here to stay, and that’s clearly something you can’t say confidently about the other layers.  Ciena could develop an SDN vision that could build upward from the transport/physical layer and infringe on the features and capabilities of what’s above.  If they were to do that they could effectively steal switching/routing market share without fielding a switch or router.  That in my view is what they should have done by now, and they have not done it.

Switching and routing started off as aggregating functions.  We can’t build networks by running an optical mesh between every possible communicating pair in the world.  However, nobody needs to do that.  Increasingly our networks are really metro star configurations where users are trying to reach not other users but service points of presence.  We should be considering the question “What kind of network topology and infrastructure builds a network that carries profitable traffic?”  I contend that network would be heavy on fiber and would have higher-layer gadgets that looked more like giant BRASs than like switches or routers.  What other gear would be there?  What functions needed above metro optical transport are needed?  Those are the questions that vendors like Ciena needed to answer.

The third thing we can learn from Ciena is that you can “steal market share from operations”.  There is one budget for networking, one aggregate cost of service that has to compare favorably with the aggregate price.  While new service revenues (gained through “agility” for example) are always nice, the fact is that the only absolutely credible source of new infrastructure spending is a proportional reduction in operations costs.  Don’t rob optical-layer, or even higher-level, Peters to pay your Paul, rob somebody outside the whole equipment framework.  If we could bring about a reduction of ten billion dollars in operations and administration in networking we could increase network equipment TAM by the same amount without changing our ROI assumptions/needs a whit.

Everybody seems to have gotten somewhat interested in OSS/BSS, even major vendors like Cisco who recently invested in an operations startup.  But billing systems are not going to move the ball much by themselves.  To make a revolutionary change in operations cost we’re going to need a revolutionary change in operations practices.  The OSS/BSS vendors are unlikely to drive this sort of change for the same reasons that network equipment vendors aren’t anxious to drive revolutionary changes in network architecture.  So the network vendors should be robbing operations Peters here, and that includes Ciena.  There’s found money to be found in operations, money that could help a lot of revenue and margin lines down the road.  Find it.

The final thing we’re learning from Ciena is that the operators themselves are not going to drive this process forward as they should.  Since the Modified Final Judgment and Telecom Act and similar “privatization” initiatives globally, operators have not only gotten out of the equipment business (Western Electric, a Bell subsidiary, used to make gear for the Bell System) but out of the habit of driving equipment details.  Today, faced with a new requirement, operators tell vendors “Well, we need x, y, and z” and then stand back and hope somebody hands the items to them on the proverbial sliver platter.  Obviously when those things are contrary to vendors’ own continued revenue and profit growth, they’re not falling all over themselves to do the hand-off.

Operators do not understand how to run projects to build infrastructure in a software-driven age of networking.  Yes, there are some who do, but the processes they initiate to face the future are not run like software projects even though software is the expected output.  And, perhaps worst of all, the projects have an implied timeline measured in years when the Investor Relations people in the operator space would tell them they don’t have years to get a new approach rolled out.

Ciena didn’t have a bad quarter but it should have, could have had a great one and didn’t.  The difference between not-bad-ness and greatness isn’t that hard to bridge.  Will somebody else bridge it?  That should be something Ciena management fears more than just a continuation of price pressure trends that we’ve faced for years.

Can OpenStack Take Over NFV Implementation Leadership?

Some time ago, the Linux Foundation in cooperation with the NFV ISG launched an initiative (Open Platform for NFV, or OPN) aimed at implementing the NFV specifications.  OpenStack has now launched a “Team” that has the same goals.  It’s not unusual for multiple groups to aim at the same target these days, both in standards and implementation, but when that happens you have to ask which effort is likely to bear fruit.  It’s early in both processes but I’ll offer my views, organizing them by what I think are the primary issues in implementation.

NFV is really three pieces; the virtual network functions (VNFs) deployed to create service features, the NFV Infrastructure (NFVI) that serves as the platform for hosting/connection of the VNFs, and the management/orchestration (MANO) functions that do the deployment and management.  The goal of all of this is to create a framework for deploying service logic in an agile way on a pool of virtualized resources.  The NFV ISG’s approach is outlined in their end-to-end specification, which is where I think implementation problems start.

If you looked at the three pieces in a top-down way, you’d have to conclude that the first step should have been to define what you expected to be your source of VNFs.  It’s been my contention from day one that the great majority of VNFs will have to come from existing open-source and proprietary software, the former largely running on Linux and the latter also adapted from embedded logic used by vendors.  If we were to look at this pool of VNF candidates, we’d see that the stuff is already written and maps pretty well to what could be considered applications for the cloud.

Neither the Linux Foundation nor OpenStack seems to have started with the mandate that their strategy run existing logic with minimal modification and largely rely on current virtualization tools, but obviously there’s a greater chance that OpenStack would do that.  OpenStack does seem to be taking things a bit more from the top, where the ONP activity is focusing initially on NFVI.  So here, advantage OpenStack.

One of the specific problems that arises when you don’t consider the current applications as sources of VNF logic is that you define network requirements in terms that aren’t appropriate to the way the VNF source software is written.  Here I think the ISG created its own problem with the notion of “forwarding graphs” as descriptions of pathways between VNFs.  Anyone who has ever written software that communicates over IP networks knows that “pathways” aren’t a network-level feature.  IP applications communicate via an IP address and port in most cases.  The goal should be not to define the logical paths between components, but rather the network framework in which the components expect to run.  If I have four components that expect to be inside the same subnet, I should create a subnet and then put them in it.  Once that’s done, the way that they pass information among themselves is something I should neither know nor care about.  IP is connectionless.  But forwarding graphs encourage everyone to think in terms of creating tunnels, a step that’s unnecessary at best and at worst incompatible with the way the logic works.

OpenStack actually builds up cloud applications the way that NFV should expect, so it would be logical to assume that OpenStack will do this part right.  The problem is that the forwarding graph stuff is in the spec, and the OpenStack team seems at least for now to be taking it seriously.  Thus, here, we have to say that they’re starting wrong.  ONP isn’t starting at all at this point, so we can’t call any win or even leading player yet.

The next issue is in the way that NFV relates to infrastructure.  There are different virtualization and cloud options, different network technologies.  If MANO descriptions of NFV deployments are specific to the hosting and network technologies—if they have detailed instructions appropriate to a single hosting approach or connection model—then every service would have to be drawn up in multiple ways reflecting all the infrastructure combinations it might use, and changed every time infrastructure changed.  This, to me, cries out for a “double abstraction” model where a generic model for deployment or connection is abstracted to whatever infrastructure is selected.  How that would be done in MANO isn’t clear in the spec in my view.  OpenStack handles a low-level network abstraction because Neutron lets you map a generic model to connection tools via plugins.  OpenStack’s orchestration model, now evolving as HEAT, has the potential to provide a tighter level of hardware integration with software deployment (DevOps).  ONP, focused on NFVI, could address this stuff quickly by defining virtualization models for all resources, but it’s not happening yet.  Advantage OpenStack.

There’s a related issue here, which I think is going to be critical.  NFV isn’t going to be the total platform for many (if any) services; it will combine with elements of legacy infrastructure for access (Ethernet for example) and service connectivity (VPNs, for example).  To deploy the NFV parts automatically while relying on different processes to deploy connecting legacy network elements seems to put any overall agility and operations efficiency benefits at risk.  An effective double-level abstraction like the one I’ve suggested is needed for NFV elements in any case should be able to describe legacy provisioning, and so allow operators to build services with evolving mixtures of NFV and legacy elements.  In my view, OpenStack’s Neutron has the potential to do some of this out of the box, and HEAT would add even more capabilities. Again, ONP has yet to address this issue as far as I can determine.  Advantage OpenStack.

The final point deals with virtualization and management.  The presumptive management model in the ISG E2E architecture is to expose infrastructure elements into the VNF’s “space” and also to compose “management VNFs” into virtual-function-based service features so that the features can be managed.  This approach is workable in abstract but in my view it’s full of problems at scale.  I think that you have to presume management views—both how VNF elements view infrastructure and how management systems view VNFs—have to be virtualized and composed in order to map evolving service elements to existing operations practices.  Similarly, you have to be able to recompose your management views to accommodate the changes in operations you’ve made to gain agility and efficiency.  I don’t see either of the two groups doing anything in this area, so there’s no way to call a winner.

So where does this leave us?  I think there are a couple of areas in NFV implementation where the OpenStack people have an obvious lead.  I don’t think there’s any area where I can say that ONP has the upper hand, at least not yet.  Despite the fact that OpenStack’s NFV Team is a kind of new kid, I think it may be the most likely forum for the development of a logical NFV implementation.  However, as I’ve pointed out here, there are some issues that if not addressed will IMHO lead to crippling flaws in the implementation.  These will undermine the NFV business case, or force operators to adopt vendor-driven solutions that are more comprehensive.  It may well be that neither of these open implementation initiatives will succeed, in which case there is likely no hope for any standard implementation other than a de facto vendor solution.

Should the IP/Ethernet Control Plane be a “Layer”?

There’s been a lot of talk about “layers” in SDN and even in NFV, and I’ve blogged in the past that most of it is at the minimum inconsistent with real OSI model principles.  Some is just nonsense.  Interestingly, given that we seem to want to invent layer behaviors ad hoc, we’re ignoring a real “layer”, one that could actually be the glue to a lot of stuff in both the SDN and NFV space.  It’s signaling, or what in IP is called the “control plane”.

When you dial a conventional phone (or most mobile phones) you create a set of messages that aren’t you call but rather a request for service.  When you click on a URL, you’re generating something very similar—a “signal” in the form of a DNS request that will eventually return the IP address of the site you’re trying to access.  The fact is that every Internet or IP service relationship we have today, and nearly every Ethernet one as well, is based in part on what’s typically called control plane interactions that lay the groundwork for data exchanges.

We don’t recognize the role of the control plane until we don’t have it.  Recently a big ISP lost DNS service and it ended up breaking the Internet services of millions.  I’ve had recurrent problems with Verizon’s FiOS service because of DNS reliability or loading, and as a result I changed to another DNS provider.  DNS is also often the means used to direct traffic to one of several points of service—load balancing.  In SDN, you still need DNS services and many think that DNS “signaling” is a better way to set up paths than waiting for traffic to be presented for routing.  In NFV, DNS may well be the core of any effective strategy for horizontal scaling.  And DNS is only one control-plane service/protocol.

One of the central truths of our networking revolutions is that all of them necessarily end up creating what’s popularly called “Networking-as-a-Service” or NaaS.  This control plane stuff comes into play with any conception of NaaS, and so ignoring it not only hampers realistic implementation of SDN or NFV, it also impedes our understanding of what has to be done.

Suppose you have this little app called “generate-a-packet”.  When invoked, it sends a packet to some URL.  Now suppose that you want that app to be supported by our evolving SDN/NFV world, through the medium of NaaS.  How does it work?

One way it doesn’t work is that you present your packet out of the blue and hope for the best.  If you are sending a packet to a URL you need two things—an address for yourself and one for your target site.  So presumptively your “NaaS” has three functions available.  One is “get me an address for myself”, based on a DHCP request.  One is “decode this URL” based on DNS, and the other is “transport”.  In a sense we could say that the first service is “Initialize” or “pick up the phone”, the second is “dial information and set up a call” and the final one is “talk”.  We could visualize web access in terms similar to a phone call, with the process of clicking a URL setting up a session.   We could drive an SDN route from user to website based on this kind of NaaS.

The problem with this is that we don’t have any way to signal that we’re done, so in an SDN world we’d end up having a route for every URL path the user ever clicked on.  Is that necessarily bad, though?  In an Internet sense, taking this approach eventually reduces to replicating generalized routing tables because it scales better to provide uniform connectivity than to manage all the individual “sessions”.  In a VPN though, it might be very rational.  The user doesn’t need an infinite variety of destinations, only the ones they’re supposed to be talking to.

What this proves is that it’s possible to conceptualize ever-so-slightly-different services if you go back to control plane exchanges and NaaS.  It’s also possible to resolve some ambiguities in current “services” by adding some new commands, in the form of new signaling.  We signaled via DNS to decode an address and that was taken as a session setup request.  Could we signal to break down a session?  Surely.  In fact, there have been numerous proposals to set up IP relationships inside SIP sessions, so SIP could be the mechanism.

This may not be the best approach for the Internet at large but it could be fine not only for VPNs but also for content relationships.  In SDN applications to mobile services, session-NaaS for content delivery could allow us to bind users to cache points that varied as the user moved around.  The central address management process simply sets either end of the session to the correct source/destination.

In the NFV world, moving to a NaaS with explicit sessions could relieve one of the big challenges of horizontal scaling, which is how you keep a sequence of packets going to the same destination when they’re related.  “Stateful load balancing” isn’t easy because we don’t know what packets go together as a single request.  We could know if the application bounded the requests in a “session”.

I’m not trying to rewrite how IP or the Internet works here.  In fact, quite a bit of this could be done by simply using current signaling/control plane protocols in a different way.  More could be done if we were to assume that some critical applications used a NaaS that had features different from those of traditional IP or Ethernet.  Exploiting those features would make services based on SDN or NFV better and not just cheaper, and that could boost the business case for deployment.

Signaling and control plane stuff is an untapped gap between the high-volume data-plane movement of packets and the management processes that provision infrastructure.  It may be that if we’re going to get the most out of either SDN or NFV we have to fit some features into this gap, to exploit some of the extra stuff that could be done with signaling.  I contend that DNS-based load-balancing, already in use, is an example of this truth.  We need to look for other examples, opportunities, and see if realizing them will move the ball.