How SDN Models Might Decide the “Orchestration Wars”

One of the interesting things about SDN is that it may be getting clearer because of things going on elsewhere.  We still have a lot of SDN-washing, more models of what people like to call “SDN” than most would like, but there’s some clarity emerging on just how SDN might end up deploying.  I commented on SDN trends in connection with HP’s ConteXtream deal last week, and some of you wanted a more general discussion, so here goes!

There have been three distinct SDN models from the first.  The most familiar to most is the ONF OpenFlow “purist” SDN, the one that seems to focus on SDN controllers and white box switches.  The second is the overlay model, popularized by Nicira, which virtualizes networks by adding a layer on top that’s actually a superlayer of Level 3 not Level 4 as some would like to think.  The final model is the “software-controlled routing/switching” model, represented in hardware by most vendors but in particular Cisco, and in software by virtual-router products like Vyatta from Brocade.

Virtual switching and routing, and overlay SDN, are all the result of a desire to virtualize networks just like we did with computing.  In effect, these create a network-as-a-service framework.  That can be done with ONF-flavored SDN too, but the white-box focus has tended to push this model to a data center switching role, and to a role providing for explicit forwarding control in the other models.

Too many recipes spoil the broth more decisively than too many chefs.  Diversity of approach and even mission isn’t the sort of thing that creates market momentum.  What I think is changing things for SDN is a pair of trends that are themselves related.  The first is rapidly growing interest in explicit network-as-a-service, both for cloud computing and for inclusion in retail service offerings.  The second is NFV, and it may be that NFV is why the NaaS interest was finally kindled in earnest.

NFV postulates infrastructure controlled by a “manager” of some sort.  Initially this was limited to a virtual infrastructure manager, but many in the ISG are now accepting a “connection manager” responsible for internal VNF connectivity, and some are accepting that this might be extended to support connection to the user.  The important notion here is the “manager” concept itself.  You have to presume (the ISG isn’t completely clear here so “presuming” is a given) that a manager gets some abstract service model and converts it into a service.  That’s a pretty good description of NaaS.

If a manager can turn an abstraction into connections under the control of MANO in NFV, it’s not rocket science to extend the notion to applications where there’s no NFV at all.  I could use an NFV VIM, for example, to deploy cloud computing.  I could use a “connection manager” to deploy NaaS as a broad retail and internal service.

In NFV, most would think of the connection manager as controlling SDN, meaning that there’s an SDN controller down below.  That would likely be true for NFV inter-VNF connections, and it could also be true for edge connections to NFV services.  But logically most “connections” beyond the NFVI in NFV would be made through legacy infrastructure, so connection managers should be able to control that too.  Some would use OpenDaylight, and others might simply provide a “legacy connection manager” element.

It’s this that makes things so interesting, because if we can use connection managers to create NaaS and we can have legacy connection managers, we can then use legacy infrastructure for NaaS.  The manager-NaaS model then becomes a general way of addressing infrastructure to coerce it into making a service that can be sold.

In TMF terms, this might mean that “managers” are providers of resource-facing services.  If that’s true, then orchestration at the service level, meaning within OSS/BSS, might be able to do all of the higher-level orchestration involved in NFV.  “Higher-level” here would mean above MANO, above the process of deploying and controlling virtual functions.

Oracle sort-of-positioned itself in this camp with their NFV strategy.  I commented that it was the most operations-centric view, that it had TMF-style customer-facing and resource-facing services, and that it seemed to be positioned not as a limited implementation of VNF orchestration but as a broader approach, perhaps one “above” NFV.

I’ve been saying for quite a while that you need total cross-technology, vertically-integrated-with-OSS/BSS orchestration to make the service agility and operations efficiency business cases for NFV.  There’s always been three options on getting to that.  First, you could extend NFV and MANO principles upward.  Second, you could extend OSS/BSS downward, and third you could put an orchestration stub between the two that did the heavy lifting and matching between the environments of OSS/BSS and NFV.  How would an SDN-and-legacy NaaS model influence which of these options would be best, or most likely to prevail?

It might not change much, even if the NaaS story comes about.  The NFV ISG has taken a very narrow position on its mission—it’s about VNFs.  If you presume that the evolution to NFV comes about because services are converted from appliance/device-based to VNF-based, then the easiest way to orchestrate would likely be to extend MANO upward.  If you presume that NFV deploys to improve service agility and operations efficiency, then orchestration has to provide those things, and even if you orchestrated VNF-based versions of current services you’d still have the same operations problems unless something attacked that area too.

There’s some pressure from operators conducting NFV trials to broaden the trial to include operations, and also some to demonstrate specific efficiency and agility benefits.  However, these trials and PoCs are based on the ISG model of NFV and so they’ve been slow to advance out of the defined scope of that body.  Operators haven’t told me of any useful SDN orchestration PoCs or trials, and most of the operations modernization work in operators is tied up in long-term transformation projects.

That’s what’s creating the race, IMHO.  NFV could win it by growing “up”, literally, toward the higher operations levels and “out” to embrace legacy elements and broader connection-based services.  SDN could win it by linking itself convincingly to an operations orchestration approach, and OSS/BSS could win it by defining strong SDN and NFV connections for itself.

Who will win is probably up to vendors.  OSS/BSS has always moved at a pace that makes glaciers look like the classic roadrunner.  NFV is making some progress on generating a usefully broad mission, but not very quickly.  So I’m thinking that the question will come down to SDN.  Can SDN embrace an orchestration-and-manager model?  The competitive dynamic that might be emerging is what will answer that question.

How HP’s ConteXtream Deal Might Change the Game

Hint:  It’s not how you think!

HP is certainly at least one of the functionality leaders in the NFV race, and the fact that they’re an IT player is important to senior management at many operators.  They’ve won what’s arguably the most important NFV deal yet (Telefonica), and they’re on track to deliver convincingly on operations integration.  In one regard, though, they reminded me of the lion in the Wizard of Oz; they lacked a heart.

SDN may be the critical heart of any NFV deployment and HP’s SDN position was “referential” in that they support OpenDaylight.  That’s not enough for some operators, and HP has now fixed that by acquiring  ConteXtream.  The move may signal some new HP aggression in the SDN/NFV space, and it may even move the ball in terms of “network as a service.”  Or it may be a simple tactical play, one that could even go wrong.

ConteXtream provides a form of “overlay SDN” not unlike the Nicira model that first popularized SDN as a concept.  That approach offers three potentially significant benefits.  First, overlay SDN is infrastructure-independent and so it doesn’t force HP to take a position on network equipment technology.  Second, ConteXtream implements OpenDaylight and so it reinforces HP’s strategic SDN commitments but realizes them in a form operators can buy and deploy (and they have done so already).  Finally, overlay SDN models are fairly easy to make end-to-end.  All you need is some software to do the encapsulation/decapsulation at any access point and you have a virtual network that behaves much like a VPN but can be deployed with incredible agility and in astonishing numbers.

There’s also a tactical issue that ConteXtream addresses.  HP has been forced to work with SDN from other players, including major competitors, at key NFV accounts.  This obviously keeps the competitors in play where HP would like to have complete control, and having their own SDN strategy could be a big step in that direction.

The best of all possible worlds would be that HP takes the ConteXtream deal for tactical benefits and strategic potential, but it’s simply not possible to tell whether that’s the case.  In his blog on the acquisition, Saar Gillai (head of HP’s cloud and NFV business) cites both service-chaining benefits and subscriber connection benefits for ConteXtream, and that could be an indicator that HP intends to use the deal both to support its current PoC activity (where other SDN vendors are sticking their noses in) and also to address broader SDN service issues.

The connection between all of this and NaaS is the big question, for SDN and for NFV.  There is no question that SDN is critical “inside” NFV where connections among VNFs have to be made quickly, efficiently, and in large numbers.  If you have SDN in place for that purpose, it would make sense to use SDN to provide connections outside the NFV enclave too, and that could open not only a broader NaaS/SDN business but also expand the scope of NFV.

We have virtual networks today using legacy technology; VPNs based on IP/MPLS and VLANs based on Ethernet standards.  The problem with the approach is that these networks are based on legacy technology; they have to be supported at the protocol/device level and there are limitations in their number and the speed at which you can set them up.  Overlay SDN can add virtual networking to legacy networking, removing most of the barriers to quick setup and large numbers of users without changing the underlying network or even requiring any specific technology or vendor down there.  I’ve blogged a number of times about the benefits of application-specific and service-specific virtual networks; overlay SDN can create them.  You could have NaaS for real.

This summary demonstrates three possible values of ConteXtream; you can connect VNFs with it, extend NFV to users with it, and build it into cloud and telecom services even without NFV.  HP seems likely committed to both the NFV-related missions, and if they were to embrace the third pure SDN mission and add in their operations-and-legacy orchestration capability, they could build services that could support new NaaS models, which could start a legitimate SDN arms race to replace the current PR extravaganza we tend to have.

The competitive dynamic between HP and Alcatel-Lucent might be a factor in that.  Nuage is still in my view the premier SDN technology, but as I’ve noted Alcatel-Lucent has tended to soft-ball Nuage positioning, perhaps out of fear of overhanging their IP group’s products and perhaps simply because of product-silo-driven positioning that I’ve already commented on.  HP fires a shot directly at Alcatel-Lucent with the ConteXtream deal, one they might not be able to ignore.

If Alcatel-Lucent takes a more aggressive NaaS position, it would follow that Juniper and Cisco could also be forced to respond in kind.  It’s possible that virtual-router leader Brocade (with Vyatta) could then respond, and all of this could create a new model for network connectivity based on SDN.

NFV could be impacted by this evolution too.  Overlay SDN doesn’t provide a direct means of coupling connection-layer QoS to the network layers that could actually influence it.  You can do that with operations orchestration of the type used in NFV, which could pull NFV orchestration up the stack.

Oracle’s positioning might also play in this.  Oracle has been pushing a TMF-and-operations-centric vision for NFV, but its Oracle SDN strategy for enterprises includes firewall, NAT, and load-balancing features that are most often considered part of NFV service chaining.  Augmented SDN could then be seen as a way of bridging some NFV features to the enterprise.  Since both Alcatel-Lucent and HP have SDN now and since both have NFV features, might this presage an “NFV lite” that adds some capabilities to the enterprise?  Remember that Alcatel-Lucent’s Rapport makes IMS a populist-enterprise proposition.

The net here is that an arms race with SDN could actually open another path to both operations orchestration and service opportunity.  In fact, you could secure a better cost/revenue picture from operations orchestration and SDN in the near term than from NFV, presuming you did NFV without those opex orchestration enhancements.  Some might think that means that NFV is at risk of being bypassed, but it’s not that simple.

In the long term, service provider infrastructure is a cloud on top of dumb pipes.  We have over a trillion dollars a year on the table from cloud-based services so dynamic that NFV will be needed to deploy and orchestrate them no matter whether we call them “network functions” or “cloud applications”.  What NFV is at risk for is the loss of a valuable early deployment kicker.  We can do without NFV today, but we’ll be sorry tomorrow if we try to do that.

How Operator Constituencies are Groping the SDN/NFV Elephant

I often get emails on my blogs from network operators (and of course network vendors too, but those are another story).  One of the things I get from those emails that I find particularly fascinating is the difference in perspective on SDN and NFV between the pillars of power in operator organizations.  We talk all the time about what “operators” think or want, but the fact is that their thoughts and wants are not exactly organized.  Ask a given operator a question on their plans for SDN or NFV and you’ll get different pictures depending on who and where you touch, just like the classic joke of “groping the elephant” behind the screen and identifying the parts as trees, cliffs, snakes, etc.  I thought it might be interesting to sort the views out, particularly with some feedback on the comments I made in Tuesday’s blog.

One group of people, the “standards and CTO” people, see SDN and NFV as technology evolutions, primarily.  To their minds, the value proposition for them is either fairly well established or not particularly relevant to their own job descriptions, and the question is whether you can make the technologies work.  This group generally staffs the industry initiatives like the ONF and the NFV ISG, and they’re focused on defining the way that things work so that SDN and NFV can actually do what legacy technology already does, and more.

Within this group, SDN and NFV deployment is seen as something for 2018 to 2020, because it will take that long to go through the systematic changes in infrastructure that deployment of either would require.  SDN is generally seen as having an earlier application than NFV to this group of people, and also SDN is seen as requiring the smallest number of enhancements or changes to specifications to make it work.  Among the S&CTO crowd, almost 80% think SDN is already in the state needed for deployment.  NFV is seen as ready for deployment by less than half that number, though both technologies are thought to be ready for field trials.

The greatest challenge for both SDN and NFV is seen by the S&CTO group as “product availability.”  For SDN the feeling is that a mature controller from a credible source and switch/router products (including “white box” from credible sources will have to be available before full deployment would be considered.  The group sees vendors as dragging their feet, and in more than half the cases actively obstructing the standards process for both SDN and NFV.

The second group of people are the CIO and Business/Operations group.  This group believes that current activities for both SDN and NFV border on “science projects” because the linkage of the new technologies and their expected behaviors to OSS/BSS has not been effectively defined.  That means that from the perspective of CIO&O, the value proposition for SDN or NFV deployment is incomplete.

Whose fault is that?  Most everyone’s.  Three-quarters of the group think that the ONF and the NFV ISG have dropped the ball on management and operations integration.  They believe that proposing new infrastructure demands proposing effective management/operations models for it.  A bit smaller percentage thinks that the TMF has dropped the ball, and that the body has simply taken too long and showed to little insight in moving toward a better operations model.  Almost all of them blame vendors for what they see as the same-old-same-old attitude toward operations/management—let the other guy do it.

For this group, the biggest problem for SDN and NFV is the lack of a suitable operations framework to support the technologies themselves, much less to deliver on the scope of benefits most hope for.  One in nine think that SDN/NFV can be operationalized based on current specifications.  One in eleven think that operations efficiency benefits or service agility could even be delivered based on current ops models.  Almost exactly the same number think that vendors will have to advance their own solutions, and this group holds out the most hope for vendors who have OSS/BSS elements in their portfolios.

The third group is the CFO organization, and this group has very different views from both the other groups.  For CFOs, the problem isn’t and never was “SDN or NFV deployment” per se.  They don’t see technology deployment in a vacuum; it has to be linked to a systematic shift in business practices—what has been called “transformation”.

Transformation is apparently a cyclical process.  Operators’ CFO groups say that transformation has been going on for an average of a decade, though at first it was focused on “TDM to IP”.  It’s enlightening to hear what they think is the reason for cyclic transformation; Eight out of ten said the IP transformation didn’t deliver what they’d hoped for in terms of operations efficiency and service agility.  To many of these, the lesson is not to tie “transformation” as a goal to a specific technology realization.  Rather, you have to tie technologies to the goal.

That’s where the problem lies.  Fewer than one in twelve in the CFO group thought that SDN or NFV had a convincing link to transformation.  Interestingly, less than a third thought it was a priority to create such a link, which shows that CFOs and their direct reports are more interested in business results than in promoting technologies.  And if you press, more than half the CFO group thinks that the right approach to transformation should be “top down”, meaning it should focus on service lifecycles and operations and not on infrastructure.  Not surprisingly, this group tends to take an operations-centric view of the technology needed, and they also believe that vendors will have to play a key role in bringing the transformation about; standards are too slow.

The final group is Network Operations, the group responsible for ongoing management of the network.  This group (perhaps unsurprisingly) has the least cohesion in the views they express.  In the main, they see the issues of SDN and NFV to be in another group’s court for now.  However, I do get a couple of consistent comments from the operations group.

The first is that they believe that NFV and SDN are both more operationally complex than the alternatives and that there has been little or nothing done in trials or PoCs to address this.  At this point they’re seeing this as a pure network management problem not a customer care problem.

The second comment from operations is that vendors who supply traditional gear are telling their operations contacts that both SDN and NFV won’t roll out suitably until 2016.  There are no reports of vendors suggesting they consider SDN/NFV alternatives in current purchases, or consider holding back on deals to prevent being locked into an older technology.

If you dig through all this, there are a couple of themes that stand out.

First, operators themselves are not organizing their own teams around an NFV strategy and engaging all their constituencies.  Even where a group cites something that is clearly identified with another (CFO people and service lifecycle, for example) there is little or no attempt to coordinate the interests.  As a result, there is no unified view of either SDN or NFV across all the constituencies and no solid broad support for either concept.  Few operators are making SDN/NFV a cross-constituency priority, and some are planning “transformation” without any specific goal to employ either SDN or NFV in it.

Second, too much time has been and is being spent proving something nobody seems to doubt, which is the concept of NFV.  The problem is that the benefit case for SDN and NFV is necessarily cutting across all the groups, and nothing is really uniting them.  Most of this problem can be attributed to the relatively narrow scope of both SDN and NFV standardization; both are too limited to cover enough infrastructure and practices to make a convincing business case.

Third, this is getting way too complicated.  Even the CTO team thinks that the work of the respective standards bodies is making both SDN and NFV more complex and likely more expensive to implement.  Some operations people noted that they were being told to “forget five-nines” with traditional networks at the same time as the standards people were trying to insure that every aspect of reliability/availability was addressed through service definitions, redundant VNFs, failovers, and so forth.  They wonder how such a setup would ever even meet current costs, and those costs have to be lowered.

Vendors are finding themselves on the horns of a dilemma.  On the one hand, the current trials are too narrow to be likely to advance the cause of SDN or NFV.  One operator said that both technologies were at risk to becoming “the next ATM.”  On the other hand, an attempt by vendors to broaden the trials not only works against their primary CTO-group contacts’ interests, it introduces potential delay.  So do you rush forward to support an engagement that doesn’t have convincing backing or funding, or rush to the funding and potentially lose the engagement?

I think it’s truth time here.  Fewer than 15% of lab trials have historically resulted in deployment, so simple statistics say that just betting the current processes will lead to success through inertia alone seems risky.  Not only that, the current “benefit-and-justification gap” screams for a vendor who is willing to face reality, and once somebody prominent charts a course to real validation of the SDN/NFV business case, they’ll leap into a lead that may be hard for competitors to overcome.

Did Alcatel-Lucent Document the Wrong Path to SDN/NFV?

Studies are always a good thing, and we have one now that could be particularly good for the industry.  Working with AD Little, Alcatel-Lucent’s Bell Labs cooperated to generate a report (registration required) that outlines the benefits to be expected from SDN/NFV adoption.  There’s good news and bad in the report, at least from the perspective of most operators, and good thinking and bad as well.  In the balance, it’s a very useful read because it illustrates just what’s important to drive SDN/NFV forward, and the gap that exists among interested parties on the best way to do that.

Two early quotes from the report set the stage. The first sets the stage, citing the new world that mobility and the Internet have created:  “In this new environment, significant change is needed to the nature of the services offered and the network implementing them. These changes must allow the network to participate and contribute to the development of the cloud ecosystem.”  The second frames a mission:  “The foundation of this crucial change is next-generation cloud/IP transformation, enabled by NFV and SDN. In our view, it is a clear imperative for the industry.”

Everything good and bad about the report is encapsulated in these quotes.  The future has been changed by a new notion of services.  The role of SDN and NFV is to facilitate a cloud/IP transformation.  I agree with both these points, and I think most others would as well.  That would mean that the benefits of SDN and NFV should be achieved through the transition to the cloud.  That’s a point that competitor HP just made, and also one I believe to be true.

The authors focused on EU operators probably because they’ve led in SDN/NFV standardization and are also leaders in the lab trials and PoCs.  These activities are generally more tactically focused, and that becomes clear when the authors state the presumption of the study:  “What is clear, however, is that virtualization, programmability and network automation, enabled by these new technologies, will drive down industry operating costs considerably.”

The “enabled by these new technologies” part is the key point, I think.  There are two interpretations possible for the phrase.  One is that new technologies will open new operations models as they deploy, changing the way we operationalize by changing what we operationalize.  The other is that the new technologies will be tied to legacy services as well as to changes in services driven by changes in network infrastructure.  The report takes the former position; SDN and NFV will transform us as they deploy, and I think that begs some serious questions.

Accounts from writers who discussed the report with authors say that the timeline for realizing these from-the-inside benefits is quite long—ten years.  That delay is associated with the need to modernize infrastructure, meaning to adopt hosted and software-defined elements that displace long-lived equipment already installed.  There is a presumption that the pace of savings matches the pace of adoption, remember.

“Opex” to most of us would mean human operations costs, and it certainly means that to operators, but the study seems to miss two dimensions of opex.  First, there is nothing presented on which you could base any labor assumptions, at least not that I can see.  That may be why the labor savings quantified are a relatively small piece of the total savings projected.  Second, there is nothing presented to assess the operations complexity created by the SDN or NFV processes themselves.  If I replace a box with three or four cloud-hosted, SDN-connected, chained services I’ve created something more operationally complex not less.  If I don’t have compensatory service automation benefits, I come out in the hole not ahead.

I’m also forced to be wary about the wording in the benefit claims.  The study says that “the efficiency impact of onboarding NFV and SDN for these operators could be worth [italics mine] 14 billion euros per year, equal to 10 percent of total OPEX. The results are driven by savings from automation and simplification.”  That kind of statement begs substantiation.  Why that number?  What specific savings from automation and simplification?

From a process perspective, the industry has produced no accepted framework for operationalizing either SDN or NFV.  Only about 20% of operators say they have any tests or trials that would even address that question, and only one of them (in my last conversations) believed these tests/trials would actually prove (or disprove) a business case.

We could fix all this uncertainty by making one critical change in our assumptions, taking the other path in the question of what applying “new technologies” means.  That assumption is that service modeling and orchestration principles created for NFV infrastructure would be immediately applied through suitable infrastructure managers to legacy infrastructure and services as well.  In short, if you can operationalize everything using the same tools, you can gain network-wide agility and efficiency benefits.

If we’re applying operational benefits network-wide, then savings accrue at the pace we can change operations not the pace we can change infrastructure.  The study says about ten percent of opex is eliminated; my figures say about 44% would be impacted, and the difference is pretty close to the portion of infrastructure you could convert to SDN/NFV using the study’s guidelines.  Apply opex benefits to a larger problem and you get a more valuable solution.

Perhaps my most serious concern with the assumptions of the study is the implied scope of the future carrier business.  We are saying that the network is transformed by new opportunity, but that the operators’ role in this transformation is confined to the least-profitable piece, the bit-pushing.  The examples of new services offered are all simply refreshes of legacy services, packaging them in a more tactical way, shortening time-to-revenue for new things.  There’s nothing to help operators play in the OTT revolution that’s created their most dramatic challenges.  Good stuff is happening, we’re saying, and you have to be prepared to carry water for the team.  I disagree strongly with that.  IoT for example has many of the attributes of early telephony.  It involves a massive coordinated investment and an architecture that keeps all the pieces from creating fiefdoms of one element that are useless to every.  Why would a common carrier not want to play there, and why would we not want them to?

The future of the network is to broaden the notion of services.  I can’t see why operators would invest in that and then leave all the benefits on the table for others to seize.

All of the basic points in the document regarding the value of agility and efficiency are sound, they’re just misapplied.  If you want to fix “the network” then you have to address the network as a whole.  NFV and SDN can change pieces of the network but the bulk of capital infrastructure will not be impacted even in the long term—access aggregation, metro transport, and so forth are not things you can virtualize.  In the near term the capital inertia is too large for any significant movement at all.  It’s a matter of timing.

I think this study is important for what it shows about the two possible approaches to SDN and NFV.  It shows that if we expect SDN and NFV to change operations only where SDN or NFV displace legacy technology, it will take too long.  Particularly given that operators agree that their revenue/cost per bit cross over in 2017 on the average.  Alcatel-Lucent and ADL have proved that we can generate “second-generation” benefits with NFV and SDN but they missed the fact that to get early benefits to drive evolution and to deliver on the “coulds” the report cites, we need to make profound service lifecycle changes for every service, every infrastructure element.  And we need that right now.

Alcatel-Lucent isn’t the only vendor to take a conservative stance on SDN and NFV, of course.  Operators, at the CFO level at least, generally favor the computer vendors as SDN/NFV partners because they believe that class of supplier won’t drag their feet to protect legacy equipment sales.  The operators, in the recent project documents I’ve seen, are expanding their scope of projected changes.  They want “transformation” not evolution, and the question is who’s going to give it to them.

Does “SDN” Muddy Alcatel-Lucent’s Opto-Electrical Integration?

Pretty much everything these days is based on networking and software, which by current standards means that everything is SDN.  The trend to wash stuff with terms like SDN and NFV is so pervasive that you almost have to ignore it at least at first to get to the reality of the announcement.  So it is with Alcatel-Lucent’s announcement of its Network Services Platform, which is sort-of-SDN and sort-of-NFV in a functional sense, but clearly in the SDN camp from the perspective of the media and even many in Alcatel-Lucent.

At the basic level, NSP is a platform that sits on top of legacy optical and IP/MPLS infrastructure and provides operational control and unified management across vendor boundaries.  It exposes a unified set of APIs upward to the OSS/BSS/NMS world where services are created and managed, and it allows operators to manage a combination of optics and IP/MPLS as a cohesive transport network.  It could create the very “underlay” model I’ve suggested is the right answer for the network of the future.

The notion of unified software control is the “SDN” hook that a lot of people have picked up on, not the least because Alcatel-Lucent uses the term “SDN” to describe it in their press release.  Many SDN purists would disagree, of course.  The SDN approach they’d take is more like that of Open Daylight, which creates an SDN controller and puts legacy devices and interfaces underneath that controller as an alternative to OpenFlow as a control protocol.  The Alcatel-Lucent approach is actually a bit closer to what the ETSI NFV ISG seems to be converging on, which is a WAN Infrastructure Manager (WIM) that runs in parallel with the Virtual Infrastructure Manager (VIM).  Most in ETSI seem to put SDN and SDN controllers inside the VIM and use them for intra-VNF connectivity.

Network operators all seem to agree that 1) you need to evolve SDN out of current legacy infrastructure and 2) SDN has to have a relationship with NFV.  The question is how that should be accomplished—and Alcatel-Lucent’s NSP approach illustrates one of the viable options.  Does it illustrate the best one?  That’s harder to say, and it’s harder yet to say why they’ve taken this particular approach.

You probably all know my own thinking here.  I believe that all network services should be viewed as NaaS in the first place, meaning that I think that connection-model abstractions should be mapped either to legacy, to SDN, to NFV, or to mixed-model infrastructure as circumstances dictate.  I also believe that “infrastructure management” is a generic concept at the level where software control is applied, even if the infrastructure is for hosting stuff and not connecting it.

The Alcatel-Lucent approach, laid out in their NSP white paper, seems to make NSP and SDN more a parallel path than the unified element that their PR suggested.  The paper shows an NFV figure that places both NSP and Nuage under the operations/management layer rather than showing Nuage under NSP.  It’s also interesting to note that the Alcatel-Lucent press release never mentions NFV or positions either NSP or SDN relative to NFV.

Which I think is a mistake.  If you read the PR and even the white paper, you get the distinct impression that this is really all about optics and MPLS products.  There really is a need for coordinated SDN/legacy deployment so you can’t say that Alcatel-Lucent was guilty of SDN-washing, but they do seem to have missed an opportunity to position NSP in a more powerful way.  That’s worrying because it could be a reflection of their age-old silos problem.

I remember commenting a couple years after the merger that created Alcatel-Lucent that the new company seemed to be competing as much with themselves as with real competitors.  More, in some cases, because internal promotion and status was linked to relative success.  This problem kept Alcatel-Lucent from fully leveraging its diverse assets, to the point where in my view it prevented the company from giving Cisco a real run for its money in the network operator space.

In today’s Alcatel-Lucent we have three dominant business segments responsible for facing the future.  One is the IP group, which has been the darling of Wall Street because of its ability to create engagement and hold or gain market share.  Another is the Cloudband stuff where Alcatel-Lucent’s NFV lives, and the third is Nuage, the SDN people.  It’s hard to read the NSP material and not wonder just how much these three units are communicating.

If NSP is in fact what the press release suggests, which is an umbrella strategy to unify legacy and SDN technology under a common software-defined-and-decoded abstraction, then it’s a step forward for Alcatel-Lucent, but the company would still have to prove that their approach is stronger than one based on Open Daylight (I think that would be possible because of management holes in ODL).  If NSP is what Figure 4 of their white paper says it is, then it’s leaving the SDN/legacy separation to be handled by “the higher level” meaning operations and management systems.  That to me undermines the value proposition for NSP in the first place, because facilitating software definition can’t be simply letting the other guy to the heavy lifting.

Some of the operators, particularly those in the EU, have what I think is the right idea about the SDN/NFV relationship and their ideas should, in my view, be the standard against which stuff like NSP is measured.  Their vision is of an NFV-orchestrated world where services can call on multiple “infrastructure managers” that could provide both intra-VNF and end-to-end connectivity and use both SDN and legacy network elements.  It seems to me that this vision would benefit from a unified model of SDN/legacy control, which would be in effect a merging of Alcatel-Lucent’s NSP and Nuage positioning.

Which may be why we don’t hear about it.  This positioning would cut across all three of our future positioning trends, uniting what may be three positions that Alcatel-Lucent would like to keep separate for the moment.  That’s not necessarily a bad thing (Cisco is in my view dedicated to keeping the evolution toward SDN/NFV from impacting legacy sales in the current period), but it does create vulnerabilities.

Both SDN and NFV demand radical future benefits to justify the comprehensive infrastructure changes they’d involve.  We need to see what those benefits are to make the journey, and vendors who invoke the sacred names of SDN or NFV have to be prepared to show how their stuff fits in a compelling vision of that future infrastructure.  Vendors who don’t have to defend the present, meaning in particular software/server giants like HP or Oracle, could run rampant in the market while network-equipment competitors are still shifting from silo to silo.

I don’t think that legacy and SDN should be parallel concepts integrated at the ops level.  The NaaS model, IMHO, is best served by having vertical integration of higher (service) and lower (transport) layers based on policy or analytics.  Alcatel-Lucent actually has analytics and has all the pieces it needs to create the right model, and NSP is functionally a good step.  To make the benefits both real and clear, though, it seems they may have to fight those same old product-silo demons.

A Security/Compliance “Model” for SDN and NFV

We know we need to have security and compliance in SDN and NFV, simply because we have them today in other technologies.  We also know, in at least a loose sense, that the same sort of processes that secure legacy technology could also secure SDN and NFV.  The challenge, I think, is in making “security” or “compliance” a part of an SDN or NFV object or abstraction so that it could be deployed automatically and managed as part of the SDN/NFV service.

Security, or “compliance” in the sense of meeting standards for data/process protection, have three distinct meanings.  First, they are a requirement that can be stated by a user.  Second, they are an attribute of a specific service, connection, or process.  Finally, they are a remedy that can be applied as needed.  If we want a good security/compliance model for SDN and NFV we need to address all three of these meanings.

The notion of attributes and remedies is particularly significant as we start to see security features built into network and data center architectures.  This trend holds great potential benefits, but also risks, because there’s no such thing as a blanket security approach, nor is “compliance” meaningful without understanding what you’re complying with.  Both security and compliance are evolving requirement sets with evolving ways of addressing them.  That means we have to be able to define the precise boundaries of any security/compliance strategy and we have to be able to implement it in an agile way, one that won’t interfere with overall service agility goals.

Let’s start by looking at a “service”.  In either SDN or NFV it’s my contention that a service is represented by an “object” or model element.  At the highest level, this service object is where user requirements would be stated, and so it’s reasonable to say that a service object should have a requirements section where security/compliance needs are stated.  Think of these as being things like “I need secure paths for information” and “I need secure storage and processing”.

When a service object is decomposed, meaning when it’s analyzed at lower levels of the structure on the path toward making actual resource assignments, the options down there should be explored with an eye to these requirements.  In a simple sense, we either have to use elements of a service that meet the service-level requirements (just as we’d have to do for capacity, SLA, etc.) or we have to remedy the deficiencies.  The path to that starts by looking at a “decomposable” view of a service.

At this next level, a “service” can be described as a connection model and a set of connected elements.  Draw an oval on a sheet of paper—that’s the connection model.  Under the oval draw some lines with little stick figures at the end, and that represents the users/endpoints.  Above the oval draw some more lines with little gear-sets, and those represent the service processes.  That’s a simple but pretty complete view of a service.

If a service consists of the stuff in our little diagram, then what we have to do to deploy one is to commit the resources needed for the pieces.  Security and compliance requirements would then have to be matched to the attributes of the elements in our catalog of service components.  If we have a connection model of “IP-Subnet” then we’d look at our model resources to find one that had security and compliance attributes that matched our service model.  Similarly, we’d have to identify service processes (if they were used) that would also match requirements.

My view is that all these resources in the catalog would be objects as well, built up from even lower-level things that would eventually lead to resources.  A service architect could therefore build an IP-Subnet that had security/compliance attributes and committed resources to fulfill them, and another IP-Subnet that had no such attributes.  The service order process would then pick the right decomposition based on the stated requirements.

It’s possible, of course, that there are no such decompositions provided.  In that case, there has to be a remedy process applied.  If you want to have the service creation and management process fully automated (which I think everyone would say is the goal) then the application of the remedy has to be automated too.  What might that look like?

Like another service model, obviously.  If we look at our original oval-and-line model, we could see that the “lines” connecting the connection model to the service processes and users could also be decomposable.  We could, for example, visualize such a line as being either an “Access-Pipe” or a “Secure-Access-Pipe”.  If it’s the latter then we can meet the security requirements if we also have an IP-Subnet that has the security attribute.  If not, then we’d have to apply an End-to-End-Security process, which could invoke encryption at each of our user or service process connections.

Just to make things a bit more interesting, you can probably see that an encryption add-on, to be credible, might have to be on the user premises.  Think of it as a form of vCPE.  If the customer has the required equipment in which to load the encryption function, we’re home free.  If not, then the customer’s access pipe for that branch would not have a secure option associated with it.  In that case there would be no way to meet the service requirements unless the customer equipment were to be updated.

I think there are two things that this example makes clear.  First is that it’s possible to define “security” and “compliance” as a three-piece process that can be modeled and automated just like anything else.  Second, the ability of a given SDN or NFV deployment tool to do that automating will depend on the sophistication of the service modeling process.

A “service model” should reflect a structural hierarchy in which each element can be decomposed downward into something else, until you reach an atomic resource.  That resource might be a “real” thing like a server, or it might be a virtual artifact like a VPN that is composed by commanding an NMS that represents an opaque resource structure.  The hierarchy has to be supported by software that can apply rules based on requirements and attributes to select decomposition paths to follow.

At one level, NFV MANO at least should be able to do this sort of thing, given that there are a (growing) number of service attributes that it proposes to apply for resource selection.  At another level, there’s no detail on how MANO would handle selective decomposition of models or even enough detail to know whether the models would be naturally hierarchical.  There’s also the question of whether the process of decomposition could become so complex (as it attempts to deal with attributes and requirements and remedies) that it would be impossible to run it efficiently.

It’s my view that implementations of service modeling can meet requirements like security and compliance only if the modeling language can express arbitrary attributes and requirements and match them at the model level rather than having each combination be hard-coded into the logic.  That should be a factor that operators look at when reviewing both SDN and NFV tools.

Can Effective NFV Management/Analytics Solve SDN’s Management Problem?

Both SDN and NFV deal with virtualization, with abstractions realized by instantiating something on real infrastructure.  Both have management issues that stem from those factors, which means that they share many of the same management problems.  Formalistically speaking neither SDN nor NFV seem to be solving their problems, but there are signs that some of the NFV vendors are being forced to face them, and are facing them with analytics.  That may then solve the problems for SDN as well, and for the cloud.

You could argue that our first exposure to the reality of virtualization came from OpenStack.  In OpenStack, we had a series of models (CPU, network, image store, etc.) that represented collections of abstractions.  When something deployed you made the abstractions real.  DevOps, which also came into its own in the cloud even though the concept preceded the cloud, also recognized that abstract models were at least a viable approach to defining deployment.  OASIS TOSCA carries that forward today.

The basic problem abstractions create is that the abstraction represents the appearance, meaning what the user would see in a logical sense.  If you have a VPN, which is an abstraction, you expect to use and manage the VPN.  That the VPN has real elements may be something you’d have to come to terms with at some point, because you can’t fix real devices in the virtual world, but this coping with reality stuff is always problematic, and that’s true here in virtual management too.

My personal solution to this problem was what I called derived operations, which means that the management of an abstraction is done by creating a set of formulative bindings between management variables in the abstract plane and other real variables from real resource.  It’s not unlike driving a car; you have controls that are logical for the behavior of driving and these are linked to real car parts in such a way as to make the logical change you command convert to changes in auto elements that make that logical change real.

In one sense, derived operations is simple.  You could say “object status = worst-of(subordinate status)” or something similar.  In virtualization environments of course, you don’t know what the subordinates are until resources are allocated.  That means two levels of binding—you have to link abstract management variables to other abstract variables that will on deployment be linked to real variables.  You also have to accept the fact that in many cases you will have layers of abstraction created to facilitate composition.  Why define a VPN in detail when you can use a VPN abstraction as part of any service that uses VPNs?  But all of this is at least understood.

The next problem is more significant.  Say we have an abstract service object.  It has a formula set to describe its status.  We change the value of one of the resource status variables, one that probably impacts a dozen or so service objects.  How do we tell those objects that something changed?  If we’re in the world of “polled management” we assume that when somebody or something looks at our service object, we would refresh its variables by running the formulative bindings it contains.

Well, OK, but even that may not work.  It’s not efficient to keep running a management function just to see if something changed.  We’d be wasting a lot of cycles and potentially polling for state too many times from too many places.  What we need is the concept of an event.

Events are things that have to be handled, and the handling is usually described by referencing a table of “operating states” and events, the intersection of which identifies the process to be invoked.  We know how to do this sort of thing because virtually every protocol handler is written around such a table.  The challenge comes in distributed event sources.  Say a trunk that supports a thousand connections fails.  That failure has to be propagated up to the thousand service models that are impacted, but how does the resource management process know to do that?

This is where analytics should come in.  Unfortunately, the use of analytics in SDN or NFV management has gotten trivialized because there are a number of ways it could be used, one of which is simply to support management of resources independent of services.  Remember the notion of “directory-enabled networking” where you have a featureless pool of capacity that you draw on up to a point determined by admission control?  Well that’s the way that independent analytics works.  Fix resource faults and let services take care of themselves.

If you want real service management you have to correlate resource events with service conditions, which means you have to provide some way of activating a service management process to analyze the formulary bindings that define variable relationships, and anchor some of those binding conditions as “events” to be processed.  If I find a status of “fault” here, generate an event.

When you consider this point, you’ve summarized what I think are the three requirements for SDN/NFV analytics:

  1. The proactive requirement, which says that analytics applied to resource conditions should be able to do proactive management to prevent faults from happening. Some of this is traditional capacity planning, some might drive admission control for new services, and some might impact service routing.
  2. The resource pool management requirement, which says that actual resource pools have to be managed as pools of real resources with real remediation through craft intervention as the goal. At some point you have to dispatch a tech to pull a board or jiggle a plug or something.
  3. The event analysis requirement, which says that analytics has to be able to detect resource events and launch a chain of service-level reactions by tracking the events along the formulary bindings up to the services.

The nature of the services being supported determines the priority of these three requirements for a given installation, but if you presume the current service mix then you have to presume all three requirements are fulfilled.  Given that “service chaining” and “virtual CPE” both presume some level of service-level agreement because they’re likely first applied to business services, that means that early analytics models for SDN/NFV management would have to address the event analysis requirement that’s the current stumbling block.

From and implementation perspective, it’s my view that no SDN/NFV analytics approach is useful if it doesn’t rely on a repository.  Real-time event-passing and management of services from the customer and customer-service side would generate too much management traffic and load the real devices at the bottom of the chain.  So I think all of this has to be based on a repository and query function, something that at least some of the current NFV implementations already support.

Where this is important to SDN is that if you can describe SDN services as a modeled tree of abstract objects with formulary bindings to other objects and to the underlying resources, you can manage SDN exactly as you manage NFV.  For vendors who have a righteous model for NFV, that could be a huge benefit because SDN is both an element of any logical NFV deployment and a service strategy in and of itself.  Next time you look at management analytics, therefore, look for those three capabilities and how modeling and formulary binding can link them into a true service management strategy.

We Know We’re Building Clouds and not Networks in the Future; We Just Don’t Know How

It’s obvious that there’s a relationship between NFV and the cloud, but for many the relationship is simply one of common ancestry.  You host apps in the cloud and with NFV you host features there.  Well, there’s a lot more to the cloud than just hosting apps or features.  The cloud is the model of the network of the future, and so a vendor’s cloud-savvy may be critical to their credibility as an NFV partner.

If NFV was a simple value proposition, a single golden application that when implemented would suddenly arrest the convergence of revenue and cost per bit, we’d have it easy.  The problem is that the “golden applications” for NFV that we see today are golden largely because they’re contained and easily addressed.  Our experience with them, at the service level, isn’t going to take us to where we need to be.  So we have to look to other value propositions.

We hear a lot about how service agility, service chaining, and SDN are going to create new opportunities for network operators.  The net of this is the assertion made by many, that bit-based services can still be profitable.  Well, the facts don’t bear that out.  In the most recent quarter, for example, enterprise services for Verizon were off by 6% and made up only about 10% of revenues.  My survey data shows that enterprises now have premises tools for firewalls, NAT, and so forth, and that the path to substitute carrier services for these tools is littered with delays caused by financial cycles (depreciation of the current assets) and fears of operational impacts.  Operator CFOs privately think that service chaining isn’t going to make a meaningful difference in that converging revenue/cost curve.

The services most talked about today aren’t suitable for driving broad transformation simply because they’re not broadly consumed.  They don’t impact enough total cost or introduce enough incremental revenue.  That’s what makes NFV a kind of game of bridge.  We have to build a future infrastructure that will make money, so we have to bridge between the implementation of our early services and the implementation of future services.  We have to get to that future with some flair, meaning that we’ll have to build out the future-facing tools while addressing near-term costs and revenues.  Where the features of NFV that we hear about matter is in that bridging.

The challenge is that you need both banks to make a bridge.  We know where we are now, and we understand that things like service chaining can lead us forward.  What’s less clear is what “forward” means in terms of features and technologies.  We can’t expect operators to renew their infrastructure for every step of the transition.  We need transitioning infrastructure, transitioning operations practices.

Saar Gillai, SVP and general manager of HP NFV Business Unit did what I think might be the first blog post made by a vendor on this topic.  In it he lays out four phases of NFV as decouple, virtualize, cloudify, and decompose.  This starts with breaking up appliances by decoupling function from one-off hardware, moves to virtualizing the software for the functions, adds in agile cloud hosting of both functions and applications, and finally decomposing both elements of services and apps to permit more agile composition of new stuff.

I think these phases are the right framework.  Making them work in practice will involve two things.  First, we really do need to understand the kind of network future we’re building.  We cannot even recognize the right decisions without that standard of future mission to guide us.  Second, we need to find a way to accelerate benefits to fund the changes.  Without that, the first baby steps won’t provide enough benefit to justify broad deployment.

I’ve talked about orchestration of operations and its importance before, but I want to point out another aspect of orchestration that might well but just as important.  It’s the other face of service management, which is service logic.  We have to recognize a key point, which is that as we accelerate service cycles we reach a point where what we’re describing in a service model is how it works and not how it’s built.

Right now, we’re visualizing an implementation of NFV and orchestration that deploys stuff.  It’s role is the service management side—putting resources into place, connecting them, and sustaining their expected operating state and SLA.  Even this service management mission is broader than current standards activities can support, because we don’t fully control mixtures of legacy, SDN, and NFV and because we don’t automate the management processes along with the deployment.  Taking orchestration to the service logic level seems daunting.

Well, maybe that’s because we’re thinking too narrowly about services.  If you believe that OTT players have the secret to future revenues, then you believe in service logic.  Most OTT revenues come from advertising, which is composed into web pages dynamically and delivered.  That’s not management—we don’t build web pages and store them in anticipation of their being used.  We look at cookies and decide what ads to serve, and we then serve them.  Service logic.

Some of the implementations of NFV orchestration also touch, or could touch, on service logic.  Overture’s Ensemble Service Orchestrator uses BPMN to describe a deployment workflow.  Similar notation could describe service logic.  In my original ExperiaSphere project, the alpha demonstration involved finding a video based on a web-search API, finding its serving location, and describing an expedited delivery plan.  That could be viewed as deployment, but it is also very close to being service logic.

In fact, anything that can select processes based on state/event analysis can do both service management and service logic, which means that it’s likely that any NFV implementation that met current requirements for operations orchestration could become service logic orchestration tools too.  That’s good, because service logic and cloud support for it is the big brass ring of the future that every operator and every vendor is trying (even when they don’t fully realize it) to grasp.

Service-logic orchestration means you could orchestrate the flow of work among service elements in real time.  This clearly isn’t suitable for high-speed data-path applications but I think those applications are just a small step above pushing bits.  What future services have to offer is some sort of functional utility, something like exploiting LBS to find things or serve ads, or exploit community behavior to do the same.  In short, service logic is part of realizing mobile/behavioral opportunities.

Service logic is about handling events.  So it service management.  What separates the two (besides intent) is the performance requirements associated with something and the fact that service logic is likely to compose experiences not from things newly deployed but from things that are available as resources.  These are the “microservices” I’ve talked about.

I think cloud-hosted microservices are the key to NFV’s future.  There is I think some recognition that they’re important even in the ETSI ISG’s current work; multi-tenancy of elements of services are an issue under discussion.  The challenge is advancing our thinking before we’ve advanced to the future we’re supposed to be preparing for.

NFV can’t be about building networks a different way, or building bit-based services a different way, or building services for a small fraction of the customers a different way.  We need to make massive changes in both cost and revenue and that will take transformative architectures.  Those are the architectures we need to be thinking about right now, because 2015 is grinding along and we’re running out of time before operators are forced to look beyond NFV and SDN and even the cloud for their transformation.

Cisco: Facing their Past to Save their Future

Here is an interesting question for you.  If the gazelle evolves, does the lion also have to change?  Of course, you’d say.  A food chain generates a chain reaction to any significant alterations.  Well, then, how about this one.  If network services evolve to something very different, does enterprise network equipment also have to evolve?  That’s the question that should be plaguing Cisco and maybe other network vendors as well.

If you look at Cisco’s quarter you see what probably surprises nobody at all.  Their enterprise numbers were pretty good and their service provider results were dim.  Here’s Chambers’ comment from the earnings call:  “We are managing continued challenges in our service provider business, which declined 7%, as global service provider Capex remained under pressure and industry consolidation continues.”  There are two questions begging to be asked regarding these numbers.  First, why are operators holding back while enterprises spend?  Second, will changes in the operator business model inevitably impact the enterprise?

Everyone buys stuff for ROI.  For enterprises, the return comes in the form of improved worker productivity, lower support costs, and lower equipment costs.  My surveys suggested that enterprises responded to the 2008 economic crisis by holding back on “modernization spending”.  They’re not doing that as much now, though they’re also not backfilling to make up for past neglect.  Whatever the details, enterprises really can’t just stop spending on networking because networking supports their operations.  If your accounting isn’t profitable you can’t stop making payments or collecting on invoices.

For operators it’s more complicated.  They sell services based on expensive and long-lived infrastructure.  They could certainly decide to exit service markets that weren’t profitable, or to invest only where profit could be had.  Verizon, remember, doesn’t offer FiOS everywhere.  They sell it where they can make money, and they’re trying to sell off their access business to rural telcos where FiOS isn’t going to pay off.  Operators also have the option to under-invest in infrastructure and allow service quality to decline if it’s impossible to make money by providing what customers want.

I think all of these factors explain the current Cisco profit picture.  Operators are saying that their profit per bit is declining so they’re not rushing out to spend on infrastructure to generate more bits.  Enterprises are tied to network-centric application paradigms for productivity enhancement.  The latter are carrying spending better than the former.

The latter are also spending on the services of the former.  When we didn’t have IP VPNs, enterprises bought routers for WAN transport.  Today those products aren’t necessary.  The question is whether new services could change the enterprise network composition as old ones did.  If they do, then Cisco’s enterprise business is also in jeopardy.

The cloud could be another issue for enterprise spending on network equipment.  Most enterprise switching goes into data centers, and if there is a significant migration of applications from the enterprise data center to the public cloud, there would be a drop in enterprise data center switching spending.  This could be somewhat offset by gains on the provider side, but obviously cloud computing can’t work if there’s no economy of scale, so we’d have to assume that compensatory cloud provider data center switching spending gains would be significantly smaller than the enterprise losses.

The big question, though, is whether the evolution of “services” that network operators and even equipment vendors are proposing would impact the way enterprises buy equipment.  One obvious example is the virtual CPE stuff.  Today we’d often terminate business services to branch offices in a router or custom appliance.  What operators plan to do is to terminate it in a cheap little interface stub backed up by hosted functionality in the cloud.  There are a lot more branches than headquarters locations, so if this technology switch succeeds then enterprise branch networking could change radically.

Then there’s NaaS.  We hear that SDN could let us dial up a connection ad hoc, letting enterprises buy bandwidth as needed on a per-application and per-user basis.  What does this do to traditional networking?  Even carrier networks that are at least partially justified by VPN services might be changed if suddenly we were just building connections on demand.  Underneath a VPN is IP routing.  Underneath SDN forwarding paths is…well, nothing specific.

Virtual network elements could let enterprises bypass the whole notion of Ethernet or IP services and devices and simply funnel tunnels into servers and client devices over pretty much featureless optical or opto-electrical paths.  An “overlay SDN” technology like the original Nicira stuff, now part of VMware, or the Nuage SDN products from Alcatel-Lucent could be used to build this kind of network today.  At the very least it could dumb down both the client/branch side of the network, and from their latest announcement it’s clear that Nuage is aiming at integrating enterprise data center networking and branch networking even to the extent of supporting combined operations.

If you combine NaaS and NFV principles you get network services that are composed rather than provisioned in the old sense.  Think of it as a kind of 3-D printer for services.  You send a blueprint to the Great Service Composer and you get what you asked for spun out in minutes.  This would be a profound change in not only services but applications, including cloud computing.  All of a sudden application features aren’t put anywhere in particular, they’re just asked for and supplied from the most economical source or with the most suitable SLA.

What Cisco is facing, what Cisco should fear, isn’t white box switching.  The fact is that we’ve not done much yet to make “forwarding engines” like OpenFlow devices into alternative network components.  We’ve just made them into switches and routers.  That would have to change, though, if we expect to have the kind of things I’ve noted here, and if it does change at the service level then it will pull through a transformation even at the enterprise equipment level.

This doesn’t mean that I advocate Cisco jumping with both feet into the deep end of the NaaS-and-NFV pool.  I think they’d simply have too much to lose.  Networking is an industry whose depreciation cycles are very long, and it will take time for the service providers and enterprises to adapt their infrastructure to a new model even if they understand that model and accept its consequences.  Cisco could, in a stroke, make the future more understandable and acceptable, but I don’t think they could win in it quite yet.  Till they reach that tipping point, I think we’re going to hear the same story of hopefulness for old technology and blowing kisses at the new.

The Real Lesson of the Verizon/AOL Deal

The network operators, particularly the telcos, are in a battle with themselves, attempting a “transformation” from their sale-of-bits model of the past to something different and yet not precisely defined.  One likely offshoot of this, Verizon’s decision to acquire AOL, has generated a lot of comment and I think it may also offer some insights into where operators are heading in terms of services and infrastructure.  That has implications for both SDN and NFV.

AOL reported about $2.6 billion in revenues, compared to Verizon’s $128 billion.  I don’t think, nor do most Street analysts, that Verizon was buying AOL for their revenue.  Verizon’s revenues, in fact, are about a quarter of the whole global ad spend.  You don’t find relief for a large problem at a small scale.  That means they were likely buying them for their mobile and ad platform—their service-layer technology.  So does this mean that telcos are going to become OTTs?

Not possible.  Obviously somebody has to carry traffic, so the most that could happen was that telcos could add OTT services to their inventory.  But even that has challenges as a long-term business model because of seismic shifts in carrier profits.  OTTs get access to the customer essentially free, and telcos get paid for providing customer services, right?  Right, but remember that operators everywhere are saying their revenue and cost per bit curves will cross about 2017.  If the operator ends up providing Internet at a loss and making up the loss with OTT services, their service profits will be lower than the “real” OTTs’.  They have to make up a loss the others don’t bear.

What is possible?  Start with the fact that the converging cost/revenue per bit curves can be kept separate by improving costs or raising revenues, or any combination thereof.  We’re used to thinking of “revolutionary” technologies like SDN, NFV, and the cloud as being the total solution, meaning that we’d address both cost and revenue with a common investment.  Not only that, operators themselves have made that assumption and most still do so.  Of the 47 global operators I’ve surveyed, 44 have presented “transformation plans” to address both costs and revenues.  Well, AOL isn’t likely to impact Verizon’s cost, so it just might be that Verizon has decided to split what nearly everyone else has combined.  That could be big news.

It’s not that NFV can place ads or control mobile experiences directly.  As announcements for session-service platforms from Alcatel-Lucent, Huawei, and Oracle have already shown, you really need a kind of “PaaS” in which specialized service apps live.  You can deploy these platforms with NFV, operationalize them with NFV principles (if you have any NFV operations features from your vendor), and perhaps even connect them using SDN.  All of the vendors who have announced these platforms have taken pains to endorse NFV compatibility.

AOL isn’t a vendor, nor are they a carrier.  I’ve never heard them say a word about NFV or SDN and it’s unlikely they designed their mobile or ad platforms to be SDN-consuming or NFV-compliant.  That means that what Verizon bought isn’t either of these things, and that’s where the interesting point comes in.

Late last year, a number of operator CFOs told me that their need for a resolution to their cost/revenue problems quickly.  Many said that unless they could make the case for NFV as a path of resolution for these problems before the end of 2016, they’d have to find another path.  Might it be that Verizon has decided that they need a revenue-generating architecture, NFV or not?  AOL proves an important point, which is that many service platforms for mobile and advertising are multi-tenant.  You don’t have to “deploy” them for every user, they’re just hosted for multi-tenant services.  IMS is like that, after all.  So for these types of service you don’t “need” NFV.  It’s a cloud service.

Two things generate a need for orchestrated service behavior.  One is deployment frequency.  If I have to spin up an instance of a service feature set for every customer and service, then I have to worry a lot about costs for that process.  If I spin it up once for all customers and services, then I’m much less concerned about the spin-up costs.  The other factor is management and SLAs.  If a service has to be managed explicitly because it has an explicit SLA, then I have to orchestrate service lifecycle processes to make that management efficient.  If a service is best-efforts I’ll manage resources against service load and roll the dice, just like the Internet does.

If we can address service revenue growth outside SDN/NFV, how about the cost side?  Service automation is a path to reducing opex, but you could also envision opex reduction coming about as a result of “infrastructure simplification”.  Suppose you built a vertically integrated stack of virtual switching/routing, agile SDN tunnels, and agile optics, all presented as NaaS?  Could this stack not have much lower intrinsic operations cost even without a lot of add-on service automation or orchestration?  Sure could.

The risk to operators in both these “outlaw” solution strategies is that they’ll end up with silos by service, major infrastructure transformation costs to secure benefits, vendor lock-in, and so forth.  The operators don’t want to face these risks, but that’s not the primary driver.  The point I think Verizon/AOL may prove is that operators’ priority is solving their business problem not consuming a specific technology.  If SDN or NFV or even the cloud don’t boost the bottom line appropriately, they won’t be invested in.

I believe that SDN, NFV, and the cloud combine to solve all of the operators’ revenue and cost challenges to the extent that any infrastructure technology can.  The challenge is defining a scope for them to address enough benefits to justify costs.  The pathway to making this work is simple.

First, the cloud is the infrastructure of the future.  The network only connects cloud stuff and connects the cloud with users.  Thus, we need an architecture for service hosting and one for the building of services from hosted components.  I suggested a “microservices” approach to this but any useful architecture is fine.  We just can’t build toward a future with no specific technology framework to describe it.

Second, network infrastructure delivers NaaS.  Every service in the future has to be defined as a “connection model” and a set of endpoints.  The former offers the rules for delivery and the latter defines the things that can emit and receive traffic.  The definition, in abstract, has to then be resolved into resource behaviors appropriate to the devices involved.  Maybe it’s legacy/MPLS or GRE or VLAN.  Maybe it’s SDN.  Whatever it is, the resolution of the model defines the commitment of resources.

Third, services have to be modeled from intent through to resource structure through a common, phases-of-decomposition, chain of progress based on a harmonized modeling approach.  We have to isolate service definitions at the intent/functional level from changes made to how a function is fulfilled when the service is ordered.

Finally, service automation has to completely bind human and machine operations processes to the model so that all of the lifecycle phases of the service from authoring it to tearing it down when it’s no longer in place are automated based on a common description of what’s there and what has to be done.

The unifying concept here is that there has to be a unifying concept here.  Operators, including Verizon, may be risking long-term efficiencies by addressing their “transformation” processes without that unification.  But they have to address them, and I guess that if no unified approach is available then lack of unity is better than lack of profit.

There should have been a product for Verizon to buy, not a company.  We’ve known about the problem of profit-per-bit for almost a decade, and all of the technology concepts needed to solve it have been around for at least two or three years.  Everyone in the industry will now be watching to see whether Verizon’s move can get them to the right place.  What we have to learn from is that the industry is failing its customers.