Did Alcatel-Lucent Document the Wrong Path to SDN/NFV?

Studies are always a good thing, and we have one now that could be particularly good for the industry.  Working with AD Little, Alcatel-Lucent’s Bell Labs cooperated to generate a report (registration required) that outlines the benefits to be expected from SDN/NFV adoption.  There’s good news and bad in the report, at least from the perspective of most operators, and good thinking and bad as well.  In the balance, it’s a very useful read because it illustrates just what’s important to drive SDN/NFV forward, and the gap that exists among interested parties on the best way to do that.

Two early quotes from the report set the stage. The first sets the stage, citing the new world that mobility and the Internet have created:  “In this new environment, significant change is needed to the nature of the services offered and the network implementing them. These changes must allow the network to participate and contribute to the development of the cloud ecosystem.”  The second frames a mission:  “The foundation of this crucial change is next-generation cloud/IP transformation, enabled by NFV and SDN. In our view, it is a clear imperative for the industry.”

Everything good and bad about the report is encapsulated in these quotes.  The future has been changed by a new notion of services.  The role of SDN and NFV is to facilitate a cloud/IP transformation.  I agree with both these points, and I think most others would as well.  That would mean that the benefits of SDN and NFV should be achieved through the transition to the cloud.  That’s a point that competitor HP just made, and also one I believe to be true.

The authors focused on EU operators probably because they’ve led in SDN/NFV standardization and are also leaders in the lab trials and PoCs.  These activities are generally more tactically focused, and that becomes clear when the authors state the presumption of the study:  “What is clear, however, is that virtualization, programmability and network automation, enabled by these new technologies, will drive down industry operating costs considerably.”

The “enabled by these new technologies” part is the key point, I think.  There are two interpretations possible for the phrase.  One is that new technologies will open new operations models as they deploy, changing the way we operationalize by changing what we operationalize.  The other is that the new technologies will be tied to legacy services as well as to changes in services driven by changes in network infrastructure.  The report takes the former position; SDN and NFV will transform us as they deploy, and I think that begs some serious questions.

Accounts from writers who discussed the report with authors say that the timeline for realizing these from-the-inside benefits is quite long—ten years.  That delay is associated with the need to modernize infrastructure, meaning to adopt hosted and software-defined elements that displace long-lived equipment already installed.  There is a presumption that the pace of savings matches the pace of adoption, remember.

“Opex” to most of us would mean human operations costs, and it certainly means that to operators, but the study seems to miss two dimensions of opex.  First, there is nothing presented on which you could base any labor assumptions, at least not that I can see.  That may be why the labor savings quantified are a relatively small piece of the total savings projected.  Second, there is nothing presented to assess the operations complexity created by the SDN or NFV processes themselves.  If I replace a box with three or four cloud-hosted, SDN-connected, chained services I’ve created something more operationally complex not less.  If I don’t have compensatory service automation benefits, I come out in the hole not ahead.

I’m also forced to be wary about the wording in the benefit claims.  The study says that “the efficiency impact of onboarding NFV and SDN for these operators could be worth [italics mine] 14 billion euros per year, equal to 10 percent of total OPEX. The results are driven by savings from automation and simplification.”  That kind of statement begs substantiation.  Why that number?  What specific savings from automation and simplification?

From a process perspective, the industry has produced no accepted framework for operationalizing either SDN or NFV.  Only about 20% of operators say they have any tests or trials that would even address that question, and only one of them (in my last conversations) believed these tests/trials would actually prove (or disprove) a business case.

We could fix all this uncertainty by making one critical change in our assumptions, taking the other path in the question of what applying “new technologies” means.  That assumption is that service modeling and orchestration principles created for NFV infrastructure would be immediately applied through suitable infrastructure managers to legacy infrastructure and services as well.  In short, if you can operationalize everything using the same tools, you can gain network-wide agility and efficiency benefits.

If we’re applying operational benefits network-wide, then savings accrue at the pace we can change operations not the pace we can change infrastructure.  The study says about ten percent of opex is eliminated; my figures say about 44% would be impacted, and the difference is pretty close to the portion of infrastructure you could convert to SDN/NFV using the study’s guidelines.  Apply opex benefits to a larger problem and you get a more valuable solution.

Perhaps my most serious concern with the assumptions of the study is the implied scope of the future carrier business.  We are saying that the network is transformed by new opportunity, but that the operators’ role in this transformation is confined to the least-profitable piece, the bit-pushing.  The examples of new services offered are all simply refreshes of legacy services, packaging them in a more tactical way, shortening time-to-revenue for new things.  There’s nothing to help operators play in the OTT revolution that’s created their most dramatic challenges.  Good stuff is happening, we’re saying, and you have to be prepared to carry water for the team.  I disagree strongly with that.  IoT for example has many of the attributes of early telephony.  It involves a massive coordinated investment and an architecture that keeps all the pieces from creating fiefdoms of one element that are useless to every.  Why would a common carrier not want to play there, and why would we not want them to?

The future of the network is to broaden the notion of services.  I can’t see why operators would invest in that and then leave all the benefits on the table for others to seize.

All of the basic points in the document regarding the value of agility and efficiency are sound, they’re just misapplied.  If you want to fix “the network” then you have to address the network as a whole.  NFV and SDN can change pieces of the network but the bulk of capital infrastructure will not be impacted even in the long term—access aggregation, metro transport, and so forth are not things you can virtualize.  In the near term the capital inertia is too large for any significant movement at all.  It’s a matter of timing.

I think this study is important for what it shows about the two possible approaches to SDN and NFV.  It shows that if we expect SDN and NFV to change operations only where SDN or NFV displace legacy technology, it will take too long.  Particularly given that operators agree that their revenue/cost per bit cross over in 2017 on the average.  Alcatel-Lucent and ADL have proved that we can generate “second-generation” benefits with NFV and SDN but they missed the fact that to get early benefits to drive evolution and to deliver on the “coulds” the report cites, we need to make profound service lifecycle changes for every service, every infrastructure element.  And we need that right now.

Alcatel-Lucent isn’t the only vendor to take a conservative stance on SDN and NFV, of course.  Operators, at the CFO level at least, generally favor the computer vendors as SDN/NFV partners because they believe that class of supplier won’t drag their feet to protect legacy equipment sales.  The operators, in the recent project documents I’ve seen, are expanding their scope of projected changes.  They want “transformation” not evolution, and the question is who’s going to give it to them.

Does “SDN” Muddy Alcatel-Lucent’s Opto-Electrical Integration?

Pretty much everything these days is based on networking and software, which by current standards means that everything is SDN.  The trend to wash stuff with terms like SDN and NFV is so pervasive that you almost have to ignore it at least at first to get to the reality of the announcement.  So it is with Alcatel-Lucent’s announcement of its Network Services Platform, which is sort-of-SDN and sort-of-NFV in a functional sense, but clearly in the SDN camp from the perspective of the media and even many in Alcatel-Lucent.

At the basic level, NSP is a platform that sits on top of legacy optical and IP/MPLS infrastructure and provides operational control and unified management across vendor boundaries.  It exposes a unified set of APIs upward to the OSS/BSS/NMS world where services are created and managed, and it allows operators to manage a combination of optics and IP/MPLS as a cohesive transport network.  It could create the very “underlay” model I’ve suggested is the right answer for the network of the future.

The notion of unified software control is the “SDN” hook that a lot of people have picked up on, not the least because Alcatel-Lucent uses the term “SDN” to describe it in their press release.  Many SDN purists would disagree, of course.  The SDN approach they’d take is more like that of Open Daylight, which creates an SDN controller and puts legacy devices and interfaces underneath that controller as an alternative to OpenFlow as a control protocol.  The Alcatel-Lucent approach is actually a bit closer to what the ETSI NFV ISG seems to be converging on, which is a WAN Infrastructure Manager (WIM) that runs in parallel with the Virtual Infrastructure Manager (VIM).  Most in ETSI seem to put SDN and SDN controllers inside the VIM and use them for intra-VNF connectivity.

Network operators all seem to agree that 1) you need to evolve SDN out of current legacy infrastructure and 2) SDN has to have a relationship with NFV.  The question is how that should be accomplished—and Alcatel-Lucent’s NSP approach illustrates one of the viable options.  Does it illustrate the best one?  That’s harder to say, and it’s harder yet to say why they’ve taken this particular approach.

You probably all know my own thinking here.  I believe that all network services should be viewed as NaaS in the first place, meaning that I think that connection-model abstractions should be mapped either to legacy, to SDN, to NFV, or to mixed-model infrastructure as circumstances dictate.  I also believe that “infrastructure management” is a generic concept at the level where software control is applied, even if the infrastructure is for hosting stuff and not connecting it.

The Alcatel-Lucent approach, laid out in their NSP white paper, seems to make NSP and SDN more a parallel path than the unified element that their PR suggested.  The paper shows an NFV figure that places both NSP and Nuage under the operations/management layer rather than showing Nuage under NSP.  It’s also interesting to note that the Alcatel-Lucent press release never mentions NFV or positions either NSP or SDN relative to NFV.

Which I think is a mistake.  If you read the PR and even the white paper, you get the distinct impression that this is really all about optics and MPLS products.  There really is a need for coordinated SDN/legacy deployment so you can’t say that Alcatel-Lucent was guilty of SDN-washing, but they do seem to have missed an opportunity to position NSP in a more powerful way.  That’s worrying because it could be a reflection of their age-old silos problem.

I remember commenting a couple years after the merger that created Alcatel-Lucent that the new company seemed to be competing as much with themselves as with real competitors.  More, in some cases, because internal promotion and status was linked to relative success.  This problem kept Alcatel-Lucent from fully leveraging its diverse assets, to the point where in my view it prevented the company from giving Cisco a real run for its money in the network operator space.

In today’s Alcatel-Lucent we have three dominant business segments responsible for facing the future.  One is the IP group, which has been the darling of Wall Street because of its ability to create engagement and hold or gain market share.  Another is the Cloudband stuff where Alcatel-Lucent’s NFV lives, and the third is Nuage, the SDN people.  It’s hard to read the NSP material and not wonder just how much these three units are communicating.

If NSP is in fact what the press release suggests, which is an umbrella strategy to unify legacy and SDN technology under a common software-defined-and-decoded abstraction, then it’s a step forward for Alcatel-Lucent, but the company would still have to prove that their approach is stronger than one based on Open Daylight (I think that would be possible because of management holes in ODL).  If NSP is what Figure 4 of their white paper says it is, then it’s leaving the SDN/legacy separation to be handled by “the higher level” meaning operations and management systems.  That to me undermines the value proposition for NSP in the first place, because facilitating software definition can’t be simply letting the other guy to the heavy lifting.

Some of the operators, particularly those in the EU, have what I think is the right idea about the SDN/NFV relationship and their ideas should, in my view, be the standard against which stuff like NSP is measured.  Their vision is of an NFV-orchestrated world where services can call on multiple “infrastructure managers” that could provide both intra-VNF and end-to-end connectivity and use both SDN and legacy network elements.  It seems to me that this vision would benefit from a unified model of SDN/legacy control, which would be in effect a merging of Alcatel-Lucent’s NSP and Nuage positioning.

Which may be why we don’t hear about it.  This positioning would cut across all three of our future positioning trends, uniting what may be three positions that Alcatel-Lucent would like to keep separate for the moment.  That’s not necessarily a bad thing (Cisco is in my view dedicated to keeping the evolution toward SDN/NFV from impacting legacy sales in the current period), but it does create vulnerabilities.

Both SDN and NFV demand radical future benefits to justify the comprehensive infrastructure changes they’d involve.  We need to see what those benefits are to make the journey, and vendors who invoke the sacred names of SDN or NFV have to be prepared to show how their stuff fits in a compelling vision of that future infrastructure.  Vendors who don’t have to defend the present, meaning in particular software/server giants like HP or Oracle, could run rampant in the market while network-equipment competitors are still shifting from silo to silo.

I don’t think that legacy and SDN should be parallel concepts integrated at the ops level.  The NaaS model, IMHO, is best served by having vertical integration of higher (service) and lower (transport) layers based on policy or analytics.  Alcatel-Lucent actually has analytics and has all the pieces it needs to create the right model, and NSP is functionally a good step.  To make the benefits both real and clear, though, it seems they may have to fight those same old product-silo demons.

A Security/Compliance “Model” for SDN and NFV

We know we need to have security and compliance in SDN and NFV, simply because we have them today in other technologies.  We also know, in at least a loose sense, that the same sort of processes that secure legacy technology could also secure SDN and NFV.  The challenge, I think, is in making “security” or “compliance” a part of an SDN or NFV object or abstraction so that it could be deployed automatically and managed as part of the SDN/NFV service.

Security, or “compliance” in the sense of meeting standards for data/process protection, have three distinct meanings.  First, they are a requirement that can be stated by a user.  Second, they are an attribute of a specific service, connection, or process.  Finally, they are a remedy that can be applied as needed.  If we want a good security/compliance model for SDN and NFV we need to address all three of these meanings.

The notion of attributes and remedies is particularly significant as we start to see security features built into network and data center architectures.  This trend holds great potential benefits, but also risks, because there’s no such thing as a blanket security approach, nor is “compliance” meaningful without understanding what you’re complying with.  Both security and compliance are evolving requirement sets with evolving ways of addressing them.  That means we have to be able to define the precise boundaries of any security/compliance strategy and we have to be able to implement it in an agile way, one that won’t interfere with overall service agility goals.

Let’s start by looking at a “service”.  In either SDN or NFV it’s my contention that a service is represented by an “object” or model element.  At the highest level, this service object is where user requirements would be stated, and so it’s reasonable to say that a service object should have a requirements section where security/compliance needs are stated.  Think of these as being things like “I need secure paths for information” and “I need secure storage and processing”.

When a service object is decomposed, meaning when it’s analyzed at lower levels of the structure on the path toward making actual resource assignments, the options down there should be explored with an eye to these requirements.  In a simple sense, we either have to use elements of a service that meet the service-level requirements (just as we’d have to do for capacity, SLA, etc.) or we have to remedy the deficiencies.  The path to that starts by looking at a “decomposable” view of a service.

At this next level, a “service” can be described as a connection model and a set of connected elements.  Draw an oval on a sheet of paper—that’s the connection model.  Under the oval draw some lines with little stick figures at the end, and that represents the users/endpoints.  Above the oval draw some more lines with little gear-sets, and those represent the service processes.  That’s a simple but pretty complete view of a service.

If a service consists of the stuff in our little diagram, then what we have to do to deploy one is to commit the resources needed for the pieces.  Security and compliance requirements would then have to be matched to the attributes of the elements in our catalog of service components.  If we have a connection model of “IP-Subnet” then we’d look at our model resources to find one that had security and compliance attributes that matched our service model.  Similarly, we’d have to identify service processes (if they were used) that would also match requirements.

My view is that all these resources in the catalog would be objects as well, built up from even lower-level things that would eventually lead to resources.  A service architect could therefore build an IP-Subnet that had security/compliance attributes and committed resources to fulfill them, and another IP-Subnet that had no such attributes.  The service order process would then pick the right decomposition based on the stated requirements.

It’s possible, of course, that there are no such decompositions provided.  In that case, there has to be a remedy process applied.  If you want to have the service creation and management process fully automated (which I think everyone would say is the goal) then the application of the remedy has to be automated too.  What might that look like?

Like another service model, obviously.  If we look at our original oval-and-line model, we could see that the “lines” connecting the connection model to the service processes and users could also be decomposable.  We could, for example, visualize such a line as being either an “Access-Pipe” or a “Secure-Access-Pipe”.  If it’s the latter then we can meet the security requirements if we also have an IP-Subnet that has the security attribute.  If not, then we’d have to apply an End-to-End-Security process, which could invoke encryption at each of our user or service process connections.

Just to make things a bit more interesting, you can probably see that an encryption add-on, to be credible, might have to be on the user premises.  Think of it as a form of vCPE.  If the customer has the required equipment in which to load the encryption function, we’re home free.  If not, then the customer’s access pipe for that branch would not have a secure option associated with it.  In that case there would be no way to meet the service requirements unless the customer equipment were to be updated.

I think there are two things that this example makes clear.  First is that it’s possible to define “security” and “compliance” as a three-piece process that can be modeled and automated just like anything else.  Second, the ability of a given SDN or NFV deployment tool to do that automating will depend on the sophistication of the service modeling process.

A “service model” should reflect a structural hierarchy in which each element can be decomposed downward into something else, until you reach an atomic resource.  That resource might be a “real” thing like a server, or it might be a virtual artifact like a VPN that is composed by commanding an NMS that represents an opaque resource structure.  The hierarchy has to be supported by software that can apply rules based on requirements and attributes to select decomposition paths to follow.

At one level, NFV MANO at least should be able to do this sort of thing, given that there are a (growing) number of service attributes that it proposes to apply for resource selection.  At another level, there’s no detail on how MANO would handle selective decomposition of models or even enough detail to know whether the models would be naturally hierarchical.  There’s also the question of whether the process of decomposition could become so complex (as it attempts to deal with attributes and requirements and remedies) that it would be impossible to run it efficiently.

It’s my view that implementations of service modeling can meet requirements like security and compliance only if the modeling language can express arbitrary attributes and requirements and match them at the model level rather than having each combination be hard-coded into the logic.  That should be a factor that operators look at when reviewing both SDN and NFV tools.

Can Effective NFV Management/Analytics Solve SDN’s Management Problem?

Both SDN and NFV deal with virtualization, with abstractions realized by instantiating something on real infrastructure.  Both have management issues that stem from those factors, which means that they share many of the same management problems.  Formalistically speaking neither SDN nor NFV seem to be solving their problems, but there are signs that some of the NFV vendors are being forced to face them, and are facing them with analytics.  That may then solve the problems for SDN as well, and for the cloud.

You could argue that our first exposure to the reality of virtualization came from OpenStack.  In OpenStack, we had a series of models (CPU, network, image store, etc.) that represented collections of abstractions.  When something deployed you made the abstractions real.  DevOps, which also came into its own in the cloud even though the concept preceded the cloud, also recognized that abstract models were at least a viable approach to defining deployment.  OASIS TOSCA carries that forward today.

The basic problem abstractions create is that the abstraction represents the appearance, meaning what the user would see in a logical sense.  If you have a VPN, which is an abstraction, you expect to use and manage the VPN.  That the VPN has real elements may be something you’d have to come to terms with at some point, because you can’t fix real devices in the virtual world, but this coping with reality stuff is always problematic, and that’s true here in virtual management too.

My personal solution to this problem was what I called derived operations, which means that the management of an abstraction is done by creating a set of formulative bindings between management variables in the abstract plane and other real variables from real resource.  It’s not unlike driving a car; you have controls that are logical for the behavior of driving and these are linked to real car parts in such a way as to make the logical change you command convert to changes in auto elements that make that logical change real.

In one sense, derived operations is simple.  You could say “object status = worst-of(subordinate status)” or something similar.  In virtualization environments of course, you don’t know what the subordinates are until resources are allocated.  That means two levels of binding—you have to link abstract management variables to other abstract variables that will on deployment be linked to real variables.  You also have to accept the fact that in many cases you will have layers of abstraction created to facilitate composition.  Why define a VPN in detail when you can use a VPN abstraction as part of any service that uses VPNs?  But all of this is at least understood.

The next problem is more significant.  Say we have an abstract service object.  It has a formula set to describe its status.  We change the value of one of the resource status variables, one that probably impacts a dozen or so service objects.  How do we tell those objects that something changed?  If we’re in the world of “polled management” we assume that when somebody or something looks at our service object, we would refresh its variables by running the formulative bindings it contains.

Well, OK, but even that may not work.  It’s not efficient to keep running a management function just to see if something changed.  We’d be wasting a lot of cycles and potentially polling for state too many times from too many places.  What we need is the concept of an event.

Events are things that have to be handled, and the handling is usually described by referencing a table of “operating states” and events, the intersection of which identifies the process to be invoked.  We know how to do this sort of thing because virtually every protocol handler is written around such a table.  The challenge comes in distributed event sources.  Say a trunk that supports a thousand connections fails.  That failure has to be propagated up to the thousand service models that are impacted, but how does the resource management process know to do that?

This is where analytics should come in.  Unfortunately, the use of analytics in SDN or NFV management has gotten trivialized because there are a number of ways it could be used, one of which is simply to support management of resources independent of services.  Remember the notion of “directory-enabled networking” where you have a featureless pool of capacity that you draw on up to a point determined by admission control?  Well that’s the way that independent analytics works.  Fix resource faults and let services take care of themselves.

If you want real service management you have to correlate resource events with service conditions, which means you have to provide some way of activating a service management process to analyze the formulary bindings that define variable relationships, and anchor some of those binding conditions as “events” to be processed.  If I find a status of “fault” here, generate an event.

When you consider this point, you’ve summarized what I think are the three requirements for SDN/NFV analytics:

  1. The proactive requirement, which says that analytics applied to resource conditions should be able to do proactive management to prevent faults from happening. Some of this is traditional capacity planning, some might drive admission control for new services, and some might impact service routing.
  2. The resource pool management requirement, which says that actual resource pools have to be managed as pools of real resources with real remediation through craft intervention as the goal. At some point you have to dispatch a tech to pull a board or jiggle a plug or something.
  3. The event analysis requirement, which says that analytics has to be able to detect resource events and launch a chain of service-level reactions by tracking the events along the formulary bindings up to the services.

The nature of the services being supported determines the priority of these three requirements for a given installation, but if you presume the current service mix then you have to presume all three requirements are fulfilled.  Given that “service chaining” and “virtual CPE” both presume some level of service-level agreement because they’re likely first applied to business services, that means that early analytics models for SDN/NFV management would have to address the event analysis requirement that’s the current stumbling block.

From and implementation perspective, it’s my view that no SDN/NFV analytics approach is useful if it doesn’t rely on a repository.  Real-time event-passing and management of services from the customer and customer-service side would generate too much management traffic and load the real devices at the bottom of the chain.  So I think all of this has to be based on a repository and query function, something that at least some of the current NFV implementations already support.

Where this is important to SDN is that if you can describe SDN services as a modeled tree of abstract objects with formulary bindings to other objects and to the underlying resources, you can manage SDN exactly as you manage NFV.  For vendors who have a righteous model for NFV, that could be a huge benefit because SDN is both an element of any logical NFV deployment and a service strategy in and of itself.  Next time you look at management analytics, therefore, look for those three capabilities and how modeling and formulary binding can link them into a true service management strategy.

We Know We’re Building Clouds and not Networks in the Future; We Just Don’t Know How

It’s obvious that there’s a relationship between NFV and the cloud, but for many the relationship is simply one of common ancestry.  You host apps in the cloud and with NFV you host features there.  Well, there’s a lot more to the cloud than just hosting apps or features.  The cloud is the model of the network of the future, and so a vendor’s cloud-savvy may be critical to their credibility as an NFV partner.

If NFV was a simple value proposition, a single golden application that when implemented would suddenly arrest the convergence of revenue and cost per bit, we’d have it easy.  The problem is that the “golden applications” for NFV that we see today are golden largely because they’re contained and easily addressed.  Our experience with them, at the service level, isn’t going to take us to where we need to be.  So we have to look to other value propositions.

We hear a lot about how service agility, service chaining, and SDN are going to create new opportunities for network operators.  The net of this is the assertion made by many, that bit-based services can still be profitable.  Well, the facts don’t bear that out.  In the most recent quarter, for example, enterprise services for Verizon were off by 6% and made up only about 10% of revenues.  My survey data shows that enterprises now have premises tools for firewalls, NAT, and so forth, and that the path to substitute carrier services for these tools is littered with delays caused by financial cycles (depreciation of the current assets) and fears of operational impacts.  Operator CFOs privately think that service chaining isn’t going to make a meaningful difference in that converging revenue/cost curve.

The services most talked about today aren’t suitable for driving broad transformation simply because they’re not broadly consumed.  They don’t impact enough total cost or introduce enough incremental revenue.  That’s what makes NFV a kind of game of bridge.  We have to build a future infrastructure that will make money, so we have to bridge between the implementation of our early services and the implementation of future services.  We have to get to that future with some flair, meaning that we’ll have to build out the future-facing tools while addressing near-term costs and revenues.  Where the features of NFV that we hear about matter is in that bridging.

The challenge is that you need both banks to make a bridge.  We know where we are now, and we understand that things like service chaining can lead us forward.  What’s less clear is what “forward” means in terms of features and technologies.  We can’t expect operators to renew their infrastructure for every step of the transition.  We need transitioning infrastructure, transitioning operations practices.

Saar Gillai, SVP and general manager of HP NFV Business Unit did what I think might be the first blog post made by a vendor on this topic.  In it he lays out four phases of NFV as decouple, virtualize, cloudify, and decompose.  This starts with breaking up appliances by decoupling function from one-off hardware, moves to virtualizing the software for the functions, adds in agile cloud hosting of both functions and applications, and finally decomposing both elements of services and apps to permit more agile composition of new stuff.

I think these phases are the right framework.  Making them work in practice will involve two things.  First, we really do need to understand the kind of network future we’re building.  We cannot even recognize the right decisions without that standard of future mission to guide us.  Second, we need to find a way to accelerate benefits to fund the changes.  Without that, the first baby steps won’t provide enough benefit to justify broad deployment.

I’ve talked about orchestration of operations and its importance before, but I want to point out another aspect of orchestration that might well but just as important.  It’s the other face of service management, which is service logic.  We have to recognize a key point, which is that as we accelerate service cycles we reach a point where what we’re describing in a service model is how it works and not how it’s built.

Right now, we’re visualizing an implementation of NFV and orchestration that deploys stuff.  It’s role is the service management side—putting resources into place, connecting them, and sustaining their expected operating state and SLA.  Even this service management mission is broader than current standards activities can support, because we don’t fully control mixtures of legacy, SDN, and NFV and because we don’t automate the management processes along with the deployment.  Taking orchestration to the service logic level seems daunting.

Well, maybe that’s because we’re thinking too narrowly about services.  If you believe that OTT players have the secret to future revenues, then you believe in service logic.  Most OTT revenues come from advertising, which is composed into web pages dynamically and delivered.  That’s not management—we don’t build web pages and store them in anticipation of their being used.  We look at cookies and decide what ads to serve, and we then serve them.  Service logic.

Some of the implementations of NFV orchestration also touch, or could touch, on service logic.  Overture’s Ensemble Service Orchestrator uses BPMN to describe a deployment workflow.  Similar notation could describe service logic.  In my original ExperiaSphere project, the alpha demonstration involved finding a video based on a web-search API, finding its serving location, and describing an expedited delivery plan.  That could be viewed as deployment, but it is also very close to being service logic.

In fact, anything that can select processes based on state/event analysis can do both service management and service logic, which means that it’s likely that any NFV implementation that met current requirements for operations orchestration could become service logic orchestration tools too.  That’s good, because service logic and cloud support for it is the big brass ring of the future that every operator and every vendor is trying (even when they don’t fully realize it) to grasp.

Service-logic orchestration means you could orchestrate the flow of work among service elements in real time.  This clearly isn’t suitable for high-speed data-path applications but I think those applications are just a small step above pushing bits.  What future services have to offer is some sort of functional utility, something like exploiting LBS to find things or serve ads, or exploit community behavior to do the same.  In short, service logic is part of realizing mobile/behavioral opportunities.

Service logic is about handling events.  So it service management.  What separates the two (besides intent) is the performance requirements associated with something and the fact that service logic is likely to compose experiences not from things newly deployed but from things that are available as resources.  These are the “microservices” I’ve talked about.

I think cloud-hosted microservices are the key to NFV’s future.  There is I think some recognition that they’re important even in the ETSI ISG’s current work; multi-tenancy of elements of services are an issue under discussion.  The challenge is advancing our thinking before we’ve advanced to the future we’re supposed to be preparing for.

NFV can’t be about building networks a different way, or building bit-based services a different way, or building services for a small fraction of the customers a different way.  We need to make massive changes in both cost and revenue and that will take transformative architectures.  Those are the architectures we need to be thinking about right now, because 2015 is grinding along and we’re running out of time before operators are forced to look beyond NFV and SDN and even the cloud for their transformation.

Cisco: Facing their Past to Save their Future

Here is an interesting question for you.  If the gazelle evolves, does the lion also have to change?  Of course, you’d say.  A food chain generates a chain reaction to any significant alterations.  Well, then, how about this one.  If network services evolve to something very different, does enterprise network equipment also have to evolve?  That’s the question that should be plaguing Cisco and maybe other network vendors as well.

If you look at Cisco’s quarter you see what probably surprises nobody at all.  Their enterprise numbers were pretty good and their service provider results were dim.  Here’s Chambers’ comment from the earnings call:  “We are managing continued challenges in our service provider business, which declined 7%, as global service provider Capex remained under pressure and industry consolidation continues.”  There are two questions begging to be asked regarding these numbers.  First, why are operators holding back while enterprises spend?  Second, will changes in the operator business model inevitably impact the enterprise?

Everyone buys stuff for ROI.  For enterprises, the return comes in the form of improved worker productivity, lower support costs, and lower equipment costs.  My surveys suggested that enterprises responded to the 2008 economic crisis by holding back on “modernization spending”.  They’re not doing that as much now, though they’re also not backfilling to make up for past neglect.  Whatever the details, enterprises really can’t just stop spending on networking because networking supports their operations.  If your accounting isn’t profitable you can’t stop making payments or collecting on invoices.

For operators it’s more complicated.  They sell services based on expensive and long-lived infrastructure.  They could certainly decide to exit service markets that weren’t profitable, or to invest only where profit could be had.  Verizon, remember, doesn’t offer FiOS everywhere.  They sell it where they can make money, and they’re trying to sell off their access business to rural telcos where FiOS isn’t going to pay off.  Operators also have the option to under-invest in infrastructure and allow service quality to decline if it’s impossible to make money by providing what customers want.

I think all of these factors explain the current Cisco profit picture.  Operators are saying that their profit per bit is declining so they’re not rushing out to spend on infrastructure to generate more bits.  Enterprises are tied to network-centric application paradigms for productivity enhancement.  The latter are carrying spending better than the former.

The latter are also spending on the services of the former.  When we didn’t have IP VPNs, enterprises bought routers for WAN transport.  Today those products aren’t necessary.  The question is whether new services could change the enterprise network composition as old ones did.  If they do, then Cisco’s enterprise business is also in jeopardy.

The cloud could be another issue for enterprise spending on network equipment.  Most enterprise switching goes into data centers, and if there is a significant migration of applications from the enterprise data center to the public cloud, there would be a drop in enterprise data center switching spending.  This could be somewhat offset by gains on the provider side, but obviously cloud computing can’t work if there’s no economy of scale, so we’d have to assume that compensatory cloud provider data center switching spending gains would be significantly smaller than the enterprise losses.

The big question, though, is whether the evolution of “services” that network operators and even equipment vendors are proposing would impact the way enterprises buy equipment.  One obvious example is the virtual CPE stuff.  Today we’d often terminate business services to branch offices in a router or custom appliance.  What operators plan to do is to terminate it in a cheap little interface stub backed up by hosted functionality in the cloud.  There are a lot more branches than headquarters locations, so if this technology switch succeeds then enterprise branch networking could change radically.

Then there’s NaaS.  We hear that SDN could let us dial up a connection ad hoc, letting enterprises buy bandwidth as needed on a per-application and per-user basis.  What does this do to traditional networking?  Even carrier networks that are at least partially justified by VPN services might be changed if suddenly we were just building connections on demand.  Underneath a VPN is IP routing.  Underneath SDN forwarding paths is…well, nothing specific.

Virtual network elements could let enterprises bypass the whole notion of Ethernet or IP services and devices and simply funnel tunnels into servers and client devices over pretty much featureless optical or opto-electrical paths.  An “overlay SDN” technology like the original Nicira stuff, now part of VMware, or the Nuage SDN products from Alcatel-Lucent could be used to build this kind of network today.  At the very least it could dumb down both the client/branch side of the network, and from their latest announcement it’s clear that Nuage is aiming at integrating enterprise data center networking and branch networking even to the extent of supporting combined operations.

If you combine NaaS and NFV principles you get network services that are composed rather than provisioned in the old sense.  Think of it as a kind of 3-D printer for services.  You send a blueprint to the Great Service Composer and you get what you asked for spun out in minutes.  This would be a profound change in not only services but applications, including cloud computing.  All of a sudden application features aren’t put anywhere in particular, they’re just asked for and supplied from the most economical source or with the most suitable SLA.

What Cisco is facing, what Cisco should fear, isn’t white box switching.  The fact is that we’ve not done much yet to make “forwarding engines” like OpenFlow devices into alternative network components.  We’ve just made them into switches and routers.  That would have to change, though, if we expect to have the kind of things I’ve noted here, and if it does change at the service level then it will pull through a transformation even at the enterprise equipment level.

This doesn’t mean that I advocate Cisco jumping with both feet into the deep end of the NaaS-and-NFV pool.  I think they’d simply have too much to lose.  Networking is an industry whose depreciation cycles are very long, and it will take time for the service providers and enterprises to adapt their infrastructure to a new model even if they understand that model and accept its consequences.  Cisco could, in a stroke, make the future more understandable and acceptable, but I don’t think they could win in it quite yet.  Till they reach that tipping point, I think we’re going to hear the same story of hopefulness for old technology and blowing kisses at the new.

The Real Lesson of the Verizon/AOL Deal

The network operators, particularly the telcos, are in a battle with themselves, attempting a “transformation” from their sale-of-bits model of the past to something different and yet not precisely defined.  One likely offshoot of this, Verizon’s decision to acquire AOL, has generated a lot of comment and I think it may also offer some insights into where operators are heading in terms of services and infrastructure.  That has implications for both SDN and NFV.

AOL reported about $2.6 billion in revenues, compared to Verizon’s $128 billion.  I don’t think, nor do most Street analysts, that Verizon was buying AOL for their revenue.  Verizon’s revenues, in fact, are about a quarter of the whole global ad spend.  You don’t find relief for a large problem at a small scale.  That means they were likely buying them for their mobile and ad platform—their service-layer technology.  So does this mean that telcos are going to become OTTs?

Not possible.  Obviously somebody has to carry traffic, so the most that could happen was that telcos could add OTT services to their inventory.  But even that has challenges as a long-term business model because of seismic shifts in carrier profits.  OTTs get access to the customer essentially free, and telcos get paid for providing customer services, right?  Right, but remember that operators everywhere are saying their revenue and cost per bit curves will cross about 2017.  If the operator ends up providing Internet at a loss and making up the loss with OTT services, their service profits will be lower than the “real” OTTs’.  They have to make up a loss the others don’t bear.

What is possible?  Start with the fact that the converging cost/revenue per bit curves can be kept separate by improving costs or raising revenues, or any combination thereof.  We’re used to thinking of “revolutionary” technologies like SDN, NFV, and the cloud as being the total solution, meaning that we’d address both cost and revenue with a common investment.  Not only that, operators themselves have made that assumption and most still do so.  Of the 47 global operators I’ve surveyed, 44 have presented “transformation plans” to address both costs and revenues.  Well, AOL isn’t likely to impact Verizon’s cost, so it just might be that Verizon has decided to split what nearly everyone else has combined.  That could be big news.

It’s not that NFV can place ads or control mobile experiences directly.  As announcements for session-service platforms from Alcatel-Lucent, Huawei, and Oracle have already shown, you really need a kind of “PaaS” in which specialized service apps live.  You can deploy these platforms with NFV, operationalize them with NFV principles (if you have any NFV operations features from your vendor), and perhaps even connect them using SDN.  All of the vendors who have announced these platforms have taken pains to endorse NFV compatibility.

AOL isn’t a vendor, nor are they a carrier.  I’ve never heard them say a word about NFV or SDN and it’s unlikely they designed their mobile or ad platforms to be SDN-consuming or NFV-compliant.  That means that what Verizon bought isn’t either of these things, and that’s where the interesting point comes in.

Late last year, a number of operator CFOs told me that their need for a resolution to their cost/revenue problems quickly.  Many said that unless they could make the case for NFV as a path of resolution for these problems before the end of 2016, they’d have to find another path.  Might it be that Verizon has decided that they need a revenue-generating architecture, NFV or not?  AOL proves an important point, which is that many service platforms for mobile and advertising are multi-tenant.  You don’t have to “deploy” them for every user, they’re just hosted for multi-tenant services.  IMS is like that, after all.  So for these types of service you don’t “need” NFV.  It’s a cloud service.

Two things generate a need for orchestrated service behavior.  One is deployment frequency.  If I have to spin up an instance of a service feature set for every customer and service, then I have to worry a lot about costs for that process.  If I spin it up once for all customers and services, then I’m much less concerned about the spin-up costs.  The other factor is management and SLAs.  If a service has to be managed explicitly because it has an explicit SLA, then I have to orchestrate service lifecycle processes to make that management efficient.  If a service is best-efforts I’ll manage resources against service load and roll the dice, just like the Internet does.

If we can address service revenue growth outside SDN/NFV, how about the cost side?  Service automation is a path to reducing opex, but you could also envision opex reduction coming about as a result of “infrastructure simplification”.  Suppose you built a vertically integrated stack of virtual switching/routing, agile SDN tunnels, and agile optics, all presented as NaaS?  Could this stack not have much lower intrinsic operations cost even without a lot of add-on service automation or orchestration?  Sure could.

The risk to operators in both these “outlaw” solution strategies is that they’ll end up with silos by service, major infrastructure transformation costs to secure benefits, vendor lock-in, and so forth.  The operators don’t want to face these risks, but that’s not the primary driver.  The point I think Verizon/AOL may prove is that operators’ priority is solving their business problem not consuming a specific technology.  If SDN or NFV or even the cloud don’t boost the bottom line appropriately, they won’t be invested in.

I believe that SDN, NFV, and the cloud combine to solve all of the operators’ revenue and cost challenges to the extent that any infrastructure technology can.  The challenge is defining a scope for them to address enough benefits to justify costs.  The pathway to making this work is simple.

First, the cloud is the infrastructure of the future.  The network only connects cloud stuff and connects the cloud with users.  Thus, we need an architecture for service hosting and one for the building of services from hosted components.  I suggested a “microservices” approach to this but any useful architecture is fine.  We just can’t build toward a future with no specific technology framework to describe it.

Second, network infrastructure delivers NaaS.  Every service in the future has to be defined as a “connection model” and a set of endpoints.  The former offers the rules for delivery and the latter defines the things that can emit and receive traffic.  The definition, in abstract, has to then be resolved into resource behaviors appropriate to the devices involved.  Maybe it’s legacy/MPLS or GRE or VLAN.  Maybe it’s SDN.  Whatever it is, the resolution of the model defines the commitment of resources.

Third, services have to be modeled from intent through to resource structure through a common, phases-of-decomposition, chain of progress based on a harmonized modeling approach.  We have to isolate service definitions at the intent/functional level from changes made to how a function is fulfilled when the service is ordered.

Finally, service automation has to completely bind human and machine operations processes to the model so that all of the lifecycle phases of the service from authoring it to tearing it down when it’s no longer in place are automated based on a common description of what’s there and what has to be done.

The unifying concept here is that there has to be a unifying concept here.  Operators, including Verizon, may be risking long-term efficiencies by addressing their “transformation” processes without that unification.  But they have to address them, and I guess that if no unified approach is available then lack of unity is better than lack of profit.

There should have been a product for Verizon to buy, not a company.  We’ve known about the problem of profit-per-bit for almost a decade, and all of the technology concepts needed to solve it have been around for at least two or three years.  Everyone in the industry will now be watching to see whether Verizon’s move can get them to the right place.  What we have to learn from is that the industry is failing its customers.

What SDN and NFV REALLY Mean for the Network of the Future

There are reasons to do things and there are justifications and any CFO knows the difference.  In the last four blogs, I’ve talked about the value propositions for SDN and NFV, how they’re impacted by limited perceptions of the things SDN or NFV have to do, and the kind of holistic model that could define an SDN-and-NFV future.  From those, I think we can make some statements on how we’d get to such a future, meaning how near-term value propositions and steps could pave the way for SDN or NFV on a large scale.

Generally speaking, a new technology could be valuable to a network operator for three reasons:

  1. It reduces capital spending or improves cash flow—what could be called “capex” reasons.
  2. It reduces operating/operations costs, or “opex”.
  3. In improves revenue generation by creating new services or creating services more quickly.

Any of these reasons could be justifications for some sort of transformation project, but these projects would in particular likely to be limited in scope.  For example, you can justify something like service chaining in business Ethernet access services based on capex.  The problem is that the target services don’t account for much more than a fraction of typical operator revenues and costs.  If the problem operators face is what they say it is, which is that revenue per bit and cost per bit are going to cross over in 2017ish timeframes, then we need something more massive than diddling down in the single-digit percents-of-revenue-and-costs.

The assessment I’ve made based on fairly extensive carrier dialog can be summarized as follows:

First, there is little chance that operators would gain significant bottom-line value from the capex value proposition alone.  “We can get 20% by beating up Huawei on price” is a real comment that shows the challenge of a capex-only solution.  No matter what anyone says, capex is not going to justify SDN or NFV deployment at any scale, period.  All it could do is provide a service-specific architecture in an age where operators are trying to avoid being service-specific.

Second, operations savings could be very compelling not only as a direct benefit source but as a means of insuring that other benefits aren’t diluted by opex increases associated with a new technology.  They could also be obtained at least in part by completely modernizing OSS/BSS even without changing the network, though they could be enhanced with network modernization.  The problem is that cost management always runs out of steam as costs approach zero.  You need an end-game.

Finally, new revenues created by “service agility” are very difficult to prove out in the near term because they typically require considerable change in both service infrastructure and operations and service lifecycle practices/processes.  However, this benefit is likely as durable as changes in the market would be.  Service agility demands not only an agile architecture but a framework for finding value and merchandizing it.

To me, the net of this is both interesting and (probably) controversial.  In the near term we’d be well served if we could simply orchestrate management and service lifecycle processes; we could avoid making network investment at any significant level and still gain a lot of ground.  The question is whether there is anything in place to accomplish that goal.  Where NFV comes in is not that you need it for operations efficiency, but that you need operations efficiency for NFV and if you don’t get it in an NFV-harmonious way you may compromise service agility down the line.  NFV is a bridge between a current and future model if you extend its MANO concept as you should.

The same is true with SDN.  We don’t need SDN to create old services a different way, but to create new services.  Those who believe that everything we’re proposing to do with SDN can be done already are right in the service sense.  The problem is with our proposals, though, and not with SDN.  We are unable to frame the notion of a service in abstract.  SDN should have forced us to face that question, as NFV should have forced us to face orchestration of operations.  Neither did what they should have.

Everything starts with opex, but operators themselves are divided on just how to achieve operations orchestration.  There’s a camp that believes that OSS/BSS simply has to become “event-driven”.   Another camp thinks that it has to be completely modernized, and a third camp thinks the whole notion is beyond redemption and should be scrapped.  The problem here is that the first group has no specific notion of what event-driven OSS/BSS would look like, the second has no specific track in mind for modernization to follow, and the third has no suggested replacement.

If we accept the 2017 deadline then any major operations overhaul is not in the cards.  I believe that effective operations orchestration could be done by applying NFV principles beyond the target scope of NFV as ETSI has set that scope.  I think I’ve proved that in two separate projects, in fact.  I’m not sure that we could get there any other way at this point, certainly not by 2017.  If we’re going to fix NFV in this area, though, it’s vendors who will have to do the heavy lifting.  The ETSI ISG and OPNFV alike are still down at the bottom of what has to be a top-down process.

I also believe that SDN needs its own set of benefit-boosters, but unlike NFV it’s hard to say exactly where those boosters are going to come from.  IMHO, the best way to promote SDN would be to make it clear how SDN-based services could be created and operationalized in NFV.  There is an ETSI ISG activity that’s working on that, but again IMHO it’s hampered by the fact that operations hasn’t been accepted as in-scope by NFV at large.  In addition, there’s virtually nothing useful you can do in operations automation if you don’t control legacy infrastructure elements and that was also declared out of scope by ETSI (though it’s working back in).

You have to lead with operations efficiency or neither SDN nor NFV can hope to get off the ground in time to be useful.  You could probably get some gains in the service agility area even if you did nothing but totally orchestrate operations, but total orchestration of anything demands IMHO that you have some sort of “intent model” against which you can organize operations processes.  The challenge is that at least some operations processes relate to service-and-infrastructure activity that might force you to then move downward and start automating there.  That’s why I think the logical approach is to use an NFV-compatible approach to orchestrate infrastructure and network processes, and enhance NFV as needed to make that work.

Presuming we can get to the future with some flair, the challenge is making money there on new services.  I believe that there is little or no credibility to incremental service revenue gains from tweaking the characteristics of transport/connection services.  Customers who are offered on-demand bandwidth, for example, will exercise the option if their total cost is reduced, which means operator revenues would be reduced.  Thus, I believe that all credible new revenue has to come from something other than connection/transport.

Which means, I think, that all credible future revenues are going to come from something that looks at the service-implementation level like cloud computing.  New features, new applications, new services will all end up being hosted and the way that all this good stuff manages to be profitable will depend on application of service automation and orchestration techniques to cloud deployment.  It will also depend on the availability of features and apps to host, of course, and on service provider marketing that an sell all the stuff.  That latter set of requirements can’t be met in the near term, and even the orchestration requirements probably have to evolve out of whatever we do for near-term service orchestration or we’ll be stranding a lot of costs and practices.

This is what poses the challenge for operators.  I can make a lot of favorable cost changes in the present with little more than better operations.  Over time those changes will have to “descend the stairs” toward infrastructure in order to continue to capture new savings, and eventually we’ll get to the basement.  I can do a lot of future revenue growth based on effective cloud infrastructure, but not right away.  Logically what we have to do here is to leverage one concept (ops efficiency) in the near term, and another (service agility) in the longer term, then bridge between them.

SDN and NFV, and even the cloud, are facing a crisis of visualization.  We have ideas so revolutionary that we’re trivializing them just to make them consumable in the near term.  You don’t need revolutionary ideas in the near term, folks.  In fact, there wouldn’t be time to develop them.  We have an unparalleled opportunity to remake networking and services, and we are going to meet that opportunity one way or another.  I think it’s time to decide what the opportunity really is and how we really get there effectively.

Have We Had the Solution to SDN Control All Along?

The question of how the network of the future could work, how SDN in particular could be introduced and managed, needs to be answered.  What’s really interesting is that it might have been answered already, and that not only are we not running to explore the solution, we might be running away.

One of the goals of SDN (at least the OpenFlow version) was to substitute central and explicit control over forwarding tables for adaptive protocols and behavior.  An SDN controller is expected to “rule” on explicit requests for routes or to set up default routes based on some policies.  The challenge with this is that it’s obvious that central control of a very large network with a single controller begs major scalability issues.  The solution so far has been to apply SDN to relatively limited domains and to rely on legacy interworking techniques to connect these domains.  But that limits SDNs benefits.  Is there a better way?

SDN displaces adaptive routing.  Could it be that there were developments in the past that could be considered in deciding how SDN networks might work?  Remember the line “Return with us now to those thrilling days of yesteryear?”  Well, it’s not the Lone Ranger here, but maybe the “Lonely Protocol” I’m talking about.  Let’s go back in time to the late ‘90s and early 2000’s.  We were in the age of frame relay and ATM, and startups like Ipsilon, and we were struggling to figure out how to make connection-oriented protocols work with IP.  Along came a solution, what was known as NHRP or the Next Hop Resolution Protocol.

“Call-oriented” services like ATM required that a connection be made by setting up a virtual circuit to something specific.  If we imagine an IP user at the edge of a frame relay network wanting to do an IP session with another such user, you can see the problem.  Both users are on the network but the network doesn’t support the normal IP discovery processes.  The standards (in the IETF) at the time even came up with a name for this kind of network—an “NBMA” or “non-broadcast multi-access” network.

NHRP was a solution to the NBMA problem.  You had a “next-hop server” that was a lot like a DNS.  The server would provide an NHRP agent at the edge of an NBMA with the network address (in the NBMA) of the “next hop”.  The agent would then call the designated partner agent, make the connection, and we’d have a packet path.  Even if the network were made up of multiple subnet-NBMAs we could run NHRP at each border to find a way across.

No, we’re not going back to ATM here, but consider for a moment the question that SDN should be considering; is an SDN network something like an NBMA?  There are no forwarding paths in the network in its initial state, right?  There’s a “server” that can establish what needs to be done for a given packet at a given node, but no systemic notion of destination.  Could we adopt NHRP principles at the edge of an SDN network to establish pathways for packets based on source/destination?

This probably seems like a lot of work, going to next-hop servers and all.  Remember, though, that we go to DNS servers to decode the URLs that start most web activity.  We also go to an SDN controller to get forwarding instructions, and this on a per-packet basis.  At the very least, something like NHRP could be used for high-value services, and it could easily be used in carrier Ethernet, VPNs, and so forth.  Could it scale to the Internet?  As easily as SDN could without it, and perhaps more easily.

An NHRP-ish concept could in fact be combined with DNS.  We could get an IP address the way we do now when we decode a URL, but we could also get routing information, almost like the source-route vector that was a part of frame relay and ATM and also a part of MPLS.  Suppose the DNS server returned to us an ordered list of NHRP-ish domains that we had to transit to the destination.  We’d then move the packet to the edge of the first domain, let it get through itself as needed, and then do the same with the rest.

With this sort of model (and again I stress I’m citing this as an example, just something to start discussions) we have a mechanism for inter-SDN-domain linkage.  We also have a way of using SDN and the model to improve security.  The enhanced DNS-and-NHRP stuff could also be used to validate source address, so that packets can’t be emitted into the network unless they’re part of the “source tree” that they should be given their address.  You can also quench sources at the source only, by telling their home domain not to connect to them.

This would work for SDN, but it would also work for any tunneling or “connection” protocol as well, at any of the OSI levels.  We could tunnel through something Ethernet-like, tunnel Ethernet through something IP-like, tunnel both through SDN.  Add in the microservices concept that I talked about yesterday to handle control-layer and management-layer interactions, and you could compose a service protocol from bare connectivity, which is what SDN provides.

There are obviously questions about how we’d set up a DNS-like hierarchy for this model, and how we’d integrate it with current IP, but I think you can see that we could route “normally” where we have adaptive discovery and using the new model where we don’t.  There may indeed be scalability issues but those wouldn’t be any worse than we face with DNS now, and that we’d face with SDN controllers in the future.

The net of all of this is that we’d establish a model of “destination finding” that isn’t dependent on discovery, and that we would be able to apply that method to any technology that has forwarding tables that allow for packet movement and delivery.  That includes “connection-mode” stuff like RSVP/MPLS and OpenFlow SDN.  We can even fit the model to legacy technology.

NHRP has been around a long time so you might think that all this is in place and ready to be used.  Well, while researching it I found a note that said Cisco was pulling NHRP support from its products.  It seems to me that we should be looking instead about how either NHRP or its principles could support SDN.  Of course, Cisco’s not a big fan of a revolutionary SDN transition.

I know a lot of people besides Cisco are going to tell me this is an awful idea (or at least they’ll think that!) and that we should be using the pure notions of IP.  I don’t have a problem with that assertion.  If the industry wants IP forever, go for it.  The problem is that we’re saying we want “new” technologies and benefits.  If we plan on doing everything the old way, we have to square that attitude with the notion that we’re going to do something revolutionary with SDN.

Can NaaS and Microservices Shape a Generalized SDN/NFV Service Model?

In my blog on Friday of last week, I talked about the pitfalls of not examining the details of current network services when considering how SDN or NFV might be used to implement or replace them.  Some of you have noticed that the blog opens a door to considering network services of all sorts as a kind of hybrid involving two hot topics—NaaS and “microservices”.  I want to use a discussion of those to start a deeper dive into a service model for SDN and NFV that will carry through both today’s blog and the one tomorrow.

Let’s imagine an access connection, something like Carrier Ethernet or a consumer broadband connection.  It’s just been installed and it doesn’t do anything yet.  Pump in a typical packet and it goes into the bit bucket.  But let’s also suppose that this connection has one special IP address, like 10.0.0.1.  If you go to Port 80 (html) of that address, you would see a portal, and through that portal you could now set up a service.  Maybe it’s consumer Internet, or maybe a VLAN, or even perhaps both.  Click and you have forwarding as requested, and the ability to receive packets from specified places.  You could also say that you wanted firewall, DNS, DHCP, VPN client, special gaming or business collaboration.  Whatever you pick is strung out on your access line and now accessible to you.

This sort of thing is where I think that NaaS, SDN, and NFV could get us, with a healthy dose of “microservices” that would be a lot like VNFs but also a lot more.

All technology innovations aside, it’s time we started thinking about access connections as assets that we, and operators, can leverage freely.  They should not be dedicated to a specific service, but to an elastic set of service relationships that might be delivered using any convenient protocol.  A blank slate.  This model would suggest that our access portal could deliver to us a set of network-as-a-service capabilities, but if you look deeper each of these would consist of a connection model and a microservice set.

The connection model is simply a description of the forwarding behavior and site community associated with a NaaS.  This is what would define the addressing mechanism used and how traffic emitted at a given site would be handled (line, LAN, tree, are the classics).  The microservice set would provide higher-layer features, and could be extended to offer even more—which I’ll get to below.

What this would create would be something like an on-ramp to an expressway that’s lined with useful little shops.  You could access only the shops, you could access only the expressway (picking your route) and you could do a bit of both.  I’m not saying this is the only to do it, but this would create the architectural model of an “enhanced” service.

Management of this could be done through the portal, meaning that we could set this up to deliver web-based management.  We could also deliver SNMP management by simply opening an SNMP microservice, and we could deliver an arbitrary management API as a microservice too, something like TMF MTOSI.

The microservices in this model could be hosted in the cloud, of course, and they could either be deployed per-tenant on demand or be multi-tenant services.  In addition, they could be deployed on the customer premises, either in a private cloud or in CPE that provides service termination.  The model doesn’t care what a microservice is or does, so it blends cloud applications and NFV features and makes it clear that what’s needed for either is increasingly needed for both.

There are obviously a lot of ways of looking at this besides the one I’m proposing, but I hope my comments make my central point.  We need to examine the end-game of SDN, NFV, and the cloud.  We need a picture of what networks and services and applications will look like at that point, because without it we’re going to have real trouble in the evolution to SDN and NFV.

Remember my comments on how all this arguably started was an operator vision of transformation?  Well, you can’t get much traction without having some sense of what you’re transforming to.  Part of that is simple merchandizing.  Advocates for SDN and NFV can talk about “starting small” or “early applications” or “basic services”, but transformation isn’t about limited change, it’s about massive change.  To what?

Not to mention the specific benefits.  If NFV or SDN improves capex or opex or agility, they’d have to spread widely enough for the improvements they offered within their scope to be significant at the business level.  Nobody will bet their job on a migration to something that saves one percent.

Where I think the big problem with limited-scope thinking comes in is in hiding the need for and even value of systemic strategies.  I talked last week about the protocol issues of current services and their impact on SDN and NFV.  The model of NaaS and microservices that I described here could address those issues.  But how about the problems of SDN and NFV?

Let’s look at SDN as an example.  We have three possible modes of SDN operations.  One is where connectivity doesn’t exist and forwarding tables don’t exist, and the introduction of packets stimulates devices to ask the controller what to do with them, thus building up the routes.  Clearly this would never work on the Internet with some gigantic central server; it would likely take weeks to converge on full connectivity.  Another mode is where the central controller preconfigures routes.  This is fine for “interior” routing, but users appear and disappear at the edge and it’s getting to their addresses that forwarding is all about.  The final mode is adaptive, which gets us to building something we say is SDN but is actually just doing legacy routing/switching a little differently.

I think that future services will be NaaS-like, meaning pure forwarding with no inherent control/management behavior.  I think that control-plane activity will be supported then through microservices, and that microservices will also offer management connections.  I’d guess that many agree with these points, but I’d be happy if someone presented an alternative model.  Happy because it would get us to the discussion of what the ultimate SDN/NFV/cloud network would look and work like, and how we’d use our revolutionary technologies to get to that state.  We need that.