Network Feature Composition, Decomposition, and Microservices

At the TMF event in Nice Verizon opened yet another discussion, or perhaps I should say “reopened” because the topic came up way back in April 2013 and it was just as divisive then.  It’s the topic of “microservices” or breaking down virtual functions into very small components.  NetCracker also had some things to say about microservices, and so it’s a good thing to be talking about.

If we harken back to April of 2013, we’re at a point where the NFV ISG had just opened its activity.  There was still plenty of room to discuss scope and architecture, and there was plenty of discussion on both.  This was the meeting where I launched the CloudNFV project, and it was also the meeting where a very specific discussion on “decomposition” came up.

Everyone knows that the purpose of NFV was to compose services from virtual functions.  Anything that composes a whole from some parts will be sensitive to just how granular the parts are.  We know, for example, that if you compose virtual CPE from four or five functional elements (firewall, NAT, etc.) you get some benefits.  If you had a virtual function that consisted of all of these things rolled into one and that was as granular as you got, it’s hard to see how a physical appliance wouldn’t serve better.  Granularity equals agility.

The “decomposition” theme relates to this granularity.  Here, the suggestion was that operators require that virtual functions be decomposed not only into little feature granules, but even further into what today we’d call “microservices”.  There are a lot of common elements in things like firewall, VPN, NAT, and so forth, so the decomposition camp says.  Why not break things down into smaller elements to allow even totally new stuff to be built from the building blocks of the old.  It carries service composition downward to function composition.

The operators really liked this, and so did some vendors (Connectem introduced it in a preso I heard), but the major vendors really hated it.  They still do, because this sort of decomposition not of services but of functions threatens their ability to promote their own VNFs.  But the fact that buyers and sellers are in conflict here is no surprise.  The question is whether decomposition is practical, and if it is whether microservices are a viable approach.

Virtually all software that’s written today is already decomposed, in that it’s made up of classes or modules or functions or some other internal component set.  My memory of programming techniques goes back to the ‘60s, and I can honestly say that even then there was tremendous pressure from development management to employ modular structures.  Even in programming languages like assembler, or machine language, there were features to support “subroutines” or modular elements that called directly on the computer’s instruction set (for those interested, look up “Branch and Link”).

One might think that this long history of support for modularity would mean that it would be no big thing to decompose functions.  Not necessarily.  Then, as today, the big problem is less dividing software into modules than it is in assembling those modules in any way other than the original way.

Most software that’s composable is really designed to be composed at development time.  There are frequently no convenient means provided to determine what data elements are needed and what format they’re expected to be in.  Worse yet, the flow of control among the components may implicitly depend on efficient coupling—local passing of parameters and execution.  For something to be a “service” or “microservice” in today’s terms, it would have to accept loose coupling through a network connection.  That’s something that adds complexity to the software (how do you know where the component is and whether it’s available?) and also can create enormous performance issues through introduction of network delays into frequently used execution paths.

The point is that it’s an oversimplification to say that everything has to be decomposed and recomposed.  There are plenty of examples of things that should or could not be.  However, there are also examples of vendor intransigence and a desire to lock in customers, and quite a few of the functions that could be deployed for NFV could be decomposed further.  Even more could be designed to be far more modular than they are.  We have to strike a balance somehow.

NetCracker’s concept of making more of NFV and operations modernization about microservices is an example of how that could be done.  If there’s a service whose lifecycle events are so frequent that they are almost data-plane functions, that service has a serious problem no matter how you deploy it.  Generally, management and operations processes have relatively few “events” to handle.  State/event tables are the most common way to represent lifecycle process phases and their response to events, and the intersection of the states and events defines a component, a “microservice” if you like, and one that’s probably not activated often enough that it couldn’t be network-coupled.  I’ve advocated this approach from the first, back to that 2013 meeting of the ISG.

Event-driven OSS/BSS is one way of stating a goal for operations evolution—another is “agile”.  Whatever the name, the goal is to make operations systems respond directly to events rather than imposing a flow as many systems do.  This goal was accepted by the TMF almost a decade ago, but most operations systems don’t achieve it.  A microservice-based process set inside a state/event lifecycle structure would be exactly what the doctor (well, the operator) ordered.

If we want to go further than this, into something composable even when the components have to stay local to each other, then we need to define the composition/execution platform much more rigorously.  An example, for those who want more detail, is the Java Open Service Gateway Initiative (OSGi), which has both a local and remote service capability.  Relatively few network functions now residing in physical network devices conform to this kind of architecture, which means you’d have to rewrite stuff or apply the microservices-and-decomposition model to new functions only.

It’s hard for me to see this stuff and not think of something like CHILL or Erlang or Scala—all of these are specialized languages that could be applied to aspects of virtual-function development.  If you’re going to develop for a compositional deployment that ranges from local to network-coupled, you might want to make the location and binding of components more abstract.  If you want to be able to do this in any old language you may need to define a PaaS in which stuff runs and make binding of components an element of that, so you can adapt to the demands of the application or to how its owners want to deploy it.

Microservices, composable operations, and “decomposition” of network functions are all good things, but there’s a lot more to this topic than meets the eye.  Software agility at the level that operators like Verizon or vendors like NetCracker want demands different middleware, different programming practices.  The big challenge isn’t going to be accepting the value of this stuff, or even getting “vendor support” of the concept.  It’s going to be finding a way to advance something this broad and complex in as a complete architecture and business case.  We’ve not figured that out for something relatively simple, like SDN or NFV.

Vendors Aren’t Driving SDN/NFV Anymore, so What Now?

There is an inescapable conclusion to be drawn from recent industry announcements:  Vendors have lost control of SDN and NFV, which means they’ve lost control of the evolution of networking.  Operators, in a state of self-described frustration with their vendors’ support for transformation goals, have taken matters into their own hands.  I’ve gotten emails over the last ten days from strategists and sales types in the vendor community, and they’re all asking the same question, which is “What now?”  It’s a good question in one sense, and it’s too late to ask it in another—or at least too late to have the full set of choices on the table.  But there are always paths forward, some better than others, so we need to look at them.

In a prior blog I made the point that commoditization of connection services was inevitable, and that it was also inevitable that operators will spend less on capital equipment at L2/L3 than they have in the past.  Accepting this truth, I’ve said, is critical to vendors who have historically depended on these layers for their revenue and profits.

The up-front truth for this blog is that it is no longer possible for vendors to control the SDN and NFV revolution even if they were to step up now and do what should have been done all along.  I’ve noted what should have been done many times and in any case, it’s too late to do it.  Buyers have taken their own path now, and vendors need to fit into the operators’ programs and not try to define their own.  I’m not saying they don’t need to pay attention to the focus on opex, to the need to develop a holistic SDN/NFV business case, only that doing that won’t give them control of the game anymore.

The key to accommodating operator initiatives seems to start with sophisticated service modeling.  All SDN and NFV modeling and the associated APIs and orchestration is derived from the software concept of “DevOps” that defined a way of describing deployment of software elements and their connection into systems we’d call applications.  There have always been two models of DevOps, one that describes the steps to take (called the “prescriptive”) and the other that describes the end-state desired (initially called the “declarative” but increasingly called the “intent model”).  The critical first step vendors need to take in modeling is to adopt declarative/intent modeling.

“How-to” modeling cannot be general—it has to be a process description that naturally depends on what you’re doing and where you’re doing it.  If you describe a system of VNFs in terms of its intent, you can deploy it on any convenient platform.  If you say how to deploy it, you can deploy only on the target upon which your instructions were based.  All the emerging operator architectures make it clear that a wide variety of platforms, including legacy “physical network functions” or PNFs, have to be supported for any feature.  The thing, as they say, speaks for itself (for Latin/legal fans, “res ipsa loquitur”).

I personally think an intent model approach would be ideal across the board, meaning everywhere from top to bottom in an implementation.  However, it is essential only at certain key points in the structure of an SDN/NFV package:

  • At the top, where SDN/NFV software interfaces with current OSS/BSS systems.
  • Underneath “End-to-End Orchestration” or EEO, to define the way that infrastructure-based behaviors are collected into functional units.
  • At the “Infrastructure Manager” boundary, to describe how a given behavior is actually deployed and managed for one or more of its hosting options.

Each of these points represent a hand-off that operators are insisting be open, which means that the implementation below has to be represented to the implementation above.  Intent modeling makes that mutual representation practical.

The second point that vendors have to enforce in their implementation is the notion of a VNF PaaS implementation.  All of the APIs that a VNF presents as an interface have to be connected with a logical paired function, and all of the SDN/NFV and management APIs that a VNF would be expected to use have to be offered in a uniform way to the “virtual space” in which VNFs run.  This same requirement exists in a slightly different form for SDN, but in my view it would be met by the support of an intent-model “above” the SDN controller.

This is going to be the most important issue for NFV, I think.  Absent a PaaS-like framework, there is no meaningful portability/onboarding, and no way to contain integration cost and risk.  Commercial VNF vendors are likely to tie up with NFV partners (as they have already) and integrate only with these partners, which opens a risk of each setting licensing terms that operators will find offensive because there’s little or no competition.  Open-source could be totally excluded from the picture.

A “VNF” is a system, a black box that provides a feature or features, asserts an explicit SLA, and contains a range of deployment options that could adapt to conditions by scaling or replacement.  All of this good stuff should happen inside the box, with specific contained APIs to link the functionality with the rest of the service ecosystem.  Absent that, we have no reliable integration, and we are absent that now.

The next point is perhaps the largest problem, and a problem that would have to be solved in order to solve the VNFPaaS challenge.  Management, meaning lifecycle management at all levels, has to be defined explicitly or nothing can be integrated at all—no VNFs, no NFVI, nothing.  The current model is kind of like the software equivalent of the universal constant (“That number which, when multiplied by my answer, yields the correct answer.”)  We have the VNF Manager that might be integrated with each VNF, or it might be centralized, or a combination of both.  What is integrated with a VNF is part of a tenant-service, and what is centralized is part of the management system.  You can’t float between these two environments because it’s not secure or reliable to do so, any more than you can let applications change the operating system.

The really big problem here is that the industry approached all this from the bottom, and you can’t really do management right except from the top.  You manage services against the SLA.  You manage service components against the behavior that you set for them to secure the SLA, and you manage resources to the standards required to make those component-level behaviors work.  Management should be linked to modeling, so that every model layer has appropriate SLAs and management definitions.  That way you have management of the system of functions that make up a service, down to the system of resources that support the functions.

The final point for SDN/NFV vendors is to focus strongly on federation, not only across operator boundaries but across implementations of SDN and NFV at the lower level.  “Federation” in my context means supporting an autonomous implementation at some level by representing it as an opaque model to the level above.

A good modeling approach will take you a long way toward federation support of this sort because an intent model makes the “who” and “how” opaque to the higher-level orchestration process.  However, there are a number of commercial relationships possible among operators, and there’s always going to be a number of different approaches to sharing management data.

Accommodating the commercial relationship is an implementation issue with intent modeling.  The decomposition of a model representing a federated lower-level (or partner) element just means activating whatever that lower-level process might be, at any appropriate level.  So you could have a “treaty federation” where billing data didn’t have to be exchanged, or one where the order process in one domain was activated by the orchestration in another.

The management stuff could be more complicated, depending on how good the management model is to start with.  If you presume my preferred approach, which is a repository in which all “raw” management data is collected and from which management APIs present query interfaces, then there’s no real issue in controlling what a partner sees or how it should be interpreted.

In some respects, operator architectures could make it easier on vendors.  If they fit in the architecture they don’t have to offer a complete solution.  If they fit in the architecture they don’t have to sell the entire SDN/NFV ecosystem.  It could create focused procurements and shorter sales cycles.  It certainly will facilitate more limited service-specific applications of SDN and NFV, as long as they can be fit into the operator’s holistic model.  It’s also surely an indication that the SDN/NFV space is maturing, moving from media hype to the real world.  It’s just important to remember that doesn’t mean media hype becomes the real world.  Operator architectures are the proof of that.

The Critical Open-Source VNF: How We Could Still Get There

One of the most logical places for operator interest in open-source software to focus is in the area of virtual network functions (VNFs).  Most of the popular functions are available in at least one open-source implementation, and operators have been grousing over the license terms for commercial VNFs.  It would seem that an open-source model for VNFs would be perfect, but we seem to have barriers to address in making the approach work.

VNFs are the functional key to NFV because they’re the stuff that all the rest of the NFV specifications are aimed at deploying and sustaining.  Despite this, VNFs have in some sense been the poor stepchild of the process.  From the first, everyone has ignored the fundamental truth that defines VNFs—they’re programs.

Virtually all software today is written to run on a specific platform, with hardware and network services provided through application program interfaces (APIs) presented either by an operating system or by what’s called “middleware”, system software that performs a special set of useful functions to simplify development.  In some cases, the platform (and in particular the middleware) is independent of the programming language, and in others it’s tightly integrated.  Open-source software is no exception.

A convenient way to visualize this is to draw a box representing the program/component, and then show a bunch of “plugs” coming out of the box.  These plugs represent the APIs the program uses, APIs that have to be somehow connected to services when it’s run.  Let’s presume these plugs are blue.

When something like NFV comes along, it introduces an implicit need for “new” middleware because it introduces at least a few interfaces that aren’t present in “normal” applications.  If you look at the ETSI diagrams you see some of these reference interfaces.  These new APIs add new plugs to the diagram, and if you envision them in a different color like red, you can see the challenge that NFV poses.  You have to satisfy both the red and blue APIs or the software doesn’t run.

A piece of network software of the sort that could be turned into a virtual function also has implicit external network connections to satisfy.  A typical software component might have several network ports—one for management access, one as an input port and another as an output port.  Each of these ports has an associated protocol—for example, a management port might support either IP SNMP or a web API (Port 80).  Data ports might have IP, Ethernet, or some other network interface (to connect to a tunnel, for example).

Then there’s what we might call “implicit” plugs and sockets.  Virtual functions have a lifecycle process set, meaning that they have to be parameterized, activated, sustained in operation, perhaps scaled in or out—you get the picture.  This lifecycle process set may or may not be recognized by the software.  Scaling, for example, could be done using load balancing and control of software instances even if the software doesn’t know about it.  But something has to know, because the framework has to connect all the elements and work, even when there are many components with many plugs and sockets to deal with.

What this means is that when a piece of open-source software is viewed as a virtual function, it will have to be deployed in such a way that all the plugs from the software align with sockets in the platform it runs on, and all the sockets presented by NFV interfaces line up with some appropriate plug.  How that might happen depends on how the software was developed.

If we presume that somebody built an open-source component specifically for NFV, we could presume that the software itself would harmonize all the plugs and sockets for all the features.  The same thing could be true if the software was transplanted from a physical appliance and altered to work as a VNF.  Operators tell me that there is very little truly customized VNF software out there in any form, much less open-source.

The second possibility is to adopt what might be considered a variation on the “VNF-specific VNF Manager (VNFM).”  You start with a virtual function component that provides the feature logic, and you combine it with custom stuff that harmonizes the natural plugs and sockets and connectivity expected by the function with the stuff needed by NFV.  This combination of functional component and management stub then forms the “VNF” that gets deployed.  Operators tell me that most of the VNFs they are offered use this approach, but also that only a very few open-source functions have been so modified.

The final possibility is that you define a generic lifecycle management service that talks to whatever plugs are available from the function component, and makes the necessary connections inside NFV to do deployment and lifecycle management.  I’ve proposed this approach for both the original CloudNFV project and my ExperiaSphere model, but operators tell me that they don’t see any signs of adoption by vendors so far.

All of these options for open-source virtual functions expose two very specific issue sets—deployment (the NFV Orchestrator function) and lifecycle management (VNFM).  For each issue set, current trials and tests have exposed a “most-significant-issue” challenge.

In deployment, the problem is that open-source software’s network connection expectations are quite diverse.  In some cases, the software uses one or more Ethernet ports and in others it expects to run on an IP subnet, sometimes with other components, and nearly always with the aid of things like DNS and DHCP services.  One challenge this presents is that “forwarding graphs” that show the logical flow relationship of a set of VNFs may do little or nothing in describing how the actual network connectivity would have to be set up.

In the lifecycle management case, there are two challenges.  One is to present some coherent management view of the VNF status.  In the ETSI model this is the responsibility of the VNFM, which is often integrated with the VNF, but I don’t think this is workable because the VNF may be instantiated in multiple places because of horizontal scaling.  The other challenge is getting the VNF information on its own resources.  You can’t have a tenant service element accessing real resource management data, particularly if it plans to then change variables to control behavior.

I’ve said in prior blogs that VNF deployment should be viewed as platform-as-a-service (PaaS) cloud deployment, where the platform APIs come from a combination of operating system and middleware tools deployed underneath the VNFs, and connectivity and control management tools deployed alongside.  We have never defined this space properly, which means that there is no consistent way of porting software to become a VNF and no consistent way to onboard it for use.

What’s needed here is a simple plug-and-socket diagram that defines the specific way that VNFs talk to NFV elements, underlying resources, and management systems.  The diagram has to show all of the plugs and sockets, for not only the base configuration of the VNF but also for any horizontally scaled versions, including load-balancers needed.

Open source is not the answer to this problem; like any other software it has to run inside some platform.  In fact, the lack of a platform puts the application of open-source software to VNFs at risk because it poses a significant risk in terms of resources needed to adapt the software, and in the open-source world the commercial interest in covering that risk is diminished.

Operator initiatives like the recent architecture announcements from AT&T and Verizon take a step in the right direction, but they’re not there yet.  I’d love to see these operators step up and define that VNFPaaS framework now, so we can start to think about the enormous opportunity that open-source VNFs could open for them all.

What Operators Think Vendors Should Do To Counter Spending and Transformation Risk

These are the times that try the souls of networking sales management.  Most of you know that I have an ongoing dialog with salespeople in many companies, and that dialog says that network spending overall is under pressure.  Legacy infrastructure investment is slow-rolling because of ROI issues, and vendors who have presented next-gen architectures have failed to make a business case for their deployment.  In SDN/NFV, all the sales people tell me that they are undershooting their goals.  Cisco, Juniper, and even Brocade have reported anemic spending by network operators, and Wall Street isn’t liking the equipment space.  What can be done?  I asked some of my operator contacts to find out.

Business as usual isn’t, or shouldn’t be, on the list of options.  In 2013 when SDN and NFV were getting a lot of early attention, there was a chance of redefining networking in such a way as to preserve a great deal of the legacy equipment model.  That opportunity has passed forever at this point, as both vendor financial performance and operator architecture evolution has shown.  However, vendors should still (as I’ve noted in prior blogs) provide specific support for opex reduction to reduce the pressure on capex.

Capex for connection technology, at Levels 2 and 3 of the “true” OSI model, is expected to decline for as far out as operators have any visibility.  Initially, operators expect to slow-roll spending on these layers and put price pressure on vendors (outside the US, shifting to Huawei is a popular approach).  In the longer term, they expect to move to “gray” and “white” boxes, meaning commodity devices that would increasingly include server-hosted switch/router instances.

Even at Level 1 (optical) operators aren’t expecting to generate a near-term windfall, which comes as no surprise to the optical vendors I’m sure.  My contacts tell me that operators have been prepared to shift spending downward providing that the optical vendors presented a strong architecture to reduce costs higher up.  That means shifting functionality downward to a “virtual wire” layer and perhaps facilitating the virtualization of Levels 2/3.  The operators tell me that optical vendors have not been prepared to define that strong architecture, so optical spending is stuck in the general ROI backwater generated by continuing profit-per-bit pressures.

One clear operator reaction to the problem is “elevation”, as one CFO calls it.  Instead of focusing on infrastructure changes (for which nobody is presenting a credible model), the operators are focusing on plastering a service-and-operations skim coat over the cracking foundations.  This process can be as casual as offering portal-based solutions for customer care, which have to be linked to current operations and management systems, or as sophisticated as end-to-end orchestration to unify legacy technology with the various service-specific flavors of SDN and NFV now gaining favor.

Vendor reactions aren’t quite this clear, but perhaps they should be.  If you ask operators what vendors should do, they present the following points.

First, present a model-driven top-end to their legacy, SDN, and NFV offerings that can be incorporated into an end-to-end orchestration (EEO) element.  The number one issue with operators is preventing siloization of operations and management processes so they can harmonize their “skim coat” solutions with their evolving infrastructure.  Their number two issue is agile service development, and they see the EEO-to-network link as being critical in addressing both agility and silos.  It’s less important at this late point for vendors to promote a “complete” strategy (which in any case collides with the operators’ open vision) than to fit into an EEO scheme.

Operators, especially those who had taken the trouble to articulate their approach to network evolution, have been somewhat surprised by vendors’ lack of enthusiasm in supporting the initiatives.  The need to integrate with EEO is absolute, the value to vendors themselves should be clear (you can fit your stuff in if you conform, and you cannot if you do not), yet there’s no rush to define the necessary models.

The second recommendation of operators is get VNF and software vendors to promote realistic license practices.  Some operators claim that if you were to apply the VNF license policies to vCPE services to businesses, the cost to operators would be higher than that represented by appliances.  These same operators believe that NFV kingpins have packed their partner programs with vendors who envisioned NFV as being just another dimension of the old gravy train.  They want open-source VNFs now, but they’d accept license terms that didn’t totally contaminate their business case.

Part of this issue seems to arise from the fact that many VNF providers aren’t appliance vendors and have no experience with that side of the market.  If you don’t realize that your customers have been selling firewalls (for example) for decades, you might be forgiven for thinking that their desire to sell firewall VNFs is your chance to make your numbers.  Revenue-sharing, one vendor put it.  Taking advantage, one operator responds.

The third recommendation is think gray.  Operators see established network equipment vendors refusing to develop commodity switching/routing solutions or OpenFlow switches to protect their network equipment sales.  According to the operators, that will only accelerate the development of credible white-box competitors.  If instead, established vendors brought out lightweight intermediary “gray box” devices with optional proprietary features that would help make the business case or support orderly evolution, they could win acceptance.

Most of the major vendors have dismissed white-box networks, and yet operators have been increasingly committed to them, in no small part because they see vendors’ lack of acceptance as new evidence of manipulative intransigence.  The problem, though, is that operators say that even white-box or hosted-instance vendors present their stuff as one-off alternatives to switches and routers and not as a part of an architectural shift.  Operators say that the shift is the goal, and one-off-ness isn’t an option.

No industry willingly accepts radical transformations to its business model, but when the driver of change comes from outside, when the evidence that change is needed is overwhelming, and when buyers start to take defensive actions as a result of the forces of change, it’s time to make the best of things.  We seem to be at that point now.

For IoT, Forget Network Virtualization and Think “Thing Virtualization!”

How can we best accommodate the notion of virtualization to the application of IoT?  That’s a question that more and more operators and vendors are wrestling with, and it’s a good one.  The answer might be interesting and disruptive—think less about virtualizing the network and more about virtualizing the “things”, the sensors and controllers.

I’m not denying that there are “things” that we might want to access that aren’t currently connected.  I’m not denying that 4G/5G might be a useful way to connect some of them, but I think everything we already know about security and environmental monitoring and process automation proves that connection isn’t really the issue.  You’re probably sitting in the midst a “thing network” as you read this blog, and it’s based on pedestrian sensor/controller technology that doesn’t put any of the “things” directly on the Internet, or on a 4G network either.

So is the whole IoT thing a colossal media/analyst fraud?  Maybe, but there is still a grain of value in the notion if you look beyond the aspirations of vendors and operators for easy money.  The question is how to empower the market in general with the knowledge of what are now (and are likely to remain) “private things” that are neither online nor accessible in any form to general application development.

I’ve talked about one model that could harness the things, so to speak.  If we were to build a massive set of repositories that held the collected knowledge we extract from our things, we could then run queries/analytics that would let people exploit all this data and still (through the analytics apps) apply the necessary protections to insure stability, security, and privacy.  In this approach, IoT is a database presented by a series of APIs.  I think this is a good approach, certainly better than just sticking all these sensors and controllers directly on the Internet and hoping everyone would behave.

There’s another approach, though.  Suppose that we want to preserve the literal “Internet of Things” model but recognize that everything that’s on the Internet isn’t necessarily directly and discretely connected.  We could then employ virtualization to create a series of “virtual things” that are constructed from, related to, the real things.  These could be presented on the Internet through traditional web-like APIs, but the real stuff that supports the virtual presence could be hidden, connected as it is now, and the APIs could still apply policy controls to protect the integrity of the data and the security of the users.

With this model, each “thing” is represented as though it were a kind of website; you could read and write to it and potentially even access it through a web browser.  Like any “website” it could be either on a VPN or on the open Internet, and it could apply encryption and access controls.  In programming terms, it’s a resource accessed with a RESTful API.

Behind each “thingsite” is a process that links it to the real sensor or controller, or set thereof.  This process is similar to that used behind websites to link to transactional applications.  In theory it could operate asynchronously, gathering data and posting it to the thingsite based on policy-determined timing, or it could be triggered by an inquiry to the thingsite.  The process could also be doing database dips, meaning that the thingspace could be a front-end for the repository of thingdata I’ve been talking about.

This model of IoT would preserve the notion of a set of on-the-web sensors and controllers that could be exploited, but they’d buffer the idea with the same kind of virtualization that currently keeps tenant networks separate.  If your company wants to expose a set of sensors/controllers to partners, you simply define a thingspace for them, and let the back-end technology populate it with the information you’re willing to share.  They can do whatever they want with the exposed things, and you don’t have to coordinate with them as long as you’re happy with the data that’s being shared.

“Public” things, meaning things that would be available for use without contractual arrangements, are also possible with the model; you simply expose a thingspace directly online and you apply only the policy filters that are required to conform to evolving privacy regulations.  In theory you could even build in security and load-balancing with this model, spawning multiple virtual things that represent the same set of real ones to share the load of mass access.

Since the back-end applications that feed the thingsites would be able to gateway data from a private sensor/controller network based on any technology, you can immediately harness all the stuff that’s already deployed, or at least that part of the current base that its owners are prepared to open up.  You could also construct, with the proper access to either things or thingsites deployed elsewhere, your own “derived thingsites” that represent analytics-based digestions of one or more sensors, or that introduce data from stuff outside the thingspaces—like retail pricing or personal presence.

What about the original sensors-online model?  Well, if you wanted you could augment virtual things with real ones, but I think that eventually somebody is going to get smart and realize that the cost of supporting a complete online presence with policy and security filters for every “thing” is going to kill the opportunity completely.  A better approach would be to have the real things, even new ones, front-ended by virtual thingsites that could handle all the variables of security, policy, and performance.

So will this approach rise up and take over?  Probably not, because so much of technology these days is about creating buzz rather than creating opportunity.  What could happen, and I think will happen eventually, is that the real IoT opportunities will end up migrating to a practical platform, which could be the thingspace concept or the analytics model.  Somebody who manages to figure this out up front could end up making some big bucks.

Can We Build Agile Infrastructure with the Overlay/Underlay Model?

Let us suppose for a moment that the goal of operators is to reduce equipment and operations cost in concert and at the same time increase their ability to provision current services quickly and flexibly, and develop new services just as quickly.  Let us further suppose that they have addressed the higher-level operations/portal implications of this.  What would the ideal network approach be?

Since it’s clear that operators do want exactly what’s presented in the last paragraph, this is a fair question.  Since the answer to the question will dictate infrastructure spending in the future, it’s an important one.  Interestingly, we have an answer for it, and it’s been around for a fair period of time.

If we go back to a point in my last blog, operators need to be able to make changes to costs and revenues without forcing a fork-lift, large-scale, change-out of infrastructure.  There is simply no way to bear the risk of a large transformation and at this point no time to prove out alternative infrastructure technologies to the degree needed to contain that risk.  We have to evolve with some grace into the future.

My conversation with the MEF’s CTO convinced me that their Third Network model has merit, providing that the model embrace something that is strongly hinted at but not featured—the concept of an overlay technology.  If the lower three layers of the OSI model (what the model says is actually in the network) is Levels 1 through 3, then let’s call this overlay layer Level I or Li for short.

The basic notion for Li is that services would be defined and delivered at this new layer, which would then consume tunnels (“virtual wires”) created at the layers below.  Since services would now be using existing network technology only as a physical layer, you’d be able to change out any or all of that stuff at whatever pace you find optimal because lower-layer implementations are opaque to the higher layers.

Overlay connections are based on a header that’s appended to data payloads before they’re encapsulated for handling by the tunnel protocol.  They subdivide the traffic at any tunnel-point, and at each such tunnel-point the subdivisions can either extract the traffic with a given header and deliver it to a user access point, or “cross-connect” it to another tunnel.  It’s in how this is done that the efficiency and value of the Li model is determined.

In the original Nicira overlay-SDN model, a LAN or VLAN or VPN architecture created the tunnel paths, and these connected physical network/IT elements like servers.  The SDN overlay then subdivided access by tenant.  In theory, each server could either extract header-identified traffic for its local users or cross-connect it onward.  This is not unlike how lower OSI layers relate to higher layers; you can pull traffic from a LAN (Level 2) and connect it to another LAN through a WAN connection, via a router.

The current SD-WAN products have a slightly different approach but use the same overlay concept.  Here, a series of connections made at a lower level to the same access point are effectively united by a higher overlay that can ride on any of the low-level options.  This higher layer then presents the user interface.

The general overlay model that might be viewed as the basis for MEF’s Third Network should be able to work with any of the following tunnel-models:

  1. The lower-level tunnels can connect all the way to the access points, creating a virtual mesh. The overlay technology would then provide only service-specific handling and addressing, and each tunnel access point would simply forward a packet on the right tunnel.  This would work for modest-scale virtual networks where a fully scalable forwarding technology (like SDN switches) was used.
  2. The lower-level tunnels connect to some number of aggregation points hosted within the network based on traffic topology. At these points, forwarding rules would cross-connect them.  This is the structural model that would optimize the use of hosted/virtual router instances.
  3. The lower-level tunnels, in addition to one of the above approaches, cross a protocol or administrative boundary where tunnel-to-tunnel connection is not available, and where tunnels from each side must therefore terminate. The Li layer now has to cross-connect the tunnels appropriately just to pass across the boundary.

The issue that can mess up a good overlay strategy could be called “tunnel granularity”.  If you have too little tunnel granularity, then you can’t create tunnels to the access points for an overlay-based service without a lot of tunnel cross-connecting.  Not only does this process increase delay and packet loss risk, the fact that it’s happening for a concentration of users sharing an inadequate number of lower-level tunnels means it might well grow in demand to the point where addressing it with a hosted router instance would be difficult.  You’d like to get your lower-level tunnel mesh as close to serving all the access points as possible.  The MEF has been working to improve Ethernet’s ability to support connected-path multiplicity efficiently, and that’s good.

Here is where “universal SDN” might be very helpful.  If you think of an OpenFlow-driven concatenation of forwarding table entries as a kind of “naked tunnel”, you see that SDN could create any arbitrary tunnel configuration end to end if desired.  If you combine this with agile optics (ROADMs) then you’d have a highly functional physical layer over which you could overlay any convenient L2/L3 service protocol while largely ignoring issues like topology and even path failures (because they’d be handled or controlled below).

The overlay approach would be easy to apply to mobile infrastructure because it’s already heavily based on tunnels (EPC).  It would also be easy to apply to business virtual network services and to cloud application services.  It’s not as clear that you could adopt an overlay model for the Internet, which suggests that either you’d want to retain standard Internet routing at least in the core and augment it with SDN forwarding, or at least retain it for non-content delivery services, which are already supported largely from CDNs.

There’s no shortage of potential vendors to support the model, starting with the classic overlay-SDN Nicira/VMware play and extending to SD-WAN vendors like Talari, Citrix/CloudBridge, Silver Peak, and Riverbed/SteelConnect.  In addition, most virtual routers (software router instances) can interconnect tunnels and so could be used to build an overlay-modeled service framework.  However, vendors have been shy so far in committing to the approach, preferring to sell to enterprises in more limited missions rather than to operators.  Even the SD-WAN vendors whose products could easily frame an overlay model (even within the Third Network approach) haven’t played that capability as a differentiator.

The likely reason for this is that selling SD-WAN to enterprises is working, and selling it as a mainstay for next-gen networking is a Great Unknown, particularly for vendors who don’t call on operator CTOs and don’t participate in emerging-network standards.  Despite the resistance, I think it’s clear that overlay networking could play a major role in next-gen infrastructure, perhaps the dominant one.  It may be that the evolution of the MEF’s Third Network will finally legitimize the approach and address the critical question of overlay/underlay relationships.

How Equipment Vendors Can Counter Cautious Operator Spending

With the exception of Huawei, network equipment vendors are facing tightening spending by operators.  The reason, obviously, is that compression in profit-per-bit that I’ve been talking about—the compression that’s led to operator support for “transformation” and their interest in SDN and NFV.  Since SDN and NFV have not evolved fast enough and far enough to generate the kind of radical improvements in cost and revenue operators had hoped for, their only response is to slow capital spending.  The impact is greatest in wireline, because wireless is too competitive for anyone to skimp on improvements.  Vendors like Juniper with little credible wireless contribution to make suffer, obviously, but nearly every vendor who isn’t a price leader is feeling the pinch.

So what’s to be done?  While I might be (and am) confident that there’s a way out of the compression problem, I’m not the guy who’s going to be stuck with an enormous technical albatross if the method doesn’t pan out.  Operators have long capital cycles and so they’re unusually risk-adverse with respect to writing down failures.  They either have to reduce the risk you’re doing the wrong thing, or they have to do nothing—or more precisely they have to do nothing different and risk-building in terms of architecture, just build with cheaper components.  Hence, Huawei.

Vendors across the board have failed to deal with this resistance to failure, and some add insult to injury by promoting the OTT notion of “fast fail” as a model the operators need to adopt.  There is no way of fast-failing a trillion-dollar infrastructure.  What operators need, and have always needed, is a set of new-approach hypothesis that link directly to a benefit, and that can be proven out in a modest-scale trial.  The biggest casualty of bottom-up specifications is the potential to fulfill this need.  The early work has no business context in which it can prove either benefits or realization.

But OK, we’re here and it’s now.  AT&T and Verizon have both issued papers describing an architecture model and these should resolve a lot of the issues, right?  Not so fast.

What AT&T has done is set goals, which can define benefits.  What Verizon has done is frame a solution set inside an architecture.  You can do a lot with these two, particularly if you could somehow combine them, but guess what?  Vendors are singularly unimpressed with, perhaps even unhappy with, the two approaches.  “Selling” has become not a fitting of your product plans to the demands of your buyers, but rather a coercion or manipulation of your buyers to accept what you’ve decided to produce.  The reason, of course, is that vendors want to make money and their fondest wish is that operators just suck it up and buy stuff and forget all this newfangledness.

What we need now is recognition that resolving the problem of cost reduction without killing vendor support for your initiatives must lie in focusing on costs other than equipment costs, and in achieving transformation on a large scale without making infrastructure changes on a large scale.  As I’ve said in past blogs, this means focusing on opex reductions that can be achieved by a higher layer of orchestration, one that accommodates both legacy and new SDN/NFV technologies.  That would let you test and realize benefit-based changes without forcing you to commit to major infrastructure upgrades that could only be justified if you were sure they’d work.

The problem getting to that (happy?) goal is a combination of the fact that when you do a bottom-up spec you don’t get to the top till the end of the process (if at all) and that same old issue of vendor self-interest.  Network equipment vendors are reluctant to embrace top-down abstraction-based operations because it anonymizes network equipment and threatens incumbencies.  IT vendors in the SDN/NFV space are similarly reluctant because these top-down approaches don’t sell servers right away.  And everyone is reluctant because, in the main, they don’t have the top-layer tools in place.

One of the most important developments in this area is the emergence of operator-driven initiatives to define holistic SDN/NFV architectures.  Verizon, in particular, has emphasized the notion of a layered orchestration model that would allow a higher-level orchestrator to harmonize not only legacy and emerging network technologies but also multiple vendor-specific implementations.  This overcomes the fact that neither SDN nor NFV standards include modernizing operations practices or incorporating or evolving future networks from legacy deployments.

Another potential solution is the use of a generalized orchestration model, championed by some of the six vendors who have complete NFV solutions (ADVA, Ciena, and HPE in particular).  This approach could in theory be applied two ways—a top-to-bottom orchestration architecture and a selective architecture.  With the former, the vendors’ solutions would be accepted as the only orchestration approach, and this seems to run afoul of the current service-specific SDN/NFV evolution trends.  With the latter, you’d adopt the generalized orchestration model where there’s no competing implementation, and use a stub/adapter to incorporate the competing models by supporting abstractions that represent them.

It’s this last approach that shows the most promise, but vendors have not been enthusiastic in promoting it.  Part of the reason is that most still hope to achieve their own “lock-in” of early NFV deployments, and fear that embracing an open model would hurt them as often as help them.  Part is the fact that you have to implement the stubs to represent “foreign” models, which of course means that there has to be some foreign model structure to represent.  At this point, absent any specific intent-model requirement for NFV or SDN (SDN’s is coming along), that could be challenging.  In particular, it would leave the top-level orchestration vendor at risk to changes made by vendors below.

That problem, unfortunately, can happen in any multi-layer orchestration approach, and that’s why in the end the operator models may be the only hope.  Verizon or another Tier One could compel vendors to open and stabilize their models so that lower-level service-specific implementations would fit inside an end-to-end orchestration model.  Other vendors almost surely could not.

Everything comes back to the point I made about vendor differentiation and model-based abstraction.  If operators think of equipment as simply a realization of a given abstract model, then it’s harder for vendors to differentiate.  Operator-driven models would probably not include special differentiating features from vendors, given operator demands for an open approach.  Vendors need to somehow support open-network goals and retain some opportunity to exploit their own special sauce.

The first-quarter slump in operator spending (which vendors want to believe is just a blip on an otherwise untroubled horizon of spending growth despite ROI compression) argues for taking decisive action.  A vendor could develop an operations-savings approach that would at least mitigate the problem of loss of differentiation.  For example, they could develop their own models to link their lower-level management systems to EEO tools, which could then exploit their own differentiation.  As long as their models only enabled these special capabilities and didn’t mandate them, would operators refuse them?  Probably not, and they might even use the features if they were valuable, even at the cost of openness.

Remember too that Verizon and AT&T are emphasizing a shift to white-box products, which means products non-differentiable at the data plane level.  Verizon has also explored the notion of displacing physical routers with software instances, recognizing that hardware acceleration may be required for the hosting.  I think that widespread use of router instances will also require “virtual-wire” partitioning of traffic at L1/L2 to eliminate large L3 aggregation missions that servers are never likely to be able to support efficiently.

I said early on that if vendors did not find a way to secure significant non-capex benefits through SDN and NFV, operators would re-architect networks to reduce spending on switching and routing, and also achieve opex savings through L2/L3 simplification.  I think that’s happening.  I think that everything happening in the network market today demonstrates a need for vendors to push an operations-savings approach, and to take control of the way their own orchestration and management tools integrate with emerging high-level EEO tools.  In fact, I think that vendors have already lost millions by not having this capability, money operators would have spent on infrastructure had the profit compression pressure been relieved by operations savings.  Not losing any more should be a priority.

Exploring the Operations Implications of the Verizon Model

The issue of operationalizing next gen networks and services is critical for operators, and it’s thus fitting to close this week’s review of the Verizon architecture with comments on OSS/BSS integration.  There are two questions to be answered; can the approach deal with the efficiency/agility goals that will have to be met to justify SDN/NFV, and can they accommodate the political divisions over the future of OSS/BSS.

Every Tier One I know is conflicted on the issue of OSS/BSS.  It’s not that the functions themselves are not needed (you have to sell services to make money) but that the functions are wrapped in an architecture that seems in every respect to be a dinosaur.  OTTs who have adopted modern cloud and web principles seem to be a lot more efficient and agile, and thus there’s a camp within each operator who wants to toss the OSS/BSS and remake the functions along web/cloud lines.  On the other hand, every OSS/BSS expert is going to resist this sort of thing (just like router experts resist transformation to SDN) and in any event, changing out your core business systems is always going to present risks.

The Verizon paper introduces the notion of layered service abstractions, starting at the top with an end-to-end retail-driven vision and ending at the bottom with virtual features/functions/devices.  These layers can be used to assemble a spectrum of network-service-operations relationships, and if these relationships are broad enough they could cover all the bases needed for benefit generation and political consensus-building.  Do they?  Let’s use the extreme cases to see, and let’s assume that we could graph the resulting structural relationship between OSS/BSS, independent EEO, and resources as a small-letter “y” whose left, shorter, branch could connect either up top or down lower.

One extreme case is to consider the OSS/BSS system to end with the high-level retail model.  A service is a set of commercially defined functions represented by SLAs.  Those functions, from an OSS/BSS perspective, are atomic—think of them as “virtual devices” if you like.  OSS/BSS systems deploy them, and customer portals and service lifecycle management processes treat the SLA parameters as representing service and resource behavior.

If you charted this in our “y” model, the left bar would join the main bar very close to the top.  EEO, then, would generate the vision of the service that operations systems saw, reducing the role of OSS/BSS to the commercial management of the service and leaving the issues of deployment and lifecycle management to EEO.

The other extreme case is to consider the OSS/BSS system to be responsible for the lowest-level virtual function/device models.  This would open two options for our conceptual structure.  The first would dive the left bar of the “y” downward to touch the resources, and the second would make the “y” into an “l” with a single branch.

The first approach would say that while EEO should be responsible for the deployment and lifecycle management, the OSS/BSS would see the lowest-level virtual devices.  This approach is friendly to current OSS/BSS models and perhaps to the legacy TMF approach, because it retains contact between “resources” and “operations” in its current form.  We have management systems today that operate, somewhat at least, in parallel with operations, and this would perpetuate that.

The second says that it’s fine to have EEOs but they need to be inside the OSS/BSS.  That component then has a linear relationship with all the model layers of the Verizon architecture.  This, I think, is the essential model for the TMF ZOOM project, though the details of that architecture aren’t fully open to the public at this point.  It would magnify the role of OSS/BSS, obviously, and preference the OSS/BSS vendors.  It also perpetuates the OSS/BSS and its role.

If you look at these extremes, particularly in terms of my “y” topology, you see that the Verizon approach of layers of models opens the opportunity to connect the OSS/BSS in at any of the modeling layers, meaning that you can slide that left bar up and down.  It also lets operators elect to integrate the functions “above the junction” of the left bar with the OSS/BSS, turning the model into an “I”.

All this capability isn’t automatic, though.  To understand the issues of implementation, let’s move from our “y” model to another one, resembling an “H”.

The left vertical bar of our “H” represents the OSS/BSS flow and visibility, and the right the EEO or incremental orchestration view.  Both these can coexist, but to make sure they don’t end up as ships-in-the-night competitors to management, there has to be a bridge between the domains—the crossbar.  The purpose of this is to establish a kind of visibility bridge—at the model layer where this crossbar is provided, there is a set of processes that convert between the two sets of abstractions that drive the OSS/BSS and EEO flows.  What is above the crossbar is invisible to the other side, and what is below it is harmonized—reflecting the likelihood that the two verticals represent different but necessarily correlated views.

Wherever this crossbar exists, the model for the associated layer has to provide both operations-friendly and orchestration-friendly parameters to represent the status, and that has to include lifecycle state if that state has to be coordinated between EEO and OSS/BSS.  Where the bar is set higher, meaning where the model layers represent more functional and less structural abstractions, the same parameters would likely serve both sides and little work is needed.  If you drive the bar lower, then you encounter a point where it’s desirable to have the operations view composed from the EEO view.

In an operator-defined architecture like Verizon’s, there’s no need to support a range of options for positioning the bar because the operator can make a single choice.  For a general architecture, vendors and operators would have to expect that the modeling at every layer provide for the coequal viewing of OSS/BSS and EEO elements, and support the necessary parametric derivations—both in read and write terms—to provide that capability.

The bidirectionality of the bar is important because it illustrates the fact that there are two parallel paths—operations and orchestration—and that these can be harmonized in part by “exporting” functionality from one to the other.  This is the answer to the political dilemma that OSS/BSS modernization seems to pose, because if you use the bar to shunt orchestration into OSS/BSS you “modernize” it, and if you use the bar to move OSS/BSS functions (by making them orchestrable) into the orchestration side, you essentially replace the current OSS/BSS concept.

I don’t see specific evidence of this “bidirectional bar” in the Verizon approach, but I think that all of the six vendors with current full-spectrum NFV capability could provide it.  It will be interesting to see if the emergence of carrier-developed models (like those of AT&T and Verizon) will raise recognition of the multiplicity of possible OSS/BSS-to-orchestration relationships, and create some momentum for a solution that can accommodate more, or even all, the options.

The Implications and Impacts of Verizon’s End-to-End Hierarchical Modeling

It has always been my view that NFV would be better and more efficient if there were a common modeling approach from the top layer of services to the bottom layer of infrastructure.  I still feel that way, but I have serious doubts on whether such a happy situation can now arise.  The service-centric advance to NFV now seems the only path, and that advance almost guarantees a multiplicity of modeling approaches.  They might be harmonized, though, by adopting some of the principles outlined in the Verizon paper I’ve blogged about, and that’s the topic of the day.

A model is an abstraction that represents a complex interior configuration.  In NFV, the decomposition of a model into that internal complexity is the responsibility of the orchestration process.  Everywhere you have a model, you have an orchestrator, and of course of you have different modeling approaches then you have multiple orchestrators.  I’ve always felt that introduced complexity and inefficiency, which is why I favored a single one—but we’ve already noted that’s probably no longer practical.

A generalized NFV architecture should, and likely would have, contained a single modeling/orchestration implementation, but the ETSI work hasn’t defined all the layers and only a few (six) vendors have a unified architecture to date.  Further, there’s been little (well, let’s be honest, no) progress toward a full-scope NFV business case.  That’s what got us on a service-driven path, and service-driven approaches rarely develop holistic modeling/orchestration visions.  That’s because individual services don’t expose the full set of issues.

There are two pieces of good news in this.  First, most service-driven NFV starts at the bottom and goes no higher than necessary—which isn’t very far.  Second, a proper modeling and orchestration approach at a higher level can envelope and harmonize the layers below even if they’re not based on the same approach.  This is one of the features of Verizon’s End-to-End Orchestration or EEO approach, but it also applies down deeper, and it’s the basis for coercing order from service-specific NFV chaos.  But it’s not without effort and issues.

Let’s suppose we have a “classic ISG” implementation of NFV, which means that we have some sort of model that represents the VNF deployment requirements of a service, which means we have individual VNFs and the “forwarding graph” that somehow connects them.  This combination, represented using something like YANG, represents a specific model at a low level.  It’s the sort of thing you might find in a vCPE business service deployment.

Now let’s suppose we want to incorporate this in a broader service vision, one that for example includes some legacy service elements that the first model/orchestration didn’t support.  We could add a new layer of model/orchestration above the first.  If our first model is M0 we could call this second one M1 to show its relationship.  The M1 model would have to be able to properly decompose requests for legacy provisioning, and it would have to be able to recognize a model element that represented an M0 structure and decompose that structure into a request to pass a low-level model to the appropriate low-level orchestrator.  This is a hierarchical decomposition—one model can reference another as an interior element.

In my example, I assumed that the M1 model had orchestration that would directly process legacy deployments, but you could just as easily have had a second M0 level, this one for legacy, and had the M1 level reference the models of either of the two options below.  Thus, even if you assumed that you had two different ways of implementing NFV (deploying VNFs) you could still envelope them both in a higher-level model, as long as each of the two options below could be identified.  Either give them a different model element, or have the decomposition logic determine which of the two was needed.

What this shows is that layered modeling and orchestration can accomplish all sorts of useful stuff, including harmonizing different implementations or addressing the deployment of things that a given model doesn’t include/support.  And it can be carried on to any number of layers, meaning that you could orchestrate a dozen different model layers.  This (sorry to beat a dead horse here!) is another reason I liked the idea of a single model/orchestration approach.  It would have let us decompose a model using recursive processing by a single piece of software.  But, onward!

Verizon’s paper calls for two layers of service modeling, one representing the retail view of the service as it might be seen by a customer or the OSS/BSS, and the second representing the input to a connection controller (SDN controller with legacy capability).  I think it would be helpful to generalize this to allow any number of layers, and to recognize that each “leaf/branch” on the tree of a service hierarchy would pass from service-abstract to resource-abstract in its own way at its own time, subject to the higher-layer service/process synchronization needed to insure pieces get set up in the right order (which Verizon’s paper includes as a requirement).

How about Verizon’s EEO?  The Verizon paper has an interesting point in its section on E2E service descriptors:

Apart from some work in ETSI/NFV, which will be discussed below, there has not been much progress in the industry on standardizing EENSDs. That is not considered an impediment, for the following reasons:

  1. EEO functionality and sophistication will improve over time
  2. Operators can start using EEO solutions in the absence of standard EENSDs

Since an EENSD is essentially a service model, what Verizon is saying is that you could hope that the industry would converge on a standard approach there, but if it didn’t operators could still use proprietary or service-specific EEO strategies.  True, but they could also simply overlay their “EEO” models with a “super-EEO” model and harmonize that way.

I said earlier that this wasn’t necessarily a slam-dunk approach, and the reason should be obvious.  If a given Mx is to superset a lower-level model then the decomposition to that lower-level model and the invocation of its orchestration process has to be incorporated into the modeling/orchestration at the Mx level.  Somebody would have to write the code to do this, and even if we assume that the orchestrator at our Mx level is open-source, there’s still work to be done.  If it isn’t, then only the owner of the software could do the modification unless there was a kind of plug-in hook mechanism provided.

To make this kind of model-accommodation easier, the first requirement is that all modeling approaches provide the documentation (and if needed, licensing) to allow their model to be enveloped in one at a higher layer.  The second requirement is that any modeling layer have that plug-in hook or open-source structure such that it can be expanded to include the decomposition of new lower-level models.

All of this could be accomplished in two broad ways.  First, any of the six vendors with a comprehensive implantation could focus on “de-siloing” and service harmonization in their development and positioning.  Second, some standards group or open-source activity could address it as an explicit goal.  I think AT&T and Verizon have both made the goal implicit in their announced approaches, but real progress here is going to depend on somebody picking up the standard of harmonization and making a commitment.

A final interesting point is that this approach appears to offer an opportunity to offer “modeling-and-orchestration-as-a-service”.  Higher level models could be linked to cloud service portals, passing off lower-level provisioning and lifecycle management to operators’ own implementations.  This could create a whole new set of NFV opportunities and competition among model providers could move the whole service-first approach ahead, to the benefit of all.

Lessons from Taking a Service-Inward View of NFV

Getting closer to the buyer and to the dollars is always good advice in positioning a product or service.  For network operators, that means looking at what services they sell, and for network operators reviewing the potential of SDN/NFV, it means looking at how these new technologies can improve their services.  But “services” doesn’t necessarily mean “all services.”  In my last two blogs, I used a combination of operator comments to me on their view of NFV’s value and future, and Verizon’s SDN/NFV architecture paper, to suggest that operators were looking at NFV now mostly in a service-specific sense.  Holistic NFV, then, could arise as an almost-accidental byproduct of the sum of the service-specific deployments.

One question this raises is just how far “holistic NFV” can go, given that early projects might tend to wipe low-hanging benefits off the table.  Another is whether silos of NFV solutions, per service, might dilute the whole holistic notion.  I don’t propose to address either of these at this point, largely because I’ve talked about these problems in prior blogs.  What I want to do instead is look at what “service-driven NFV” might look like.

Service-driven NFV has to start with service objectives, first and foremost.  There is relatively little credibility for capex reduction as an NFV driver overall, and I think less in the case of service-driven NFV.  Few “services” offer broad opportunities to reduce capex, broad enough to impact the bottom line and justify taking some risks.  There are some credible opex benefits that might be attained on a per-service basis, but again the issue of breadth comes in.  In addition, there’s a risk that specialized NFV operations within a single service could create islands of opex practices that would end up being confusing and inefficient.  That means revenue-side, or “service agility” benefits would have to be the key.

That’s a conclusion consistent with both my operator survey and the content of the Verizon paper, I think.  Operators told me they liked mobile services and business services as NFV targets, and their specific comments focused on portal provisioning and customer care, agile deployment of incremental managed service features, etc.  The big focus in Verizon seems to be the same, with the specific adjustment that “mobile” probably means “5G”.  In fact, about half of the over-200 page Verizon paper is devoted to mobile issues and applications.

Everyone these days thinks “services” mean “portals”, and that’s true at one level.  You need to have self-service user interfaces to improve agility or the operator’s customer service processes are just delays and overhead.  However, a portal is a means of activation and presentation, and that’s all it is.  You still need to have something to process the activations and to generate what you propose to present.

If customers are going to have a portal that provides them both service lifecycle information and the ability to make changes to services or add new ones, then the critical requirement is to have a retail representation of a service that can be decomposed into infrastructure management, including the deployment of virtual functions (NFV) and the creation of ad hoc network connectivity (SDN).  In the Verizon paper this is accomplished through a series of hierarchical models.  There is a model that represents the service as a portal or operations system would offer it—the retail vision.  Another model represents the connectivity options available from the underlying infrastructure, and yet another the “abstract devices” that create the connectivity.  The models build up to or decompose from (depending your perspective) their neighbors.

Implicitly, the service-driven vision of NFV would start with the creation of this model hierarchy.  The retail presentation (analogous to the TMF’s “Product”) would decompose into functional elements (TMF Customer-Facing Services?) and then into the infrastructure connectivity elements (TMF Resource-Facing Services?) that would be built from the abstract devices.  To make this process amenable to portal-driven ordering, you’d need the model hierarchy to define the service based on all of the possible infrastructure options that might be associated with a given order, meaning that the model would have to support selective decomposition based on a combination of the order parameters and the service-versus-network topology-to-infrastructure relationships.  A portal order could then initiate a selective decomposition of the model, ending in a deployment.

Automated deployment doesn’t address all the issues, of course.  A user who depends on a portal for service orders and changes is likely to depend on it for service status as well.  Thus, it’s reasonable to assume that the retail model of the service defines parameters on which the service is judged by the buyer—the SLA.  The model would then have to create a connection between these parameters and the actual state of the service, either by providing a specific derivation of SLA statistics from lower-level model statistics, or by doing some sort of status query on an analytics database.

Both the deployment and lifecycle management activities associated with service-driven NFV pose a risk because a “service” may not expose the full set of requirements for either step, and thus new services might not fit into current models of NFV as operators seek to expand their NVF story.  Put another way, each service could end up being a one-off, sharing little in the way of software with others.  It’s even possible that early services would not develop generally reusable resource pools.

vCPE is a good example of this.  There is no question that the best model for managed service deployment would be an agile CPE device on which feature elements could be loaded.  Every model I’ve run on this suggests that it would always beat a cloud-hosted model of deployment for business service targets, which is where most operators want to use it.  Obviously, though, CPE resources to host VNFs wouldn’t be generally helpful in building a resource pool.  Less obvious is the fact that software to deploy VNFs on CPE wouldn’t have to consider all the issues of general pooled-resource infrastructure.  There’s no selection optimization, no connection of features through the network, and the derivation of management data is much easier (the CPE device knows the status of everything).

A general solution to service-specific deployments via evolving SDN/NFV technology could be created by expanding the OSS/BSS role, but Verizon’s architecture seems to focus on the opposite—containing that role.  Operators seem to think that pushing more details of SDN/NFV into the OSS/BSS is a bad move overall, and the TMF has yet to publish an open model that supports the necessary absorption.  At this point, I think that’s a dead issue, which is why I appended the question-marks on TMF references earlier.

A final consideration in a service-driven model of NFV is whether new services might be a major contributor.  Even in mobile NFV, Verizon and other operators seem to think that it would be difficult to drive NFV without a broader mobile change, like 5G, to help bear the cost and justify the disruption.  That suggests that a green-field service would be even more helpful.  IoT is the obvious one, and Nokia has recently suggested it thinks that some medical applications (which could be considered a subset of IoT) could also drive network change.  However, focusing on NFV as a platform for new services would be facilitated if there were a generic NFV model to build on, and getting that model in place may be more than new services can justify.