Will Nokia Get the Most Out of Nuage?

One of the positive things said recently in the SDN space is that Nokia is very committed to Nuage, the SDN unit that Alcatel-Lucent acquired several years ago.  Alcatel-Lucent never played Nuage well in my view, and it offers a number of highly relevant features, enough to make it my favorite among the SDN plays.  Now, the combination of support from the parent and market opportunity may be playing in its favor.  As is often the case, though, there are still questions.  Just how much “commitment” is Nokia offering here?

There are three broad technical models of SDN.  One is the overlay or Nicira model, in which tunnels created on top of arbitrary infrastructure let you create a connection network whose structure and rules for connectivity and traffic flows are independent (largely as we’ll see) from the transport underlayment.  The second is the OpenFlow model where forwarding rules combine to create connectivity at (theoretically) L2 or L3.  The final one is a melding of these two models.  Nuage represents this last approach.

Nicira proved that in today’s relatively low cost-per-bit world you can afford to waste some bandwidth with additional headers, needed to create the overlay.  The overlay model is nice because it can span any arbitrary set of L2/L3 switching/routing networks and create what appears to be a unified model.  You can see this approach maturing in the SD-WAN offerings of today.  However, overlay networks are just traffic to the transport underlayment, which means that it’s difficult to provide integrated QoS or traffic management.  For network operators, a pure overlay model is also hard to differentiate from competitor managed service providers, who can ride on your underlayment to offer their services.

The OpenFlow model solves these problems with SDN controllers and centralized traffic and QoS management, but SDN is a revolution in cost and operating procedures.  Because major vendors don’t want to lose revenue, their SDN support has largely focused on making OpenFlow control work on legacy devices, which offers only limited utility versus the native L2/L3 protocols.  With a little SDN, you don’t get much benefit.  Without starting with a little, you don’t get to a lot, and so this model has had growing pains outside the data center.

Nuage’s approach has been to enhance the overlay model in a number of interesting ways.  One is that APIs provide a means for integrating traffic engineering at the SDN level with transport traffic engineering.  This lets network operators with real, owned, assets integrate their stuff vertically to differentiate the services it provides.  Another is to provide API integration with cloud hosting environments of all types (hypervisor and container) and also support Network Functions Virtualization connectivity.  A third is to virtualize networks all the way to the user in any arbitrary, useful, way.

Truly virtualized networks integrated with data center networking and transport QoS would be a revolution in a number of ways.  A true virtual overlay could be pushed out to any connected user simply by giving that user a device/app that could terminate the overlay network properly.  Armed with that capability a service could support a worker on a phone as easily as a fixed-Ethernet branch location, and in the latter could then extend out virtual-worker terminations too.  By linking workers to communities/roles that defined their application access rights, then linking those roles back to the proper data center application-specific VPN, you’d have network security and isolation without firewalls.

All of this is very consistent with network connections for the cloud, obviously, but it’s also good for NFV, and Nuage enhanced their stuff to support SDN connections for service chaining.  In fact, while Alcatel-Lucent was independent, Nuage seemed to be drifting more to a role of cloud connectivity and NFV and less one of branch networking.  You can see the transition in the Nuage presentations even going back only two years; they were much more “universal SDN” than “cloud/NFV” in the old days.

While neither the old nor the new positioning favored a radical virtual-wire positioning, the fact is that any overlay SDN has the potential to at least complement virtual-wire networks.  For those who don’t recall the blogs on that topic, “virtual wire” networks use tunnels and software instances of switching/routing to create L2/L3 services instead of segmenting transport switches/routers.  If you could create virtual-wire partitioning of optical paths and then add overlay SDN, you could create pretty much everything—even (if you were careful in your choice of vendors and the approach you took) the Internet.

This is interesting because it puts the Nokia/Nuage combination in potential competition with the from-the-optical approaches that could be developing from ADVA and Ciena, and potentially Nokia could also support that bottom-up model.  The most impactful thing that could happen in network infrastructure (and in network equipment sales) would be subducting most of the reliability and availability and aggregation features out of L2/L3 devices into SDN-groomed virtual wires, then building overlay IP and Ethernet networks using software instances hosted (in the cloud, by NFV, or however you’d like to see it.

The issue in both the branch and virtual wire models of networking is the same; can you virtualize the connection layer of the network to provide better isolation of services and at the same time reduce opex and capex.  I think the answer is “Yes!” but the issue is complex and the players who have the most incentive (and in the case of Nokia, the most collateral) to solve the problem often have the most at stake in the status quo as well.

If Nokia were the only player who could do this, I’d be inclined to bet that they’d sit on their hands, Nuage-wise, just like Alcatel-Lucent did.  They have inherited the Alcatel-Lucent portfolio, they have to pay back on their investment in Alcatel-Lucent, and Nokia was never a progressive marketing/positioning company.  If there’s anything that would demand progressive thinking in networking, both the branch office and virtual-wire transformations would top my personal list.  But remember that ADVA and Ciena can push virtual wire, and so can Brocade.  Dell/VMware would be happy to promote overlay SDN and perhaps the branch model, and maybe even virtual-wire as well.  Thus, Nokia may have little choice but to try aggressive marketing on for size.  They can’t suppress change, only refuse to profit from it!

Can We See NFV Emerging from MWC?

Experience teaches a hard school so they say, but difficulty aside, experience often teaches the only lessons that matter.  We’re starting to get some experience in the NFV space.  It’s not so much direct deployment experience, because little has really happened to drive the kind of NFV deployment that would serve as the model for NFV-based transformation.  It’s marketing experience as vendors begin to confront for the first time buyer understanding of the transformation issues.  The results are starting to show in marketing.  NFV will go as vendor positioning takes it, so it’s well worth the time to look at the marketing lessons we’ve learned, especially coming out of MWC.

The most important thing we’ve learned is that operators are realizing that there’s more to NFV than ETSI specifications suggest.  We have two new management/orchestration groups (Open Source MANO and OPEN-O, presumably Open Orchestration) launched primarily by operators and aimed primarily at making the bare-bones ETSI model of NFV into something that can actually be deployed and generate benefits.

Telefonica’s announcement on Unica, the topic of my blog yesterday, wasn’t strictly a part of MWC but it was contemporaneous, and it illustrates that operators have moved beyond having tests and trials that prove that OpenStack can deploy a VNF just like it deploys an application component.  They’re now recognizing that they have to deploy an NFV that’s functionally useful or integrating with it won’t mean anything.

I think that the OSM white paper should be required reading for everyone in the NFV space.  It’s not necessarily that it’s the definitive statement on functionality or approach, but that it demonstrates that operators are recognizing the key things that NFV needs and the ETSI ISG hasn’t provided.  These things include full-scope service orchestration, inclusion of legacy network components, and multiple VIMs per implementation.

This new realization brings with it a new problem for many vendors.  NFV isn’t going to be simply a matter of blowing kisses or throwing money at influencers.  This isn’t easy stuff, and therefore everyone in the NFV space now has to face the fact that they need either a real position or risk being exposed as an NFV charlatan.

This point raises the second thing we’ve learned, which is that the big-money part of NFV is splitting off the rest of it, explicitly, for the first time.  While startups could hope to get a big revenue boost from selling the orchestration part of NFV, most established vendors need a bigger cash cow to milk.  NFV’s biggest is NFV Infrastructure (NFVI), which means the data center equipment that NFV will require.  Up to now, most vendors with NFVI aspirations have ballyhooed their general NFV positioning.  No more.

EMC’s Provider Cloud System is an example of an NFV strategy that’s focused on NFVI.  It’s also an example of why it’s critical that the NFV community accept something that Telefonica has made a part of its Unica architecture—multiple VIMs.  If there’s to be a rich community of infrastructure providers for NFV then each of them has to be able to supply the software that interfaces their stuff upward into the management and orchestration process.  Otherwise a few big infrastructure vendors would simply refuse to support other vendors with their VIMs and you’d be back to silos.

Intel may have a lot to do with this NFVI-independence development.  There are a lot of ways to slice up the server deployment that could come out of an optimum NFV success, but Intel chips would probably figure dominantly in most of them.  Thus, Intel has the most to lose if everyone sits on their hands.  For the last year, Intel has been gradually increasing the pressure on vendors to position NFVI effectively, and I think that’s now bearing fruit.

Splitting NFVI out of the NFV fog might result in splitting NFV out into its realistic market elements, particularly if the open-MANO stuff matures as it should.  Orchestration, VNFs, and NFVI are what NFV is about.  Together they’re great, but operators tell me that the togetherness has to come from open architectures and not from monolithic vendor offerings.

For the real NFV vendors this is mostly good because most of them don’t really expect to own the ecosystem.  HPE and Nokia, both of whom had a realistic shot at building a single-vendor NFV world for at least some clients, will find that harder to do at this point.  ADVA and Ciena, both of whom have very strong orchestration and management stories but don’t have all the other NFV pieces, may find themselves being courted by a host of NFVI hopefuls who need the business case, which NFVI alone won’t build.

The third critical lesson relates to this point.  NFV, as well as SDN, are technologies critical to optimizing the business case for advances like 5G, but only if they can make their own business case.  I got a lot of email from MWC offering variations on a single theme:  “I’m sick of hearing vacuous crap with the 5G label stuck to it!”  Paraphrasing an operator, you need SDN and NFV to make 5G anything more than an RF upgrade.  You know I agree with that one!  The financial media noted that in the smartphone space, there wasn’t much at MWC that qualified as inspirational or transformational.  Yet we had an NFV announcement.  Interesting.

Networking has changed profoundly in experience and expectation terms under the pressure of the web, content delivery, and mobile broadband.  Given that a network is already a mixture of complex technologies and vendors, we can’t expect that diddling it here or there in grand isolation is going to bring about much in terms of the business model or user QoE.  Standards like SDN and NFV are not a perfect driver of change, but they’re better than swatting at individual pieces of this and that.  Virtualization seems a clearly justified direction for network evolution, and we need to face that in its entirety, across all the pieces of the network, all the organizational fortresses of operators, and all the stakeholders (parasites and symbiotes alike).

We have no good standards-and-consortiums way to do this now.  Even SDN and NFV are too compartmentalized in focus to make their own business case, and to change networking at the fundamental level we’d have to involve so many standards development organizations that we’d collapse into an acronym black hole.  OSM is a proof that operators are moving into a very new and very uncomfortable position in trying to build the framework that defines their own future.  Many of their challenges are imposed by antiquated regulations that make a conference of operators about as illegal as a mob meeting.  We may have to rethink how the industry is regulated and allow operators to do more as a community without running into anti-trust risk.

I’m sort of optimistic coming out of MWC.  It’s not that trade shows are anything less than cynical marketing, or that I believe that collecting a bunch of people in one place makes them smarter (we all know the theory that the IQ of any group of people is equal to the IQ of the dumbest divided by the number of people).  It’s just that opportunism that fails in the end to support a realistic set of benefits to the buyer is doomed, and while all the vendors want to own or control the NFV opportunity, few think it can be stalled forever.  Sensibility just might be rearing its head.

What’s Inside Telefonica’s Latest Unica Award

The headlines say that Telefonica has picked a new integrator for its Unica network program—Ericsson—and that’s true but only sort of.  It’s true that Telefonica rebid the integration deal that HPE won originally, and it’s true that Ericsson won the new deal.  Ericsson was one of the three choices all along, and the one I’d been hearing would be the final pick.  What doesn’t appear to be true is that the deal Ericsson won is much like the original deal that was rebid, and what has changed is a lot more important than who won the changed bid.

If you look at the Telefonica press release, which is quite short, it says that they closed “a new phase in the selection process for UNICA”, which hardly sounds like a big integrator award.  They don’t get to Ericsson or the award to the fourth paragraph, when they say “Ericsson has been awarded as a key partner of this program to provide the infrastructure and software required to launch the UNICA architecture in Telefónica Germany in 2016.”  The press release doesn’t say what Ericsson is doing in “infrastructure” but Light Reading reported it included “Hyperscale Datacenter System 8000, Ericsson Cloud Manager, OpenStack based Cloud Execution Environment, Cloud SDN” and automation tools.

What the press release does say is that HPE and Huawei, the other two finalists in the integrator bid, will “continue to participate in this program as strategic partners for NFV.”  Some insiders tell me that these two firms may even have enough latitude to develop the integrated frameworks they’d proposed to Telefonica, though apparently without receiving integrator payments.

What this looks like to me is a bit of a shift of focus.  I’d heard that Telefonica was learning that the integration task for NFV was profoundly different and more difficult than they’d expected it to be.  I’d heard that they were unwilling to pay for the kind of work they now knew would be needed.  Most importantly, I heard that they were rethinking their approach to control the costs.

The most important point in the integration story isn’t who was selected but that Telefonica is apparently now relying on open source contributions to not only supply components to their network, but also to shift the focus of “integration” to something broader and more suitable to making an NFV business case.  They’re showcasing their Open Source MANO initiative at MWC and they’ve now released the white paper referenced on the OSM site.  I agree with most of the points they make there, and because Telefonica is a big power behind OSM and it’s thus likely to frame the cooperation among vendors and products that they expect Ericsson to integrate, I want to summarize what it says.

OSM is a superset of NFV, designed that way because of what limited the ISG work.  “This initiative did a fine job of focusing on the additional components (to a conventional network) required to enable NFV service. There was a great effort in defining information models, which can in turn drive a lifecycle management function enabling resource orchestration through the NFVO and VIM. The challenge was initially so vast that some aspects were left outside of scope for further study. These included the higher-level issue of end-to-end service orchestration and also the current state of play in the open-source community with respect to cultivating possible management and end to end service orchestration solutions.”

Nobody will be surprised that I agree with this point; the scope limits in the ISG work were something I fought from the first.  What the OSM material is saying is that you need “service orchestration” beyond the narrow problem of orchestrating virtual functions, and that’s clearly true.  What’s critically important here is that a major Tier One and one of the ISG founding members is supporting a broader model of “service orchestration”.  This is what’s been missing all along in making the business case.

The OSM model the white paper presents has some interesting technical principles.  The paper cites them as layering, abstraction, modularity, and simplicity and it’s hard to disagree with any of them.  The implementation that’s described in the paper adds some color to these principles; the deployment process is orchestrated by JuJu, which is an application-modeling-and-deployment tool that comes from the cloud (Canonical).  JuJu is probably the best of the current cloud DevOps tools, so it seems an inspired choice.

So far, the paper describes a useful and even insightful model of service management, but there are still a couple of issues the white paper doesn’t completely address, IMHO.  One is the VNF Manager (VNFM) concept, which is the ETSI way of addressing the problem of lifecycle management beyond the initial deployment.  The other is exactly what a service orchestrator does, and in particular how services are modeled.

It appears that the OSM approach still supports, if not mandates, a VNF-integrated VNFM component to handle things beyond the general tasks handled by JuJu.  I’m uncomfortable with this approach, as I’ve been all along, because it seems to me that it will both complicate VNF integration and also raise risks of diluting the boundary between services/tenants.  I hope further experimentation within OSM will address these issues.  I’ve got nothing against the notion of having a VNF component do parameterization, but you need to insure that it doesn’t have to address real resources or you’ve created a security/stability risk.

On the service orchestration side, the logic is contributed by Rift.io.  I’ve had some discussions with this company and reviewed a slide deck, but their deck and website material don’t offer any detail on the service modeling, and in fact imply that they have what’s essentially an open-source implementation of MANO, which OSM agrees doesn’t include service orchestration.  The website, in fact, seems to suggest the company’s focus is on a VNF-PaaS approach, and the diagrams show orchestration as being a contribution from the NFV ISG side, presumably Open MANO from Telefonica.  I’d like to see their service model and details made available, because the material that’s currently distributed doesn’t validate the service-orchestration positioning of the paper or the demo.

I’ve focused on open VNFs so far, but Telefonica isn’t ignoring open resources.  The press release on Telefonica’s site notes in passing a very important advance—the support for multiple VIMs.  The notion that a single VIM would manage all virtual infrastructure created what to my mind was a completely intolerable risk of lock-in, since a major vendor would likely support their own stuff with a VIM and then use some of their killer apps to pull through a VIM solution that wouldn’t work with anyone else.  So we have a big step toward openness at the NFV Infrastructure level.

What all this adds up to in my view is that Telefonica is now looking to OSM to create a new framework for VNF deployment, and because OSM isn’t quite ready to do that and the integration tasks associated with it can’t be defined precisely, they’re holding off on the “integration bid”.  What Ericsson won wasn’t what HPE lost.  Nobody has won that yet, and nobody will likely win it ever, in its original form.  We’re going to see a new NFV model, defined by OSM, at Telefonica.

This doesn’t mean that Ericsson didn’t win anything or HPE lose anything.  Ericsson will have a big NFVI deal in Germany, and Intel will have a bigger one.  Remember my comment in a prior blog that Intel was trying to make sure that we got enough NFV data centers to boost their chip sales?  Well, the Ericsson system based on is Intel’s disaggregated cloud-scale-and-hyperscale architecture.  HPE is losing what might have been a nice early deal, though some people tell me that HPE actually still has a shot to provide servers based on that same Intel architecture (which HPE offers) to Telefonica in Germany.

All the vendors lose one common thing, which is the opportunity to frame NFV at the high level.  The OSM project is the big competitor and the big winner, providing that it can actually deliver something.  I’ve noted the issues I have with the architecture as announced; I think these will have to be resolved quickly because both HPE and Huawei have functional NFV implementations that cover all the space OSM proposes to cover, and more.

In summary and at the highest level, Telefonica is going to step beyond ETSI NFV to attempt to establish an open model for NFV that can make a broad business case.  There is no value in integration for any piece of NFV if it can’t make the business case overall, and thus won’t deploy except in specialized applications.  There’s also no way, in my view, to integrate even within a strong business case unless you adopt a service-model approach like the one I described in my blog yesterday.  OSM would be bigger news if I could be confident that they’re taking all the steps needed and that progress will be timely, but I can’t say that will be the case based on what’s released.  There are still questions, ones I’ve raised in recent blogs on service models and service orchestration.

We have other open-source activities associated with NFV, from OpenNFV (which I think is still stuck in the early stages of defining the software project and hasn’t gotten around to much at the functional level) to OPEN-O, which sort-of-competes with Telefonica’s OSM.  The media has made a lot of the competition here, but the real competition—between a strict ISG model of NFV and a broader model that can make much more of a business case—hasn’t gotten much attention.  I think now that it will.

Modeling for Next-Gen Services: Why You Should Care a Lot

One of the terms you hear more and more in networking is model.  We’ve already had a host of MWC announcements that include it (one of them I’ll blog about specifically tomorrow), and the concept of a model is pivotal in NFV and many cloud management systems.  If you read network rags, then you know that models are probably central to your future.  The question is “what kind of model?” and that question is harder to answer because there are many model types and issues.

The term “model” by itself is too general to be useful; we have network models and model airplanes, after all.  A model is a representation of something, but exactly what it represents and how it might be used has to be conveyed in a qualifying term.

An “information model” is used primarily to describe the relationship between elements, and some people would say that it’s a high-level, conceptual, or abstract model.  A “data model” is the detailed description of something, a set of elements that are contextualized by the information model.  It’s not uncommon to combine the two in a single definition; the TMF’s SID is a “Shared Information/Data” model.  In my view, this sort of model is primarily about the abstract relationship among data elements and an inventory of elements used/needed.

Which means both these terms are probably most useful in describing databases, which is where most of them have come from.  Software creates another set of model issues, related to software structuring techniques that have evolved from the ‘80s.  There are a number of competing technologies and definitions here, so let me try to use some fairly descriptive and general ones.  A piece of software can be called a “component”.  Components are pieces of application, and in modern software practices components are “objects” (hence the term “object-oriented programming”).  An object is something that presents a specific interface and provides specific features.

The purpose of objectivizing software is to make it easy to change.  If you have an object that presents an interface and features and you change the interior code without changing those interfaces/features, then you have not changed the object because its external properties are the same.  Everything that used it before can use it now.  You can see how valuable this capability would be in software development.  The object is sometimes called a “black box” model because you can see its properties as a box, but not its contents.

If you looked at a collection of objects and the data associated with them, you’d see the data fit into one of two categories.  Some data is represented as a variable in the interface description of one or more objects.  Other data exists “inside” an object or objects, and is therefore invisible.  You can also see that while it might be useful to collect all the externalized data and represent it as a tabular model or document, that doesn’t do much to describe how the structure of objects would work.  It would be useless to collect knowledge of the internalized data because it’s invisible.

Some internal data becomes externalized, of course.  An object could create a data element for the purpose of interfacing to another object.  In this case, the “new” external data is related to the interface data of the object that did the creating.  So there’s a chain of relationships that links objects through data, a flow of information.  This structure can be visualized in abstract, but it can also serve as a map to how software itself is organized, and how it works.

This picture is really important in understanding service modeling in the modern era.  A good service model should be translatable into a software implementation so it should likely support software object principles.  This means that a service model is made up of “intent models”, called that because the objects in an intent model are known by what they do, not how they do it.  A good service model should also insure that the chain of information/relationships maintains object independence.

Suppose I have a top-level object called “service”, and suppose this object has a variable called “hosted-on”.  This seemingly logical connection is really a problem because not all services are hosted on anything, and if you take something that’s a “deep” object variable and propagate it to the top of a model, you make the model “brittle”, meaning that changing something that shouldn’t matter at the top breaks the description there.  This is why it’s a bad idea to describe a service as a flat data/information model—you have to “know” everything at once and you break the inherent property of abstraction and black-box-ism that you need.  Every object in a service model has to be able to get what it needs.  That doesn’t mean it gets it from adjacent objects, or from a single order or descriptor file.

This all adds up to why good service models are hierarchies of objects.  An object can be “decomposed” in a good model into lower-level objects or (at the very bottom) specific resources.  A VPN might be a service object at the top, a “core” VPN object in the middle, and a set of access objects at the edge.  The hierarchy would be service-to-both-core-and-access, a two level structure.  If you implemented the core VPN with NFV, you might have a series of objects representing the VNFs, locked safely inside the VPN object.  That way you could order a VPN based on MPLS or on NFV and as long as you provided the externally necessary parameters, you’re able to use the same structure and let the object itself decide how to break down, depending on location, policy, etc.

That introduces another important point, which is that each object in a service model has a state that represents where it is in the lifecycle process, and a series of events that impact its operation.  You can signal among adjacent objects with events, but never further because that would violate the black-box rule—you don’t know what’s inside a given object so how can you send a message to something inside one?

What you see in this real service model isn’t just a list of information, parameters, it’s a live system that exchanges information.  In effect, it’s a program.  That’s good because if we want services to be managed by software processes then our models of service had darn well better look like software.  I did a couple of open projects to prove out the approach, and all these principles work and can result in an agile service model.  There are many paths to such a model, but IMHO it’s clear that we do have to take one if we’re to get operations, next-gen services, SDN, and NFV on track.

Where are we with the right model?  Well, you could make the TMF SID work but I don’t think it’s the optimal choice because it doesn’t enforce hierarchy and modularity as much as allow it.  ETSI’s new MANO group promises to have “modeling” but there’s no example of the models on their website yet so I can’t say whether they have the right stuff.  The six vendors who can make an NFV business case all have a service modeling approach that’s generally conformant, but I have deep detail on only three (ADVA from their Overture deal, Ciena from their Cyan Blue Planet deal, and HPE) so I can’t be completely confident the others (Huawei, Nokia, and Oracle) support all the key nuances, though they do support the broad principles.

In the NFV world, we now have two competing sources of orchestration with Open Source MANO and OPEN-O, each sponsored by some major network operators and each at least loosely linked to the ETSI ISG NFV work.  OSM makes a lot of references to models and modeling, which OPEN-O doesn’t, but in my view there is simply no way to do NFV if you don’t have a strong service modeling strategy.  Tomorrow I’ll look at what OSM offers to see if 1) we can see enough to draw conclusions and 2) whether OSM will move the modeling ball forward.  Eventually I think it will, but we remain locked in the question of time.  Operators will need to make changes this year on a large enough scale to promise real cost/revenue differences for 2017.  NFV or no, only modeling is going to get us on track for that.

ADVA and DartPoints Define a Realistic Model for Distributed NFV Data Centers

When ADVA bought Overture Networks, one of the six companies who I believe has a full-spectrum NFV solution that can make a business case, I was critical of the fact that they seemed to be positioning Overture as a limited extension to the carrier Ethernet business.  Now they’ve made their positioning clearer with an announcement that DartPoints, a supplier of on-premises micro-datacenters, has selected the ADVA Ensemble NFV products as a key element in their service plans.  This goes a long way toward establishing ADVA’s commitment to NFV, and it makes an important point about NFV too.

One of the most difficult challenges that NFV, in its vCPE guise, faces is getting out of the CPE edge device and into the cloud.  The business case for deploying agile CPE is only slightly related to that of NFV, and the technology for NFV is overkill if all your VNFs are going to do is squat in the customer edge device and wait to be changed around.  NFV technology is valuable when VNFs have to be more spread around.

DartPoints is an example of a logical pathway to expand the number of places where VNFs could be hosted.  Instead of requiring operators to jump from the customer premises directly to a cloud data center somewhere in a metro area, DartPoints provides a facility micro-datacenter that’s located in multi-tenant facilities and can host, securely, the virtual functions for a group of tenants.  It’s a very logical idea given that 1) many operators already run fiber to major office buildings and campus locations to provide service to the tenants and 2) most multi-site businesses have at least some if not all their operations in a multi-site location.

The application is based on a hardened ADVA server that hosts the functions, a software platform created by the Ensemble Connector software instance that represents each tenant and provides a common deployment and management platform for VNFs, and the Ensemble Orchestrator for deployment and lifecycle management.  Because the hardware/server from ADVA is multi-tenant, the cost is shared across the users of the micro-datacenter and that makes the VNFs more attractive.

It may be that the most interesting thing about the micro-datacenter concept is the fact that it’s a jump to an edge-distributed multi-tenant cloud for VNF hosting.  One of the problems I’ve identified with NFV progression from the virtual CPE model is that the logical next step is to start building a central resource pool to offload functions from the edge.  That’s obviously the next move from a financial standpoint, but it creates long data paths and it also creates the risk that all your VNFs are now stranded in the center of each metro area, when many NFV applications (mobile services, content delivery, and IoT to name a few) are better served if you host them at the edge.

Could micro-datacenters, located in multi-tenant facilities, help solve this?  For business services, they certainly could, but I think they could also help solve them for other services as well.  Operators could build on the deployments of micro-datacenters used to support building tenants to support nearby cell sites, deploy content caches, and do a lot of other critical stuff.  The result could be an NFV deployment that puts the power of VNFs where it has to be to support the largest number of valuable services.

Even if operators don’t see the benefits of micro-datacenters right away, what’s to prevent somebody like DartPoints from taking the next step down the line?  The DartPoints model might be the most critical single thing that’s come along for NFV because it might point to a way for OTT-like deployment of NFV, a way to create multi-service edge hosting points that anyone might then take advantage of in a service sense.  It’s especially interesting since shopping malls and food courts are multi-tenant facilities and could easily justify a micro-datacenter.  From there, the concept could expand both in terms of the number of facilities and the number of supported services.

For ADVA, the critical thing here is that this is a real NFV application and not just VNF-squatting.  We have a very small number of NFV services today that can credibly claim to exercise a large portion of even the functionally limited ETSI specifications.  Here’s one that can exercise all the components needed to make an NFV business case, and by doing so it proves that a full-scale NFV business case can be made, at least in a technical sense.

I don’t know if we’re going to hear much about the mobile and content implications of this announcement even though we’re in the midst of MWC, but I think that the story could be really interesting to MVNOs as well as to the prime mobile operators.  There is nothing in the mobile application of NFV more important than getting those hosting points out there toward the network edge, because without that you have to compromise things like the delivery of content, the utility of virtual RAN strategies, or the agility of IoT-related control of facilities.  All these things need a short delivery path.

I had a chance to chat with ADVA’s business lead on the Ensemble unit, where Overture went within ADVA, and there seems to be a commitment to pursue Ensemble NFV in the most aggressive way.  This is a darn good start because it leads to so many follow-on services and applications, and because it showcases the broad orchestration and management support that Ensemble has had all along.  ADVA might take steps now that would start to separate the NFV pretenders from the players who can really do something.

It doesn’t hurt that ADVA is primarily a fiber player either.  Metro infrastructure for mobile services is mostly fiber deployment of capacity combined with hosted IMS/EPC/RAN elements.  The DartPoints model demonstrates that can be done by taking advantage of convenient multi-tenant facilities.  Many of whom, by the way, already have mobile antenna systems on their roofs.

I can’t say for sure that ADVA will be as aggressive as they could be with this, but I do think that an emerging model of NFV deployment that takes real and useful steps toward the optimum model of NFV data centers could be very powerful.  I’ll be watching how this plays out.

Can We Achieve Universal Service/Resource Orchestration in the Real World?

In many of my past blogs I’ve talked about the question of operations transformation, and proposed that it be considered co-equal to network infrastructure transformation.  I’ve also noted that most network operators are weighing the question of how to go about operations transformation.  Perhaps because of this (or perhaps because operator views are driven by vendor offerings) vendors are also thinking about the way to get operations into the new age.  What exactly does that mean?  That’s what I’d like to consider here.

Operations systems have always been the top end of the operator business, the part that faces the customer.  This is the “business support system” or BSS part of the picture today, and it’s been largely responsible for things like billing and accounting.  In the past, the customer-facing side of the process, relating largely to orders and order status, was pushed down into the provisioning of services by providing support for human (“craft”, as they say in the operator world) processes.  These were the operations support systems, OSS.  Everything was happy until IP convergence created some cracks in this process, for two reasons.

First, IP services (like all packet services) are non-deterministic and thus require their own management processes (fault, configuration, accounting, performance, and security or FCAPS) to sustain their operation.  These technical network processes were easier to deploy outside of OSS/BSS, and this created a kind of network adapter plugin notion to allow OSS and the NMS/FCAPS processes to coordinate their behavior to suit user needs.

Second, IP ushered in the packet age and multiplied the number of services that could be provided, and the number of functional components inside each.  This encouraged both operators and OSS/BSS vendors to create service-specific higher-layer features, often paralleling some of the OSS/BSS elements and almost always overlapping each other in terms of functions and features.  One of the technical challenges that came out of this IP convergence period was a collision in the basic model of “provisioning”.

In the old TDM days, you provisioned a service by performing a bunch of largely manual tasks that could include running new access connections, installing CPE, and so forth.  These processes were undertaken in a nice orderly flow, a linear progression of steps.  If something broke, you had a tech fix it.  Shared tenancy was non-interfering and networks didn’t take their own steps to fix things.  Again, it was an easy linear flow to imagine.

In the IP world, services are more often coerced from in-place resources, and while the setup process could still be visualized as a flow of steps, the rest of service lifecycle management didn’t fit that model.  Self-healing adaptive networks do all kinds of things on their own and report issues (either ones that they’ve fixed but with some loss of service continuity or ones they could not fix) to the higher layer.  This is an “event model”, and it’s difficult to fit random asynchronous events into a nice linear flow.  This is what gave rise to the drive to make OSS/BSS “event-driven”.

SDN and NFV take things even further because they exacerbate two primary issues that IP introduced.  First, network infrastructure was even smarter and autonomous than before, with all kinds of lifecycle management processes built in.  Many of these processes were intended to provide service assurance, something that had often been viewed as an OSS function, and many required changes to customer billing, etc. that was always a BSS activity.  This raised the question of whether “services” and “resources” had such a flexible relationship that it would be impossible to create a fixed link between network management and service management even through an adapter.  Does the orchestration and management of resources then have to rise up somehow to be visible to OSS/BSS, or does OSS/BSS somehow have to be orchestrated and managed by resource-oriented processes like NFV MANO?

The easiest way to frame the results of all these changes is to postulate the difference between a traditional operations flow-driven structure and what someone would likely come up with today if there were no incumbent technology to worry about.

Today’s system could be likened to a service-bus workflow, where a work item like an order or a change moves along a pathway from a determined starting point to a determined completion point.  Along the way it would encounter data dropped off by asynchronous tasks, and based on this data it might pause or change course.  This sort of system is used routinely in transaction processing for enterprises, but there it faces a simpler set of asynchronous tasks and there are fewer requirements to create new rules for new services or new market conditions.

A “modern” system would look more like a microservice set that’s coupled to a service data model.  The data model, by providing a set of state/event relationships, associated processes that could be network/resource-linked or service-linked with a specific lifecycle state and a specific event within it.  If you get a CHANGE event in the DEPLOYING state you change resources but perhaps make no billing adjustment because you aren’t billing yet.  In the OPERATING state you’d have to change resources and presumably also change the billing.

It sounds like we’re talking about oil and water here, but the differences aren’t as irreconcilable as they’d appear.  The majority of OSS/BSS systems have evolved over time to be highly “componentized” meaning that their functionality has been divided into logical components rather than composed into one big monolithic application.  Workflow/service bus systems use components too (though in a different way) and the TMF proposed to do event coupling to processes (componentized processes) via the same service-oriented architecture (SOA) now used by many enterprise workflow/service-bus transaction processing systems.  It wouldn’t be impossible to simply “compose” the event-to-process linkage using a state-event table in a data model without changing current operations systems much.

Why then hasn’t it been done?  Operators think that OSS/BSS vendors resist the notion of composed event-to-process coupling because it would allow buyers to shop for individual processes instead of entire OSS/BSS systems.  “It’s the classical opposition to best-of-breed or point competition,” one operator told me, meaning OSS/BSS vendors fear that competitors would eat off little pieces of their business when the competitor wouldn’t be credible to provide a total solution.  Others say that it’s impossible to map out who supports such a composed configuration, making operators rely on third-party integrators.

There are other issues to be addressed here too.  Many operations systems today have evolved service silos to a whole new level, to the point that they duplicate processes and even data elements.  If the components of a nice new compositional operations system all expect their own databases to be in a certain state when they run, then the integration complexity explodes, and if different vendors provide the components of such a system, all bets are off.  It’s not that this problem couldn’t be solved too, but that it’s hard to see who has the incentive to solve it.

Well, it was hard until some recent trends offered hope.  Two things might break the logjam here.  One is NFV orchestration and the other is open-source.  And yes, the two might even combine.

NFV orchestration could in the right hands generate a model-driven service architecture with event-to-process component coupling, just to make NFV lifecycles work out at the technical level.  If this framework were suitable for use by operations processes too, then operators could build operations processes into the framework on their own, using components supplied by third-party vendors or (you guessed it!) open source.

It’s also possible that NFV vendors who don’t have a horse in the legacy OSS/BSS race could use a microservice-based, model-driven, service/resource lifecycle process to gain traction.  These days, progress in infrastructure tends to be made not as much by startups as by non-aligned major vendors creeping in from other spaces.  NFV, promising as it does a shift from network appliances to software and hosting, is a perfect opportunity for that sort of thing.  I believe that all of the six vendors who can currently make a full NFV business case could support a completely orchestration-integrated operations/resources model.  Of the six, four have no entrenched position in OSS/BSS and the remaining two are not really OSS/BSS incumbents.

Open source is a way for operators to get functional pieces of operations software without vendors cooperating to make it happen.  If we had, for example, an entire OSS/BSS set up as a set of microservices, you could see how operators would be able to compose a lot of their future service, network, and SDN/NFV operations software from that inventory.  It’s not that farfetched either.  Sun Microsystems launched just such a project, which passed to Oracle and eventually became known as OSS for Java or OSS/J.  The project was taken over by the TMF, and it’s not made a lot of progress there, but the concepts and even a basic structure are available to members.

If the opposite of “hard” is “easy” then neither of these options really moves the ball.  If we recognize, as we should, that the real border isn’t the “hard/easy” boundary, but the boundary at which something that was highly improbable becomes likely.  It’s opportunity and need that create the pressure to test that boundary.

We have both aplenty.  Next-gen consumer and business services could add over a trillion dollars to somebody’s coffers.  Over half the current network equipment budget could shift to IT over time, and all the differentiation could be sucked out of Levels 2 and 3.  Every OSS/BSS could be rendered obsolete, along with all the NMS/EMS tools.  The new services that might come along could form a bridge between the “pure” OTT model of ad-sponsored experiences or high-price-pressure video and the traditional telco model, a bridge that Google could try to cross in one direction as operators try to cross in the other.

We are going to have virtualization in infrastructure, you can bet on it.  We’re going to have winners and losers, and you can bet on that too.  All we’re doing now is sorting out the details of both.

Mobile Content and the Drive to SDN and NFV

If mobile infrastructure is the target of choice for any aspiring new network technology, then we have to ask why that is in order to decide how new technologies would have to address the future.  Everyone knows the answer at a high level—video streaming to mobile devices is the driver of mobile change.  It’s not as clear just where exactly mobile is being driven.  Since I talked about the general importance of mobile and 5G to SDN and NFV only yesterday, today’s a good day to weave video into the mix.

It’s fashionable to say that OTT streaming is changing video, but the facts are more complicated.  While people viewing at home do consume more OTT video than before, they haven’t changed their TV viewing that much.  What has changed is that smartphones and tablets with cellular or WiFi service allow people to view video when they’re not at home.  And even if this form of viewing isn’t threatening the home-TV model, it’s threatening the advertising dollars that fund it.

As long as we’re on the topic of fashionable speech, we should add that it’s fashionable to say that this is about “Internet” delivery of video, and that’s also a simplification.  Users may access video on the Internet, but most video is delivered through a parallel metro infrastructure that is actually outside the Internet in a technical sense, and even in many areas in a regulatory sense.  The notion that we have to make the Internet faster to support video isn’t supported by facts.  We have to make access as fast as the combined video usage of customers in the area in question (a mobile cell, a central office) and then we have to push video in distributable form closer to the access edge.  That’s what content delivery networks have been about for ages.

The basic notion of CDNs is caching content close to the viewer to reduce the network travel and capacity needed.  It would seem impossible to store a million videos at every edge point, but it’s not necessary to do that.  Videos aren’t viewed at the same rate; there are popular fads and far-fringe content elements.  You could argue that some people would like to have a bit of both, but of course the relevant question isn’t how much viewers want something but how much can be paid by someone for those viewers to have the opportunity.  That means advertising or on-demand pay-per-view, and those patterns of viewing are predictable.  Thus, you can make caching work.

Operators know this, and as video popularity increased they went to a strategy called “forward caching” which pushed cache points closer to the edge.  One of the fundamental questions in mobile network design, for 5G in particular, is how far “forward” really can be.  We know every cell site can’t have a full video library, but what can be done.

The big challenge in mobile caching is the fact that mobility management is handled through the Evolved Packet Core (EPC) specification, which calls for the use of a tunnel between a packet (Internet) gateway and a serving gateway to deliver packets with a fixed address (the user) to a variable cell site.  Classic CDN/mobile design would define “forward” caching as caching adjacent to the PGW, because that’s where content is expected to originate.  The problem is that as you increase video consumption you increase the value of (and need for) caching even further forward.

Logically, video caching policy is based not on “sites” meaning cells, but on typical subscriber count.  That’s based on the user population of a given area, so in metro areas with a lot of population you could expect to justify caching easily.  Where?  The smart approach would be to see how cell sites were clustered and how easily fiber could be run to each, from various points.  You could draw out an optimum metro map by looking for the shortest total weighed cost of fiber, considering both distance and cost of laying the glass.  This would probably set a number of optimum cache points.

This structure, set by video, should then probably frame how we look at mobile delivery of everything, meaning the EPC.  As I said above, cache points for video, if near the edge, would be “inside” the normally mapped location for a PGW, which is where EPC traffic is expected to originate.  Thus, you have three choices.  Forget forward caching beyond the PGW is one obvious and unattractive choice.  Second, move the PGW forward, which can be done only by duplicating it or making it a kind of virtual hierarchical device.  Third, rethink the whole notion of how you address content from mobile devices.

With virtualization, you could diddle with the mobile structure a little or a lot.  On the “little” side, you could make the cache-centered cluster of cells into the PGW and the SGW.  You’d then feed Internet to each of these points and let mobility management simply aim the cache delivery at the right cell within the cluster.  On the “lot” side, you could construct a virtual address space within the cache site, and let all Internet requests go there, where they’d either be passed upstream to the real PGW or resolved to a “local” host.

This latter approach might be interesting if you look at the way that NFV, contextual services, and cloud computing could be added to the mix.  The cache points are natural places to locate a data center, to provide VNF and cloud hosting for both “network” services and application services.  It would be a perfect place to forward-place IoT processing assets, to shorten the control loop.

It’s not completely clear that all the “virtual EPC” approaches now emerging are tightly integrated with CDN, or which (if any) of these options for forward cache placement they might support.  More significantly, perhaps, it’s not clear whether anyone is proposing to use SDN’s explicit forwarding to replace the tunnel-driven approach of classic EPC.  You could, using OpenFlow, simply tell a switch to forward a user’s packet to a given cell.  If mobility management were coupled to an SDN controller you could eliminate the whole tunnel thing and simply control the forwarding switch.

This would also let you converge multiple forwarding sources on the same set of cells, which means that a cache could be quite far forward and still send packets to the correct controller to meet the user who’d requested the content.  This sort of thing could revolutionize the way we do mobile infrastructure, so much so that it would justify a pretty substantial refresh.  That, in turn, could be a major driver for SDN.

For NFV, the neat question is the placement of these switches and the distribution of the control logic (both SDN controllers and mobility management elements) within a metro area.  If all the cache points were mini-clouds, then you could move these functions around to accommodate both user location changes (en masse) and content viewing patterns.

So we have, with content, another potential driver for SDN and NFV, but providing that we rethink the mobility management process and EPC, almost completely.  Here, as in many places in the network, the value of the future is limited by the inertia of the past.  But with mobile services, we have enough push away from that past to give the future a good chance.

Might 5G Be NFV’s “Killer App”?

One of the lessons of the current earnings cycle is that if you’re a network operator you probably see the profitable side of your future in almost purely mobile terms.  For the last ten years, mobile revenues have been strong where wireline has been under pressure.  Mobile infrastructure has benefitted from investment priority during that period, for the obvious reason that “I” follows “R”.  Thus, it’s not surprising that we hear a lot about mobile in general, and 5G in particular.

Hype around 5G is inevitable, in no small part because the media loves these linear generational advances—they’re so easy to write about and there’s usually so much promise in each step.  How much better 5G will be than 4G is harder to say given that many didn’t see all that much difference between 4G and 3G.  Fortunately I’m not proposing to talk about the improvement nuances, just the technology impacts and opportunities.

As I noted in a past blog, the easiest place to apply new technology is where you’re redoing things anyway.  5G doesn’t directly pull through things like SDN or NFV but it does provide a fertile growth medium for them—money.  If 5G changes network traffic or configuration in a significant way, then there’s a decent chance to rethink how networks are built and take advantage of the new ideas during the reinvestment.

For both SDN and NFV, in fact, mobile services offers the best path toward a revolutionary rate of adoption.  A massive 5G rollout (and Ericsson has almost two-dozen early-stage 5G-related deals already) would let either SDN or NFV achieve critical mass without any other drivers at all.  That has implications on the vendor space, obviously, because those vendors with natural positions in the 5G space would be better able to gain traction.

Nokia might be the poster-child for being in the right place at the right time.  The combined Nokia/Alcatel-Lucent entity is strongest in the mobile area.  Nokia got one of the six NFV solutions that could make a broad business case when they got Alcatel-Lucent.  They also got the best overall SDN product, and so if we presumed a fairly thorough new-technology-driven remake of mobile infrastructure would come out of 5G, Nokia has a great opportunity to use that remake to advance itself to the leader in NFV and in operator use of SDN.

The fly in this ointment of sublime happiness (if you’re Nokia, at any rate) is the fact that the Alcatel and Lucent parts of Alcatel-Lucent never really came together right, and adding Nokia into the mix probably didn’t grease any of the old pathways to cooperation.  There’s also the question of whether Nokia is prepared to be aggressive in promoting next-gen architectures that could very well compete with switching and routing, the two pieces of Alcatel-Lucent that were making the most money.

Nokia, Ericsson, Huawei, Fujitsu, and other major mobile-infrastructure vendors provide some or all of the things needed for mobile infrastructure virtualization.  Obviously, having experience in the space and contact with the buyers gives these vendors some advantage, but not necessarily a decisive one.

At least some vendors think the 5G shift opens the door wider than just a single vendor.  Brocade, for example, is looking at 5G evolution as an opportunity to promote its (from Connectem) vEPC strategy.  It also hopes, I think to find a place for virtual routing.  Metaswitch’s Project Clearwater has long provided a virtualized IMS.  This week, ASOCS is demonstrating their cloud RAN at MWC, and Juniper and Affirmed are partnering to address the mobile infrastructure opportunity.

So far among vendors not already selling legacy mobile infrastructure, Brocade seems to be making the most direct play for a seat at the 5G table.  They’re positioning themselves as a non-aligned solution, meaning that they don’t drag in a bunch of RAN and other infrastructure elements when the deal should really be about EPC.  The offer all of the features you’d expect from vEPC, including separation of the data and control planes, horizontal scaling of components, agile deployment to locations where traffic patterns make sense, and so forth.

A fully virtualized 5G infrastructure makes a lot of sense, particularly if the operators in a given market area are under pressure to support roaming at little or no premium or if the operator has aspirations to support MVNOs or IoT.  It would also make sense in mixed mobile/CDN applications, in my view.  Given all these positive things, it’s tempting to see 5G mobile infrastructure as the Big Idea that carries through both SDN and NFV, and it may.  Or not.

It seems almost inevitable that cloud-hosting virtualized components of mobile infrastructure will in fact be part of 5G deployment, but while the cloud is a clear winner, SDN and NFV are more problematic.  The key vendors all have reasons not to make their offerings too dependent on either of these new technologies.  Do you want to force operators to trash current switching/routing?  Even if you don’t make the gear, the additional cost of writing it down will hurt your business case.  Do you want to demand NFV as the means of deploying and scaling when truth be told all the vendors of virtual mobile infrastructure can deploy without it?

This isn’t just a challenge for SDN or NFV proponents to face.  While you might not need either SDN or NFV to build 5G mobile infrastructure, you’ll darn sure reach a point where you wish you had a good implementation of them both.  This is a classic case of having to balance what you need in the future with what you have to displace or risk in the present.  If we make 5G networks too much like 4G, we’re all too likely to end up with a different RAN and not much different elsewhere, including the services to the users.

If 5G infrastructure is the key opportunity then it may promote more, and more constructive, populism in the NFV and SDN spaces.  Nobody thinks that mobile infrastructure is a single function, and yet most providers of SDN or NFV don’t have full-spectrum mobile stories.  That’s why Juniper aligned with Affirmed, and why Brocade has its own partner program built around its mobile story.  The challenge for these smaller players is that we’ve had NFV partnerships from the first and most of them are just a collection of vendors chanting “NFV!” in the direction of the nearest reporter.  Substantive partnership may be needed to provide a full solution, and that partnership may have to be built around a critical-mass central vendor to provide credibility.

MWC always generates a lot of mobile buzz, and so you could argue that this 5G stuff will pass and something else will end up leading the SDN/NFV charge in a couple weeks.  I don’t think so; remember that I’ve said for a year now that mobile infrastructure was one of the few credible paths to a full NFV deployment.  This is where the bucks will go, and so this is where changes in how they’re spent will be easiest to justify.  What we should be asking now is whether vendors who don’t have mobile assets to position but do have complete NFV solutions will have to think more about how they fit into the mobile deployment of the future, or risk being devalued.

Is NFV’s Slow Progress a Buyer Problem, or a Seller Problem?

I got up this morning and looked in the mirror and found that I wasn’t Bill Gates.  I checked my bank balance and investments and I wasn’t a billionaire.  There was no shiny new Corvette in my garage.  Here’s the question.  Does this mean that we’ve entered into an era of retail disillusionment, or does it mean that I just didn’t manage my own choices optimally?

The reason behind this opening is that I had an email from an NFV sales type (I get a lot of those).  The argument presented boiled down to the statement that we’d entered into the phase of NFV disillusionment, and the underlying problem was that the buyer insisted on making a business case when they weren’t really “businesses” in the classical sense.

I’ve trashed some sacred cows of next-gen networking and in particular SDN and NFV.  It’s not that I’m trying to ruin the technologies or the people who are trying to support them.  I’m trying to save them from themselves.  Sales, to be successful, cannot consist of convincing the buyer that they’re delusional because they want to know how they’ll financially justify the thing you want them to do.  NFV, or SDN, or the cloud, or anything else out there, is not going to be successful because we trick people into adopting it by exaggerating its benefits and hiding its faults.

For my friend in NFV sales, the truth is that operators are businesses and that they have the same financial demands on them that the NFV salesperson’s own company has.  They have to face the Street every quarter, and they have to make decisions that support the interests of their shareholders, because those are the people a company is directly accountable to, not the customer base or the industry or the economy.  So let’s get off the silliness here and focus on doing what we have to do to prove that network change to NFV or SDN or the cloud is actually a good idea.

If we look at the high level, the business case for NFV is going to be made through insightful application of software-driven automation of service and infrastructure processes.  The notion that operations costs can be reduced without radically reducing human interaction is baseless; that’s where most opex costs are.  So NFV isn’t about virtual functions or process hosting or any of that other stuff; those are just features.  It’s about software automation or it’s not going to get adopted.

Software automation comes in two flavors—lifecycle management of the services and lifecycle management of the service resources.  The reason these two are called out separately is that the former has historically involved managing the financial and contractual relationship between operator and customer, and the latter the management of the technology elements that are committed to fulfill that relationship functionally.  Today, we’d say the service lifecycle management stuff is an OSS/BSS function and the resource management stuff is an NMS function.

The mistake that my NFV sales friend makes, and that most NFV companies make, is that they think in terms of resources only—because they sell them.  NFV, they think, is a revolutionary way of building new services.  We’re in an age of revolutions—the Internet and the iPhone and so forth.  The operators are still in neutral while Apple and Google have the pedal to the metal.  Get with it (and buy my stuff).

Resource management accounts for a fraction of operations costs, about one sixth to be exact, across all carrier classes and in 2016.  The fundamental truth about NFV, or SDN, or anything else we want to propose as a means of transforming operators’ business models, is that there’s not enough on the table to make them transformative if you nail your efforts to the ground in the wasteland of network operations benefits or capital cost reductions.  In fact, I could make a transformation business case at the service lifecycle management level without changing a dollar’s worth of network hardware, more easily than by making changes to the network.

But you can do better by looking at lifecycle management across both services and resources.  You can do better by uniting SDN and NFV into a single technology shift, into a single revolution that is really a cloud revolution.  I said that way back in 2013 when NFV was just getting started, and it’s still true today.  I also sat down with a dozen operators and showed them a model-driven (what we’d call today an “intent-model” approach that did exactly that.  One lifecycle to rule them all.

Why don’t we have this kind of thing happening in NFV sales?  The simple answer is that box people are box people.  It’s not the carriers that are locked in the past, it’s the vendors.  They don’t want revolutions that impact their current sales if they’re network equipment vendors.  They don’t want revolutions that they have to sell to levels of operator management they don’t even call on today.  They don’t want long sales cycles when their management is pressing them relentlessly for success, and where some companies are seriously thinking of getting out of the space.

Why not the OSS/BSS people, then?  Part of the problem there is that there’s nothing more high-inertia in all of networking than OSS/BSS.  Our grandparents would probably recognize the concepts that are in play today.  We’re struggling in OSS/BSS to introduce concepts (like being event-driven) that have been a mainstay of enterprise application design for at least fifteen years.  The political processes inside OSS/BSS standardization (the TMF) make the US Congress look like a band of brothers.  All that’s bad enough, but it’s not the only point.

Our problem with NFV gets back to those bicameral lifecycle targets.  There are two pieces to the network of the future, just like there are of the network of the present.  Services and resources.  The interesting thing about them is that we consider our hoped-for next-generation advance as being a resource evolution even when the benefits we’re going after are predominantly on the services side.  Look at the TMF, who are driving toward next-gen operations under the umbrella of an NFV strategy when NFV needs operations revolutions more than the other way around.

What irks me here, besides the tendency for self-justification, is the fact that we have six vendors out there who actually can do lifecycle orchestration from top to bottom, who can unify the service and resource sides and who can make a business case by drawing on all the cost elements on the table, and add in service agility besides.  These vendors could show buyers right now, today, that an NFV transformation would meet all their goals, and could guide them on the right path.  Yet most of them never even attempt to do that.

And part of this is our fault, “our” meaning the media, consultants, analysts, and yes readers and consumers of industry news.  Are we just junkies for exciting stuff or are we trying to accomplish something here?  If it’s the former, let’s go to the fiction category in Kindle or something.  If it’s the latter, then let’s start pushing back on meaningless drivel about NFV and NFV standards and NFV open-source and ask the real questions about the NFV business case.  That’s what I’m going to do, and if that makes me a contributor to disillusionment in your eyes, I can send you a link to the Kindle fairy tale section.  Happy endings come automatically there.

What DOES Get Us to the Magic Optimum Number of New NFV Data Centers?

More and more people are realizing that the challenge for next-generation networking is basically getting enough of it to matter.  Whether we’re talking about replacing switches/routers with white boxes or hosted instances, we aren’t going to justify much excitement if we do that for perhaps two or three percent of the operators’ capital spending.  There has to be more, and how either SDN or NFV gets to a substantive level of deployment is critical to whether either technology can change networking.  We may still get “SDN” or “NFV”, but without something to drive large-scale deployment it’s a hobby not a network revolution.

I said in past blogs that an optimum NFV deployment would result in a hundred thousand new data centers, millions of new servers, and a vast change in service operations and network capex.  Are there pathways to that optimality?  Obviously I believe there is or I’d not raise the number as a best-case goal.  So today let’s look at what it would take to realize something close to optimality.  Remember that our goal is that optimum hundred-thousand new data centers!

In order for there to be a massive number of data centers deployed for NFV, there has to be a massive number of things to run in them.  A hundred thousand data centers globally would mean, roughly, one hundred for each major metro area or roughly 2.5 per current central office location.  Let’s use these numbers and work backward along various justification paths to see what might work.

Virtual CPE (vCPE) is one option, but the problem is that business customers are too thin a population to justify large-scale operator data center deployment based on virtualization of service-edge features.  There would obviously be plenty of residential customers, but the problem there is that residential real edge devices aren’t expensive enough to make displacing them a useful concept in most markets.  The only exception is video, the set-top box.

There are a lot of features associated with operator delivery of video, and many of these features (having to do with video on-demand catalogs, and even DVR if you don’t run into regulatory issues, could be cloud-hosted, which means they could justify data centers.  So our first hopeful path is virtualization of advanced video features, which could generate on the order of 40,000 data centers according to my model.  So our tally starts with 40,000.

Mobile infrastructure is another favored NFV target.  There are three elements of mobile infrastructure that are already being virtualized—the RAN, the Evolved Packet Core (EPC) and the core IMS and related service-layer elements.  If we were to virtualize the RAN (ASOCS has made some recent announcements in this space and they’ll be at MWC with a broad demo), as well as the IMS/EPC structures, my model says we could generate on the average 20 data centers per metro area to host all the functions, which is another 20,000 of the data centers.  That gets us up to 60,000, a bit over half of the optimum number.

And here it lies, unless we go beyond current thinking.  What could generate additional need for hosting?  Here are some candidates, with issues and potentials for each.

Number one is network operator cloud services.  Four or five years ago, network operators were telling me they thought they’d have about twenty-eight cloud data centers per metro area, which could have generated 28,000 data centers in itself.  This was when operators were more excited about the potential for cloud computing than for any other possible new monetization opportunity.  If we could count on cloud services we’d almost be at our optimum number, but there are issues.  Verizon just announced it was exiting the cloud, which while it doesn’t necessarily stall all operator momentum for cloud computing, certainly casts a long shadow.

The simple truth about carrier cloud is that it’s great if you already have NFV deployed and can take advantage of the automated tools and service-layer orchestration that NFV would bring.  It could even pull through NFV providing that operators were willing to bet on the cloud today.  Four years ago, for sure.  Today, probably not.  We can look to operator public cloud services down the line but not up front.

Unless we can use that cloud for something.  If we were to adopt contextual services we could build a cloud mission that creates incremental revenue and doesn’t mean immediately competing with Amazon or Google.  Contextual services are services offered primarily (but not exclusively) to mobile users for the purpose of giving them information in context, meaning integrated with their social, geographic, and activity framework.  It’s harder to model what contextual services could do, but my modeling shows anywhere between eight and twenty data centers per metro area could be justified.  That’s up to 20,000 cloud data centers worldwide, raising our total to 80,000.

The challenge with contextual services is that it’s got no PR, no snappy headlines.  On the other hand, we have IoT that has plenty of both, and in fact the biggest contributor to contextual services would be IoT.  If we combined the two, my model says we generate anywhere from twelve to 40 data centers per metro area, which gets us comfortably over the goal.  Allowing for inevitable reuse, my model says that this would hit 100,000.

So we can get to our 100,000 data centers and we’re done?  No, we still have to work in SDN and we have another big opportunity to address.  Suppose we did a virtual-wire grooming on top of agile optics to produce virtual-layer-1 subnets for everything except the Internet itself.  Applications in the clouds, all business services, everything.  We now host L2/L3 switching and routing instances for all these networks, at the virtual edge and/or in the virtual core, and we generate another forty data centers per metro area, which puts us way over.

We aren’t, of course.  When you do the math you find that as you add these applications/drivers together the data centers tend to combine in part, so while our total might approach 200,000 the actual optimum number based on traffic and population/user distributions is that magic hundred thousand.

The order of these drivers has an impact on the pace of NFV success.  Things like cloud computing and business service features can be deployed in a central data center within a metro, then dispersed as needed toward the edge.  This model eventually creates an optimum NFV deployment, but it takes a while because the economy-of-scale benefits of centralized hosting overcome, early on, the reduction in traffic hauling (“hairpinning”) that comes from edge hosting.  Other applications, particularly mobile infrastructure, tend to deploy edge-distributed data centers early on, and these then achieve reasonable economy of scale quickly.  That favors edge distribution of hosting, which enables other applications (like contextual services and in particular IoT) that favor short network paths.

With the exception of business vCPE and residential non-video CPE, any of these applications would be enough to build sufficient NFV scale and functionality (presuming they’re rationally implemented) to get a strong start.  Even vCPE could play a role in getting functional NFV started, providing that the vCPE story built to a true NFV implementation that could make a broader business case.  So this isn’t hopeless by any means.

So why are we starting to see so many negative signs (and trust me, we’ll see more in the next three or four months)?  The answer is that we’ve been trying to get a full NFV story from the minimalist-est of minimalist starting points.  You can’t get to the right place that way.  At some point we have to pay our NFV dues, if we want NFV to pay.