Red Hat, Dell, HPE, VMware, and the Future of Networking

Competition among vendors always generates interesting changes, and we may be seeing some more interesting than usual with the announcements of Red Hat and VMware/Dell over the last week or so.  Not only is there the usual tension among competitive products, but this time some tension between a vendor who offers open-source software platforms and one that offers both hardware and software.  The announcements have hit a lot of market hot-buttons, and it’s worth looking at each to see what’s likely to happen in the next couple years.

Red Hat’s Summit 2018 event was obviously a major focus for the company, and an organizer of its news.  The opening comment in their summary video is evocative: “Everything we’ve done for the last 16 years has been to just get us to this point.”  Certainly open source software has come into its own, but just as certainly it’s really now just getting started.  Red Hat is betting on a future where hardware is just the commodity junk you need to run software on.

The key points Red Hat seemed to emphasize in their promotional material and after-show summary were hybrid cloud, Ansible, the union of CoreOS and RHEL, and virtual networking.  The advantage they had in their May positioning (besides the show format) is that their model is inherently partnership-based, and they could let partners do a lot of heavy lifting.  That could also be a disadvantage, as we’ll see.

Dell and VMware are wrestling with that trend, but also recognizing that there has always been an advantage for providers who offer a buyer a total solution, meaning one that includes software and hardware.  Not only does that reduce buyer integration concerns and accusations of finger-pointing among vendors, it allows more sales focus on an account because the full-service vendor gets more money to justify the effort.

Dell and VMware had their own show schedules, all in the same general period, but I think that the fact that the two were separate made it harder for both units to get the most of their messaging.  Of the two, the VMware show seems likely to be the most important because of the “Virtual Cloud Network” story that I’ve already blogged about.

On the enterprise side, the focus of competition was clearly hybrid cloud, for what I think is the blindingly (frustratingly) obvious reason that no enterprise is going to go cloudless or totally to the cloud.  This was never in doubt, as my blogs about the issue have said for years, so what’s really going on now is a race to adopt specific measures to facilitate hybridization and along the way protect your own incumbent position in the market.

This is an easier position for Red Hat to support because they have the advantage of having fairly broad position.  OpenShift, which is Red Hat’s container platform, is going to be integrated with the CoreOS stuff that is increasingly popular for large-scale deployments.  Red Hat also supports OpenStack, of course, and it made a point of pushing the Ansible DevOps solution integrated with Red Hat CloudForms, which is a multi-cloud management framework.

At their event, Red Hat announced OpenShift deals with IBM and Microsoft, and they’ve also jumped on Kubernetes and containers.  It’s very clear that Red Hat sees partnerships as a pathway to touching all the right spots in the market very quickly, and it’s a good strategy because it offers Red Hat a position where it matters, when it matters.  The only problem is that it diffuses the marketing through many channels and stories.

Despite the fact that Red Hat has a natural lead in this space, VMware isn’t any slouch here.  They announced a new version of both vSphere and vSAN, with the avowed mission of supporting seemingly everything, everywhere.  It’s not really hype, either, because VMware has deliberately defined umbrella technology to extend from its own virtualization incumbency into the cloud.  That unifies VMware positioning, and that could be important in a market that seems to want direction and confidence most of all.

The other hot spot for both companies is carrier cloud.  What’s at stake here is what my model has consistently predicted would be the largest single source of new data centers through 2040, and the largest source of edge computing deployments period.  Here again, Red Hat seems to have the natural edge.  Operators are falling more in love with open source every day, and who’s that if it’s not Red Hat?  Then there’s the fact that Red Hat was at least engaged in the early NFV work, even supplying the platform for the CloudNFV stuff I launched back in 2013 (Dell, interestingly, provided the hardware and the test/integration lab to run it).

The Red Hat carrier cloud approach is subtle, perhaps too much so.  The focus is more on the “cloud” side than on the carrier side.  Nokia/Nuage and Accenture did a nice presentation on SD-WAN automation, but Red Hat didn’t feature it and I only saw it because an Accenture contact sent it to me.  It was a strong story in that it named all the hit-buttons of operators today, and I think Red Hat could have benefitted from making it more of a centerpiece of strategy.

One particularly interesting (and subtle) Red Hat initiative is its OpenShift Cloud Functions, a hostable event and functional computing platform that could be incorporated into carrier cloud and hybrid clouds, extending current cloud provider functional computing offerings.  Red Hat asks if this is the “next phase of cloud-native development”, likely knowing that cloud-native edge computing and event processing are keys to most carrier cloud initiatives.  But they need to make their story stronger, more direct.

VMware obviously wants to be strong in the carrier cloud space, but they are also going at the mission with perhaps too much subtlety.  Their centerpiece is the Virtual Cloud Network, which earns a highlight position on their website’s homepage.  As I noted in my blog on the topic last week, this is a solid effort to frame logical application networking from the server/application side.  That’s incredibly important for carrier cloud because every application driver needs it.

Red Hat doesn’t have a home-grown strategy for logical/application networking, which is one reason they should have highlighted the Nuage/Accenture pitch.  VMware could make a lot of hay with Virtual Cloud Network, perhaps even enough to overcome the fact that operators are less likely to see VMware as a natural open platform for carrier cloud than Red Hat.

You might wonder about HPE in this mix, and frankly I wonder too.  In one sense, HPE has the same challenge as Dell, with a hardware business to defend when the world is really mostly about software.  But HPE was a leading player in the early days of NFV.  They fumbled their opportunity and rather than dusting themselves off, they seem to have forgotten about the carrier cloud space.  Today, they have fewer mentions among operator planners than the other two competitors we’re covering in this blog.  They do have a good hybrid cloud management tool, but they don’t seem to be driving toward a unified hybrid solution.

Is there some central truth driving this?  I think so, and it’s unity of virtualization.  You can’t virtualize one thing in the IT universe and hope to come out with an optimum outcome.  It’s all or nothing, and we are heading toward a universal facing of that truth.  Hybrid cloud, multi-cloud, carrier cloud, edge, core, compute, networking, software—everything has to be virtual, and it will be.  Things like Mesos and DC/OS, which represent virtualizing infrastructure in a more general way, would seem to me to be the high ground on one side, and virtual/cloud/logical networking would be the other.  No major player has a solid footing in the former, but VMware seems more determined than others to seize the latter.

Which leads to the competitive question of who’s winning here, and there’s no clear answer yet.  Right now, I’d say that VMware’s Virtual Cloud Networking concept is the best productized embodiment of the critical unity-of-virtualization shift, among the major vendors.  Red Hat should have nailed something down on its own, but what they’d need to do now is to create a virtual network ecosystem and try to draw in players besides Nokia/Nuage, who probably wouldn’t want to put all their carrier eggs in the Red Hat basket at this point.  HPE has a greenfield opportunity, but they might be growing only weeds.

The deciding factor in this competition, and perhaps in the whole of the hardware/software debate, might be Dell.  They could create a stronger symbiosis with VMware, but of course doing that could erode their power to play with other software giants, including Red Hat.  If they decide that independence is a stronger play, it’s my view that they’re saying that hardware cannot, in the long run, sustain itself against commoditization pressure.  Software wins.

What does that do for innovation?  Software is more agile, more feature-rich, but we’re also seeing a push toward open-source on the software side.  Part of the problem with NFV is the desire of vendors to earn as much licensing VNFs as they did on the appliances the VNFs were based on.  Operators are already pushing for things like ONAP, partly because they embody the goals of things like NFV better, but also partly because they distrust vendors.  Can we innovate with open-source?  Some say Linux is proof we can, but Linux is an open-source implementation of the POSIX APIs.  So is POSIX proof of innovation?  That’s based on UNIX.  The fact that we can trace the roots back so far may tell us that it’s the business model of open source that’s innovative, not necessarily the technology.

In the end, the buyer will decide.  Operators are shifting their focus to open source, but they’re not driving the projects effectively.  Enterprises are still more willing to accept proprietary solutions, particularly for vertical-market software.  Open source, open hardware, will be undergoing a trial of fire in the next couple of years, and what emerges will probably define not only whether Red Hat or VMware wins, but what our industry will end up looking like in a decade.

Making NFV as Good, and Cloud-Centric, as It Should Be

I don’t think NFV will ever be what its proponents hope it will be, but I do think it can be better.  Here’s the big question that the Network Functions Virtualization (NFV) initiative has to answer.  This is a cloud project, so why not simply adopt cloud technology?  That didn’t happen, so we have to ask just what makes NFV different from the cloud.  Is it enough to justify an approach only loosely linked with the cloud and virtualization mainstream?  In one case, I think the specs have undershot the cloud’s potential, and in another I think it’s cloudier than it should be.

The stated mission of the NFV Industry Specification Group was to identify specifications that could guide the deployment of hosted virtualized network functions (VNFs) that would replace “physical network functions”, meaning purpose-built devices.  This mission is a mixture of traditional cloud stuff and traditional network stuff.  What we need to do is to look at the issues with VNF deployment and whether they fit reasonably within the cloud model or require some significant extension.

To start with, there is no inherent reason why hosting any feature of a service would be any different than hosting a component of an application, in deployment terms.  The process of deploying and redeploying VNFs should be considered an application of cloud or container software and the associated orchestration tools, like Kubernetes or Marathon.  Thus, if we look at deploying VNFs, it seems we should be simply handing that process off to cloud tools with as little intervention as possible.

On the network-related side of deployment, there are two principle differences.  First, applications are not normally data-plane forwarding elements.  That means that there are potential hardware and software optimizations that would be appropriate for VNFs and not as likely to be needed (but not impossible) for the cloud.  These can be accommodated with cloud tools, though.  Second, nearly all VNFs require operational parameterization and ongoing management, which is fairly rare with applications.  Standard cloud facilities don’t offer these capabilities in the form that VNFs would expect.

The network part of the VNF picture is critical in the sense that networks are cooperative communities of devices that have somewhat broadly controllable individual behavior.  The system has to be functional as a system, and that’s complicated by the fact that making the system functional is partly a matter of adaptive behavior within the system and partly a function of remediation outside it.

Moving up the ladder from basic deployment, the unit of deployment in the cloud is an application, which is a combination of components united by a workflow and serving a community of users.  The presumption in the cloud is that the application is an arbitrary functional collection owned by the user, so there’s no expectation that you’d stamp a bunch of the same application out with a functional cookie cutter.  On the network side, the unit of functionality is the service, and a service is a set of capabilities provided on a per-user basis (leaving the Internet out of the discussion).  It is essential that a given “service” be offered in a specific form and, once ordered, deploy to conform with the specifications, which include a service-level agreement (SLA).

It’s my view that this point is what introduces the real need for “orchestration” in NFV, as something differing from the comparable tools/capabilities in the cloud.  However, the way to approach it is to say that a service is modeled, and that when it is ordered, the model is used to deploy it.  Once deployed, the same model is used for lifecycle management.  Given this, the first step that NFV should have taken was to define a service model approach.  The TMF has a combination of models of various types, called “SID”, and while I don’t think it’s structured to do what’s needed in an optimal way, it at least is a model-driven approach.  The NFV ISG hasn’t defined a specific model, and thus a means of model-driven lifecycle management.  That’s where it falls short of cloud evolution.

One thing that makes commentary on service modeling so important is that operators now recognize that this should have been the foundation of transformation all along.  Another thing is that service modeling is a potential point of convergence for the cloud and network sides of our story.

Properly constructed service models do two important things.  First, they show the dependencies between component features of a service.  That lets faults propagate as they must.  Second, they provide a basis for ongoing lifecycle management by retaining state information per component and acting as a conduit for events into a service process.  These missions were recognized by the TMF’s work, including both SID and NGOSS Contract.  They’re critical for service automation, but if you think a moment you can see that they are likely to become critical for application automation as well.

Deployment of complex multi-component applications on virtual infrastructure is a whole new dimension in application deployment management.  The process isn’t exactly like service deployment in that you don’t have to stamp out thousands of instances of an application and manage them all independently, but it still benefits from automation.  DevOps and application orchestration interest proves that.  The cloud is coming to NFV’s rescue, in an orchestration sense.

Where NFV may be too cloudy, so to speak, is in the ever-popular concept of “service chaining”.  This particular notion has been a fixation in the NFV ISG from the first.  The idea is to create a service by chaining linear functions that are hosted independently, and that is a problem for two very specific reasons.

The first reason is that the application of service chaining is almost totally limited to “virtual CPE”, the use of hosted elements to replace customer-edge appliances.  We actually have some interest in vCPE today, but it’s in the form of general-purpose boxes that can edge-host custom-loaded features.  However, this mission is most valuable for business services whose terminating devices are fairly expensive.  That makes it a narrow opportunity.

The second reason is that if you really wanted to chain functions like that in a linear way, you’d almost surely want to construct a single software image with all the functions included.  Software transfer of data between functions isn’t going to require network connections or generate additional complexity.  A vCPE element with three functions has three hosts and two connecting pipes, five elements in all in addition to the input and output.  A single-image solution has none.

For those who say the chain is more reliable, remember that any break in any of the five elements breaks the connection, and that’s more likely to happen than a break in a single-image hosting point.  Just because there are more things that have to work.  The three-function chain also poses three times the integration problems, because each of the three functions will have to be converted into a VNF and deployed and managed.  There are more resources to manage, more operational complexity.

In the cloud, the dangers of over-componentizing are widely accepted and written about almost daily.  Service chaining is like the worst of over-componentization; it adds complexity and cost and provides nothing in return.  There was no reason to seize on this early on, and less reason to stay fixated on it today.

The ISG is looking at steps to take after its current mandate expires.  Two good ones would be to develop a specific NFV service model, based on a cloud standard like OASIS, and fit NFV orchestration and management to that model.  The other to step back from chaining services and toward the use of composed multi-VNF images to replace those chains.  If these two things were to be done, it would better align NFV with cloud thinking, and that should be the goal.

Have Operators Uncovered a New Transformation Path?

Is there a new model for telco transformation?  According to a Fierce Telecom piece, there may be, and the details in the article match very well with what I’ve been reporting from my own contacts with network operators.  The big question may be where this new model will take us, and what vendors, bodies, and concepts it will end up validating.

The leading technologies in transformation, according to popular wisdom, are SDN and NFV.  Operators have been griping about both in both conversations with me and surveys I’ve done.  SDN, they say, is too transformational in terms of the network technical model.  Central control of forwarding, the classic ONF OpenFlow SDN model, may or may not scale far enough and may or may not handle large-scale faults.  NFV, they say, is simply too complicated to be able to control operations costs, and its capital savings are problematic even without opex complications.

So what works?   BT, in the article, says “If you ask me what’s going to be the most transformational change in our industry it’s not SDN. It’s not NFV. It’s this model-based concept with telemetry and network automation. And yes, SDN and NFV are tools that help us with that, but actually they’re largely unexciting.”

Model-based networking uses a modeling language like OASIS TOSCA to describe a service or service element and frame how it’s deployed and managed.  Tools that understand the modeling language can then be used for service lifecycle automation.  The use of modeling also provides a framework into which vendors can be told to integrate, as a condition of procurement.

The article, and other stories told in public by operators, shows that the most critical step in transformation is really the standardization of service models. That paves the way for adoption of automated lifecycle management, but also the adoption of SDN and NFV.  That BT is cited in the piece as saying that SDN and NFV are “unimpressive”, “unexciting” and “still hard” to deploy is also proof that because SDN and NFV attempted to justify transformation on bottom-up technology shifts without operational support, they now depend on things outside their own control.

This is all about taking an “operations-first” approach, and that makes sense even based on my modeling.  That’s been showing (for years now) that you can get a larger reduction in service provider costs, a larger improvement in agility and efficiency, by service automation.  Given that service automation isn’t dependent on lower-level infrastructure upheavals, there’s a much lower implementation cost and higher ROI—20 times that of an SDN/NFV approach, in fact.

The impact of an operations-first approach to transformation could be profound, but it also presents challenges at many levels.  One obvious truth is that service and network operations have been historically supported in different organizations within the operator, the Operations group for the first and the OSS/BSS CIO group for the second.  Even operators who have established ad hoc teams to try to unify planning and adoption of wider-scope operations automation have struggled.

The new model would also tend to pit vendors against each other.  OSS/BSS players are often not major players in infrastructure—Ericsson, Huawei, and HPE are exceptions.  IP/Ethernet players like Cisco and Juniper are not typically recognized in OSS/BSS, and in fact aren’t always seen as leaders in the new area of service automation.

The standards bodies and open-source groups may also feel the pressure.  With the exception of OASIS TOSCA and perhaps (though not in my own view, Cisco and YANG) there’s not much model focus among major bodies and vendors.  Even ONAP, which is the best of the current open-source stuff, is behind in terms of the modeling and architecture needed to create effective, open, scalable infrastructure and services.  Most work is also confined to either the OSS/BSS or NMS/NOC space, and we really need to transform through virtualization, which demands we bridge that gap.

The tensions created by an operations-centric view of transformation could spread, too.  The article I cite and my own discussions with operators show increased interest in promoting an overlay-service model based on SDN/SD-WAN technology.  This is a kind of network-as-a-service virtualization that has some very specific benefits, and affinities with an operations-first approach.

One obvious benefit of SD-WAN for an operator is the ability to construct services that transit, in part, infrastructure of another operator, and even to mix technologies under the same service umbrella.  MSPs have built whole business models on managing services across operator boundaries, and operators could hardly miss the fact that they’re losing out on an opportunity.  Furthermore, since an SD-WAN presents service terminations and can monitor user QoS as well as transport SLAs, integrating it with transport management and service models would provide a better control over user experiences and lower customer care cost.

Service overlay networks are bad news for incumbents because they put the revenue-generating part of the operators’ business into a layer that they don’t control.  There is a huge difference between the services of networks and services over networks. The former means that what users pay for is a behavior of infrastructure, and the latter means that infrastructure is simply a delivery mechanism.  In effect, it makes traditional connection services into a form of over-the-top services.

What’s most interesting here is that SD-WAN vendors generally don’t have a story in service lifecycle automation, despite the logical connection it has with “service overlays”.  Nuage, a partner in BT’s SD-WAN plans, is quite different from the typical SD-WAN vendor; their roots lie more on the SDN side.  QoS Networks is another outlier here, a company that seems to have started as an MSP with custom SD-WAN tools and integrated operations as a major differentiator.  Will the traditional players catch the wave here?

That’s tough to say, in no small part because many of the traditional SD-WAN players (and even some of the outliers) are venture-financed and so are beholding to the interests of players who traditionally push for quick exits with little or no additional cost or risk.  Ironically, many SD-WAN players probably have the right assets, or at least have a set of APIs that could be exposed as part of a specific strategy to integrate the service layer SD-WAN creates with service lifecycle automation.  Thus, the real effort would be in positioning the stuff correctly.

Positioning may be the biggest place to look for VC pushback on vendors, though.  The specific problem is that operator SD-WAN opportunity is seen by VCs as a long slog, and enterprise positioning an easy lob.  Management and management/lifecycle automation are definitely of more interest to operators than to enterprises, so singing that tune would likely seem to VCs as an excursion away from an early and profitable exit.  Truth be told, it’s probably a faster exit opportunity, and perhaps even something necessary for survival.

Where the Evolution of Virtualization is Taking Orchestration

Virtualization has been around for a long time, and sometimes the pace of its advance has seemed dizzying.  The truth is that it’s probably just getting started, and so it’s important to look a little at where we came from and what’s new, to try to get a hint of what’s going to be even newer.

Virtualization started off with the notion that you could utilize servers more efficiently if you created “virtual machines” that were co-hosted on a single server but appeared to both applications and operations processes as real, independent, computers.  Nearly all the early applications of virtualization ended up inside a data center, but these still launched the notion of a virtual resource pool that had an indirect connection with the real server resources.

The next phase of virtualization was the cloud, which used the greater efficiency of virtual hosting to provide a commercial infrastructure-as-a-service offering.  IaaS took virtualization out of the data center, and it also exposed a clearer vision of what virtualization was.  A resource pool in the cloud is an abstraction of hosting that users think is real and that providers know has to be mapped and managed to resources.

Beyond IaaS we have both “hybrid cloud” and “multicloud”, which are a form of virtual resource pool that span not only servers and data centers, but also administrative domains.  This is where we are today in terms of virtualization, attempting to create a unified application-side vision of resources spread across multiple hosting domains.  My enthusiasm for Apache Mesos and the Mesosphere DC/OS stuff was generated by the steps that framework takes to accomplish this goal of unification of resource pools under a common virtual hosting model.

These developments in virtualization have been paralleled, in a sense, by developments in DevOps and orchestration.  Deployment of multi-component applications has been an increasing challenge for operations people because every component is in one sense an independent application, deployed under its own rules, and in another a part of an integrated workflow that has to be sustained as a whole to do its job.  DevOps in its early form presumed that development teams would, while structuring the components of an application, define the necessary deployment (and redeployment) steps.  Most early DevOps tools were really “scripting” tools that recorded operations steps to ensure accuracy.

The cloud changed this too.  One logical mission for public cloud services is to act as a backup or incremental resource in which to add application component instances under load or during a failure.  Scaling and “hot replacement” represent a more complicated mission than deployment or redeployment, and so the notion of “DevOps” as a simple development/operations handoff evolved to what’s now more commonly called “orchestration”, the idea of making a bunch of separate elements sing the same tune in the right key through the entire number.

As orchestration evolved to deal with more complexity, the whole notion of “deployment” was demonstrating it was a moving target.  An application running on an elastic, distributed, resource pool and taking advantage of scaling and failover, is a dynamic system.  The goal isn’t just to deploy it, but to manage its complex lifecycle, a lifecycle in which pieces appear, disappear, and are replaced as needed.  Application lifecycle management was a term already used to reflect the development lifecycle, and in the cloud provider and network operator spaces these new challenges emerged as services got more complex, so I like the term “service lifecycle management” to describe the process of sustaining the services of application software in a virtual world.

The state of virtualization, stated in terms of current goals, is a combination of creating a true abstraction of hosting that can be spread uniformly over all kinds of resources, and creating a model of service lifecycle management that operates on that abstraction.  We’re only now starting to recognize that these two goals are important, and will inevitably have to be met to make virtualization as useful as everyone wants it to be.  It’s this recognition that will drive the future of virtualization.

It doesn’t take an expert to understand that the goal here is a three-layer structure.  At the top are the applications, components, and hosted features that combine to create a user/worker experience.  At the bottom are the resources needed to run them, including servers, databases, and network connectivity.  Virtualization is the layer in the middle.  This view is useful, because it demonstrates that what we call “orchestration” may be more complicated than we think.

Today, orchestration and DevOps cross that middle layer, meaning that the focus of service lifecycle management is to create a binding to resources at deployment and redeployment, but to link the resource layer explicitly to the application/service function process.  That linkage makes it harder to make these processes truly independent.  Might it be better to keep applications and services above, and resources below?

The problem with this is that it may collide with the notion of universal resources.  If the application has to run on anything, then the orchestration processes have to be able to deal with anything.  Is the specialization needed to accommodate hybrid cloud and multicloud appropriate to what’s supposed to be an application-layer orchestration process?

The complication we face achieving a true abstraction layer is the fact that the abstraction layer of virtualization has to be harmonized two-directionally, and the harmonizations are interdependent.  A virtual host has to be linked to an application, but also to a real host.  Some of the dynamism that virtualization is aimed at could be achieved either by managing the abstraction-to-resource part or the abstraction-to-application part.  There are likely two “orchestrations” taking place, and how the work is balanced between them and how coordination is achieved is our next big issue.

Another way of thinking about the problem (the right way, IMHO) is saying that there is orchestration modeling taking place on both sides of the boundary, with some specific coordinating behavior between the two.  This could be done via a multi-layer modeling and orchestration function that simply assigns a boundary point.  Above the boundary, we call what’s happening “service” or “application” orchestration, and below we call it “resource orchestration”.  I’ve used this approach in my ExperiaSphere work, both in the original project and the new version.

The advantage of an elastic boundary supported by modeling on both sides is the elasticity.  We could define a particular “virtual resource” with properties that included both scaling and self-healing, in which case most of the complexity is pushed down into the service.  We could also define it more as it’s done today, where a virtual resource/host is something that can fail or be overloaded, and the application/service is then expected to remediate if it does.

Having a hierarchical model of the service/application/resource structure, with an elastic boundary that frames the point where remediation is expected to be applied, lets us either shift to a model of virtualization where resources self-heal, or to a model where application-specific mechanisms respond to faults and overloads.  That seems the best approach to take, in no small part because it reflects how we’ve been evolving virtualization all along.

Have We Forgotten a Key Piece of Service Lifecycle Automation?

We’ve all heard the talk about automation and opex reduction as means of improving both service and revenue per bit.  Part of the implicit goals in increasing operational efficiency is a shifting of some tasks to an automated form, but a bigger part has to come from shifting customer care responsibility more directly and efficiently to the customer.  That means that a portal is a critical piece of the story.

Customer care, meaning customer technical support and technical sales support, has exceeded network operations costs for the last five years, and the two pieces are growing about 40% faster.  Not only that, customer care overall has a significant impact on customer acquisition and retention costs, which are higher still and growing even faster.  If we could assume that we could project pre-sale and technical support through a portal, we could not only reduce the staffing requirements (in many operators already expensed through use of a third party) but also improve customer retention.

A couple years ago, I did a survey of consumer and business attitudes toward the handling of technical problems and questions.  I found, to nobody’s surprise, that about two-thirds of users reported “less than satisfied” with their experiences, and about a quarter reported themselves to be “very unsatisfied”.  Business and residential experience was similar, but with a tilt toward more unsatisfactory for consumers.  Current support isn’t particularly popular, and offshoring trends are the factor cited most often as the source of the problem.

It’s not just disgruntled customers, either.  It’s also fleeing customers.  A single negative support experience, even one that is considered very unsatisfactory, generates only about a year of angst, which will subside if nothing else happens.  Several experiences, particularly if they’re spaced about six months apart, generate a continuous negative attitude.  If this persists to the point where the service contract is up for renewal, customers of all types tend to look at competitive options, and if there is another choice at a comparable price, those with the “very unsatisfactory” view of support are three times as likely to jump ship as average.  All of this is why operators put customer care high on their list of priorities.

The goal of operators, which I’ve seen both in surveys and contacts and in real consulting, is establishing a portal that represents the totality of their relationship with customers.  The portal offers service order support, including marketing, technical support, and problem resolution.  Most operators also want to have a kind of status indicator, the classic green/yellow/red service state for each service, summarized upward by service type and eventually reaching a customer-level status indicator.  Some want information on periods of maintenance, planned upgrades, etc.

One of the specific challenges that service lifecycle automation runs into is that this customer care stuff is typically seen as an operations-level task, meaning that it’s related to the OSS/BSS systems.  Over the last five years, network operations and service operations have actually separated somewhat, with improvements in the latter creating a situation where the former area doesn’t necessarily have deep visibility into service resources and service state.  SDN and NFV, which today are largely being automated using a “virtual device” model that presents the status of logical features rather than physical elements, seems to be widening the gap.

There are two tasks here to address, then.  One is to create a series of service management APIs that allow non-technical inspection of and intervention into service behavior.  That has to be different from the capabilities offered to customer service and network operations personnel, usually “higher-level” meaning more translated into common language and more filtered against accidental errors.  The other task is to construct views of the underlying service/network data according to the needs of specific users inside and outside the operator organization, and the policies and regulations that govern the space.

My view has always been that the easiest way to get the duality of requirements noted above would be to apply the principles of an old (and now-obsolete) Internet RFC called “i2aex” which stands for Infrastructure to application exposure”.  The notion of i2aex was to use proxy functions to suck data out of MIBs for everything and record them in a database.  Queries would then be run on this database to produce management views.  Updates would be pushed through the reverse process, and policy filters would limit what different roles could see and do.  I called this “derived operations” in my 2012-early-2013 presentations, and I incorporated it in the original CloudNFV architecture.

One of the reasons for this kind of indirection is that customer access to management data in any form poses a risk of overloading the associated APIs.  The classic example is an outage, which causes every customer impacted by the problem to immediately look for status, which swamps the APIs that provide for monitoring and control, which prevents the NOC from taking action by creating what’s almost like a denial of service attack.

There are some open-source management tools that work this way, even if i2aex never really gained traction and acceptance.  Given the proposed role of analytics in network and service management, capacity planning, and other network operations roles, it seems logical to me that having all the data available in a nice time-stamped repository would be the logical solution to everyone’s problems.  Certainly it would make the portal process easier.

Management repositories like the one i2aex envisioned look like ordinary databases, which means that normal analytic tools and web front-end tools for data digestion, presentation, and even updating would work on them.  Every worker role, every customer role, and even every third-party role could be given a customized view of everything they’re entitled to see; “derived operations” in action.  It could make the presentation of customer care interfaces an easy web development task, fostering greater customization of the GUI depending on factors like the nature of the customer’s support contract, the skill of personnel, the use of third-party integrators or MSPs, and so forth.

Vendors, likely seeing a loss of management differentiation, haven’t been wild about the approach (though it was a Cisco employee who led the i2aex draft).  Even operators have been cool, with some saying they feared the impact of gathering all the telemetry from all the resources and functional elements.  They tend to change their view when you explain that 1) the new approach would eliminate the risk of management denial-of-service issues, 2) the use of analytical tools implies the same amount of access to the same information to have current network state information and trends available, and 3) the rate of access to management portals would be controllable if one element polled them where it’s not if everyone interested just takes their shot when it’s convenient.

Portals are useless if they’re as monolithic and rigid in features and functions as the underlying operations systems have been.  You need to have full visibility, but you also need to decide how you’re going to exploit it.  I’m not saying the notion of “derived operations” is the only solution, but it’s a solution that would obviously work.  If vendors have a different approach, they need to describe it.

The Right Way to Model Elements of NFV Infrastructure and Services

As I’m sure regular readers of this blog know, I don’t really like where the NFV ISG is today.  I do like some of the places it’s been.  One place is the notion of a kind of modular virtual infrastructure.  The concept of “virtual infrastructure” and its management (via, to no surprise, a VIM) has evolved within the ISG, but it’s the start of something important, which is a concept of modularity in carrier cloud.  In fact, it raises an interesting question about what “infrastructure” means in the age of the cloud.

Think of a resource pool as a collection of workers.  If every worker in your collection has different, specialized, skills then when you need to assign a task, you probably have one option only.  That’s losing the spirit of a “pool” of resources, right?  On the other hand, if the collection is made up of a bunch of general handy types, then you can use anyone for any task, and you have a useful pool to work with.  That has to be the goal with any pool concept, including carrier cloud.

When the NFV ISG got going, they saw “resources” as the virtual infrastructure, represented by a VIM.  They seem to have evolved a bit in their thinking, accepting that there might be multiple VIMs and even that “VIM” might be a special case of “infrastructure manager” or IM.  If we accept both these principles then we have a starting point on defining what carrier cloud needs.

Cloud computing a la OpenStack, or container computing using Docker Swarm or Kubernetes, defines the notion of resource clusters, which are effectively pools into which you can deploy things.  The presumption is that the resources in a cluster are interchangeable, which means that you don’t have to put in a lot of logic to decide what resource to pick.  NFV, of course, presumes that there are factors to determine what resource would fit best.  Those factors are generally related more to things that traditional resource assignment doesn’t look at, and in practice are likely to relate to factors about the location of the resource, ownership, connectivity, and so forth.  It would be convenient, and fairly valid, to say that all these factors could be applied above the cluster process.  Pick a cluster based on esoteric criteria, then let traditional software pick a host.

This, I think, opens a model for “virtual infrastructure management” and for creating a modular notion of carrier cloud hosting.  Ideally, a “virtual pod” might be created by combining servers, platform software, and local cluster control for deployment and redeployment.  This pod would be represented by a VIM, and the goal of the VIM would be to present the pod as an intent model for the function of “something-hosting”.  The “something” would represent any specialization or characterization of the capabilities within the pod.

In this approach, the VIM like any intent-model structure, would be a structure, a hierarchy.  This would allow selection of specific requirements to be made through the dissection of the requirements for the application or service overall.  The same VIM would provide a management interface in the form of input parameters and an SLA, and output status indications against the SLA accepted.  That would combine to make the pod self-contained; attach it as an option to a higher-level intent model element and it should harmonize management and deployment at the same time.

The generalized “IM” notion could work the same way.  Network services, meaning connection services, are obviously part of “infrastructure”, and so they could be presented as resources through an infrastructure manager.  That would be true whether they were created by hosting something on a server, or by using a community of cooperating devices, a traditional IP or Ethernet network.

The approach to building services up from “service infrastructure” could be expanded beyond hosting, even beyond network connections, to envelope all of the functional pieces.  One of the things I realized early in the world of NFV was that there were functional components of services that were not discretely provisioned in per-user, per-service, but shared.  IMS and EPC are examples of such functional components.  Every cell user, every call, doesn’t get a unique instance of IMS and EPC.  In effect, these elements are shared resources just like a server pool is, and so they should be represented by an IM and composable as a shared resource.  If we add in the ability to build “foundation services” (as I called them back in 2013) and then compose them into other services, we have a fairly good picture of what virtual infrastructure should be.

A logical approach to virtualized infrastructure is good, but not good enough.  If we really want to frame NFV and carrier cloud in an agile and easily integrated way, we need to think in terms of “containers”.  Yes, I mean that the cloud/Docker/Kubernetes concept of containers would be good, but I also mean a bit more.  A container in the broadest sense is a package that contains an application component and links it with its deployment instructions and parameterization.  Think of this for a moment in the context of virtual network functions (VNFs).  A VNF today tends to be a fairly random piece of software, which is why integration and onboarding are reported to be a pain.  We could fix that by defining a standard VNF container, something that linked to all that variability on the inside but presented a common set of interfaces to the outside world.

An intent model in the real world should be a “class model” that defines the general characteristics of a given model based on a specific feature, function, virtual device, or role.  In infrastructure, the ISG VIM model works as such a class model, but in the VNF world nobody has taken the time to define what kind of intent model structure should exist for a given feature/function.  For example, we probably should have a “superclass” of “data-path-function”, which could then be sub-defined as “firewall” and other categories, and then further defined based on vendor implementation.  If you want to sell a firewall VNF, then, it should be something that implements “firewall”, and management and deployment practices for anything that does that should then work for all the implementations.

I’ve talked about refining the intent-model notion by defining class structures for types of functions, and making this abstract notion into a container-like element would mean creating a software implementation of the abstract function.  Every vendor, having a responsibility to frame their own virtual function offering in one or more “intent classes” would then present a self-integrating element.  Best of all, this would be a totally open approach.

There are some players introducing at least a piece of this approach.  Apstra has intent-based infrastructure, for example, and EnterpriseWeb has a representational modeling approach that facilitates onboarding VNFs through a kind of “virtual container”.  Modeling languages/tools like those based on OASIS could also be used to define both the “virtual container” and “IM” pieces of the model, and also to structure the hierarchy that I think is essential to a complete implementation.

NFV has evolved under market pressure, but the pace of accepting early and obvious limitations seems much slower than necessary.  I’ve been a part of these discussions for five years now, and we’re still just coming around to the obvious.  Vendors may be moving faster than bodies, even than open-source groups.  Even with the disappointing progress of architectural thinking, we see some examples of new-think across the board.  This can all be done, in short, and if it were, it might help not only dealing with the NFV bugaboo of integration, but also get NFV thinkers thinking in the right direction about what virtualization means.

The First Application-Side View of Logical Networking Emerges

The first explicit example of how logical networking could change everything just came along.  VMware announced its own approach to that goal with Virtual Cloud Network, and it also demonstrates that SDN can be a player in logical networking as much as SD-WAN.  In fact, the speed at which VMware has jumped into the space could mean that SDN-and-data-center players will take an early lead.

VMware’s approach is based on the theme that “Apps and data are living in a lot of different places on top of virtual infrastructure”, which means that virtualization of resources is a critical driver in changing the perspective of networking to something more “logical”.  The old model, which I call “facility networking” because it networks sites, is being supplanted by a model of a “flexible, programmable, network fabric….”  This is exactly the right way to frame the transformation for a data center vendor; focus on what you’re already doing.

Virtual Cloud Network is a software-hosted fabric that creates a service/application network through NSX-based overlay technology.  It can follow a resource anywhere, including public cloud, multi-cloud, private data center, edge computing, feature/NFV hosting, vendor hosting, device hosting, you name it.  The idea is to have a software-based agent element that can be incorporated in anything that offers data or feature hosting.

Just because VMware is focusing Virtual Cloud Network on what VMware does, which is host stuff and connect data center resources, doesn’t mean it’s not an advance.  They are taking a very bold an important step by making application networks (which are “logical networks” in my terms) explicit.  Everything gets connected by an application network, and because applications are the information hub of everything, that network defines the access to all information, which is what VMware means when they say they offer “intrinsic security”.  Data center connectivity is enhanced with Virtual Cloud Network, including new public cloud partner support, explicit union of “cloud hosting” and “virtual function hosting”, and container and microservice support.

SD-WAN isn’t out of the picture here.  The VeloCloud SD-WAN is a big part of the story, extending VMware’s application networks to branch locations.  SDN has usually been confined to switching applications within the data center.  VMware Workspace One provides what’s effectively a virtual/logical device instance management framework for worker access.  However, these SD-WAN features fall short of managing the users as logical elements of the network; Virtual Cloud Network is still more about the data center.

I don’t think it’s going to stay that way, and I don’t think VMware believes it will either.  This is going to be an expanding ecosystem that will eventually integrate most of what we’d consider policy-based network and application management, access control, service and API registration, and above all role, group, and logical user addressing.

One reason this is true is that Virtual Cloud Network is not very far from the Nokia/Nuage SDN solution in terms of capabilities.  Another is that there are already SD-WAN vendors (128 Technology, for example) that go a very long way toward uniform logical networking, albeit starting more from the user side.  In fact, what we have in the logical networking space today is a clear two-faced solution—some players are facing outward from the data center and focusing on data center incumbency for their sales strategy, and some are looking inward from the branch and the user, which is a more network-centric sales approach.  The two faces almost inevitably converge under pressure to enhance value and create differentiation.

Another convergence-supporting force is the “WAN” side of SD-WAN.  There are two major constituencies in the network space, one the buyer of service and the other the provider.  Both of them are looking for virtual private networks of a different kind than they have today—the old MPLS VPN.  The problem for users is that traditional VPNs have a very high cost, in part because the VPN service is delivered through an Ethernet connection and is expensive, and in part because the user is typically required to have a fairly high-touch router termination.  The sellers, the network operators, are currently facing about two-thirds of their capex and opex on the Ethernet/IP layer, and they want to offer business services that are affordable for small/branch locations.

Some pieces of VMware’s Virtual Cloud Network (in particular VeloCloud) aims at that network-centric target, and at the same time ties in (at least in positioning terms) things like IoT as a promise for further, thinner, deployment of resources.  That lets them position what’s essentially application networking as something getting closer and closer to the user, and thus to subduct logical user management, including mobility, into application networks.  Clever move.

The VMware cleverness serves notice in two directions.  First, other hosting players (notably HPE) will have to start thinking hard about their own unified, cloud-centric, logical-network approach.  Otherwise, they’ll get left behind.  Second, the network-centric players in both the SDN space and the SD-WAN space will have to quickly expand their own thinking much more in the application direction or risk being pushed out of the unified logical-networking market as it develops.

The biggest question in this two-dimensional competitive challenge is what M&A will come out of it.  Simple partnerships are great to cement an early position in a market that might develop quickly, but they risk having a partner snapped up by a competitor, and at the least they share the wealth too much.  Acquisitions would be the most favorable approach.

VMware has its own challenge, too.  Within its own base, there’s no question that Virtual Cloud Networking will look not only credible but perhaps even compelling.  Outside its base, not so much, and VMware hasn’t set the world on fire in its positioning.  They tend to go for the classical boil-the-ocean scattershot of features and capabilities, which is hard to sell to senior management.  They have a lot of good stuff, but they need to sing better (how many vendors have I said that about, I wonder?)

Then there’s NFV.  On the plus side, VMware is the first vendor to position a common platform for the cloud, enterprise networking, and NFV.  Since there is zero chance of NFV amounting to anything at all without this kind of combined positioning, that’s a good thing.  However, NFV is probably in the pitch to cement a role for VMware in carrier cloud, where the company has been a non-starter.  If that’s the goal, then it’s not going to serve them well, because strictly by-the-book ETSI-NFV-ISG-flavored NFV isn’t likely to do anything in terms of real deployment for years to come, if ever.

That creates the really big problem for this VMware announcement.  Carriers are the long-term provider of choice for SD-WAN and logical networking.  Cloud providers are in second place.  The latter don’t care at all about NFV, and the former need something with some early revenue opportunities, which NFV won’t provide.  The VMware Virtual Cloud Networking model could work perfectly fine with other drivers of carrier cloud that will mature faster and go further.  That VMware didn’t grab onto them means that their competitors have a better story to tell—if they don’t make the same mistake themselves, of course.

The key point here is that there are going to be a lot of stories on logical networking, from a lot of different slants.  I didn’t expect to see someone like VMware step into it this early, but since they have it’s likely that other vendors will also accelerate their own offerings.  It could be an interesting summer.

The Hardware and Platform Requirements for Edge Computing

Suppose we do see edge computing.  What exactly do we end up seeing?  Is edge hosting just like cloud hosting, does it perhaps tilt a bit toward feature hosting or event processing?  If so, is the architecture needed, both hardware and software, likely to be different?  These are important questions for vendors, but no less important for the operators who are likely to be investing in edge deployment.  We have to answer them starting with the things we can count on most and moving from there to more speculative points.

One thing we know for sure is that the edge is unlikely to be just a small version of the cloud, based on exactly the same technologies.  The network edge isn’t a convenient place to build economy of scale because it’s closest to the stuff that’s connected where the edge computing center is hosted.  Since proximity to the user/application is the value axiom of edge computing, you clearly can’t expect somebody to backhaul traffic for a hundred miles to get to an edge data center.  The edge cannot compete in resource efficiency with a deeper-hosted metro or regional cloud data center.

Event processing and caching are two application classes that fit an edge-host model.  Both of these could in theory be hosted on general-purpose servers, which means that standard Linux operating system distros and at least most middleware could be used.  Caching and ad targeting are fairly traditional in other ways, resembling something like web hosting, and so I think that standard container and VM software would likely be suitable.

When you move to event processing, the focus shifts.  Lambda or functional applications, microservices, and other stuff likely related to event-handling are typically very small units of code, so small that a VM would be wasted on one unless you ran something in it to further subdivide the resources and handle the scheduling of features in an efficient way.

The industry standard for microservice hosting is containers, and CoreOS (now Container Linux) is probably the most accepted of the server distros.  There’s a Lambda Linux designed to provide a pathway for logic off of Amazon’s Lambda service, but most operators would probably not want to tune their operating system down to be that specific.  A more general approach is represented by the Linux Foundation Akraino Edge Stack project, supported by Intel, Wind River, and a host of operators, including AT&T.  The key point for the foundation is the creation of an optimized platform for edge hosting, which includes containers.

The problem with containers is the management of the scheduling process.  Container deployment isn’t fast enough to make it ideal for hosting very short-lived processes.  Remember that event-driven functions are typically expected to load on demand.  One way to improve the handling of events in container systems is to rely on a lot of memory and forget the idea that you have to load on demand.  With that proviso, the Akraino approach seems to be the best overall path to a generalized software platform at the edge.

Another possibility that might be attractive if you can set aside the requirement for generalized software hosting is to forget Linux in favor of an embedded OS (remember QNX?).  This strips out most of the general-purpose elements of an OS and focuses instead on creating a very short path for handling network connections and short-duration tasks.  The problem you can have is that most embedded-control systems aren’t as flexible in terms of what they run.

Could we afford to have multiple hardware systems and software platforms in the edge?  It depends.  Edge hosting isn’t any different from regular cloud hosting in that economy of scale is an issue.  If you have five different edge platforms, each of them will have to be sized for the maximum design load, which will waste more capacity than if you had three, or even one, platforms.  Plus, the more specialized your platform the more difficult it will be to know how much resources you need there in the first place.

Rather than pick an embedded OS with totally foreign APIs and administration, it might be wise to opt for an embedded Linux distro.  Embedded Linux uses most of the standard POSIX APIs, which means that it’s likely to support a wider range of applications out of the box.  Wind River has a family of Linux products that include embedded systems, and their approach gives you a wider range of stuff you can run.  Red Hat also has an embedded-system version and toolkit that deal with specialized edge requirements pretty well.  Hardware vendors who offer embedded Linux will normally use one of these two.

I think that edge computing on a Linux platform is a given, and that it’s likely that the edge version would be optimized at least for containers and perhaps also for fast-path, low-latency, event handling.  This isn’t quite the same as network-optimized hardware and software because event processing takes place after the message has been received.  Edge computing, even for hosting VNFs (an application that I think exploits edge computing where available but won’t provide a decisive driver), requires fast task switching and minimal delays for inter-process communications (IPC).

An optimized Linux-container model seems the most appropriate software model for edge hosting, given that video delivery, ad delivery, and personalization applications are more likely to drive early deployments than event processing and IoT.  The hardware would likely resemble a modified form of the current multi-core, multi-chip boxes available from players like Dell and HPE, but with a lot more memory to increase the number of containers that could be hosted, reduce the time required to load processes, etc.  We’d also want considerable fast storage, likely solid-state drives, to keep content flowing.  Network performance would also be important because of the need to source large numbers of video streams.

I’m wondering if it’s possible that we might end up seeing what could be called “hybrid hosts”, where we had a multi-box tightly coupled cluster of devices forming a logical (or even real) server.  One device might be nothing more than a P4 flow switch based on a white box, another might be a specialized solid-state cache engine, and the last a compute platform.  How tightly coupled these elements would be depends on how fast any out-of-box connections could be made.  If really tight coupling were needed, a multi-chip box with elements for flow switching, caching, and computing might emerge.  This would be the true edge of the future.

How fast that unified model would emerge, or whether it would emerge at all, might well depend on the pace of growth of IoT.  Pure event-handling has relatively little need for flow switching and probably doesn’t have to cache as many event processes as a video cache would need to cache in content form.   If IoT and video/advertising share the driver, then multiple coupled boxes to form a virtual edge device is the more logical strategy.  The decision will have to be made by the IoT market, which so far has been way more hype than effective strategizing.

Can 5G Really Drive Edge Computing, and If So, Where?

Everyone is fascinated by the relationship between 5G and edge computing.  What’s perhaps the most fascinating thing about it is that we don’t really have much of a notion of how that relationship works, or even if it’s real.  5G is a technological evolution of wireless.  Edge computing is a technological evolution of the cloud.  All technology evolutions have to pass the benefit-sniff test, and at this moment it’s not clear if either 5G or edge computing can do that.  It’s not clear if they’re explicitly symbiotic either.  The good news is that they could be, and that both could be wildly successful if done right.

I used to read “Popular Science” as a boy, and I can remember an article on how nuclear weapons would be used to create huge underground tanks to hold fuel, water, and other stuff.  It was feasible, it was novel and interesting, but obviously it didn’t pass the benefit-sniff test because we don’t do it and never did.  The point is that it’s interesting and even fun to speculate about the way technology, networking, will change our lives, but to make it real in any given timeframe, we have to make it worthwhile financially.  We have to be sure that 5G and edge computing aren’t building nuclear-crater tanks.

A recent Wall Street report says that “A key factor in any 5G debate is the ability to support low-latency applications”, which is their justification for saying that 5G will re-architect the metro network to increase fiber capacity and reduce hops and will promote edge placement of computing.  The only thing that separates this from nuclear craters is the presumption that low-latency applications do have immediate credible benefits to reap, benefits to offset costs and risks.

Low-latency applications really mean IoT or other event-driven applications.  There’s certainly credibility for these applications, but there is still no real progress on creating meaningful opportunities that would justify a whole new wireless infrastructure.  Can we say what might?  Sure, just like I could have listed possible liquids (or even solids) that could be stored in a nuclear-bomb-built container.  The question isn’t possible application, it’s possible high-ROI application.

That’s the big problem for those who say that 5G will drive NFV or edge computing or IoT or whatever.  5G has applications to those things, which means that 5G could benefit if they happened on their own, and also that each of the things could benefit if 5G happened on its own.  However, linking two suppositional things doesn’t create certainty.  Something has to start the ball rolling, and there’s a big barrier that something has to cross.

The biggest barrier to the unity of 5G and edge computing for low-latency applications is managing the “first cost”.  Do operators spend hundreds of billions of dollars to deploy this wonderful mixture, and then sit back and hope somebody uses it and pays enough to make it revenue- and cash-flow-positive?  We all know that’s not going to happen, so there will have to be a credible service revenue return, and an acceptable risk in terms of first cost.

The logical pathway to achieving that goal depends a bit on geography, but it centers on what’s called “Non-Standalone” or NSA.  This is an overlay of 5G New Radio (NR) on 4G IMS/EPC, which means it doesn’t have any of the 5G Core features like slicing.  What this will do is let a couple of new 5G frequencies be opened, 5G handsets to be used, and a transition to 5G be created.  A somewhat similar model applies 5G millimeter wave to extend fiber-to-the-node as a means of providing cheaper broadband provisioning in urban and suburban areas.  That justifies new services and frequencies, but it’s not as much an on-ramp to full 5G as it is an on-ramp to IP streaming over linear TV.

The reason I’m linking these two is that as far as edge computing is concerned, 5G is only a potential driver (like pretty much everything else).  There are five others, as I said in a previous blog, but the most compelling in the near term is a combination of the linear-TV to streaming IP video transition, and the monetization of advanced video and advertising features that arise from that transition.  Both the 5G NSA and mm-wave options are possible drivers for the streaming transformation, which makes it more important than 5G alone.

Internet video is more dependent on caches, meaning content delivery networks (CDNs) than it is on Internet bandwidth inside the access network.  Streaming live video has to be managed differently because live viewing creates something that on-demand streaming doesn’t—a high probability for coincident viewing of the same material.  You can also predict, from viewer behavior, just how many users will likely stream a given show and where they’ll be when they do it.  Thus, it requires improved caching technology.  Most operators say that mobile video, as opposed to wireline, is less likely to be real-time streaming, but mobile video still needs caching control to accommodate the movement of the viewer through cells.  This kind of caching differs from traditional commercial CDN caching in that it’s inside the operator network rather than connected at the edge.

Streaming video encourages ad customization, and in fact AT&T mentioned the potential of improved ad targeting for revenue gains during its most recent earnings call.  Ad customization means not only better ad caching to avoid disruptions in viewing when switching from program to ad, or between ads, but also logic to serve the optimum ad for the viewer.  Combine this with the program caching issues of the last paragraph and you have the most credible near-term opportunity for edge computing.

What this means to me is that 5G, which for most reports means “5G Core” features like slicing, isn’t going to be a near-term driver for edge computing because it’s not going to happen in the near term.  Operators will not deploy edge computing now to support a possible 5G Core deployment in 2021 or 2022.  They would deploy it to support enhanced video caching and ad support to accommodate a shift from linear TV to streaming live IP video.

I also believe that IoT-driven 5G deployments, also often cited by supporters of the “5G drives the edge” strategy, are unlikely to impact edge computing in the near term, but have a better shot than 5G Core.  If 5G NR happens, and if there were to be a major onrush of new 5G-connected IoT elements, then you’d have a set of low-latency applications.  That’s two “ifs”, the first of which is credible if we define “near-term” as 2020, and the second of which has that now-familiar business-case and first-cost problem.

Edge computing can only be driven by things like streaming video or IoT.  Of those two, only the streaming video shift has any promise of delivering compelling benefits in the next two years.  5G in the form of the 5G/FTTN hybrid could drive streaming video, and that would then drive edge computing.  If we narrow our scope to say that 5G/FTTN, followed by 5G NSA, could make live TV a streaming-video lake, then we have created a compelling link to edge computing.  Without the streaming-video presumption, though, 5G is not going to get us to edge computing for at least four years, and even then it’s a guessing game based on whether IoT ever finds a realistic business model.  Till then, we’re digging hypothetical nuclear holes in the ground.