Where is the 5G Competitive Dynamic Going to Take Us?

If 5G is a vast and confusing set of technologies, it’s just as vast and confusing at the competitive level.  If you do a bit of Internet research on 5G you’ll find that a few key vendors seem to be offering all the collateral.  Ericsson tops the list, with Huawei and Nokia behind.  Below them, there’s an array of players and even standards groups that are hoping to ride the coattails of 5G onward to fame and fortune.  If buyers actually have all this to choose from, can we ever hope they’ll make a decision and deploy?  Can buyers tolerate a vendor lock-in with a big player, or can they find a path to building an open 5G network by summing a bunch of small parts?

Operators tell me that these two models of 5G, the one-stop-shop or the open federation, on the table, and while we don’t see this specific view much in articles and tutorials, it’s what is really going to drive 5G wherever it goes.  What we need to do then is to fit the news into the models.

5G is a big interactive collection.  Operators could make the whole question of assembling the pieces and having them work together easier by picking one vendor.  Ericsson, who has its own challenges in the revenue and profit area, is banking on 5G to make its numbers for years to come.  One of my contacts there tells me that in business planning sessions, 5G shows up as the big opportunity all the way through 2023.

Nokia is in a similar situation.  While the company has a broader IP network product line than Ericsson, it’s still facing the challenge of convincing buyers who see profit per bit falling that spending more on pushing bits is a smart approach.  One thing that’s been true about 5G for the last three years is that operators have put it into their long-term budget planning without much question.  That means that linking some equipment/software proposal to 5G gets it a seat at the financial table, at least.

Huawei can afford to be a bit more circumspect on 5G prospects because they are profitable and have the broadest product portfolio of any major network vendor.  Still, they see a risk in 5G because if the future does indeed depend on 5G for CFO credibility, then they have to play strongly there.  That’s a challenge for Huawei because they’ve historically focused on being the price leader more than the new-technology driver.

It’s tempting to look at the 5G market as a war between these three players, but here’s an operator comment (suitably sanitized) that demonstrates the first of several levels of complexity.  “5G is really three layers.  There’s the New Radio stuff that’s distinctly 5G.  There’s the virtual or slicing piece, which is a combination of mobility, subscriber management, and virtualization like NFV.  Then there’s everything from backhaul through the whole metro and cloud infrastructure, which isn’t 5G any more than wireline.”

The point this comment makes is that what’s distinctively 5G is really a veneer on a big onion.  A vendor who has an ecosystemic 5G solution (like Ericsson, Huawei, and Nokia) have to establish their 5G creds in a thin layer of technology, and then have that pull through a larger chunk of gear.  That reality intersects with our second level of complexity, which is that 5G probably doesn’t deploy all at once.

Almost everything that a user would see as a 5G benefit comes along through the New Radio portion of the standards.  The relationship between 5G and the handsets is largely established by the NR stuff too, and that combination has induced everyone to focus on NR first.  Then there’s the fact that a millimeter-wave portion of 5G NR can be combined with fiber to the node to create an alternative to DSL in wireline broadband delivery.  This 5G/FTTN hybrid could well end up deploying faster than the mobile part of 5G NR.

I want to win at 5G, says an ecosystemic vendor.  5G NR is first to deploy, hence I want to win in 5G NR.  However, the whole 5G ecosystem is the target, so I need a 5G architecture.  That’s why Ericsson, Huawei, and Nokia are taking such a broad 5G mission target at MWC.  They have to convince buyers that their first step has to consider the rest of the path, the ultimate future.  If they don’t, then an early win may not give the vendors much in the way of longer-term revenue growth.

So, are we in the battle of the 5G ecosystems?  Maybe not, because of the second pathway that 5G could take, which is the open approach.  Suppose the “5G ecosystem” were completely, effectively, defined?  Every part, piece, feature, box, software component established and put in its rightful place?  What we’d end up with is a war for best of breed for every one of the elements of 5G.  Many vendors who specialize in one technology area would have a shot.  Nobody would win it all.

The threat, from the perspective of the “open system 5G” vendors, is that same early-NR deployment.  There are very few credible NR players besides the main ecosystem vendors, so if operators start with NR and pick one of those few credible players, that gives the player a leg up on the whole ecosystem and threatens the open model of 5G.

This is what’s behind the notion of the whole “Open vRAN” initiative, a Cisco idea to break the tri-opoly of those big ecosystem NR players.  According to the press release I’ve linked to here, Cisco thinks that it’s time to collect all the various vRAN concepts into a single initiative and collaborate to develop that initiative and model as a viable NR option.  The NR whole would be greater than the sum of its disconnected parts, in short.

It’s obvious that this approach is hardly compatible with the goals of Ericsson, who is perhaps the most dependent of the ecosystem players on a big 5G success.  It’s obvious that this could represent a strain on the Cisco/Ericsson relationship, but the big truth is that it means that Cisco doesn’t want to rely on Ericsson’s 5G success to pull Cisco gear into a full 5G deployment.  If the Cisco/Ericsson partnership is seen as a focus issue, the move doesn’t signal Cisco’s withdrawal as much as it signals that Cisco doesn’t think there’s anything real to withdraw from.

The big question here is whether vRAN is a real option or just a competitive spoiler.  The “horse-designed-by-committee” comes to mind, particularly when you consider that open vRAN has a limited real benefit to operators if nothing broadly 5G-supportive comes along behind it.  Remember that current standards initiatives like SDN and NFV are incomplete with respect to a full 5G infrastructure mission, and that it’s difficult to even find an open tutorial on 5G much less open elements for sale.

On the spoiler side, it’s hardly unlike Cisco to either rain on a market parade if they don’t look good in the space, or work directly to stall competitors’ progress.  If they think the market might develop and don’t think Ericsson will get them enough of a piece, then an open strategy would be wise.  That’s particularly true if we assume that an open RAN or open 5G option would require the same kind of standardization and federation of interests that have been ineffective in making real headway with SDN or NFV.

But perhaps Cisco sees more, and perhaps most of all, they see a future telecom market that could be dominated by mobile-infrastructure incumbents.  If 5G is the way to the hearts of CFOs, then could the operator comments on 5G that I cited suggest that pulling wireline and wireless closer would favor incumbent wireless giants, among whom you could hardly count Cisco?  Opening wireless via vRAN might then be a very wise move indeed.

Is There a New and More Compelling SD-WAN Mission?

One candidate for the so-called “killer app” for SD-WAN is security, which is a popular but almost always vague claim for killer-app status for nearly everything.  In the case of SD-WAN, there are security benefits to be had, but they are a part of a larger benefit I’ve noted in past blogs.  SD-WAN could be a critical piece of the largely unrecognized function of virtualization-linked user/resource mapping.

Because everything on an IP network has to have an address, it follows that you can use the address of a source and destination to decide whether the two should be allowed to talk.  There are two problems with this nice approach, though.  First, you can’t be sure the address is authoritative, meaning that it really represents the originator of the packet.  Second, in a virtual world, the address of either source or destination can change, so if you set up access rules based on one relationship, it might be invalid in the next moment when an instance of the application redeployed.  SD-WAN could fix, or at least mitigate, both these points, but to see how we have to dig in a bit.

We see IP through the Internet, which we see as a vast “flat” address space where we can reach resources like websites and content.  IP is actually a lot more complicated than that, and in some of my recent blogs I’ve explained a bit about the public/private address spaces and the fact that there’s a basic tension between the “what” something is and the “where” (in a network sense) it can be found.  Virtualization complicates that fundamental tension by scaling things and moving them around.

In the world of resources, the what/where tension relates to things that, because they are resources, have to be addressable from the outside.  Most real IP networks are a combination of resources and resource consumers.  The consumer, user, side of the picture is very different from the resource side in that not only isn’t it necessary to address a user, it’s probably not desirable.

There you sit at your PC.  You’re doing nothing online, and if no background processes are taking place, it’s not doing anything online either.  You open a browser and you become a resource consumer, consuming the resource of a web server that provides you with your browser homepage, likely a search engine.  This relationship demands two IP addresses, one for the web server and one for you.

Long ago, Internet thinkers realized that it wasn’t really necessary for these two IP addresses to work the same.  The user device doesn’t have to be reached, only responded to.  Servers know who you are because you contacted them.  Thus, most Internet users have a private IP address that’s their location within their home network.  A network address translation (NAT) process prepends the home’s IP address, assigned by your ISP, to a message you send.  The web server responds to that address, which gets the response to your home gateway, which then unpacks the NATted address and gets it to you.  This same process is often used within branch offices and even workgroups within offices, because it saves on scarce public IPv4 addresses.  Without it, we’d have run out and gone to IPv6 long ago.

Modern container architectures, as I’ve noted in recent blogs, are also based on assigning private IP addresses within application subnetworks, then exposing the “public ports” in the corporate VPN space.  Inside a virtualized container-hosted resource pool, the actual host can be anywhere.  Again, as I’d noted, that means that you would expose selective interfaces, the ones intended for public access, but you’d have to map them to the proper host and load-balance them if there was scaling to be done.  Some of the internal interfaces, representing scalable components, would also have to be load-balanced.

In theory, we could also look at a movable user model.  A “real” user might access the company VPN via any number of on-ramps, each representing a facility or area in which the user might be operating, and also perhaps including a gateway to a mobile user.  We might envision a translation here—private subnet address maps to “virtual user address”.

The common thread here, a thread SD-WAN might exploit, is the notion of having “local” or private addresses used as on-ramps for either applications or users.  The thing they’re on-ramping to, of course, is another address space.  Most SD-WAN technology builds a uniform VPN address space by doing some sort of overlay on a combination of connection technologies, including MPLS VPNs and the Internet.  Since SD-WAN technology can sit in all business sites and many can also be deployed as a software agent in a cloud, it’s easy for SD-WAN to create an overlay VPN that can touch all users, all applications, all resources.

If you added in some logic, you could further enhance the mapping.  Users and application components could have “logical names” and these names could then be used to attach instances of the components to a load-balancer.  Logical names could also map real VPN addresses to instances hosted in private IP subnets or within a cloud provider.  For end users, you could assign users logical names that would link them to a user- or role-specific subnet wherever they enter the network.  This would require a sign-on process, of course.

If SD-WANs know enough to map at the logical level, they’d also know enough to do at least some rough validation of the addresses.  If a component is to be allowed to register for a given “logical” address, it could be required to offer a token to validate its right.  The same would be true, automatically, if you required user sign-ons.  Further, you could validate the virtual or local address being associated with a given logical address to see if that on-ramp was allowed to represent a given user or resource.  It’s not foolproof, but it’s a step forward.

Logical names don’t displace DNS servers and URLs.   Those tools are associated with the public addressing of resources.  The purpose of a logical name is to provide a linkage between a resource or user and that public-link portal point.  We can assume that a component knows what it is, and so can use its logical name to get an association with a private/public mapping, probably through what I’ve been calling a “proxy load balancer”.

In some sense, application logical names could be related to the notion of “roles”, popular in public cloud spaces like Microsoft’s Azure.  A role is a function that conveys access rights and that can be performed by multiple people.  The proper use of roles can facilitate the development of forwarding/firewall rules to limit application access, protecting applications from maverick users and even from maverick components or malware.  Roles would further improve security, and if you extended the notion of “roles” to software components, you could also make sure one software component didn’t try to link itself into a workflow it’s not entitled to participate in.

It should be clear that SD-WAN and SDN have a lot in common, and that SD-WAN vendors could incorporate some of the SDN features fairly easily.  That may be a critical requirement, in fact, if those vendors want to establish some differentiation and also want to be able to address the carrier SDN market.  Nokia/Nuage, who has arguably the best SDN offering out there, is also delivering SD-WAN solutions, and has probably the most carrier interest of all the players in the space.

Even clearer is the fact that while virtualization really depends on address mapping and remapping, we don’t pay nearly enough attention to the requirement.  The first thing any IP-based system should do is define how it manages addresses within itself, and how it makes itself visible through IP addresses to the outside.  SD-WAN could really codify this process, and then go on to provide a more complete implementation than we’ve had so far.  If that happens, we don’t need to look further for killer apps for SD-WAN.

Could/Should NFV Have Been Designed as a Container Application?

Could we have built NFV around containers?  Could we still do that?  That, in my view, is the most important question that NFV proponents should be asking.  We have a lot going on with NFV, but movement (as I’ve said before) is very different from progress.  If there’s a right way to do NFV, we should be assessing that way and promoting the things that are needed to adopt it.  If we’re doing things that don’t promote the right way, we’re worse than spinning our wheels, because we’re distracting the industry from the best course.

If you launch an architecture aimed at replacing purpose-built network devices with cloud-hosted technology (which is what NFV did), the logical place to start your thinking is the nature of that cloud-hosted technology.  What would the ideal hosting framework look like?  The best place to start is with the obvious visible properties that contribute to the business case.

NFV hosting has to be pervasive, meaning that we have to be able to define a lot of connected hosting points and allocate capacity to virtual network functions (VNFs) based on the specific needs of the VNF and the overall topology of service traffic.  We can’t make NFV work if the only hosting resource we have is a hyperconverged data center in some central location, like St. Louis in the US.

We also have to make NFV hosting highly efficient and agile.  The business case for NFV depends on being able to operate a hosted-function-based service for less than we’d spend on a service created by purpose-built devices.  We already know, and so do the operators, that capex savings alone will not justify NFV.

NFV infrastructure has to protect service tenants from misbehavior.  This is a bit of a delicate question, because the extent to which VNFs can misbehave depends on the rigor with which operators qualify them for deployment, and whether operators would allow service users to deploy their own.  Since we can’t answer for all future needs, we have to assume that we can tune the level of tenant isolation in NFV to suit current requirements.

These three baseline requirements point to a virtualization-based hosting ecosystem.  In my last couple of blogs, I’ve suggested that a key requirement for such an ecosystem is a universal, ecosystem-wide, model for deployment and redeployment.  We have to be able to deploy a function where we decide we need it to be, without having to vary our procedures according to where we put it.  We also need to be able to lifecycle-manage it in that spot, again using consistent processes.  This to me means that what NFV calls “NFVI” should be a pool of resources overlaid with a uniform virtualization software foundation.

The NFV ISG has kind of accepted this approach by saying that virtual machines and OpenStack are at least accepted if not preferred models for NFVI.  Is that best, though?  At the time (2013) when NFV gained structure, VMs were the best thing available.  Now, containers in general, and the DC/OS-and-Mesos model of container deployment in particular, are better.

Containers are more efficient in hosting small units of functionality because they don’t demand that every such unit have its own independent operating system, which VMs do.  This could be a critical benefit because you can host more VNFs on a given platform via containers than via VMs.  It’s particularly important if you follow the implicit model that the NFV ISG calls “service chaining”, where multiple serially connected functions hosted independently (and on different platforms) are connected as “virtual CPE”.

The problem with containers is that they don’t provide the same level of tenant isolation as virtual machines do.  Different VMs on the same host are almost ships in the night, sharing almost nothing except an explicitly bounded slice of the physical server.  Containers share the OS, and while OS partitioning is supposed to keep them separate, it is possible that someone might hack across a boundary.

The DC/OS solution to this is to allow a container system to run on VMs.  In fact, this capability is implicitly supported in containers (a container system could always run in any VM, even in the cloud), but operationalizing this sort of framework demands some special features, which the DC/OS software provides.  The features also support scaling to tens of thousands of nodes, which is part of our requirement for pervasive deployment.  Every node, data center, collection, or whatever you want to call it, is supported in the same way under DC/OS because it runs the same stuff.

One of the issues that I don’t think the NFV ISG ever really faced is that of managing the address space for NFV.  You cannot, consistent with security/isolation requirements, let all of the pieces of an NFV service sit grandly visible on the data plane of the service network.  My company VPN should not contain the addresses of my VNFs, period.  However, I have to address VNFs or I can’t connect them, so what address space do they use?

In container networking, this issue isn’t addressed consistently.  Docker basics don’t do much more than say that an application deploys within a private subnet as the default, though you can also deploy onto the “host” or customer/service address space, public IP addresses.  We need two things here.  First, we need to decide what address spaces really exist, and second, we need to decide how we’re going to network within and among them.

The logical model for VNF deployment says that each service (consisting, in principle, of multiple VNFs) lives in a private IP address space, and all the VNFs in the service have addresses in that space.  The classic home-network 196.168.1.x address space offers up to 255 addresses, but there are 256 Class A addresses available to use, so the space has 65 thousand plus addresses.  There are also 16 Class B spaces that have 4096 addresses per Class B, for over a million available addresses, and a single Class A that has almost 17 million available addresses.  The smart strategy would be to use Class A or B addresses to deploy “services” whose elements needed to be interconnected within a virtual device or intent model (meaning, connected so they’re invisible from the outside).  You would then translate the addresses of interfaces that had to be visible in the service address space to the user’s addresses, and those that had to be visible to operations to an operations space.  The latter space would be a Class A, probably.

When we deploy an NFV service, then, we’d be hosting the connected VNFs within a private address space, which you remember means that the addresses can be reused among services, since they don’t see the light of day.  Each such service would have a set of interfaces exposed to the “data plane” or service addresses, like the company VPN, and each would have a set of interfaces exposed to the control/operations plane of the NFV infrastructure and management framework, probably that private Class A.  If there’s a user management port that’s reflected from NFV VNFM, it would be translated into the user address space from the NFV address space.  If we refer to THIS FIGURE, some of the “Proxy Load Balancers” (PLBs) exposed by the NFV software from the private address space would go to the service data plane’s address space and some to the NFV/VNFM address space.

The “how” in addressing and networking that DC/OS defaults to is the overlay VPN model, which in this case is the Navstar plugin, but you could substitute other models like Juniper’s Contrail and (my favorite) Nokia/Nuage.  The nice thing about this approach is that it reduces the overhead issues that other container networking can add, which improves throughput, as well as managing addressing fully.

Some applications in NFV may either require VMs for security/performance reasons, or have a nature that doesn’t particularly match container benefits.  A good example of the “poor fit” problem is the hosting of the mobile-network features in IMS, MMS, and EPC.  With DC/OS you have the option of deploying on VMs “under” containers, and if you use a good SDN networking tool you can integrate container and VM deployments.

What would this do for NFV?  One thing I think is clear is that a framework like containers and DC/OS lets you visualize deployment and connectivity details explicitly rather than letting you go abstractly about notions of function hosting and management without any clear deployment model in mind.  For example, if we know that VNFs are hosted in containers, then we know what we need operationally to deploy and connect them.  When’s the last time you heard of a container application driven by an ETSI descriptor like a VNFD?  Instead of describing abstract stuff, we could have identified just what container deployment features were needed and what specific capabilities they had to provide.  Anything missing could be taken back to the open-source project that launched the container system (DC/OS in my example).  What wasn’t missing could be described and standardized with respect to use.

What’s more, this approach largely eliminates the “interoperability” issues of NFV.  Yes, there are still some practices that need to be harmonized, like how the NFV software exposes management interfaces for services or how local application ports for management (CLI or even Port 80 web interfaces) are used, but the issues are exposed as soon as you map to the presumed container structure, and so you can fix them.

Best of all?  This works.  Mesos and DC/OS have been proven in enormous deployments.  The technology comes from social networking, where scale of service and resources is already enormous.  We’re not inventing new hosting, new orchestration, new management.  There are no large-scale NFV-architected deployments today.

Could this be made to work for NFV?  Almost certainly ONAP could be adapted to deploy in this framework, and to deploy VNFs there too.  The question is whether the current course could be diverted toward a container-based model, given that there is a questionable business case for large-scale NFV deployment in any event.  I’ve said for years now that we should be looking at NFV only as an application of “carrier cloud”, and it is the recognition of the broader business of carrier cloud that will probably have to lead to container adoption.  Perhaps, to NFV success as well.

The “Inside” of Hosting Components and Features: Containers

Let us suppose that the goal of next-gen infrastructure (for operator services, cloud providers, or applications) is full virtualization of both application/service elements and hosting and connection resources.  Most would agree that this is at least a fair statement of the end-game.  The question, then, is what steps should be followed to achieve that goal.  The problem we have now is that bolded word—everyone is looking at some atomic change and saying it’s progress, when without a full roadmap we don’t know whether “movement” constitutes “progress” at all.

As is often the case, we can learn from the past if we bother to inspect it.  When virtualization came along, it quickly spawned two important things.  First, an overall model for deployment of software components that combined to form a service or application.  Such a model standardizes how deployment and redeployment works, which is essential if the operational processes are to be efficient and error-free.  Second, the notion of a “software-defined network” in the Nicira overlay SDN offering (since acquired by VMware and renamed “NSX”).

If you look at any cooperative system of software components, you find a very clear inside/outside structural model.  There are many components that need to exchange work among themselves, so you need some form of connectivity.  What you don’t need is for the interfaces involved in these internal exchanges to be accessible from the outside.  That would create a risk to security and stability that you’d have to spend to address.  What emerged, as early as the first OpenStack discussions on network connectivity, was the classic “subnet” model.

The subnet model says that you build our cooperative systems of software components using the private IP addresses, defined in RFC 1918 for IPv4.  Private IP addresses are shared by all, because they are not permitted outside the subnet.  You can’t send something out from a gateway to “192.168.1.1”, but you can address that entity on/within the subnet.  Home networks are built this way.

The subnet model means that those interfaces that are supposed to be accessed from outside, from a VPN or the Internet, have to be exposed explicitly by mapping an internal private address to an external public address.  Network Address Translation (NAT) is a common term for this; Amazon calls it “Elastic IP Addresses”.  This, as I noted in my last blog on the topic, is what creates the potential for the “Proxy Load-Balancer” as a kind of bridge between two worlds—the outside world that sees services and applications, and the inside world that sees resources and mappings.  Building the inside world is important, even critical.  We need consistency for efficiency, and that means consistency in how we can address subnet components and efficiency in how we deploy and redeploy, meaning the orchestration practices needed.

An IP subnet in its original form is made up of elements that are connected on a local-area network, meaning that they have Level 2 connectivity with each other.  If we translate this into the world of the cloud and cloud infrastructure, we’d have to make a choice between keeping scaling and redeployment of components confined to places where we had L2 connectivity (likely within a data center) or accepting extension of Level 2 across a WAN.  You either have to stay with the traditional subnetwork model (L2) or you have to provide a higher-level SDN-like networking tool (Nicira/NSX or another) to connect things.

Containers have arisen as a strategy to optimize virtualization by making the hosting resource pool as elastic and generic as possible.  One vendor (Mesosphere) describes this as a pets-versus-cattle approach to hosting.  With bare-metal fixed-resource hosting (pets), you fix something that breaks because it has individual value to you.  With true agile virtual hosting (cattle) you simply toss something that breaks because the herd is made up of indistinguishable substitutes.  The goal of container networking is to facilitate the transformation from pets to cattle.

Containers are ideal for applications and service features alike because they’re low-overhead.  The most popular container architecture is Docker, which is based on the subnet model I mentioned.  That offers the benefit of simplicity, but it complicates deployments where a service or application has to be broadly horizontally integrated with other services/applications.  The most popular Docker orchestration approach, Kubernetes, takes a different approach.  It mandates that you make all containers accessible to each other as a default, but how you do that is left to you.

Containers, Docker, and Kubernetes are in my view the foundation for resolving the first point I noted as a requirement for virtualization—a standard hosting framework.  It’s my view that the current leading approach to that, the natural successor to the container crown, is DC/OS.  This is an open-source project based on the Apache Mesos container framework, and designed to work not only on bare metal (where most containers are hosted) but also on virtual machines.

You deploy DC/OS on everything, and you then deploy all your applications and features on DC/OS.  It supports all the popular container models, and probably many you’ve never heard of, because it runs below them and abstracts all kinds of infrastructure to a common view.  DC/OS (in conjunction with Mesos) creates a kind of open virtual host, a layer that links resources to agile containers with orchestration and also introduces coordination across orchestrated frameworks.  With DC/OS, a data center is a virtual server in a very real sense.

One of the nice, smart, features of DC/OS and many other container systems is the emphasis on “services” delivered through a lightweight load-balancer.  This element can be used as the boundary between the logical and virtual network worlds, the link between internal private-network connectivity for containers and the public Internet or corporate VPN.  The load-balancer is an ever-visible, ever-addressable, symbol of the underlying resource pool and how it’s allocated to support applications.  Rehost something and you simply connect the new host to the same load balancer.  Scale something and the mechanism for work distribution is already there and already being addressed.

With DC/OS and either Kubernetes or Apache Marathon orchestration, you can deploy and connect everything “inside the boundary” of virtualized hosting.  Outside that boundary in the real world of networking, you still have other technologies to handle, including all the legacy pieces of IP and Ethernet and fiber and microwave.  Thus, DC/OS and similar technologies are all about the software-hosted side of networking, or the application side of corporate IT.  While containers answer all of the hosting problem and part of the networking problem, they really do the latter by reference, by suggestion.  The connectivity is outside DC/OS.

Applications are built from components, and networks from features.  To the extent that either are hosted (all of the former, and the software-defined or virtual-function part of the latter), then the DC/OS model is the leading-edge approach.  It’s what NFV should have assumed from the first as the hosting model for VNFs.  For those who (like me) see the future of zero-touch automation as an intent-model-driven process, it’s how software can create intent.  Inside the resource boundary, DC/OS lays out the picture of networks and services, but not connectivity.  What it does do is admit to the need for explicit software-defined connectivity.  It is probably what gets SDN (in some form) out of the data center and into the WAN, because services and applications will cross data center boundaries as they redeploy and scale.

There’s a basic principle at work here.  The goal is to make all IT and virtual-function infrastructure look like a virtual host.  Inside you have to support real server iron, virtual machines, or whatever else suits your basic model of secure tenant isolation.  At the next layer, you need to support any form of containers, and you need to be able to harmonize their operational (deployment and redeployment) models and support any useful mechanism to provide connectivity.  All of this has to be packaged so that it doesn’t take a half-a-stadium of MIT PhDs to run it.  This is the foundation of the future cloud, the future network, the future of applications.

DC/OS isn’t the total solution.  We still need to add on the networking aspect, for example.  It is a step in the right direction, and most of all it’s a demonstration of how far we’ve come from the simplistic and largely useless models of the past, and how far we still have to go.  Others are looking at the same problems, in different ways, and I’ll be exploring some of those and the issues they frame and solve, in later blogs.

Exploring or Virtual Future: Addressing Models in a Virtual World

This blog is going to start what will likely be a long-running series of blogs spread over several months, and digging into the details of next-generation applications, services, and network infrastructures.  There are many dimensions to that problem, and as is often the case, the industry has attacked it from the bottom.  As is also often the case, that’s not the best approach.  The future of both applications and networks is tied to virtualization, and virtualization is about…well…things that aren’t quite real at one level.  Those things, of course, have to be made real or networks become fiction, and so we really need to start our discussions of the future with an exploration of that boundary between the not-quite-real and the real.

I’ve noted in past blogs that the boundary between applications/services and resources is an important if hazy place.  In virtualization, resources are elastic and assigned dynamically, moving and scaling as required.  Since application elements are also dynamic, this means that the boundary between the two is almost tectonic in nature, with the same kind of possible negative consequences when things slip too much.

One particular issue that’s arising at the boundary is that of addressing.  Applications and services are built from components that have to be addressed in order to exchange information.  For decades, we’ve recognized that one of the issues in IP is the fact that an address has two different meanings, and virtualization is pulling that issue to the forefront.  It’s the addressing of application components and resource points that creates that boundary tension I’ve noted above.  If you can’t make addressing work, you can’t make any form of virtualization work either.

Let’s suppose you’re a user of an application we’ll call “A1”.  You would likely access that application through a URL, and we’ll assume the URL has the same name as the application.  That’s clearly a logical address, meaning that you don’t care where A1 is, only that you want to get to it.  When you click on the URL, a domain name server (DNS) decodes the URL name to an IP address, and this is where things go sideways.  The IP address identifies A1 as a network location, because when a packet is sent to A1 it has to go somewhere specific in the network.  We have transitioned, perhaps unknowingly, from a logical address reference (the URL) to a physical network reference (the IP address).

OK, you might be thinking, so what?  If our IP address is in fact the address of A1, we get packets to it regardless of how we split hairs on what the IP address represents.  And in the old days of fixed hosting of applications, that was indeed fine.  Think now of what happens with virtualization.  Let’s look at a couple of scenarios.

Scenario one is that the host where A1 lived fails, and we redeploy it to another host.  My conception of A1 doesn’t change, but now packets have to go to a different IP address to get to A1.  At the least, I have to update my DNS to reference it correctly.  I may also have to tell my client systems to “poison their DNS cache”, meaning stop using a saved address for A1 and go back to the DNS to get a new one.  I may also have to reflect the change in A1’s address in any firewall or forwarding rules that depended on the old address.  In short, I might have created a bunch of issues by solving a single problem of a failed host.

Scenario two is where so many users are hammering at A1 that I decide to scale it.  I now have two instances of A1, but my second instance will never get used because nobody knows about it.  So, I update the DNS, you say?  Sure, but in traditional DNS behavior that only redirects everything to the second instance.  I add a load balancer?  Sure, but that load balancer now has to be where the DNS points for everyone, and the load balancer then schedules the individual instances of A1.  I have the same issues as my first scenario as far as firewalls and forwarding and DNS updates.  I get the same issues again if I drop one instance and go back to my one-address-for-A1 model.

Not tired yet?  OK, imagine that in either of these scenarios, I’ve used the public cloud as an elastic resource.  I now have to expose a public cloud address of A1 through an elastic address or NAT translation of the IP address, because the cloud provider can’t be using my own IP address space directly.  Now not only does my IP address change, I have an outlier IP address (the one the public cloud exposed) that has to be routed to the cloud provider’s gateway with me, and that has to work with firewalls and forwarding rules too.

Let’s stop scenarios and talk instead of what we’d like.  It would be nice if the IP address space of a business contained only logical addresses.  A1 is A1 in IP address terms, whether it’s in or out of a cloud, whether it’s a single instance or a thousand of them, and whether it’s moved physically or not.  I’d like to be able to define an IP address space for my business users and their applications that would reflect access policies by address filtering.  I’d like to do the same on the resource side, so my hosted components could only talk with what they’re supposed to.  This is the challenge of virtualization in any form, meaning the cloud, containers, VMs, SD-WAN, whatever.

Mobile users pose challenges too, in a sense in the opposite direction.  A mobile user that moves among cells will move away from where their IP address is pointing, meaning that traffic would be routed to where they were when the address was assigned, not to where they currently are.  This problem has been addressed in mobile networks through the mobility management system and evolved packet core (MMS and EPC), which use tunnels that can be redirected to take traffic to where the user actually is without changing the address of the user.  This is a remedy few like, and there’s been constant interest in coming up with a better approach, even within 5G.

What approaches have been considered?  Here’s a high-level summary of things that might work, have worked, or maybe somebody hopes would work:

  1. Address translation. Almost everyone today uses “private IP addresses” in their home.  Their home networks assign addresses from a block (192.168.x.x) and these addresses are valid within the home.  If something has to be directed outside, onto the Internet, it’s translated into a “real” public IP address using Network Address Translation (NAT).
  2. DNS coordination. In my example above, a DNS converted “A1” as a URL to an IP address.  If we presumed that every resource addressable on an IP network was always addressed via a DNS, then updating the DNS record would always connect a user to a resource that moved.
  3. Double Mapping. A DNS record could translate to a “logical IP address” that was then translated into a location-specific address.  This has been proposed to solve the mobile user problem, since the user’s “second” address could be changed as the user moved about.
  4. Overlay address space. An overlay network, created by using tunnels to link “virtual routers” riding on a lower-level (presumably IP) network, could be used to create private forwarding rules that could be easily changed to accommodate moving applications and moving users.  Some SD-WAN vendors already do “logical addressing”.  Mobile networks use MMS/EPC, as already noted, for tunnel-based user addressing.
  5. Finally, the IETF has looked at the notion of using one address space (IP, in most cases) over what they call a “non-broadcast multi-access network” or NBMA, which provides a codified way of what could be called “who-has-it” routing. A network entry point contacts exit points to see who has a path to the addressed destination.

All of these approaches look at the problem from the network side, and in my view they all have potential but with the collateral problem of complexity of implementation.  What everyone tells me is that they want to see an architected approach that simplifies the problem, not just a solution that network geeks might be able to adopt.

What would such a solution look like?  The simplest way to resolve the problems I’ve cited here is the “god-host” solution.  Imagine a giant universal server that hosts everything.  Now you have no confusion about addressing or location because everything is in the giant host.  Before you start imagining some movie monster or sinister robot, we can add in the word “virtual” to sanitize the approach.  A “virtual server” that envelops the network itself and directly connects to users would have none of the complexities described in the scenario.  It might adopt some, all, or none of the network strategies outlined above, but nobody would have to worry about that part.

The figure HERE illustrates how such a solution might appear.  The “cloud” houses a bunch of diverse, anonymous resources that might be hosted in a single megadatacenter, a bunch of edge data centers, or any combination thereof.  They might be bare metal, virtual machines, containers, or (again) any combination of the three.  Inside, you might want to collect some of the resources into subnetworks or clusters to reflect a need for inter-component addressing.  NFV pools would be like that.  Outside, you could have any number of different addressing schemes to define users and applications.

The secret sauce here is the boundary layer, created by what the figure calls a “Proxy Load Balancer”.  This is a kind of adapter that, on the user side, presents an address in the user’s address space that represents a logical feature or application.  On the resource side, it provides a load-balanced connection to whatever in-the-cloud resources happen to fulfill that feature/application at the moment.  If we assume that this capability exists, then we can spread a unified application fabric over every user, and a unified resource fabric over every cloud and data center.  Without it, the interdependence between logical and physical addressing will complicate things to the point where there’s no practical way of realizing the full potential of virtualization.

The question is whether we can provide the pieces of this model, and I’m going to look at some strategies to do that.  The first is a set of open-source elements that are collectively known as “DC/OS” for “data center operating system”.  This builds on the exploding popularity of containers, but it also provides what may well be the best solution for resource virtualization that’s currently available.  I’ll try to get to it later this week, or if not the following week.  Other blogs in this series will follow as I gather the necessary information.

Cisco Proves Buyers Need More Justification for Change

You can always learn a lot from Cisco earnings calls.  Not everything that’s said is meaningful, of course.  For example, they always attribute anything good in revenue or earnings to their latest initiative (intent-based networking, in this case), and if they fall short it’s because that initiative hasn’t had time to percolate through the market.  They also always talk about things that are experiencing “exponential growth” (security threats this time) and increasing complexity and proliferation of devices (IoT).  You can usually skip the first couple of minutes of a call.

There are some truths exposed later on, and even hinted in the early fluff.  One is that Cisco’s intent portfolio is actually doing them some good, even though it’s about half PR and half technology.  Cisco’s claim to fame in modern transformation politics has been an explicit and even in-your-face acceptance of the terminology and goals combined with an implicit (often subliminal) rejection of the technology changes being proposed.  Translated, that means saying that the market is right about opex savings, transformation, and zero-touch automation, but wrong if they think that means buying something other than Cisco routers and switches.

It’s not that Cisco’s “Intuitive Network” is gaining them market share as much as that it’s protecting them from the potential impact of new technologies overhanging purchase decisions.  That sounds cynical, but it’s a sound business strategy, particularly when they know that there is little marginal benefit to a wholesale transformation from routing/switching to SDN and NFV.  Specialized value for some situations?  Sure.  Total transformation?  Reckless, which means operators won’t do it anyway, but might waffle on the business case long enough to mess up a quarter.

That’s critical right now, because we’re coming off a period of extreme pressure on capex into a period when some relaxation of CFO policies is likely.  If there’s money to be spent, you don’t want the buyers tied up in technology debates.  They’re going to do what they want, which is to stay the course in technology terms, anyway.

You could argue that if there’s a fifty-fifty split between fluff and substance with Cisco, then where’s the other half?  The fact is that eventually operators will have to reduce total cost of ownership, and intent-modeled zero-touch-automatable infrastructure and service operations is the way to get that.  Even if Cisco wanted to promote capex reduction (which, since it would impact sales, would be idiotic), that pathway won’t save enough.  Cisco has highlighted the true path to network enlightenment.  Listen to Robbins on the earnings call: “The network is also a key enabler for our customers as they increasingly adopt a multi-cloud strategy. They need a unified, automated and scalable environment across their data centers, private clouds, and public clouds.”  Gospel truth.

And Cisco has not yet really delivered it, which is the risk inherent in the Cisco story.  There’s value in Cisco’s approach, but it’s not fully integrated from top to bottom, either for enterprises (where the top is the application) or network operators (where it’s the services being sold).  What Cisco has delivered fully is a resource-layer automation strategy that works.  They could expand that to reach the top of both enterprise and provider pyramids, but they aren’t doing that yet.

One reason why is that old axiom of sales “Never sell more than the buyer asks for.”  What network buyers want right now is insurance, not an implementation.  Tell me it will be OK.  At some point, of course, the buyer will want something more, and then Cisco will have to be prepared to deliver something.

If you’re a Cisco competitor, the logical thing to do now would be to deliver the full goods, then make the buyer want it now instead of later.  It wouldn’t be all that difficult for any of Cisco’s competitors (Juniper and Nokia in particular) to deliver a full top-to-bottom service and application operations solution.  I think you could cobble together something that would be dazzling after a week of planning, in fact.

One way Cisco is preparing for the execution of the future and stalling competitors at the same time is by exploiting their unique data center position.  Cisco sells servers, and they’ve now decided to sell container solutions.  Again, quoting Robbins: “Further extending our cloud-focused software offers, we recently introduced a Cisco Container Platform to simplify the deployment of cloud native applications and containers with Kubernetes.”  Containers in general, and Kubernetes in particular, combine with the cloud to create a seismic shift in virtualization and the data center.  For two decades, strategic control of the data center in enterprises guaranteed control of networking.  Since 2012 that’s also been true of network operators.

The trick for both Cisco and competitors would be selling this to the market, and there Cisco has the advantage.  They don’t want to drive the revolution, so they don’t need to sell anything, just wait for the demand to build to a critical point (at which time, presumably, they’d just buy a company or two).  Competitors have to not only have a compelling full-scope solution, they have to make it attractive.  That means positioning it in the press, and that’s a problem.

Network writers and editors don’t generally write about software, containers, Kubernetes.  They don’t write about the challenges of address space and connectivity management.  Even a topic like multi-cloud networking is difficult to get insightful stories on.  It’s hard to do a thoughtful piece on any of this stuff in less than a couple thousand words, in fact, and you’re not going to get that much space easily.  Can Nokia or Juniper, neither of which have been wizards of articulation, sing pretty enough to get traction for a pre-revolution mode on Cisco?  Not unless they change up their approach in a major way.  Read Juniper’s website material on its multi-cloud story, and the media reports of the same development.  You’d never get any sense of the value proposition.

Meanwhile, let’s go back to Cisco’s call.  The first question they got from the Street was whether the interest in Cisco’s intent-based network model was confined to the US.  The response from Robbins was “First of all, we’re incredibly pleased with the early acceptance of this intent-based portfolio. I called out in my opening comments that this Catalyst 9000 is the fastest ramping product in our history, which is pretty impressive. I’d say it’s fairly balanced across the geographic regions….”  This shows that what Cisco is doing is anchoring their future strategy to a current product first, then worrying about the actual question second.  Wise moves, particularly given the lack of wise articulation we see from Cisco’s competitors.

Those competitors are the ones who should be paying attention here.  Cisco turned in decent numbers, and if they did it by staying the course in technology terms, that only proves that the default path for an industry favors its incumbents.  Surprise?  If the other players in network equipment want to see growth in their own market share, changes in the technology framework for the industry, they’re going to have to do a better job of showing buyers why that would be good for them.

What’s the Real Thing You Should Look For at MWC?

With MWC just around the corner and a flood of 5G stuff inevitable, this is the time to ask two questions.  First, what is really happening in the 5G space?  Second, when can we expect to see a complete 5G story deploy?  Those aren’t easy questions to answer in MWC discussions, particularly when the emphasis at trade shows isn’t likely to be “reality”.

The biggest barrier to a realistic view of 5G is the lack of a concrete definition of what 5G is.  There are really two broad pieces to 5G.  One is the 5G New Radio stuff, and the other is the changes to mobile infrastructure that accompany, but don’t necessarily have to accompany, the 5G NR stuff.  We can roughly map this to what’s being called 5G Core.  Originally the two were considered to be joined at the hip, but the 3GPP adopted a new work plan in 2017 called “NSA” for “Non-Stand Alone”.  This quirky name means 5G NR without 5G core, and that’s the key point to addressing our questions.

Everyone agrees that mobile services need more capacity and more bandwidth per user.  That’s particularly true if operators want to use a combination of fiber-to-the-node (FTTN) and cellular radio to create what look like wireline broadband services.  All of the 5G changes relating to the cellular radio network itself are part of NR.  From NR, you have three possible pathways forward.

Pathway one is to meld millimeter-wave NR with FTTN to support what’s essentially a “local 5G” network that would serve not mobile users but wireline broadband users.  This model of 5G doesn’t need mobility management (fiber nodes aren’t migrating, and fixed “wireline” terminations aren’t either) or any of the handset roaming and registration features.  This form of 5G is already being tested, and there will surely be real if limited deployments even in 2018.

Pathway two is 5G NSA (non-standalone, remember?).  This route says that you upgrade the RAN, where probably 95% of the 5G benefits come from anyway, and continue to use the mobility management and handset registration technologies of 4G, meaning IMS and EPC.  This changes out handsets to use 5G frequencies and radio formats, but leaves the rest of mobile infrastructure pretty much intact.  Since most users won’t toss handsets just to migrate to 5G and since 4GLTE compatibility is a requirement for 5G anyway, this gives you most of what 5G promises at the lowest possible impact.  Some 5G NSA deployments are certain for 2019, and possible even in late 2018.

Pathway three is true 5G, the combination of 5G NR and Core.  This is the 5G of dreams for vendors, the vision that includes stuff like network slicing, NFV integration, and all sorts of other bells and whistles that someone thinks would be exciting.  This is also the 5G that may never come about at all, and that’s the nub of the issue with 5G discussions.  Do we talk about the “standard” of 5G that always includes both NR and Core, or do we talk about NSA, which includes only NR and which is almost certain to deploy?

In my view, the 5G standard is a combination of an attempt to address a plausible pathway for wireless evolution and the typical overthinking and overpromotion of our networking marketplace.  If you assumed that we were going to see billions of IoT devices connected via the cellular network, and you assumed that we were going to have hundreds of different, independent, virtual cellular networks to support hundreds of different applications, and if you assume that we were going to demand free roaming between WiFi, satellite, wireline, and wireless calls, then perhaps you need full 5G.  Can we make the assumptions?  Not now, but standards have to prepare for the future and so it’s not unreasonable to talk about 5G in full standards form, as an evolutionary path that has to be justified by tangible market opportunity.  Otherwise it’s a pie in the sky.

5G NSA is proof of that.  The vendors involved were supporters of the NSA approach, and surely they would have been more reserved had it been likely that holding out for full 5G was an option.  The vendors almost certainly realized that if operators were presented with a 5G all-or-nothing, they’d have selected the “nothing” option, or forced an NSA-like approach down the line.  Sure, vendors would love a mobile infrastructure revolution, but not if it’s a low-probability outcome on a high-stakes game where real radio-network dollars are at stake.

What this means for 5G is that NSA is the real path, and that all the Core stuff is fluffery that will have to be justified by real opportunities.  Here, the facts are IMHO quite clear; there won’t be any of those real opportunities in the near term.  If there’s a 5G core evolution, it’s probably not coming until after 2022, and even then, we’d have to see some decisive progress on the justification front, not on the standards front.

There are two realistic drivers to a broader 5G deployment, rapid IoT adoption that’s dependent on cellular-linked IoT elements, and a shift to streaming video rather than linear TV.  The first of the two is the most glamorous and least likely, so we’ll look at it first.

Almost all the growth in “IoT” has been through the addition of WiFi-linked sensors in the home.  This has no impact whatsoever on cellular needs or opportunity, and it creates zero additional 5G justification.  What you’d need for 5G stimulus is a bunch of new sensors that are directly connected to the cellular network, and while there are various 5G radio suggestions for this, and there are missions that could credibly consume that configuration, the business case behind them hasn’t been acceptable up to now.

Apart from the “soft” question of public policy and personal privacy that open sensors raise, there’s the simple point of ROI.  Remember, anything that’s directly networked, rather than behind the implicit firewall of home-style NAT, would have to be secured.  The stuff would have to be powered, maintained, etc.  Private companies installing IoT sensors would have to wonder how they’d pay back on the cost.  Would cities be willing to fund initiatives to make all traffic lights smart?  It would depend on how much of a case could be made for a result that would improve traffic conditions and the driving experience.  And how gutsy politicians would be that the results would be delivered, because if they weren’t the next election wouldn’t be pretty.

The video story is complicated, but plausible.  We already know that streaming on-demand video demands effective content delivery networking, places to cache popular material to avoid over-consuming network resources on long delivery paths.  What about live TV?  Imagine a bunch of mobile devices, and also wireline-via-FTTN/5G stuff, streaming live TV.  Would you want to pull each stream from a program source, even one within a metro area?  Would you want to cache the material locally, use multicasting, or what?

How much have you read about live TV streaming as a 5G driver?  I know I’m not exactly bombarded with material, I don’t hear operators clamoring for it, I don’t see vendors pushing solutions.  But if you want to see 5G in anything other than the NSA version, that’s what you should be looking for in Barcelona.  Don’t hold your breath, though.  As I’m sure most of you know, relevance to real issues is not a trade show strong point.

What’s Happening with SD-WAN and How To Win In It

The SD-WAN space is a unique combination of risk and opportunity.  It’s clearly a risk to the traditional VPN and VLAN service models, the operator services that are based on the models, and the vendors whose equipment is used to deliver the services.  It’s an opportunity for startups to rake in some money, for enterprises to save some money, and for operators to create a more infrastructure-agile and future-proof service model.  The question is what side of the risk/reward picture will win out.

Right now, the market is in a state of flux.  Operators are dabbling in SD-WAN, and every startup knows that if network operators themselves become the dominant conduit for SD-WAN products, the winning vendors will look very different from the winners should enterprises dominate.  But “dabbling” is closer to “dominating” alphabetically than in the definitional world.  Not only are operators not truly committed to SD-WAN, they’re really not committed to why they should be committed to it.

SD-WAN is an infrastructure-independent way of delivering virtual network services, because it’s a form of overlay technology.  Vendors differ in how they position their stuff, whether they use a standard encapsulation approach, a proprietary approach, or don’t strictly encapsulate but still tunnel.  In the end, every technology that uses a network service to ride on is an overlay.  All overlays have the property of being independent of who’s doing the overlaying, so to speak.  For there to be differentiation, you have to create some functional bond between the overlay and underlay, a bond that the network operator can create because they own the network, but that others could not.

Operators aren’t committed to offering any such functional bonding.  I contacted two dozen of them over the last six weeks, and what I found was “interest” in the idea of having a symbiotic relationship between an SD-WAN and the underlying network, and “exploration” of benefits in both a service-technical sense and a business/financial sense.  It almost appears that operator interest in SD-WAN offerings is more predatory than exploitive.  Perhaps by offering SD-WAN they can lance the boil of MSP or enterprise deployments, and in the process weed out a lot of SD-WAN vendors whose initiatives might end up generating some market heat.

It’s really hard to say whether this strategy would work, and almost as hard to say whether operators could establish a meaningful symbiotic strategy if they wanted to.  Some of my friends in the space tell me that operators fall into two camps—those who have to defend territory and those who can benefit significantly from breaking down territory boundaries.  It’s the latter group who have been most interested in SD-WAN, and obviously running an SD-WAN outside your own territory limits how much you can hope for from symbiotic offerings.  The other guy isn’t going to let you tweak his underlay.

What extraterritorial SD-WAN does to is let operators create a seamless VPN connectivity map for buyers whose own geography is way broader than the operator’s own range, and even the range the operator can cover with federation deals with partner carriers.  However, some operators say that they’d really rather somebody else did this extension, preferably the enterprises themselves and if necessary the MSPs.  The problem they cite is the difficulty in sustaining high connection quality (availability and QoS) with Internet overlays.

Amid all this confusion, it’s not surprising that SD-WAN vendors are themselves a bit at sea.  That’s bad, because it’s clear that there’s going to be a shakeout this year, and absent a clear vision of what the market will value, the risk of being a shake-ee is too high for many.  What might work?

To me, the clear answer is SD-WAN support for composable IP networks.  Market-leading container software Docker imposes one presumptive network model, and market-leading orchestration tool Kubernetes imposes another totally different one.  Microservices and component sharing fit differently into each of these, and so do things like cloudbursting or even in-house scaling.  Public cloud providers have their own addressing rules, and then there’s serverless and event processing.  It’s very easy for an enterprise to get so mired in the simple question of making everything connect like it’s supposed to that they don’t have time for anything else.

One thing in that category that seems a sure winner is a superior system for access control, to apply connectivity rules that govern what IP addresses can connect to what other ones.  Forwarding rules are essential to SD-WAN anyway, and having a system that lets you easily control connection policies makes an SD-WAN strategy functionally superior to most VPNs, where doing the same thing with routers is far from easy.

Related to this is address mapping/remapping.  SD-WAN is likely to deploy as a VPN, connecting various virtual hosts created with VM or container technology, and also pulling in a public cloud or two or three.  Each of these domains has specific rules or practices for addressing, and getting all them to harmonize on a single plan is valuable in itself, and essential if you’re also going to control connectivity as I’ve suggested in my first point.

The management framework, including the GUI and network mapping features, would be critical for both these capabilities.  Even more critical is a foundational notion, the notion that the challenge of the future (posed by virtualization) is to create connection/address elasticity that corresponds to the resource and component elasticity that modern cloud and application practices give us.  We are building IP networks today based on the same principles that were developed before there was a cloud, or even an Internet, in any real sense.

There are, to be sure, plenty of initiatives in the IETF to modernize IP, but most of them are actually unnecessary and even inappropriate in the near term, because they’d require client software transformation to adopt.  What an SD-WAN box could to is, by sitting at the network-to-user boundary, make the network appear to the user to be what it needs to be, and allow the transport network to be what it must be.

Nobody in the SD-WAN space is in the right place on this, so far.  That means that even if there’s a market shake-out coming, there’s still a chance to grab on and hold on to the critical market-shaping ideas.

The Relationship Between Service Modeling and Management Strategies

Service modeling is important for zero-touch automation, as I said in an earlier blog.  Service modeling, in terms of just how the model is constructed, is also important for operations, service, and network management.  In fact, it sets up a very important management boundary point that could have a lot to do with how we evolve to software-centric networking in the future.

You could argue that the defining principle of the modern software-driven age is virtualization.  The relevant definition is “not physically existing as such but made by software to appear to do so.”  Jumping off from this, software-defined elements are things that appear to exist because software defines a black-box or boundary that looks like something, often something that already exists in a convenient physical form.  A “virtual machine” looks like a real one, and likewise a “virtual router”.

Virtualization creates a very explicit boundary, outside of which it’s what software appears to be that matters, and inside of which is the challenge of ensuring that the software that really is looks like what’s being virtualized.  From the outside, true virtualization would have to expose the same properties in all the functional planes, meaning data plane, control plane, and management plane.  A virtual device is a failure if it’s not managed like the real device it’s modeled on.  Inside, the real resources that are used to deliver the correct virtual behavior at the boundary have to be managed, because whatever is outside cannot see those resources, by definition.

One way to exploit the nature of virtualization and its impact on management is to define infrastructure so that the properties of virtual devices truly map to those of the real thing, then substitute the former for the latter.  We already do that in data centers that rely on virtual machines or containers; the resource management properties are the same as (or similar enough to) the real thing as to permit management practices to continue across the transition.  However, we’ve also created a kind of phantom world inside our virtual devices, a world that can’t be managed by the outside processes at all.

The general solution to this dilemma is the “intent model” approach, which says that a virtual element is responsible for self-management of what’s inside, and presentation of an explicit SLA and management properties to what’s outside.  An older but still valuable subset of this is to manage real resources independently as a pool of resources, on the theory that if you capacity-plan correctly and if your resource pool is operating according to your plan, there can be no violations of SLAs at the virtual element level.

The difference between the broad intent-model solution and the resource management solution arises when you consider services or applications that are made up of a bunch of nested layers of intent model.  The lowest layer of modeling is surely the place where actual resources are mapped to intent, but at higher layers, you could expect to see a model decompose into another set of models.  That means that if there are management properties that the high-level model has to support, it has to do that by mapping between the high-level SLA and management interface, and the collection of lower-level SLAs and interfaces.

From a management perspective, then, a complex service or application model actually has three different management layers.  At the top is the layer that manages virtual elements using “real-element practices”.  At the bottom is the resource management layer that manages according to a capacity plan and is largely unaware of anything above, and in the middle is a variable layer that manages the aggregate elements/models that map not to resources but to other elements/models.

The management layering here is important because it illustrates that many of our modern network/service automation strategies have missing elements.  The simple model of two layers, the top of which is based on the “real” device management already in place and the bottom on generalized resource management, won’t work if you have a service hierarchy more than two levels deep.

One solution to that is to make the virtual device bigger, meaning to envelope more resource-directed functions in high-level models.  A VPN that is created by one huge virtual router represents this approach.  The problem is that this creates very brittle models; any change in infrastructure has to be reflected directly in the models that service architects work with.  It’s like writing monolithic software instead of using componentization or microservices—bad practice.  My work on both CloudNFV and ExperiaSphere have demonstrated to me that two-layer service structures are almost certain not to be workable, so that middle layer has to be addressed.

There are two basic ways to approach the management of middle-level elements.  One is to presume that all of the model layers are “virtual devices” some of which are just based on no current real device.  That approach means that you’d define management elements to operate on the middle-layer objects, likely based on device management principles.  The other is to adopt what I’ll call compositional management, meaning adopting the TMF NGOSS Contract approach of a data model mediating events to direct them to the correct (management) processes.

IMHO, the first approach is a literal interpretation of the ETSI NFV VNF Manager model.  In effect, you have traditional EMS processes that are explicitly linked with each of the virtualized components, and that work in harmony with a more global component that presumably offers an ecosystemic view.  This works only as long as a model element decomposes always into at least resources, and perhaps even virtualized functions.  Thus, it seems to me to impose a no-layers approach to virtual services, or at the minimum doesn’t address the middle layers.

You could extend the management model of the ISG to non-resource-decomposed elements, but in order to do that you’d need to define some explicit management process that gets explicitly deployed, and that then serves as a kind of “MIB synthesizer” that collects lower-level model element management views, and that decomposes its own management functions down into those lower layers.  This can be done, but it seems to me to have both scalability problems and the problem of needing some very careful standardization, or elements might well become non-portable not because their functionality wasn’t but because their management wasn’t.

The second approach is what I’ve been advocating.  A data model that defines event/process relationships can be processed by any “model-handler” because its functionality is defined in the data model itself.  You can define the lifecycle processes in state/event progression terms, and them to specific events.  No matter what the level of the model element, the functionality needed to process the events through the model is identical.  The operations processes invoked could be common where possible, specialized when needed, and as fully scalable as demand requires.

You probably can’t model broad resource pools this way, but you don’t need to.  At the critical bottom layer, a small number of intent models with traditional resource management tied to policy-based capacity planning can provide the SLA assurance you need.  This approach could also be made to work for services that didn’t really have independent SLAs, either implicit or explicit.  For all the rest, including most of what IoT and 5G will require, we need the three layers of management and most of all a way to handle that critical new virtualization-and-modeling piece in the middle.  Without the middle, after all, the two ends add up to nothing.

A Hands-On Perspective on the Future of TV Viewing

Streaming video that includes live TV is the biggest threat to the traditional linear model of television.  Depending on just what the operator does with it, that makes it the biggest threat to the cable TV and telco TV franchises, or the biggest opportunity.  I’ve blogged about the overall issues of streaming live TV already, but I’ve recently had a chance to look at the space closely, so I want to share my hands-on perspective.

Beauty, and live TV streaming, are functionally in the eye of the beholder.  Streaming fans tend to come from two different camps.  One represents TV viewers who either are not regularly viewing at home, or are tending to consume time-shifted material more than live material.  The other represents more-or-less traditional TV viewers who want to “cut the cord” and eliminate the charges and contracts associated with traditional linear TV.  The camp you’re in is critical when you look at the various live TV streaming options.

Hulu’s live offering is in beta still, but IMHO it illustrates a concept targeted to the first of my two groups.  The experience is nothing like watching linear TV.  There’s no channel guide, and getting to network live programming is awkward to say the least, for those who “watch TV”.  On-demand or time-shifted viewing is typically managed by show and not by time, and so things like channels guides and the notion of “on now” are fairly meaningless.  For this group, it may make perfect sense to go to a “show” tile to find both VoD and live programming, and to go to a “network” tile for the “on-now” stuff.

What you want to watch is another dimension.  Hulu, Google, and Sling TV all have limited channel repertoires, and I think that’s also something appropriate to my first on-demand-centric group.  If you want to watch something on now, then having more channels to choose from gives you a greater chance of finding something you like.  If you’re just looking for something interesting, then it’s more the totality of shows/movies than the number of live channels you’d like.  Amazon Prime is a perfect example; you look for material based on your areas of interest, and it’s on-demand.

DirecTV Now and Playstation Vue are more aligned to my second group of viewers.  Both have linear-TV-like channel guides that let you keep your “on-now” perspective of viewing.  Both also have a larger inventory of channels (and more channel bundles at different price levels).  This means that people accustomed to linear viewing with cable or telco TV have a better chance of getting a package that includes the channels they are used to watching.  Both include local stations in major market areas, so you don’t miss local news and weather or local sporting events with local announcers.

Playstation Vue and DirecTV Now illustrate another difference between the two groups of viewers.  My second linear-committed viewer group are most likely to watch shows in groups, where the first viewer group is most likely to watch solo.  This reflects in the fact that Playstation Vue supports more concurrent sessions than DirecTV Now, and of course the link between Playstation Vue and Playstation as a game system reinforces that solo model.

Another important point is DVR.  Virtually all linear systems allow for home recording of TV shows, but because streaming TV doesn’t require a specific “cable box” in which to site DVR functionality, the capability isn’t automatic in streaming services.  However, everyone but DirecTV Now offers cloud DVR (some at additional cost), and DirecTV Now is supposed to get the feature, along with a GUI upgrade, sometime in the spring.

For some, DVR isn’t necessarily a key point.  Most of the streaming live TV services include archives of past shows, usually available quickly and for a month or so.  Many networks let you sign onto their own streaming apps (on Roku, Google Chromecast, Amazon Fire TV, etc.) using TV Everywhere credentials, after which you can access shows within three or four days max of when they aired.  How long material is available will vary, though.  There’s also a PC package called “PlayOn” (also available in a cloud version) that will record not the live shows but the archived shows, which means you won’t lose them after 30 days or so, the usual limit.  With some effort, you can make live streaming TV work almost as well as linear TV, and with lower cost.

The qualifiers I keep using, like “almost”, are important because the second group of cord-cutters includes the generation less likely to be comfortable diddling with technology or dealing with changes.  The GUI is one such change.  Linear TV typically has a fairly architected user interface, with a custom remote that makes it easier to do what you need/want.  With streaming live TV, there’s a need for streaming providers to accommodate a range of different streaming devices, each with their own control and perhaps a different mobile app.  You can get things like voice control of TV that might be impossible with linear, but many people find the change in interface a bit jarring.

The setup for streaming can also be intimidating to non-technical people.  You need some sort of device to stream to.  There are add-on devices from Amazon, Apple, Google, and Roku (for example), and there are “smart TVs” that can directly support streaming, plus you may be able to cast to your TV from your phone or computer, as well as stream directly.  All these work a bit differently, and not all the streaming channels are supported on a given device.  You need to check what you can get before you make a deal, and if possible, you should view your choice in a friend’s home before you buy.

Then there’s the quality of your home network.  You probably need to think seriously about 50Mbps or better Internet service to stream live TV reliably to a typical home.  Some people tell me they can make 25Mbps work, but it depends on how many independent streams you have and whether you’re streaming 4K or a lower resolution.  That means that many DSL connections are going to be limited in terms of streaming support, which is likely why AT&T elected to buy DirecTV and start supporting a satellite delivery alternative to U-Verse TV.

In the home, the “standard WiFi” may or may not work.  Modern WiFi is based on two frequency bands, 2.5Ghz and 5Ghz, and there are two classes of 802.11 standards available, the “single-letter” ones like 802.11n and the “two-letter” 802.11ac.  The former is limited to 54 Mbps and of course it’s shared, and you might find your WiFi use congests things.  The latter lets you operate at more than double that speed.

Streamers may also face issues with coverage.  A large home can be covered completely with a single well-placed, state-of-the-art WiFi router, but many of the standard WiFi routers that come with broadband Internet service will not cover anywhere near that, and you’d need to have a “mesh” system or repeaters.  The problem with WiFi mesh/repeater technology is that most of it is based on a WiFi relay principle, where WiFi both connects the repeater to distant parts of the home, and also connects the repeater to the master WiFi router.  That means that the repeater has to handle the same traffic twice.  Multi-band WiFi is probably critical for repeater use.

You can probably see the challenge here already.  If there’s no computer/Internet literacy in a household, it’s going to be a long slog to get streaming to work unless you’re lucky and it works out of the box because you got the right gear from your ISP or you have a limited area where you need access.  And if you’ve followed the trend of using networked thermostats, security devices, etc. you may have even more problems, because changing your WiFi will in most cases disconnect all that stuff, and you’ll have to reconnect it.

Where does this leave us?  If there are two different viewing groups with two different sets of needs, then logically we’d really have two different markets to consider.  So far, streaming has penetrated fastest where it offers the most incremental value and the least negative impact on expectations.  If it continues in that channel of the market, it would likely eventually change TV forever, but not in the near term.  If it manages to penetrate the second market group, it transforms TV at the pace of that penetration.  It’s the TV-oriented group, then, that matters most.

Sling TV owes its leading market share to its appeal to my first mobile-driven, on-demand, group.  DirecTV Now has the second-largest market share, and I think the largest potential to impact the second group.  AT&T would have suffered a net loss of video customers were it not for DirecTV Now, and they mentioned the service and plans to upgrade it on their last earnings call.  They have the money to push the service, and for AT&T it offers the attractive capability to ride on competitors’ Internet access and so draw TV revenue from outside the AT&T territory.

That last point is probably the factor that will determine the future of streaming TV and thus the ongoing fate of linear TV.  The success of AT&T with DirecTV Now would certainly encourage competitors to launch their own offerings.  Amazon, who reportedly decided not to get into the streaming live TV business, might reconsider its position.  Google, remember, already has an offering, and they’re a rival of Amazon in virtually every online business.  Comcast and Charter both have basic wireless streaming, though not yet ready to displace traditional linear TV in their own service areas.  There are lots of opportunities to step up the streaming competitive game, and boost streaming live TV overall.

Streaming isn’t going to make TV free, and in fact if it totally displaces linear and induces providers to offer different bundling strategies, it might end up killing some smaller cable networks who only get license fees as part of a larger package.  It will make TV different, and shift the focus of wireline providers from linear video to Internet.  Since Internet services are lower-margin, the shift could put them all in a profit vice unless they do something—like focus on streaming, and what might be beyond it in terms of OTT video services.  That’s a shift I think is coming sooner than I’d expected.