December 2020 – Welcome to CIMI Corporation's Public Blog

5G, Edge Computing, and “Carrier Cloud”

How much might carrier cloud impact edge computing? That question may be the most important one to answer for those interested in the edge, because while there’s been a lot of discussion about “drivers” for edge computing, none of the suggestions are actually currently budgeted applications. What’s needed for edge computing to surge is some direct, immediate, financial motivation to deploy it, and a willing player. Are there any? Operators are an obvious choice.

I think that one of the signs that there is at least hope of operator-driven life at the edge is the interest the public cloud providers in promoting carrier cloud partnerships. Recall that my forecasts for carrier cloud have showed that if all opportunities were realized, carrier cloud could justify a hundred thousand incremental data centers, with over 90% at the edge. That would obviously dwarf the number of public cloud data centers we have today. No wonder there’s a stream of stories on how cloud providers are looking for partners among the network operators.

It’s also true that some companies are seeing edge computing arising from telecom, but not associated with carrier cloud at all. IBM and Samsung are linking their efforts to promote a tie between edge computing and private 5G. This initiative seems to demonstrate that there’s a broad view that 5G and edge computing are related. It also shows that there is at least the potential for significant edge activity, enough to be worth chasing.

My original modeling on carrier cloud suggested that the edge opportunity generated by hosting 5G control-plane features would account for only a bit less than 15% of edge opportunity by 2030. In contrast, personalization and contextualization services (including IoT) account for over 52% of that opportunity. That means that 5G may be more important to edge computing as an early justifier of deployment than as a long-term sustaining force.

As I noted last week in my blog about Google’s cloud, I noted that Google’s horizontal-and-vertical applications partnerships were aimed at moving cloud providers into carrier partnerships beyond hosting 5G features. Might Google be moving to seize that 52% driver for carrier cloud, and maybe 52% of the edge deployments my model suggests could be justified? Sure it could, and more than “could”, I’d say “likely”. The question is whether there’s another way.

The underlying challenge with edge computing is getting an edge deployed to compute with. By definition, an edge resource is near the point of activity, the user. Unless you want to visualize a bunch of edge users huddling around a few early edge locations, you need to think about getting a lot of edge resources out there just to create a credible framework on which applications could run. My model says that to do that in the US, you’d need about 20,000 edge points, which creates both a challenge in having a place to install all these servers, and in getting money to pay for them.

The nice thing about hosting 5G features is that those features could be consumed in proximity to all of those 20,000 locations. If a 5G technology model dependent on edge hosting were to deploy, funded by the 5G budgets, then we’d have a substantial edge footprint to exploit with other applications. However, a cursory examination of 5G and Open RAN shows that it’s more likely that the RAN components of 5G would be hosted in white boxes at the radio edge than deeper in. First, the white-box approach scales costs more in sync with service deployment. AT&T has already proved that. Second, while the operators who are offering 5G in the region where they also offer wireline services have facilities to host in, other operators don’t, and probably won’t want to buy facilities.

While 5G could directly stimulate edge deployment, then, it seems likely that unless there’s a perception that those edge resources will expand into multiple missions, operators could easily just use white-box devices to host 5G and eliminate the cost and risk of that preemptive 20,000-site deployment. That gets us back to what comes next, and raises the question of whether Google’s vertical-and-horizontal application-specific approach is the only, or best, answer.

The problem with a specific application-driven set to drive edge computing is that it risks “overbuild”. Consider this example. Suppose there were no database management systems available. Every application that needed a database would then have to write one from raw disk I/O upward. The effort would surely make most applications either prohibitively expensive for users or unprofitable for sellers. “Middleware” is essential in that it codifies widely needed facilities for reuse, reducing development cost and time. So, ask the following question: Where is the middleware for that 52% of drivers for edge computing? Answer: There isn’t any.

IoT as a vast collection of sensors waiting to be queried is as stupid a foundation for development as raw disk I/O is. We need to define some middleware model for IoT and other personalization/contextualization services. Given that middleware, we could then hope to jumpstart the deployment of services that would really justify those 20,000 edge data centers, not just provide something that might run on them if they magically deployed.

Personalization and contextualization services relating to edge computing opportunity focus on providing a link between an application user and the user’s real-world and virtual-world environments. A simple example is a service that refines a search for a product or service to reflect the location of the person doing the searching, or that presents ads based on that same paradigm. I’ve proposed in the past that we visualize these services as “information fields” asserted by things like shops or intersections, and that applications read and correlate these fields.

My “information fields” approach isn’t likely the only way to solve the middleware problem for edge computing, but it is a model that we could use to demonstrate a middleware option. Were such a concept in place, applications would be written to consume the fields and not the raw data needed to create the same level of information. Edge resources could then produce the fields once, for consumption by anything that ran there, and could communicate fields with each other rather than trying to create a wide sensor community.

There are multiple DBMS models today, each with its own specific value, and I suspect that there will be multiple models for personalization/contextualization middleware. What I’d like to see is a model that could create some 5G features as well as applications, because that model could stimulate edge deployment using 5G infrastructure as a justifier, and still generate something that could be repurposed to support the broader, larger, class of edge opportunity drivers.

It’s hard to say where this model might come from, though. Google, I think, is working to accelerate edge application growth and generate some real-world PR in the near term. That means they’d be unlikely to wait until they could frame out personalization/contextualization middleware, then try to convince developers to make their applications support it. A Google competitor? An open-source software player like Red Hat or VMware? A platform player like HPE? Who knows?

Without something like this, without a middleware or as-a-service framework within which edge applications can be fit, it’s going to be very difficult to build momentum for edge computing. The biggest risk is that, lacking any such tools, some candidate applications will migrate from “cloud edge” into the IoT ecosystem, as I’ve said will happen with connected car for other reasons. A simple IoT controller, combined with a proper architecture for converting events into transactions, could significantly reduce the value of an intermediate edge hosting point. Whether that’s good or bad in the near term depends on whether you have a stake in the edge. In the long term, I think we’ll lose a lot of value without edge hosting, so an architecture for this would be a big step forward.

What NaaS Model, and What Vendor, Might Lead Transformation?

If Network-as-a-Service (Naas) is the right approach, what would it look like? There’s a surprising amount of variability in the approaches being taken, and each option has its own unique positives and negatives. Not to mention whether NaaS as a concept is the right approach, or an approach that’s a practical option for operators now. Today, we’ll look at NaaS and see what we can pull out of the technology mist.

The essential notion of NaaS is that it presents a user not so much with access to a collective network, but rather access to a network that looks like they need it to look. Features that are additional to traditional IP networks (cache, DNS, security, etc.) can be integrated into NaaS, and NaaS services can be easily expanded and contracted in both scope and bandwidth. The value of this is that operators offering NaaS can displace cost and complexity at the user level, making the service more attractive and thus commanding a higher price and profit margin.

The Elements of a Network-as-a-Service Transformation

The figure above shows the range of technology models that could, at least in theory, deliver NaaS. There are three basic hardware models for the data plane—white-box with chip optimization, hosted, or monolithic proprietary traditional. The monolithic model presumes hosting the control plane in the same device, where other models could introduce a separate control plane. In all cases, there are the NaaS features themselves, and on top the interfaces that allow those features to be accessed.

We could map these various boxes to what’s available or proposed today, to tie theory to buyable products. That’s how we’ll start our exploration.

Traditional monolithic routers will implement control and data planes in a combined way, within a device. These devices could be supplied with NaaS capabilities by adding something above the box, as we see today with the implementation of 5G UPFs, content delivery networks (CDNs), and even the DNS/DHCP IP addressing features. We could also integrate “hooks” to these features with the control plane of the boxes, which is based on adaptive topology and status discovery. We could add an additional functional layer that provided for improved general integration of NaaS features.

SDN in the classic OpenFlow form would build a centralized controller and manage optimized white-box flow switches with it. The control plane’s connectivity is adaptive via “northbound APIs” presented by the controller, and any NaaS logic would be hosted above the controller and manipulate connectivity via those APIs.

The NFV model would suggest that both data plane and control plane are hosted on general-purpose servers or at least on pooled resources. The model could support centralized or adaptive control of forwarding tables, and in the adaptive case, it could be separated from data forwarding or integrated with it in a single “router instance”. NaaS features could be integrated with the control plane and presented through any of the interface options.

What’s the “white-box” model? At one level, you could consider white boxes to implement either the monolithic classic router, the SDN approach, or the NFV hosted-function approach. White boxes could also be used in a cluster (as DriveNets has done) to create a complex of devices that act as a single box but provide more resilience and flexibility in deployment and upgrading. That means that the white-box approach has, in theory, the ability to implement any of the control-plane options, integrate any feature set with the control plane, and provide any sort of interface through which the NaaS services can be accessed.

Having laid out the basic options, let’s dig a bit deeper into the best way to deploy NaaS, and see how our options relate to the optimums.

There seem to be three general approaches to NaaS. The first is what could be called “connected Naas”, which is what we have today. Additional service elements and features are added by making them a destination on the IP network (including VPNs). This is how 5G proposes to add its own mobility features, how CDNs work, and so forth. Obviously, this approach fits the monolithic router model that dominates today.

The second NaaS approach is what might be called “overlay Naas”. This model adds an additional connection layer above IP, and through that connection layer it has more specific service and feature control than IP provides by itself. We have this today in the various overlay SDN models and SD-WAN. This is the easiest way to augment traditional router networks to support NaaS, but it doesn’t provide any real integration between the overlay and the IP network, which means that its ability to control the connection experience is limited.

The final approach is “integrated NaaS”. This is a broad model that differs from the others in that there would be explicit tight integration between NaaS services and the IP control plane. This would mean that, via the integrating API, NaaS could draw on topology and traffic information available to the control plane, and could also influence packet forwarding and even the IP control-packet exchanges with adjacent devices.

Connected NaaS isn’t a new capability, but it’s applicable to current infrastructure. NFV was arguably aimed at creating this model through “service chaining” and “universal CPE”, but even though it uses existing interfaces to routers, it’s not necessarily easy to integrate, as the NFV service-chaining experience illustrates. However, if the goal is to provide a means for NaaS features, hosted somewhere in the cloud, to influence forwarding behavior, even connected NaaS can work. All that’s needed is a pathway to forwarding control, asserted via a management interface.

The overlay NaaS model is the logical way to transition from vanilla IP and VPNs to NaaS. In almost every case, this approach will start with an SDN/SD-WAN facility that creates a universal VPN model that works even from home. This model creates, in at least some implementations, a kind of closed-user-group capability where users can connect only to resources assigned to them. Zero-trust security, in other words, or a “secure access service edge” (SASE). Since security is the most common NaaS feature offered today, this is a powerful starting point. Some implementations also prioritize traffic, which is another highly valuable feature.

Integrated NaaS, the final approach, is where I think things are heading, but each of the major NaaS models (monolithic/traditional, hosted, white-box) are likely to approach it in a different way, so we’ll have to dig into them all, referring to the earlier figure.

If we assumed that we had a white-box model of networking with a separate control plane, likely cloud-hosted, the integrated NaaS model would allow new service features to be dropped right into the control plane implementation, which is a potentially powerful capability. Even without that, the control plane could assert APIs that would tightly couple features to the control plane, which gives them at least the potential for manipulating forwarding, impacting traffic engineering, and so forth. This means that NaaS wouldn’t be a subset of what the underlying IP network could do. Instead, it would be able to control that underlying network, perhaps even in detail. This white-box-separate-control-plane approach would then be the ultimate NaaS.

Not necessarily in a material way, though. The OpenFlow SDN model of transformed network could do something almost exactly the same, using those northbound APIs. Even a monolithic router could, with the addition of an interface to “the cloud”, offer tight integration between forwarding and a separate NaaS feature set. Many routers already support OpenFlow, for example, and any router vendor could directly expose forwarding table control via a secure separate interface or API.

I’ve talked about two specific vendors as candidates for a true NaaS platform, Juniper and DriveNets. Neither of the two have actually committed to all the pieces needed to fulfill NaaS potential, but as the old military adage says, “Judge an enemy by its capabilities and not its intentions”. I think both would be capable of implementing NaaS, via different models and requiring different steps to advance their position.

Juniper’s acquisition of Mist, 128 Technology, and Apstra gives them the foundation of what would arguably be a “NaaS ecosystem. One of the negatives of NaaS is support; customization of services creates a greater burden on traditional network operations practices, and Juniper has this covered (potentially) with Mist. 128 Technology adds connectivity control, VPN extension to pretty much anywhere, and zero-trust security and prioritization. Apstra extends all of this into the data center, and adds in the intent modeling that’s absolutely critical for management of pooled resources and mixed technologies.

DriveNets is the most fascinating startup in the space. They use a collection of white boxes assembled into a cluster connected by the white-box fabric elements. They separate the control plane, and they provide (in their management and orchestration) for the deployment of third-party elements into what becomes an extension of the control plane. They can also couple elements via traditional interfaces and through a wide range of APIs.

Do these two then vie to shape the future of networking and NaaS? Obviously other direct competitors could emerge if neither of these two moves quickly, but that’s not the only way another competing approach could emerge.

The challenge for both Juniper and DriveNets is that they’re network companies not cloud companies, and the cloud is inevitably the platform for any “service” beyond simple connectivity. A big piece of 5G and CDN belong there, and almost any value-add from IoT. VMware is a cloud player, a player who could potentially be waiting for the opportunity to arrive. The company recently promised to “pave the earth” with their Tanzu cloud platform.

The “waiting” point is VMware’s challenge. They need to move to great NaaS, to embrace where it’s heading. In the near term, that means generalizing forwarding control so they can connect with white-box data-plane elements, and in the long term it means visualizing what an “IoT service” would really look like.

These three vendors would present three different pathways to NaaS, each having its own strengths and limitations. I think that the winner is likely to be the player that gets out there first with credible positioning and an early feature set that justifies operator consideration. That could be any of these three, and it’s even possible that someone else will jump out if all three of my choices take too long to get to the starting line. Whatever happens, I think NaaS is the only salvation of operator business models, so the operators better hope at least one player does the NaaS deed.

How Do Vendors Respond to Slowing Network Investment?

How will my company win in 2021 and beyond? This is a question that every networking giant and aspirant is already facing, and will face even more in just a few months. The global economy is going to start to recover next year, and with that recovery will come a lessening of fears about broad economic impacts on networking. That leaves the network-specific fears, fears I’ve talked about from the perspective of operators in past blog. Now, it’s vendors’ turn.

The problem with network equipment sales is the same for the enterprise and service provider spaces. In both cases, return on investment for new network technology has been steadily harder to create. On the enterprise side, that’s because we’ve not had a transforming new productivity paradigm in two decades, the longest period of drought that the market has faced since the dawn of IT. For service providers, the problem is the steady slippage in profit per bit, a reflection of the fact that “connection services” are just plumbing in the Internet age. Lower ROI is the universal problem for vendors, but not all vendors face it the same way.

There are four classes of players in the networking space as we enter our new age. The first is the incumbent network equipment vendors, the second the hosted function supporters, the third the public cloud providers, and the last the white-box players. We’ll take a look at issues and options for each.

Our first group is probably under more threat than any player in the space. Both enterprises and service providers have been under “benefit pressure” for a decade, which means that they’ve struggled to make a business case to raise capital budgets. For enterprises, the problem has been a lack of a new productivity paradigm, and for service providers a steady decline in profit per bit. That’s resulted in downward pressure on pricing for network gear, and the explosion of interest in open-model networking of some sort.

The strategy of vendors so far has been to discount where necessary, but with the troubles being faced by price-leader Huawei, the discounting strategy has been less effective, and I think that’s had more of an impact on the purchasing of traditional network gear than the pandemic. I also think it’s pretty obvious that vendors are not going to be able to raise discounts without destroying their own bottom lines.

That means the network vendors need to do something I’ve told them to do for a decade, which is to help buyers find the business case for more purchasing. The only near-term opportunity to improve network benefits is to promote a network-as-a-service model. Naas lets the network itself assert APIs to provide needed features to applications, making it unnecessary for application-layer or middleware technology to do save the day. This is so important that I’ll say now that NaaS is the near-term goal for all four of our classes of network player.

Among the network vendors, Juniper is the only one who seems to have a NaaS strategy. The acquisition of Mist, 128 Technology, and (just recently) Apstra, gives Juniper all the pieces needed to create a NaaS story quickly, which would then give Juniper a jump on the market if I’m correct and NaaS is the way to go. Juniper, not a vendor I’ve picked in the past for strategic leadership, could now be out in front overall.

The group of vendors who seemed most likely to knock off the big equipment giants is our next category, the hosted-function or server-based network players. This group has a big disadvantage out of the box with its tie to NFV, a technology that didn’t get off to the right start and hasn’t been able to shed its ineffective base model since.

On the surface, NFV should be a great platform for NaaS because it postulates the creation of services by deploying and linking functional elements. The problem is that NFV got way too deep into per-customer service enhancements, notably security, and never really focused on what I’d call “infrastructure services”, the functional platform technology that creates multi-user services like the stuff NaaS has to be based on.

The major players in this group are software vendors like Red Hat/IBM, HPE, and VMware. The latter has what I think are the best technology assets to address NaaS, but its positioning and messaging has been very NFV-centric. There is no chance, in my view, that NFV could possibly create a technically optimal model in the NaaS space before the major network vendors (Juniper, notably) does it, which means that all these vendors, including VMware, will have to jump onto a different, better, bandwagon.

To do that, they’ll need a “camel” a leader-concept that can showcase a shift in positioning and justify it at the same time. It’s very clear that Open RAN is the camel here. If the hosted-function players can get a strong Open RAN story that embraces basic technology shifts applicable to the network at large, they can win here, and fairly quickly. Maybe quickly enough to take the lead and shape the NaaS space. The 5G User Plane Functions, for example, are perfect hosted elements of a NaaS.

The problem here goes back to NFV. There is no credible way that servers are going to replace all network devices. In fact, the current trend is to say that the data plane of current IP elements is going to have to be migrated to white boxes if open-model networking is the goal. That means that what gets hosted by hosted-function players is the IP control plane, as well as the “control plane” of services like 5G RAN and Core, CDNs, and (eventually) IoT. Right now, the hosted-feature guys are not engaged in white-box-symbiotic positioning, so they’re disqualifying themselves from this critical relationship, and perhaps from NaaS.

I think VMware is the player to watch in this space. The future of hosted-control-plane IP and control-plane function hosting for higher-layer services lies in bringing cloud-native to these areas, not NFV. If VMware can position its broad cloud portfolio, which is more valuable to the NaaS future than their NFV stuff, at hosting the control plane of all current and future services, then they’ll be able to quickly set a standard for the future network. To do this, they’ll need to embrace white boxes as subordinate elements to the cloud, or they’ll empower the last two groups of vendors.

The public cloud providers have recognized that 5G in general is the mission for server resource pools in the network of the future. Since network operators rarely capitalize anything but current service opportunities, it’s the mission that could induce operators to start to deploy their own cloud technology, the “carrier cloud” that could generate a hundred thousand new data centers by 2030 if it reaches full potential. Obviously, cloud providers 1) don’t want operators building that kind of cloud, and 2) want to make money by hosting the service missions that might drive it. That means, right now, Open RAN, and it’s why many of the cloud providers are focusing on it.

Many but not all. Google is taking a bold position in its Google Mobile Edge Cloud (GMEC), looking at 5G driver applications rather than 5G infrastructure elements. Google is betting that other cloud providers won’t win a significant advantage in hosting Open RAN, in no small part because that’s a competitive focus for all the other vendor categories and it’s always wise to avoid betting on a free-for-all to gain advantage.

The problem with Google’s approach is that operators have been really bad at selling anything other than connection services, and most have been stalling and diddling to avoid even considering that move seriously. They want vendors to do their job, and of course Google may see themselves as doing that by providing some nice horizontal and vertical applications in hosted form. But once you give operators an application in hosted form, all wrapped up in a nice technical bundle, you just move the objections to the selling side. Operators have limited sales capability in the first place; they’re order-takers. Second, they have zero engagement with the line organizations that buy applications. This means Google is either going to have to do some astonishing hand-holding of operators, or launch a major marketing campaign that operators can piggyback on.

And now our final category, the “white-box” players, and this group has three sub-classes. The first is the white-box players who propose to run open-source switch/router software on white boxes to create open monolithic switches and routers. The second is the SDN players who see white boxes as the data plane of the future network, and SDN controllers as the control plane. The ONF, technology driver of this group, has started promoting “programmable networking” as a benefit of this approach, and that’s (so far) the only NaaS-like positioning out there. The final group is very limited in size; it’s the group of white-box vendors who build up nodes of any size by clustering white boxes.

The advantage that our second sub-class, the SDN gang, has because of the ONF’s programmable-networks story is somewhat dissipated by the fact that the ONF is a specification body, and their insights impact the market only if somebody implements and sells the ONF story. At the switch level, SDN is commonly used in the data center, but as a general forwarding-plane element or as the data plane of IP, there’s nobody carrying the flag in this space as far as either operators or enterprises are concerned.

Besides the “no-obvious-seller” problem, the biggest problem in securing a position in the control plane of a programmable strategy is the buyer fear of a central controller, a single point of failure whose risk gets bigger as the scope of deployment expands. The second-biggest problem is that it’s far from clear just how SDN-generated programmability translates into NaaS.

The white-box-cluster group is currently being defined by a single player, DriveNets. They advocate a whole list of advanced concepts, including separation of the control and data planes, the ability to host third-party service elements in cloud-native form, and the ability to present the cluster as a single element to other routers and to management systems.

This combination would make it easy for DriveNets to deploy a NaaS story, but so far, they’ve not done that. Their current focus has been in displacing routers in deeper-than-5G missions; they won the AT&T core network, for example. That shows that the cluster approach works as a router of any size, but it doesn’t validate either 5G or NaaS, so if those are the key elements in the current battle for network transformation, they still have stuff they need to do.

Where, then, does all this lead us? Juniper, I think, has built the easiest path to NaaS by creating an ecosystem that can deliver personalized, experience-centric, connectivity that can be integrated vertically all the way down to transport. They also, with Apstra, have an intent-model technology that could simplify management of services and infrastructure. Play intent modeling right, and you solve how to cloud-host features. What they would need to add is just an explicit way of linking this to 5G Open RAN and NaaS beyond connectivity. That could be done almost by positioning alone, but they could also ally themselves with other platform players in the hosted-feature space, and integrate it with intent modeling.

The challengers to Juniper, IMHO, would be VMware and DriveNets. VMware has cloud creds aplenty, and the mass and market influence to do something profound, providing they also have the vision and determination. They only have to rise above NFV to do it, too. DriveNets is technically the closest competitor, but they’re a startup and it’s not yet clear how they’ll expend their technical bandwidth and media collateral. NaaS and the future of open networking is, after all, more future than current, and so they have to balance the strategic with the tactical. Can they make the right choices? If so, they have a shot.

For all the rest of the players, the big question they have to answer is how they’d implement that same beyond-connectivity NaaS, starting with how they could play a role in Open RAN. That starts with a vision, and it has to be one that links the “control plane” of Open RAN, CDN, and IoT, with the separate control plane of IP, and with white-box generalized forwarding.

Who could win it? Almost anyone at this point, but I think that it’s very possible that the space will start seeing cohesive positioning by spring, and after that the windows will start to close for more and more of the players. This is one of those times when nothing short of naked aggression will serve, and that’s never easy for any vendor. We’ll see who steps up, and then grabs the best chance of shaping the future.

What Questions Impact Open-Model Networking’s Success?

What will open-model networks look like, and how will we build them? What are the issues operators say are limiting their use? Operators are surprisingly confused on these issues, largely because they’re confused about higher-level issues that relate to planning and deployment. I’ve had a chance over the last month to gather operator concerns about open-model networks, how they’re built, and how they’re sold. Here’s the list of what I’ve learned.

The top item on my “learned” list is that operators are more worried about evolution than revolution, and yet it’s revolution that’s driving planning. This is forcing vendors to adopt a kind of two-faced sales model. You have to check the boxes that represent the long-term open-model goals, or you’re not in the game at all. Once you do check the boxes, though, you have to be able to present the evolutionary path that gets the operator to the goal without undue risk and cost.

Most vendors seem to be putting their focus in one or the other of these two “faces”, and my planning-stage contacts agree almost universally that’s not the best approach. The vendors who have actually executed on some aspect of open-model networking disagree; they drove themselves to the point of a buy decision and so didn’t need the vendors to espouse revolution to get their attention.

The biggest challenge this dualism has created isn’t the message either; most vendors seem to know what their revolutionary value proposition is. The problem in most cases is how to present it. There’s little chance a vendor other than a current incumbent with a lot of strategic influence could cold call on senior planners who need the strategic story. In any event, the sales organization isn’t normally equipped for that level of engagement. Marketing material would be the answer, presented through an organized program but via the website. Most vendors don’t have the story collateralized for that conduit.

The second issue is that operators are uncertain on the best future open-model approach. Some believe in the “NFV” or cloud hosting model for future networks while others believe in the white-box approach. This is the technical-strategy point where operators feel they’ve gotten the least useful information from their vendors. Some, in fact, say they’re not even sure where their vendors fall on the issue.

Part of the problem here is what we could call the “challenge of competing gradients”. The deeper you go into networks, meaning the further you are from the user edge, the more traffic you’d expect to see on a given trunk, and terminating at a given node-point. The more traffic you have there, the less credible hosting functions on cloud resources are for the data plane. Operators believe they could host lower-volume traffic in the cloud, but not core network traffic. But the closer you are to the edge and the lower the per-node traffic, the less likely it is that you could justify a resource pool there to host something on. That resource pool is likely available only deep inside, where traffic limits hosted-function value. Catch-22.

A pure white-box strategy offers another open-model alternative, but it’s not without its issues. First, there are no current white-box products large enough to handle all the routing missions as a monolithic element. You need to somehow collectivize multiple boxes into a single operating entity. That’s been done by DriveNets, which is likely why they won AT&T’s core, but it’s not widely supported or understood. Second, white boxes (says one operator) “just don’t seem as modern an approach.” Operators, particularly the larger ones, are seeing carrier cloud as their ultimate future resource. Hosted router instances in carrier cloud are thus “modern” (even though they don’t work for large traffic volumes, operators themselves agree).

This second issue is really a reflection of the fact that operators see open-model networking entirely as a cost management strategy, which is my next point. Despite the fact that they want to think about the future, despite their desire for “modern” approaches, they are really not targeting anything new, just targeting doing old stuff cheaper. That’s made effective strategic positioning for vendors more difficult because operators don’t know what they want to hear about future services. It also means that the open-model solution has a primary goal of transparency. To paraphrase an old buyer comment I got, “The worst kind of project to present is a cheaper-box substitution; the best you can hope for is that nobody will ever see the difference.” Other, of course, than cost. That means that sales efforts will tend to bog down in the equivalence problem; is this box really completely equivalent to my old box?

The obvious question is what the alternative to a cost-management-driven open-model transformation would be. That question is really a two-parter. The first part is the question of the service targets, and the second the question of the infrastructure needed to deliver to those targets. There are two credible broad targets—over-the-top services/experiences and enhanced network services—but the two can blur together.

The notion that elastic bandwidth and turbo buttons and the like generate more revenue has been proposed and debunked for decades. The likelihood is that any new connection services will have to come by absorbing what I’ll call “boundary activities” related to true over-the-top things. Two clear examples are the user-plane elements supporting mobility and the elements of content delivery. Both can be classified as examples of network-as-a-service (NaaS), as I’ve noted in prior blogs. In both cases, quality service and destination are attributes of a higher-than-network relationship, and because the network may not directly address these requirements, there’s a new element introduced to provide what’s needed. In 5G, for example, it’s the UPF.

What NaaS does in this case is to build a kind of overlay service, one that has connection properties that can be controlled directly rather than inherited from device behavior below. That’s one example of an enhanced network services, and also why the two types of services beyond the basics can blur.

Real over-the-top or higher-layer options also exist, and here the most obvious candidate is IoT. The vision of IoT-based location services as coming about by having a zillion startups exploiting free IoT sensors appeals to idealists, but it’s not going to come about unless governments create regulated monopolies and define basic IoT services for a fee. The more realistic example is that some operator deploys sensors/controllers and then abstracts them through APIs to offer a higher-level, more useful, representation of what’s happening.

Think of this example. You have a five-mile avenue with cross streets and traffic lights at each corner. You want to know about traffic progress and density. You could query five miles worth of sensors and correlate the stuff, or you could ask a “traffic service” for a picture of conditions between the start and end of the avenue. Operators needn’t get involved in navigation, route planning, food delivery, timing of goods movement, and so forth. They simply offer the “traffic service” to those who do want those applications.

Another issue operators raised is simple confusion over popular marketing terms. Many of the terms used these days are as confusing to operators as they are enlightening. One in particular, “disaggregated” is clearly a victim of over-definition by vendors. If you can take a router out of its box these days, you can bet the vendor is claiming it’s “disaggregated”. Most operators weren’t confident in their ability to define the term. Operators who were confident said it meant that software was separated from hardware (almost 64%), that control and data plane were separated (28%), and that a router instance was composited from multiple white boxes (8%).

This uncertainty over the meaning of the term seems to arise in part from deliberate vendor hype, and in part because vendors are letting the media and analyst community carry the water for messaging. It’s often the case that the first definition given for a concept, or the first one to get significant media attention, sets the term in stone. That can lead to significant mischaracterization by buyers. In at least three cases I’m aware of, a mandate for “disaggregated” was set by management and misunderstood at a lower level.

I think that any credible open-model network strategy has to provide software/hardware disaggregation and control/data-plane separation. I think any strategy aimed at high-capacity missions, even metro-level aggregation, will have to be composited from multiple white boxes. This, I think operators who picked any of the options for defining the term were partially right, but since only two picked them all, there’s still a lot of education needed.

That sums up the issue with open-model networking. It’s hard to have a partial revolution, to have a technology impact that’s minimized to one of three areas, but that’s supposed to deliver benefits across the board. There is no one reason why open-model networks aren’t exploding, what’s really needed is either a recognizing of the “ecosystemic vision” that combines the three definitions of “disaggregated”, or a camel-concept to stick its nose under the tent.

Operators do have a sort of camel-in-waiting. The place where open-model networks are expected to “start”, in the view of almost all operators at all stages of commitment, is in 5G Open RAN and 5G Core. Everyone points out that 5G is budgeted and that the momentum for open-model networking has been greatest (and most visible) in 5G Open RAN. Yes, there are operators who are taking a broader approach, but the approach that’s getting universal attention is the Open RAN area.

AT&T has been involved in Open RAN for years, perhaps longer than any major operator. They’ve cited their hopes that it would reduce capex for them, and spur competition and innovation, and both these points are critical to where open networks are heading. It’s very true that RAN, because it’s an edge technology and thus represents a mass deployment (and cost), would have a major impact on capex. That’s even more true if we assumed the Open RAN camel managed to get its nose under the broader network tent.

The innovation side is harder to pin down. What kind of competitive innovation would be possible in an open technology? Does AT&T think competition in Open RAN will be a race to the commodity bottom, or are they seeing a broader impact? Might Open RAN open up innovation on the design of the network overall, and even the design relating to how higher-level network services are coupled to the network? That hope would seem to require at least a plan on what those services, and that coupling, would look like.

For prospective vendors, this is an important point because it means that it’s likely that a sales initiative aimed at Open RAN would connect without having the strategic groundwork laid, a groundwork that’s proving difficult to put into place. The first point operators made, remember, was that strategic/tactical dualism and the problems it created. Such a targeting could create the risk of “tunnel vision” from vendors, aligning their initiatives so specifically to Open RAN that they can’t easily extend to the rest of the network. Open RAN, for example, is said by most operators to mandate “control and data plane separation”, but it doesn’t, it only mandates that the 5G control plane be separated from the user plane that unites control and data plane in IP networks.

Will we have camels leading us, or disaggregation Einsteins? That’s going to depend on which vendors catch on first.

Is Google Trying to Reshape the 5G Market with (gasp!) Applications?

5G will be truly financially successful for operators only if there’s more money collected from its deployment than was collected before it. Would users pay more for what’s likely to look like the same wireless they’ve had? Probably not many would, so 5G depends on finding applications that exploit its more subtle differences. The 5G supporters have so far lagged in getting these applications going, and now a new player—Google—wants to step in.

Back in March, Google announced a 5G cloud strategy that was less an attempt to host 5G features than to promote 5G-specific applications. Called the Global Mobile Edge Cloud (GMEC), the concept wasn’t as much to eliminate operator-owned “carrier cloud” deployments as to supplement them. Many 5G operators (and most larger ones) have service areas that are larger than the geography where they’d likely have real estate to use for their own edge hosting.

Google’s approach then was clearly aimed at driving “generalized” application hosting at the edge, especially in the AI/ML area. A partnership with AT&T on that effort was announced at the same time, and it was interesting from the first because it didn’t commit Google to hosting 5G, as Microsoft has elected to do with Azure. Google also announced a specialized Anthos for Telecom offering to manage distributed edge resources as cloud components.

On December 8^th, Google expanded this early initiative with an announcement they were bringing 200 applications from 30 partners to the edge using Anthos for Telecom. According to the Google blog just referenced, the target is “Organizations with edge presences—like retailers operating brick-and-mortar stores, transportation companies managing fleets of vehicles, or manufacturers relying on IoT-enabled equipment on shop floors.” In other words, IoT applications that are distributed to the point where it’s possible (or even likely) that companies couldn’t support their own edge hosting at a reasonable cost.

One model that I think Google wants to support would still include local controllers in the form of “Raspberry Pi” type resources, such as you’d likely find in “industrial IoT”. The target need would processing at a slightly deeper point that would require deployment of actual IT elements, including servers or even server farms. In other cases, such as in transportation applications, there may not be any local controller at all.

The specific application examples Google offers include rich visual experiences at retail locations, AI-based inspection in industrial applications, content delivery (CDN), and retail (shopping, supply chain, video surveillance, and shelf management). There’s also a good supply of “horizontal” applications in security, monitoring, etc.

There’s reasonable breadth in the early applications, and the total number isn’t overwhelming. Operators can pick from the list to create a 5G application set, and then focus on the prospects who fit the application profiles. It looks to me like the primary value in terms of introducing 5G applications will be the vertically targeted ones; there’s good referential value within a vertical that sales organizations can support, and the horizontal applications are a good way to add some additional meat to the deals.

The fact that Google is basing this on Anthos for Telecom means that it’s designed to support container-and-Kubernetes deployments, including Istio service mesh, and could be hosted on premises if needed. There’s no specific commitment to that model in the material, but I suspect that all these partners would be willing to offer operators direct hosting of their products on carrier cloud, if needed. Anthos for Telecom would allow hybrid GMEC operation, given that Anthos is a federated-cloud-and-edge model.

This possible dualism between Google’s cloud and carrier cloud is good, because many operators are concerned about a lock-in relationship with public cloud providers, as they are with equipment vendors. It’s also good that there’s probably not a huge amount of work involved in hybridizing between the two hosting environments, though if Google wants to promote this value proposition, they’ll need to be more explicit in how a customer would manage the transition.

Obviously, the big advantage of GMEC is that it offers a low-first-cost on-ramp to 5G applications, so operators could test the waters and validate an opportunity, grow with customer interest within GMEC, and (probably) migrate to a hybrid model when they have enough business to deploy their own carrier cloud infrastructure. GMEC would likely stay in place where operators had customer needs but not sufficient to justify their own infrastructure. Out of their primary region/country, perhaps?

Another advantage of GMEC is that Google is taking care of the integration/onboarding issues, the very issues that have proved so troublesome in NFV. That, combined with my first point, means operators could expect to be live with 5G application offerings very quickly if they adopted the GMEC model, compared to months (and sometimes failed attempts) for the NFV stuff.

Enterprises may be a recipient of the third GMEC advantage, even though “Anthos for Telecom” seems to aim at service providers. Some enterprises with far-flung IoT assets have seriously considered carrier 5G and edge computing, while those with more contained geographies are looking at rolling their own 5G. Many of Google’s applications, which target enterprise services, are obviously a fit for enterprises who want private 5G, and this group could benefit from using GMEC directly. For these, though, the question will be whether they need a hosted 5G service set as well as applications.

The final advantage is that Google seems to be taking care not to cross into what operators tend to regard as their personal space, the hosting of actual 5G service elements. While you could obviously elect to host 5G Open RAN or Core on GMEC, Google isn’t pushing an integrated model on every operator. That’s likely smart because while smaller operators do tend to like a one-stop-shop approach for 5G and applications, the bigger ones are reluctant to commit to public cloud hosting in perpetuity.

I’ve mentioned in prior blogs that operators have viewed Google as less threatening to their own business model than either Amazon or Microsoft, and it may be that Google is taking special pains to ensure that doesn’t change as things focus on 5G. They understand perhaps that operators pursued a relationship with Google over 12 years ago, and Google ignored it. They may feel that they have to do something to build back operator trust, though none of my operator contacts remember the earlier, failed, initiative.

Of course, the big question is whether the GMEC initiative will work. I remember all the hoopla around virtual network function commitments in the early days of NFV, and most of them came to nothing. The availability of applications isn’t a guarantee that any of them can make a business case for enough operators to be worthwhile, and that if some do, there will be growing momentum on 5G applications. Victory doesn’t come just by waving a flag and yelling “Charge!”

It’s also true that while application commitments in GMEC form aren’t a sufficient condition for 5G success, some sort of applications in some exploitable, tactical, form are surely a necessary condition. We’ve been postulating all manner of 5G revolutions around completely hypothetical missions, things we can name but not deploy. Google is frankly an ironic player to be jumping out to lead the charge for 5G applications (network equipment vendors in the space, or the operators, would be more logical), but the absence of a strategy from others has given Google an opening. Maybe it’s given the 5G space a lifeline, too.

Do We Need to Look for 5G Openness Beyond Open RAN?

Is it time to look beyond Open RAN? That question may seem strange given all the positive developments, including the story that Mavenir is looking to build radio units, but I think it’s the perfect time. In fact, it may be inevitable at this point that the focus of open 5G starts to shift.

One might ask why RAN (meaning 5G NR) had to be opened in the first place. After all, the 3GPP specs define functional elements with standard interfaces. The reason is that if you’re going to build something based on hosted functionality, you need to define more than just how the functional blocks connect. If the hosting part of the story is too divergent, then the elements may be able to connect with each other nicely, but won’t run on the same infrastructure. Goodbye openness.

Open RAN developed a framework for software-based RAN, which obviously competes with hardware or appliance-based RAN, and appliance RAN favors the big incumbent vendors. Those vendors point out that Open RAN is more work in integration because there’s more moving parts when you add hosting to functionality. Since the radio network needs radios, that marketing pushback from the giant network vendors is likely behind Mavenir’s decision to offer its own radios.

It’s hard not to see the same problem that drove Open RAN not driving open-model network principles deeper into the 5G core. The 3GPP defines functional elements there too, and there are just as many issues with hosting them. Some have suggested that because NFV is cited by the 3GPP as a requirement for function hosting, there’s no need for further work. To that, I say that NFV needs plenty of work in itself. Even the latest NFV specifications on how to make NFV cloud-native simply glues cloud-native elements into an NFV framework that wasn’t working particularly well (look at the onboarding and implementation issues for NFV for proof).

It’s pretty likely that Open RAN principles will be applied to 5G Core, to create (perhaps) Open Core, but because of the NFV association in the standards, it’s not quite as simple. Either we have to undo the NFV association, we have to somehow harmonize NFV with real cloud-native behavior, or we have to forget cloud-native and stick with NFV. The latter, I think, is a guaranteed failure. The second will take too much time, which leaves us with the first option.

As profound an issue as this is, it’s not the major challenge of Open Core (to use the name for convenience). The deeper you go into the mobile network, the more IP-like it becomes. The role of mobility management dips all the way into metro infrastructure, and if 5G really is going to increase network traffic and introduce the requirement for “slices” with specific QoS, then mobile issues touch everything in the IP network, through to the core.

My view here is that as 5G touches IP, the touch is going to come between the IP control plane and the 5G control plane. The 5G user plane is an IP network, and 5G (like earlier IMS/EPC specifications) is drawing on the IP network for service features that don’t exist in IP. For that reason, there are user-plane elements in 5G that are then controlled by the 5G control plane. The question is whether it would be more efficient to establish the “services” that 5G needs from IP as network-as-a-service features of the IP network. If that were done, then it would make sense to think about 5G’s control plane as merging with IP’s control plane.

If you look at the ONF’s recent Aether story, which is directed at private enterprise 5G, you find what I think are the justifications for Open Core, and also a model to support it. Not surprisingly, the ONF is building on its own programmable network model, which is based on SDN, but remember that ONF programmable networks can be said to be the model for producing NaaS. And it’s notable that Aether, in the ONF material, is divided into an “Aether Edge and an “Aether Central” pair, with the latter including (you guessed it) the mobile control plane.

It’s also notable, in my view, that Aether’s specific target is enterprise IoT, which happens to be the most stridently promoted public 5G application that would justify network slicing, and a full implementation of 5G Core. In a way, Aether is almost a case study for what I’ve been calling Open Core, aimed at IoT applications.

Mavenir already offers software-based 5G throughout, but they’ve not addressed as yet the potential in marrying NaaS and combining the IP and 5G control planes. That’s logical given that they don’t make SDN controllers and wouldn’t want to tie their 5G implementation to a massive change in IP infrastructure that they have no product stake in promoting.

Speaking of stakes, though, it’s interesting that the ONF is making such a strong statement on 5G and IoT in a private-5G context. There are a lot of possible reasons for this, and any one of them could be an indicator of a future revolution in 5G infrastructure design.

The first reason, and the one that’s most obvious, is that the ONF sees private 5G as a topic that’s getting a lot of media attention and just might actually be a major market opportunity. There are a couple thousand network operators but there are tens of thousands of enterprises. Not only that, those enterprises have no special loyalty to the big mobile infrastructure vendors, and they have a long history of accepting (even demanding) open source and openness in general.

The second reason is that the ONF may think it’s a bit late to take on established open 5G options in the operator markets. There are a number of established (if proprietary) vendors, a number of open-model players (including Mavenir), and public cloud provider interest too. Given that Aether, if it works for enterprises, would also likely work for operators, the ONF might be making an end run.

That circles us back to Mavenir. If there’s a lot of interest in fielding 5G infrastructure, open or not, then it’s more challenging to sell it as the field gets more crowded. Integration is always a big concern for open-model solutions, and so having its own radios is a way for Mavenir to address the multi-supplier finger-pointing challenge. It’s not a durable way, though. Do they then decide to build white-box switches? Evolve into being an Ericsson-like vendor? Or, do they need differentiation?

What makes all this especially complicated is that it’s hard to differentiate in an open-model space without looking like you’re not part of the open-model strategy. One of the things I think we can look forward to in 2021 is progress on the question of just how open networks are differentiated. I think the answer is going to come from the disaggregated plays of both startups like DriveNets and incumbents like Cisco and (most recently) Juniper. Whether the jousting will create a final answer can’t be known right now, but it’s going to be fun to watch.

Is Apstra the Third Leg of Juniper’s Strategic Stool?

Juniper has clearly gone on an acquisition tear. Just as they’ve completed their acquisition of SD-WAN technology leader 128 Technology, they’ve announced the purchase of Apstra, a data center intent-based operations automation and abstraction player. Given rival Cisco’s recent container-centric announcements, is Juniper simply trying to counterpunch, or does Juniper see 128 Technology’s contribution to Juniper’s branch-and-WAN strategy and Apstra’s contribution to an automated data center as a killer combination?

The price Juniper is paying for Apstra hasn’t come out, but financial experts tell me it’s well below $400 million, and thus lower than what Juniper paid for Mist and 128 Technology. In neither of those cases did Juniper buy for revenue; none of the companies would have been smart buys on that basis. Only one company in the group, Mist, had great market visibility, so the common element was technology. Juniper wanted, and wanted badly, some technology goodie their targets had. We know for Mist, it was AI support and automation. We know for 128 Technology, it was session awareness, and we know how the two tie in with each other. I think Apstra is another piece of the same story.

The network and the data center combine to create the experiences that empower workers, connect customers and the supply chain, and bind a complex organization into an efficient unit. Or, at least, they’re supposed to. The part about creating experiences is true, but the “complex” problems of growing IT dependence and infrastructure complexity are getting in the way of efficiency.

Juniper’s Mist AI concept, acquired with the acquisition of Mist and expanded since then, introduces a model for AI-based automation of experience delivery. The challenge is to understand what experiences you’re trying to support, and then being able to influence how the traffic and IT resources associated with them are handled. As I suggested in my blog yesterday (referenced above), 128 Technology is providing the answer to the first of those challenges, and I think Juniper intends that Apstra address the second.

Suppose that you knew everything about the way that all the critical IT applications of your enterprise were working, that you could identify problems and create solutions automatically, and support users inside the company or outside, directly. That’s the new story Juniper is promoting, I think. They want Mist AI to see all, control all, and do it based on the best possible foundation—an understanding of what workers, customers, and business partners are doing, and how valuable each of those experiences are to the company.

Apstra was one of the first vendors to recognize the value of intent models, something I’ve blogged about quite a bit. Their basic theory is simple; you can’t assure data center infrastructure and operations if you don’t know what it’s supposed to be doing. That’s the “intent” piece. If you do understand intent, you can then automate the process of fulfilling it. Apstra has a vendor-independent set of abstractions that let you assemble a data center from implementation-independent elements and make it work toward a specific, defined, business goal.

You can see how this would fit with the 128 Technology deal. Because of its session-aware handling of user traffic, 128T knows about the experiences, the user-to-application (and even application-to-application) relationships that combine to define corporate IT. Mist AI can (presumably; Juniper is still messaging its integrated positioning) support, monitor, and even assure these sessions through the company network. Add in Apstra, and you can extend that visibility, assurance, monitoring, and support throughout the total scope of the experience, from the application and data center out to the user.

Like the 128 Technology deal, this is way more important to Juniper than just the revenue from the acquired company’s own sales efforts. End-to-end experience management for business applications, and for public cloud components too, is a very powerful story to tell buyers. AI-based operations automation and user support is similarly powerful, and together they create something that could well be the smartest thing Juniper ever did.

The thing that makes this smart is the simplicity and clarity of the vision. Unlike Cisco’s container strategy, which admitted to multiple possible justifications and thus multiple possible paths to realization, there seems to be one single thing—experience control end to end—that’s driving Juniper. It’s always easier to make something work when there’s one very specific thing you’re trying to do, and it’s easier yet if your acquisitions tie into that single thing in the same way.

That raises the question of whether Juniper might incorporate the Apstra intent-model concept into its own AI-based framework, including 128 Technology. Intent modeling facilitates management and operations automation by not only setting specific usage-based goals, but also breaking complex structures up into smaller pieces that are more easily self-operationalized. They also abstract infrastructure, something that’s an explicit feature of Apstra’s approach, and that can facilitate control of multi-vendor networks. That, in turn, could make it easier for Juniper to penetrate the accounts owned by other vendors (say, Cisco).

The value of this to enterprise buyers is very clear, but there were some other tidbits in Juniper’s announcement that I think warrant some exploration and speculation. One is the Juniper reference to SONiC, and the other the focus of the announcement on T-Systems.

SONiC is a Microsoft-launched open-source operating system for white-box switching, and Juniper supports SONiC on its own switches, aiming in particular at cloud data centers and the public cloud or “hyperscaler” players. The decision to support SONiC was easy in a way; the big buyers demanded it, and it also demonstrates that data center switching is increasingly a part of the data center rather than the network.

Cisco and Juniper have both touted their “disaggregation”, more to respond to buyer pressure than because they really wanted to, or really had a strategic vision for it. Selling your naked switch/router hardware in competition with white-box vendors isn’t an attractive play for either, though it’s better than risking not getting a deal, or even getting kicked out of a data center, because you don’t support it. There’s been no indication either company had any strategic vision for the notion.

Till now, perhaps. Apstra could give Juniper an opportunity to pull white boxes into the Juniper ecosystem without abandoning or contaminating its own monolithic switch positioning. That would make the Mist AI story a piece of a strategy for countering the white-box movement, something every network equipment vendor really wants and needs to do.

The T-Systems focus illustrates that Apstra has actually done rather well with network operators, and in particular with operators who have started to climb the value chain in terms of services. There’s a strong managed service opportunity created by the 128 Technology acquisition, and managed network services make a pretty nice and obvious companion element to a software-service story. In fact, almost any higher-level service an operator decided to launch would do better if it had a managed service foundation, and of course had intent-modeled cloud data center infrastructure too.

Clarity of vision, to quote myself from an earlier point in this blog, is great. Clarity of vision coupled with timely execution is better. Combine both with elegance of positioning and it’s the greatest of all. Just yesterday I pointed out that Juniper still had to execute on a vision of a vertically integrated network based on Contrail, Juniper IP, and 128 Technology. Now they need to execute on Apstra too. Because there’s so much potential symbiosis here, that’s far from an impossible task, or even a hugely difficult one, but big router vendors in general, including Juniper, haven’t always been great at strategic positioning.

Opening up the SONiC story may be a particular complication. SONiC and Apstra really could open up a door for Juniper to enter into almost any data center, but can they do that, and address what they find? In particular, can they develop a positioning for those higher-level services that T-Systems seems to be coveting? Can they harmonize SONiC support with their own Junos? The fact that they highlighted SONiC in their press release suggests they don’t plan on abandoning it, and what one doesn’t abandon, one must live with, especially in a positioning sense.

This all combines to create a very interesting point. Two mutually supporting acquisitions could be a coincidence, but three is a strategy. While Cisco is apparently seeing future success lying totally above or outside the network, Juniper thinks that there’s still an opportunity to improve traditional connectivity, the baseline network service for operators and enterprises. They may also see an opportunity to leverage some of the tools associated with connection service modernization in rising up to a higher, application, level. If that’s the case, then Cisco might be abandoning the traditional network market to a newly invigorated rival.

A New Age in Virtual Networking?

Sometimes a term gets so entrenched that we take its meaning for granted. That seems to have happened with “virtual network”, despite the fact that just what the term means and how one might be created has changed radically over the years. In the last year, I asked almost a hundred enterprise and service provider network planners what a “virtual network” was, and there wasn’t nearly as much agreement as I thought there’d be.

Conceptually, a “virtual network” is to today’s IP network what “virtual machine” is to a bare-metal server. It looks like a network from the perspective of a user, but it’s hosted on a real network rather than being a direct property of it. There are many drivers for virtual networks, which probably accounts for the multiplicity of meanings assigned to the term, but there’s one underlying issue that seems to cross over all the boundaries.

Real networks, at least real IP networks, were designed to connect sites rather than people. They’re also a combination of Level 2 and Level 3 concepts—a “subnet” is presumed to be on the same Level 2 network and the real routing process starts when you exit it, via the default gateway. The base concept of real IP networks worked fine as long as we didn’t have a lot of individuals and servers we expected to connect. When we did, we ended up having to gimmick the IP address space to make IPv4 go further, and we created what were arguably the first “virtual networks” to separate tenant users in a shared data center or cloud.

Another problem that’s grown up in recent years is the classic “what-it-is-where-it-is” question. IP addresses are linked to a network service access point, which is typically the gateway router for a site. A user of the network, in a different site, would have a different address. In mobile networks, having a smartphone roam to another cell means having it leave the place where its connection is made, so mobility management uses tunnels to follow the user, which is a form of virtual networks.

The what/where dilemma can also complicate security. IP networks are permissive in a connection sense, which means that they presume any address can send something to any other, and this has created a whole security industry. Prior to IP the dominant enterprise network protocol was IBM’s System Network Architecture (SNA), which used a central element (the Systems Services Control Point) to authorize “sessions” within the network, with a session being a relationship between network parties, users, rather than network components. This security industry added to the installed base of IP devices to make it harder and harder to change IP in a fundamental way, which has again boosted the notion of virtual networking.

Then there’s the big issue, which is “best efforts”. IP does support traffic engineering (MPLS, for example) but typically in the carrier network and not the user endpoints. A branch office and even a headquarters location doesn’t have MPLS connectivity. Traffic from all sources tends to compete for resources equally, which means that if there are resource limitations (and what network doesn’t have them?) you end up with congestion that can impact the CxO planning meeting as easily as someone’s take-a-break streaming video.

There have been proposals to change IP to address almost all these issues, but the installed base of devices and clients, combined with the challenges of standardizing anything in a reasonable time, has limited the effectiveness of these changes, and most are still in the proposal stage. So, in a practical sense, we could say that virtual networks are the result of the need to create a more controllable connection experience without changing the properties of the IP network that’s providing raw connectivity.

Building a virtual network is different from defining what the term means. There are two broad models of virtual network currently in play, and I think it’s likely these represent even future models of virtual networking. One is the software-defined network, where forwarding behavior is controlled by something other than inter-device adaptive exchanges, and where routes can be created on demand between any points. The other is the overlay network where a new protocol layer is added on top of IP, and where that layer actually provides connectivity for users based on a different set of rules than IP would use.

The SDN option, which is favored obviously by the ONF, falls into what they call “programmable networks”, which means that the forwarding rules that lace from device to device to create a route are programmed in explicitly. Today, the presumption is that happens from a central (probably redundant) SDN controller. In the future, it might happen from a separate cloud-hosted control plane. The advantage of this is that the controller establishes the connectivity, and it can fulfill somewhat the same role as the SSCP did in those old-time SNA networks (which, by the say, still operate in some IBM sites).

As straightforward and attractive as this may sound, it still has its issues. The first is that because SDN is a network change, it’s only available where operators support it. That means that a global enterprise would almost certainly not be able to use the SDN approach to create a custom connectivity service over their entire geography. The second is that we have no experience to speak of on whether the SDN concept is scalable on a large scale, or on whether we could add enough entries to a flow switch (the SDN router) to accommodate individual sessions.

The overlay network option is already in use, in both general virtual-network applications (VMware’s NSX, Nokia/Nuage, etc.) and in the form of SD-WAN. Overlay networks (like the mobility management features of mobile networks) take the form of “tunnels” (I’m putting the term in quotes for reasons soon to be clear) and “nodes” where the tunnels terminate and cross-connecting traffic is possible. This means that connectivity, to the user, is created above IP and you can manage it any way you like.

What you like may not be great, though, when you get to the details. Overlay virtual networks will add another header to the data packets, which has the effect of lowering link bandwidth available for data. Header overhead depends on packet size, but it can be as high as 50% or more. In addition, everywhere you terminate an overlay tunnel you need processing power. The more complex the process, the more power you need.

It’s logical to ask at this point whether we really have an either/or here. Why couldn’t somebody provide both implementations in parallel? You could build a virtual overlay network end-to-end everywhere, and you could customize the underlying connectivity the virtual network is overlaid on using SDN.

Now for the reason for all those in-quotes terms I’ve been using, and promising to justify. Juniper Networks has its own SDN (Contrail), and they just completed their acquisition of what I’ve always said was the best SD-WAN vendor, 128 Technology. What 128T brings to the table is session awareness, which means that they know the identity of the user and application/data resources, and so can classify traffic flows between the two as “sessions”, and then prioritize resources for the ones that are important. Because 128T doesn’t use tunnels for a physical overlay (they have a “session overlay” that has minimal overhead), they don’t consume as much bandwidth and their termination overhead is also minimal.

What Contrail brings is the ability to manipulate a lower-level transport property set so that actual IP connectivity and the SLA are at least somewhat controllable. With the addition of Juniper’s Mist AI to the picture for user support and problem resolution, you have a pretty interesting, even compelling, story. You can imagine a network that’s vertically integrated between virtual, experience-and-user-oriented, connectivity and a virtualization layer that’s overlaid on transport IP. From user, potentially, to core, with full integration and full visibility and support.

If, of course, this is a line Juniper pursues. The good news is that I think they will, because I think competitors will move on the space quickly, whether Juniper takes a lead with 128T or not. That means that while Juniper may surrender first-mover opportunities to define the space, they’re going to have to get there eventually. They might as well make the move now, and get the most possible benefit, because it could be a very significant benefit indeed.

What’s Behind Cisco’s Container-Centric Software Strategy?

Cisco loves containers. There’s no question that container and software-related acquisitions have dominated Cisco’s recent M&A, but it’s sure reasonable to wonder what they hope to gain. Does Cisco think they can become a competitor to cloud software and server companies, are they betting on hosted network elements, or what? Cisco’s Banzai acquisition last month is perhaps a “tell” regarding Cisco’s direction.

Cisco has been in the server business since 2009, with its Unified Computing System (UCS) line. At first, UCS was pretty much a server story, but since then, and especially within the last couple years, Cisco has been picking up platform software to augment their hardware position. Given that Cisco always tells whatever story seems likely to gather the most PR momentum, it’s never been really clear where they wanted to go with the stuff.

I think that, early on, Cisco’s foray into servers came out of their data center switching business. Cisco is, and always has been, a very sales-directed company. The IT organization tends to be the buyer of data center switching rather than the networking organization, and so Cisco’s salespeople were calling on a new cadre of prospects for their rack switching. Given that data center switches in general, and top-of-rack systems in particular, connect server farms, the demand for them comes as a result of an expansion in the server area. Salespeople ran back to Chambers (CEO of Cisco at the time) and suggested Cisco get into the server business.

UCS has generated respectable but not spectacular revenue for Cisco, but it reached its peak growth around 2015, and the largest number of UCS customers were in the software and technology space. UCS has fallen in market share in the last several years according to most analysts’ reports. This coincides with the sudden growth in cloud computing and containers, and that’s what raises our questions regarding Cisco’s motives.

Cisco might be doing nothing more than aligning UCS with current platform directions. Users increasingly want to buy hosting platforms, which include both the servers and the necessary operating system and middleware tools. Even more significant is the user’s focus on the platform software for the hosting value proposition; the servers are just cost centers to be negotiated into “margin marginalization”. Since Cisco doesn’t want to be in a commodity business, it makes sense to build the value-add.

The Banzai deal may be the “tell” for this view, as I’ve already suggested. Banzai was focusing on enterprise cloud-native development and deployment. If Cisco wants to be a hosting platform player for the enterprise, building their credibility in the original UCS mission, then jumping out ahead of the current market is critical; there’s too much competition for vanilla containers. Differentiation would help Cisco sustain margins.

The only problem with this is that IBM/Red Hat and VMware are also jumping into the cloud-native space, and from a position of an established data center vendor. Their approach is to replace the software platform while being server agnostic, meaning that to compete with them, Cisco would have to either sell software without UCS servers, or displace existing servers to move UCS servers in. The former means going head-to-head with established vendors, and the latter would be a tough sell to enterprise CFOs.

So, what are they doing? A second possibility is that Cisco is shifting its focus to a future convergence of network and cloud. Remember that Cisco’s revenues are overwhelmingly from network equipment, and their best profit margins have been in the router space. With routers under price pressure, and with network operators and enterprises both looking at open-model networks, Cisco’s core business is under pressure. Could it be that Cisco is looking to sell servers to buyers who have new network missions that involve servers and network equipment? Think “carrier cloud”.

Carrier cloud is kind of like the Seven Cities of Gold; everyone “knew” they were out there, but nobody ever found them. The potential of carrier cloud is enormous, over 100,000 data centers by 2030, containing millions of servers. It would be, if deployed, the largest single source of new server purchases in the global market. Best of all, from Cisco’s perspective, carrier cloud is sold to carriers, people Cisco have been selling to for decades.

The problem with this is that operators are far from a strong financial commitment to carrier cloud. Most of them see applications of “carrier cloud”, but few of them are confident at this point that they can make a business case for them, or even assign a cost to them, given a lack of understanding of just how the applications would work. NFV was the only carrier cloud driver that operators really understood, and it failed to develop any credibility. It’s not like Cisco, sales-driven as it is, to spend a lot of sales resources educating a market that they know will be competitive if buyers learn the ropes.

5G and Open RAN might be the next opportunity for carrier cloud, and for Cisco. Here, the opportunity and the execution on it develop pretty quickly, there’s funding/budget in place, and there’s clear market momentum. Cisco could well see an opportunity to grab ahold of this next driver, and by doing so gain control over carrier cloud. They might also be able to use 5G and Open RAN to cement a position in the “separate control plane” model. Cisco disaggregates IOS and its router hardware, but to make their position real, they need to separate the control plane and extend it, at least in part, to the cloud.

The problem with this is that it’s still a big reach for a company that’s never been a software innovator. I think it’s more likely that a Cisco competitor would jump on this opportunity, in which case Cisco likely sees it the same way and would probably not invest a lot of resources at this point. Fast-follower is their role of choice.

What does that leave? “Eliminate the impossible, and what’s left, however improbable, must be the answer.” I think that while none of the possible Cisco motives for container interest are impossible, the one least improbable is the first one, that Cisco is seeking broader data center traction. A recent Network World piece seems to reinforce Cisco’s interest in the enterprise data center as their primary motivator.

One good reason is the one already cited; Cisco has engagement with the buyer in that space already. Another good reason is that Cisco thinks the enterprise, or at least the leading-edge players in that space, are likely to move faster than the service providers. Service provider profit per bit challenges are profound, and it may take years for them to evolve a strategy and fund it.

A final, possibly critical point is that carrier cloud is more “cloud” than “carrier”. If there is a credible market for carrier cloud in the future, it will involve service features based more on traditional public cloud technology than on network technology. Thus, a Cisco initiative to address near-term cloud-native opportunity for the enterprise today could pay dividends for carrier cloud initiatives in the future.

Can a Fiber-Centric Strategy Help AT&T?

Is ATT right about fiber? The CFO says it’s a “three for one” revenue opportunity, which is why the operator says they’re likely to add to their fiber inventory. One clear message you can draw from that is that a one-for-one or two-for-one might not have been enough, which leads us I think to the real reasons why AT&T is looking for more glass in the ground.

Consumer fiber, meaning FTTH, requires a fairly high demand density to justify, because its “pass cost”, or cost just to bring the fiber to the area of the customer to allow for connection when there’s an order, is high. Operators put the FTTH pass cost in the over-five-hundred-dollar range, and at that level, there’s way too much area that residential fiber can’t easily reach.

If you can’t make residential fiber broadly cost-effective, perhaps you can gang it with other fiber applications, notably things like 5G backhaul and multi-tenant applications like the strip malls the article talks about. If you look at everything other than large-site fiber as being an application of passive optical networking, you can see that just getting PON justification in a given area could open that area up to FTTH at a low enough incremental cost to make it profitable.

Of the possible non-residential fiber drivers that could be leveraged, the most interesting could be microcells for 5G and fiber nodes for 5G millimeter wave. The former mission is valuable in both more rural settings and in high-density retail areas, and the latter in suburban locations with highly variable demand density, where pass costs for FTTH could limit how much of the suburb you could cover with high-speed broadband service.

Every restaurant and retail location knows that customers like WiFi, and restaurants in particular almost have to offer it. People often use WiFi in restaurants to watch videos and do pretty much the same things they’d do at home, but in a larger concentration. If you spread a few 5G microcells around a heavily strip-malled area you could feed them with fiber, getting fiber closer to the residential areas they served.

From there, you could then consider the 5G/FTTN hybrid model. By extending strip-mall feeds to a local 5G millimeter-wave node, you could now reach residential and even small business sites up to about a mile away for high-speed broadband delivery. Each decent-sized strip mall could be a multi-purpose fiber node that supported even 5G mobile services and enhanced total capacity and service credibility. In fact, the combination of 5G/FTTH and 5G mobile could be a killer in the suburbs, and of course it facilitates wider-ranged fiber deployment.

Additional cells also improve the chances of open-model networks, and 5G in particular. One of the factors operators cite to justify proprietary 5G RAN is that there’s no support for the “5G massive MIMO” that could improve cell capacity. With more, smaller, cells, there’s less pressure to provide very high-capacity cells. In fact, this may be a major factor in AT&T’s dense-cell strategy; they also have a major commitment to open-model 5G.

A lot of fiber benefit could also enhance the classic capacity-versus-complexity tradeoff in network design. If you have a lot of capacity, you need less traffic management and complexity at Level 3, which is where most opex costs are generated. You can also probably rely more on generic white boxes for your data paths, as long as they can support the capacity you need. The effect of combining open-model IP networking with higher optical capacity and density is to shift your capex toward fiber and reduce your opex.

There’s no question in my mind that AT&T is right about using more fiber, creating more 5G nodes. There is still some question on whether the move can really fully address AT&T’s rather unique issues as a Tier One. To understand why, and what might help, we have to dig a bit into those issues.

Most Tier One operators evolved from wireline carriers who served populous regions. AT&T is fairly unique in that its wireline territory has more widely dispersed populations; rival Verizon has seven times the demand density, meaning much more opportunity per unit of geographic area. In cities and typical suburbs, AT&T and Verizon are comparable, but in more distant suburbs and rural areas, AT&T is far less dense. When Verizon started with its FiOS plans, it shed some of the localities where there was no chance FiOS could be profitable, to eliminate the problem of having some customers who could get great wireline broadband and others who could not.

Wireless is different, of course, and more so if you factor in the 5G/FTTN hybrid. Instead of having a pass cost of about $500 per home/business for FTTH, your pass cost could reduce to less than $100 providing that you had reasonable residential density in a one-mile radius of the node. That would cover about 80% of the thin suburban locations. Add in mobile-model 5G, with a range of 8-20 miles from the tower, and you have the potential to cover your entire territory with acceptable pass costs.

That’s why the decision by AT&T to drop DSL is smart. They have too many thinly populated areas to sell off everything where FTTH won’t work, so they have to find something that does work, and the 5G option is their best answer. In fact, AT&T’s network business salvation lies in focusing on 5G infrastructure for wireless and wireline and using FTTH only where the customer density is very high. If 5G mm wave works out, in fact, they might well be better off not using FTTN anywhere. Going full 5G would improve their demand density problems significantly, to the point where their effective density would triple.

That’s not enough for them to be profitable, in the long run, from delivering broadband alone. Instead of Verizon being ahead by seven times, they’d be only a bit more than double AT&T’s effective density. AT&T would still get some kicker from their Time Warner acquisition, but they’ll still need new revenue streams. If they move totally to a 5G model, meaning a pure packet model, they would be committed to streaming video, which they’ve already failed to capitalize on. Can they do better, in streaming and elsewhere?

Maybe, because the open-model, separate-control-plane, network would also potentially address their new-revenue challenge. 5G has some control-plane cloud-hosting potential (white boxes are still the best approach for the data plane), and future services built on contextual/personalization processing and IoT are all dependent on mobile access. If AT&T did an intelligent network modeling for this combination of a pure 5G future and new contextual/IoT services, they could get pretty well out in front on generating new revenue, credible and significant new revenue, in fact.

Can they do that? AT&T has been, perhaps more than any network operator, a driver of open-model networking. They’ve not always been the most insightful driver, though. There is a risk that their white-box focus will focus them again on boxes rather than on software, that they’ll view software as just a necessary component of white-box networking rather than its real justification. If they can learn the software lesson, or if a vendor can teach it to them, they’ll have a shot at a future.

The Street is mixed on AT&T today. Some love their dividend and see them as a safe play, and some say that beyond the dividend-of-the-moment, there may be bad moments ahead. I think that what AT&T has done so far has not secured their future. I think fiber enhancement and even open-model networking won’t secure it either. But I think that these measures have bought them at least two or three years to do the right thing. It’s just a matter of their identifying it, then executing on it.

There’s not much awareness among operator planners regarding the architecture of a monetizable future service set, or the network/cloud symbiosis needed to create and sustain it. There’s also, apparently, a serious shortage of cloud software architecture skills in operator organizations. Finally, operators still see “services” as being the result of combined device behaviors rather than the creation of software functionality. I think AT&T is working as hard as any operator to deal with all these issues, but they need to get moving; remember their cost management measures will buy them three years at the most.