Juniper Takes Another Metro Step; Is It Enough?

Back in February, Juniper did a couple of blogs on Juniper announcements, opening a question of whether Juniper was getting ready for some major positioning changes. I noted that in this blog, and suggested that perhaps Juniper was making some major virtual-network changes, ones that could even end up impacting Juniper’s Cloud Metro story and positioning Juniper in the metro space in a decisive way. On March 17th, they announced Apstra integration with their CN2 positioning, asking “Why are our data center and telco cloud customers so enthusiastic about the CN2 and Apstra integration?” Why? We’ll see.

Juniper’s blog post on this opens some points of interest for prospective metro applications. One thing that general function virtualization demands is the ability to host on pretty much anything, including VMs and bare metal. However, the optimum platform for most virtual functions would be containers, and the optimum orchestration strategy would be Kubernetes. CN2 supports that, and now Apstra can manage the switching fabrics in the data centers where mixed hosting options for virtual functions could be critical.

If you believe in service features, you believe in metro feature hosting. If you believe in edge computing, you believe in metro feature hosting. In fact, there’s not much new or exciting in either telecom or cloud computing that doesn’t absolutely depend on metro feature hosting, which is why I have a hard time understanding why vendors haven’t been all over it for years. Even given the fact that telcos are dragged kicking and screaming into anything, it seems, beyond tin cans and strings, it should be clear that somebody has to grab some arms and legs (and ear plugs) and start pulling. And all the drivers involve hosting, so data center integration with access and aggregation in one direction, and core connection in the other, mean metro is a big technology junction point.

Which is why I think Juniper may be getting more serious about it. Keep in mind that however much core-on-ramp or access aggregation goes on in the metro, it’s the feature injection piece that’s driving the bus, requirements-wise, and if you’re not going to build features you have to prep for unified hosting and management. I think Kubernetes and CN2 are two legs of an evolving metro stool, and the Nephio project is the third.

Let me quote the Nephio mission: “Nephio’s goal is to deliver carrier-grade, simple, open, Kubernetes-based cloud native intent automation and common automation templates that materially simplify the deployment and management of multi-vendor cloud infrastructure and network functions across large scale edge deployments. Nephio enables faster onboarding of network functions to production including provisioning of underlying cloud infrastructure with a true cloud native approach, and reduces costs of adoption of cloud and network infrastructure.” Juniper is a Nephio participant.

The point here is that Kubernetes is what makes the cloud go around. CN2 is what extends the cloud’s virtual networking to non-containerized, even non-cloud, server assets, and Nephio lets Kubernetes manage related network assets. The three combine to define the glue that ties all the various technologies in metro into a unified, manageable, entity. Not only for telcos, but for any public cloud providers or interconnects that want to play in function hosting and/or edge computing, which means pretty much all of the above.

All of this raises interesting opportunities, questions, and challenges, and not only for Juniper. As I noted in my referenced blog, Juniper has significant technology assets that could be highly relevant in the metro space. So do other vendors, most notably Nokia, who’s the topic of a blog I did earlier this week. Nokia hasn’t done much in the way of explicit metro positioning, but Nokia has a handle on the edge opportunity, and that’s driving the feature bus. Juniper has a position on the network side. The point is that if feature injection is the goal of metro, even Kubernetes-based operations is a long way down the stack from the features we’re trying to inject.

We’re really looking for a harmony of hosting and network to create a standard development platform and encourage feature production. That means that it’s not Nokia alone that could threaten Juniper’s potential metro evolution, so could other players not even in the networking space. While real threats would be solidified in a full set of “feature middleware”, there are intermediate things that could be done to both step toward that feature goal and provide some interim solutions to feature-hosting problems, in part by creating some unity between NaaS and PaaS. That’s particularly true if we generalize “feature hosting” to mean “edge computing”.

Tenant management in edge computing, and the related topic of tenant security, is way more critical than in the cloud. That’s because edge computing is really justified only for real-time applications, and real-time applications connect to the real world, where mischief could create a disaster of major proportions. That, I think, means that virtual network control is absolutely critical at the edge, and it has to meld the Kubernetes notion of virtual networks (which is the cloud/container view) and the network view. Unity, stroke one.

The next point is that we can’t have people writing edge apps for bare hosting, in large part because of that security point. Keep in mind, too, that security at the edge means more than keeping out bad actors, it means ensuring that Real Time A doesn’t somehow hog all the resources, bringing Real Times B and so forth to a grinding halt, even if it’s an accident and not malicious. Thus, we need middleware, and probably several perhaps-related-at-some-point middleware stacks. NFV hosting, if it evolves to become relevant, needs a specific middleware kit. Edge computing needs a middleware kit at the general level, with perhaps specialized stack top-offs for classes of applications. Some of this middleware will have to influence connectivity. Unity, stroke 2.

What’s important here is that none of our network vendor contenders has, as far as I know, any activities relating to the creation of this middleware, which would create both a NaaS and a PaaS element for the metro and its interconnected partner feature components. The question is then where thus stuff would come from, and there are three possibilities I can see.

First, some vendor in or out of the networking space could step forward with an architecture and and software prototype, and do the job. This would move things the fastest because a single hand on the reins moves the team best.

Second, some similarly constituted vendor could step forward with an architecture, and launch an open-source project to realize it. Juniper did something like this with IPsphere decades ago. This could also move things along quickly, but the risk is that open participation would lead to the inclusion of players who really wanted to interfere with rather than make project. Cisco was that player in IPsphere; maybe here too?

Third, somebody could launch a standards or standards-like initiative like Nephio or the NFV ISG. This would be the worst possible outcome because it would move at a glacial pace, satisfy nobody because it was trying to satisfy everybody, and stall any other initiatives with the same goal because nobody wants to step on a standards process.

If none of these options look likely, or at least look like they might be effective, what about non-network vendors who might jump in? The most obvious source of these new competitors would be the public cloud providers, and perhaps Google in particular given their connection to both Kubernetes and Nephio. VMware and Red Hat could also play from the software platform side. I’d rate the chances of a player in this non-network group stepping up as being slightly better than that of a network vendor stepping out of their usual comfort zone.

There is, of course a final implied possibility, which is that nobody does anything. Over the long haul, I think that would lead to a bunch of silo implementations that would fight it out and divert developer and other resources until all but one or two of them were killed off. That would take five or six years, and would delay full realization of the potential of metro, edge computing, and new and revenue-positive telco services. I hope somebody steps up, and I think that they will. I suspect that within a year, we’ll either see a potential answer to the problem of NaaS/PaaS unity at a higher level, or we’re in for a half-a-decade slog.

Taking Mobility to the Next Level

On March 16th, I blogged about the question of what a cloud-native telco might look like, and obviously this had to address the evolution of things like NFV and 5G. The question I raised at the end of that blog was whether we had perhaps carried evolution too far in mobile networking and NFV, whether starting again from the top might lead us to a different place. Let’s look a bit harder at that one today.

A big piece of how mobile networks manage their critical property of mobility traces back quite a way. What we have today, in 4G and 5G in some form, is a strategy that connects per-device “tunnels” from the gateway point to the packet network (the Internet in most cases) to the cell site where the device is found. Packets are processed (in 5G, by the UPF) through a set of classification rules (Packet Detection Rules or PDRs in 5G), and the rules match the packet header (the IP address, tunnel headers, etc.) and apply handling policies. Every device has to have at least two rule sets, one for each direction of traffic. The implementation of these rules, handling policies, and routing based on the combination is normally associated with elements of the “user plane” of a mobile network.

You could justify having per-device PDRs, but how about per-device tunnels? A packet for any user, emerging from a gateway UPF representing the Internet, could “appear” in the right cell even if it shared the tunnel to that cell with hundreds of other packets from other users. Thus, we could cut down on the number of tunnels by having one tunnel per destination cell. Not only that, if we had detailed control over forwarding rules, we could simply forward the packet to its proper cell based on its own IP address. We could get more forwarding control, too, by thinking a bit about how the user plane of a mobile network would work.

In the 3GPP specifications, user-plane and control-plane elements are represented as boxes with interfaces, and in the past they’ve been implemented as real appliances. In an effort to open things up, recent standards evolution has described these elements as “virtual network functions” (VNFs) and presumed a hosted software instance implemented them. The suggestion about “microservices” or “cloud-native” network traffic handling likely comes out of this point.

Following the standards leads to the creation of mobile infrastructure that’s almost an adjunct to the IP network rather than a part of it. This mobile network touches the IP network in multiple places, and its goal is to allow a user roaming around through cell sites to be connected to the Internet through all this motion, retaining any sessions that were active. We have a completely separate set of “routers” with different routing policies to make this work.

Eliminating mobile-specific appliances in favor of hosted VNFs introduces cloud network overhead to the handling of mobile traffic, which seems at odds with the 5G goal of lowering latency and supporting edge computing for real-time applications. My suggestion, as described HERE and HERE [add references to the blogs, the latest being March 16th], was to consider the control-plane implementation to be based on cloud-native and the user plane to be based on a “router with a sidecar” to add the capability of implementing the PDR-based handling 5G requires. I noted that something SDN-ish could work as the means of connecting sidecar element to the router.

There has been work done on supporting mobility management via “standard” devices, meaning either routers with some forwarding agility or white-box switches. SDN, as just noted, would offer a means of customizing forwarding behavior to include tunnel routing, and the SDN controller could act as an intermediary PDR controller. There has also been work done on using the chip-level driver language P4 to define mobile PDR handling. That would enable any device that supported P4 to be a PDR handler, though there would still be a need for “sidecar” control to implement the interface between the device and the 5G control plane.

If VNF-based handling of mobile traffic isn’t the optimum approach, it’s not the only thing that mobile standards lead to that could be sub-optimal. The problem with block diagrams in standards is that they are often difficult to connect with two real-world elements—the location where they might be deployed, and the patterns of traffic within the network overall. In my view, both these real-world elements converge into what I’ve been calling metro.

Metro to me is both a place and a traffic pattern. Network service features are most easily introduced at a point close enough to the edge to permit real-time handling and personalization, but deep enough to serve the number of users needed to create reasonable economy of scale. I submit that this place is the metro center or metro. As it happens, the notion of a metro center is already established because of the importance of content caching. The major CDN providers connect to broadband access networks at specific places, and these are probably the best early candidates for a metro site. That’s because video traffic is the great majority of Internet traffic, even mobile broadband traffic, and so there’s already great connectivity to these points, and a natural concentration of traffic. The majority of content cache sites are also places where real estate to house edge computing is available.

Mobile networks often implement a kind of implicit metro-centricity because it’s at least possible to differentiate between mobile users that remain within a metro area versus those that are roaming more broadly, with the former having direct IP addressing and the latter tunneling. Since most mobile users do tend to stay in a metro area, and since most video consumption on mobile devices involves those generally-in-metro users, traffic efficiency is often high because video traffic doesn’t need to be tunneled.

It might be possible, and valuable, to make metro-centricity a design mandate. If we were to define a kind of “movement zone” around each metro center, with the goal of containing a large proportion of the users there within that zone as they travel locally, we would improve handling and latency. We’d probably cover about three-quarters of the population that way, and over half the geographic area of the US, for example. For the rest of the population and geography we could expect users to be moving in and out of metros, and more tunneling would likely be needed.

Presuming population mobility was a challenge that had to be addressed in sustaining session relationships, we’d end up needing more tunneling because it would be very difficult to reconnect everything when a user moved out of a metro area and required a different mobile gateway to the Internet, or to cached content.

Edge computing and real-time digital twinning via 5G would exacerbate some of these issues, because interruption of message flows in real-time systems can have a catastrophic impact, and attempts to remedy that at the message-protocol level (requiring acknowledgments, buffering and reordering, etc.) would increase latency overall. If we were to assume that there were “social-metaverse” and other “cooperative” applications of edge computing that required broad edge-to-edge coordination, the need to optimize just what edge facility a given mobile user is connected with would increase. It’s also true that real-time edge applications would be likely based on microservices, and that could give impetus to attempts to create microservice-based mobile service elements overall.

I think that mobile standards, like NFV ISG work, have suffered from box fixation, making it difficult to assess how to implement them in a “cloud-native” way. I think that network standards in general have to balance between the optimized use of new technologies and the ability to evolve from existing infrastructure without fork-lifting everything at great cost. The problem is that this combination creates a barrier to motion and a limit to innovation at the same time. At some point, we need to think harder about going forward with network standards, and I think that starts by looking at features and not boxes.

A Conversation with Nokia on Their Enterprise Revenue Goals: It Might Work

When Nokia announced their “rebranding”, I blogged on it to cover their main points and to note that there’d better be something to it beyond a new logo. Shaun McCarthy, Nokia’s President of North America Sales, offered on LinkedIn to give me a briefing, and late last week we had our talk. It was, as you’ll see, insightful and useful.

The most important point Nokia made in their just-before-WMC announcement was their goal of expanding their enterprise business. While McCarthy reinforced Nokia’s commitment to the service provider space, he noted that the CSP market had a 1% CAGR, which isn’t the sort of thing you’d like from your primary market segment if you’re a vendor. In contrast, the enterprise sector has roughly a 9% CAGR, so expanding Nokia’s business there (currently 8% of their revenues) to their goal of 30% makes a lot of sense.

Finding high-growth market segments is good, but only if you can address them. I noted in that early blog that there was plenty of competition in the enterprise space, and that Nokia was going to have to fight to gain a respectable position. I also said that their private wireless push had only limited appeal among enterprises. My latest model run, in fact, said that the total private wireless and “hybrid” wireless (private combined with a public offering, which in 5G might mean a network slice) would amount to only about 8% of enterprise network spending. That’s not a great number, and I indicated that, to get to a 30% enterprise revenues level, they’d have to work out a strategy to increase enterprise revenues significantly over what private wireless can offer.

What I learned on my call is that they already have a strategy, and one that was not only working but had some pretty significant potential for success. We could call it “facilities automation”.

The overwhelming majority of IT spending today is directed at what could be called “office automation”. Workers who sit at computers and access applications in the cloud and data center make up this opportunity, and while this is the market that’s driven enterprise network spending too, it’s also a market that’s slipped largely into sustaining mode. That 30% expansion Nokia wants in their enterprise business would be a challenge in a market where incumbents were strong and growth was limited.

Neither of those are true in the facilities automation space. “Facilities” automation means the automation not of office work but of industrial/manufacturing and other core business processes, the stuff that creates and moves products and coordinates services. McCarthy calls this targeting “non-carpeted” areas. How much opportunity is there? According to my data, a lot. In fact, most of the opportunities that seem ripe for IT investment have more dirt on their floors than carpeting.

The most “empowerable” industries are those with the highest unit value of labor. In the entire spectrum of industries tracked by the government, there are only seven who have median unit values of labor in the “highly empowerable” category, and all but two are in those non-carpeted areas. The industry most represented is the energy sector, then various manufacturing sectors. As it happens, Nokia’s enterprise successes focus on these sectors already, and McCarthy mentioned the energy sector in our talk. And, also as it happens, these are the sectors most interested in private 5G.

The challenge in the facilities automation space is that it depends heavily on domain expertise. Office automation benefits from the fact that, well, an office is an office. Running a business is accounting, orders, shipments, and all sorts of stuff that doesn’t vary much across the verticals spectrum. Automating a factory or warehouse depends on knowing something about the way work is done, and influencing it as it’s being done rather than playing around with records of the money and order flows. Nokia has to depend more on partners, what could broadly be called “value-added resellers” or consulting firms that focus on the spaces you’re interested in. Their relationship with the Kendryl spinoff from IBM is a good example of how to get domain expertise, and the partner strategy has worked for Nokia so far, and will likely continue to create opportunities to fully realize private wireless potential.

Nokia wouldn’t sneeze at getting even 8% of enterprise network spending, according to McCarthy, but there may be more to that on the table if Nokia can work its way out of the private-5G space and into broader facilities automation and even into the service provider space. That may be possible because of the connection between private 5G, facilities automation, and edge computing.

Right now, over 95% of edge computing is done on-premises using customer-owned equipment. Private 5G is most useful as a better form of WiFi in manufacturing and related applications, and one reason for that is latency control and connection security for real-time process management running on these customer-owned edge devices. Having a foot in the private wireless door could thus open the larger door of IoT, edge computing, and real-time communications. It’s this secondary push that could represent the pathway to Nokia’s goal of having 30% of their revenue come from the enterprise. And it would be possible without having to face entrenched competitors in the office automation piece of the market.

Best of all, there’s another dimension to facilities automation, one that could be really significant. You can empower office workers in a bunch of ways, and that’s why we see a broad IT and network initiative aimed at them. However, office workers make up only about 30% of the workforce. The problem in getting to the 70% is that you can’t empower them with information, you have to empower them with automation. You have to step into the work itself and influence how it’s done in real time.

I continue to believe that the key to doing this is the digital twin concept. You can’t influence work without moving into real-time systems, and you can’t influence real-time systems without a digital twinning of the system to organize sensor data and to assess and take actions. McCarthy mentioned the “industrial metaverse” in our call, so they may be seeing this potential already, and seeing a role they could play there to boost enterprise sales.

While Nokia’s strategy for facilities automation is dependent on domain specialists at the sales execution and application level, all of the knowledge of the work has to be reflected eventually in implementations based on servers, platform software, and networks. There are major differences between how a farm and a nuclear power plant work, but if you dig downward toward and into the IT piece, you find middleware that’s creating the underpinning stuff like digital twinning and a metaverse concept based on it. One inclusive of the industrial metaverse but aimed at the broad facilities automation space. Nokia can’t develop custom software for every vertical, but they could develop middleware for a digital-twin metaverse that would support that custom vertical software.

And they could do it with open technology. When enterprises talk with me about Nokia (which is less often than they talk about Cisco or Juniper), the technical differentiator they see most often is “openness”, which surely comes from Nokia’s open-RAN 5G position, which has gotten them a lot of ink and is often cited as the reason they’re doing better in the service provider space than competitors. Could Nokia create an open-model digital-twin process? Sure. Would they, given that vendors typically try to lock up customers not open them up? That’s a question we’ll have to wait to answer.

Another of those “wait-for-it” questions is in the positioning and marketing. If you are a company who necessarily focuses your revenue-generating initiatives through partners, it’s easy to lose your identity. Truth be told, players like Kendryl aren’t going to work with Nokia alone. Suppose a Nokia partner decides to build something like digital-twin middleware? Facilities automation is a stack of function-to-IT mappings, and you can standardize that stack by starting at the top or at the bottom. Sure, open middleware for digital twinning is a risk, but is it a greater risk than just being made invisible by higher-level players? It might present a benefit, something you could market to become visible, and to set up the belief that the evolution you offer to facilities-based automation is critically valuable. It might also be something that service providers could use to create 5G edge applications, and that would then be a win for Nokia on the service provider side as well.

All this adds up to a real opportunity for Nokia, and even a value for private 5G, as long as you don’t expect it to carry all the revenue-gain water by itself. Exploiting the full opportunity could mean that Nokia’s “rebranding” is really a whole new initiative that could realize their enterprise goals and boost their service provider position too…providing they can get the word out.

I think Nokia’s position in facilities automation is strong, and that it will boost their enterprise sales and justify their new “branding” or strategic initiative. I think there’s a chance that it could be great, and give Nokia a lead position with edge computing and digital twinning for facility automation. Part of that will depend on whether Nokia evolves the capability, but perhaps a bigger part will depend on whether they can get aggressive with their positioning/marketing. Facilities automation needs buy-in from a lot of stakeholders, particularly ones in line organizations that Nokia couldn’t expect to call on even if they had a sales force big enough. Only marketing can reach everyone, and get them onboard, and that’s essential if the full scope of this opportunity is to be realized.

In Search of a Model for the Cloud-Native Telco

They say that beneath every fable, lie, and marketing message there lies a kernel of truth. Is that statement a self-proof or does it give too much credit to falsehood? I can’t assess that, and probably don’t need to, but I do think I could try to apply it to a question I got on a blog I referenced on LinkedIn. Paraphrasing, the question was whether there was a role that cloud-hosted, cloud-native, microservice-based technology could play in telecom. Well, is there? Not surprisingly, the answer is “Yes, but….”

We have to start with an attempt to categorize what “cloud hosting” and “cloud-native” mean, and in particular how they differ. There are really three models of cloud hosting; IaaS virtual machines, containers, and functions/microservices. I’ve presented them in order of their “closeness” to the real hardware, so it would be fair to say that the progression I offered represents increasing overhead that could be expected to be added to message processing.

We also have to ask what “a role” would mean. It doesn’t make a lot of sense to stick a hosted router instance somewhere in a network and call it a transformation. What operators generally want is a broadly useful strategy, one that can replace enough specialized (vendor-specific, proprietary, expensive) devices with cheap hosted instances to make a difference in capex overall. That puts a lot of pressure on the hardware, hardware that’s designed to host applications and not push bits.

Whatever the model, nearly all “cloud hosting” is based on general-purpose CPUs (x86, ARM), and we wouldn’t have switching chips from players like Broadcom as the basis for white-box network devices if general-purpose CPUs were up to the job. It is possible to use general-purpose servers, in or out of the cloud, to host router instances, but operators aren’t all that excited about the idea.

About a decade ago, I had a long meeting with a Tier One operator about reducing capex. Their hope was that router software (remember Vyatta?) running on servers could replace routers, and they bought licenses to test that idea out. What they found was that it was possible to host what were essentially edge router instances on servers, but anything that had to carry transit traffic (metro or core) needed special hardware.

It wasn’t long after that when the NFV “Call for Action” was published and the NFV ISG was launched. From the first, the emphasis was on “appliances” more than on network routers and switches, and many of the early PoCs focused on virtualizing and hosting what were normally CPE functions, like firewalls. This dodged a lot of the performance problem, but even those PoCs ended up turning to a non-cloud hosting model, that of “universal CPE” or uCPE. NFV’s mission there was simply to load software onto an edge device, which frankly made all the standards work overkill. Would this have happened if virtualizing CPE, which was well within server limits, was really transformational? I don’t think so.

Where does this leave the cloud router? Answer: If by cloud router you mean “hosted in the cloud” router, nowhere. There is only one viable “cloud router” in all the world, and it’s DriveNets’ Network Cloud cluster of white boxes, which don’t rely on general-purpose servers in any form. Public-cloud routing is not cost- or performance-effective. Neither, in most cases, are any server-hosted routers. The only operators who haven’t rejected the hosted-in-the-cloud approach are those who haven’t tested it seriously.

So what does this mean for all the talk about “cloud-native”? I blogged about an article that, referencing an analyst report, predicted operators would move nearly half their network traffic to a “cloud-native” hosted framework. I said that there were zero operators telling me anything like that, but I didn’t go too much into the “why” of it. The answer is that issue of cost/performance effectiveness.

But there’s a deeper question, too, one I referenced HERE. The 3GPP 5G work, and the successor expansion of O-RAN, included the notion of hosting VNFs that handled both the “user plane” and the “control plane” of a 5G network. The standards separate those two planes, but the way the features/functions are divided makes the “user plane” of 5G different from a pure IP transport network, the “data plane”. I speculated that the best way to approach the requirements of the UPF might be to think of them as a functional extension to traditional routing. But what about the control plane? That’s the deeper question.

The control plane of a mobile network is largely aimed at managing registration of devices and managing mobility of those devices. Like an IP network’s control plane (yeah, 3GPP reused a term here and that can create some confusion), the control plane of 5G doesn’t carry traffic but rather carries signaling, and signaling is a whole different world in terms of cloud and cloud-native.

5G signaling is made up of internal messages to support the management of mobile networks. A 5G user sitting in a cafe in NYC watching a 4K video could be pushing a lot of bits but generating essentially zero signaling activity, because they’re in the same cell through the entire experience, and once the path to that user in that cell is established, there’s nothing much going on to require a lot of signal exchanges, at least not exchanges that impact 5G service elements (RAN-level messages might be exchanged). No registration, no mobility management. Thus, signaling message traffic is surely way lower than user data traffic, and that means it’s probably well within levels that cloud elements could handle.

In theory, if 5G signaling is a good application for cloud hosting, we could expect to use any of the three hosting models I cited. However, the way that the 5G standards are written creates “functional boxes” that have “functional interfaces” just as real devices would. That seems to favor the use of virtual devices, which in turn would favor hosting in either VM or container form. You could easily write software to play the role of a 5G signaling element and stick it in a VM or container in the cloud (or, of course, in a data center or other resource pool).

What about “cloud-native”. We can now turn to defining it, and the most-accepted (though not universally accepted) definition is that “cloud-native” means “built on microservices”, and “microservices” are stateless nubbins of functionality. It also means, or should mean, the more general “designed to optimally realize cloud benefits”. The question, IMHO, is whether it would be possible to meet both definitions with a 5G signaling application. The answer is “Not if you strictly conform to the 3GPP/O-RAN model”.

This is the same problem that the NFV ISG created for itself a decade ago, with their release of a functional specification. Defining functions as virtual devices steers implementations relentlessly toward that model, and that model is not cloud-native. I did a presentation for the NFV ISG that described a cloud-native implementation, and what it showed was a pool of primitive functions that could be individually invoked, not a collection of virtual devices. The assignment of functions to virtual devices converts cloud-native into a kind of monolith.

In the cloud-native model, the signal messages would be routed to the primitive function (the microservice) designed to handle them. Since microservices are stateless, the presumption would be that (for example) a mobile device would have a “record” associated with it, wherein we stored its state relative to the mobile service. That state record would be accessed by a “master” signal-message microservice to determine where the message would be steered, so we could say that it would contain a state/event table. There might be any number of instances of any or all of the signal-message microservices, and steering to the right one would likely be done through a service mesh. It’s also possible that signal messages would carry state, and thus would be steered only by the service mesh.

The next obvious question would be how this would tie to the “user plane” where there was a signaling-to-user-plane interplay, like for the UPFs in 5G. This is where you’d need a mechanism for a signal microservice to send a message to a “router sidecar” function that could induce data plane behavior. For example (and only as an example), we could assume that the “router” was an SDN switch and the “router sidecar” that was messaged from the signaling plane microservice was the SDN controller.

My point here is that so far, all the work that telco standards groups have done has pushed features into virtual boxes, thus moving them away from being “cloud-native”. If cloud-native is the preferred model for the cloud, that’s a very bad thing. But even for non-communications applications, cloud-native isn’t always a good idea because all the inter-microservice messages add latency and likely add hosting costs as well. It is very possible that virtual devices would be cheaper to deploy in either public cloud or carrier cloud resource pools, and would function better as well, with lower latency. Frankly, I think that’s very likely to be true for 5G control plane features. That, in turn, would mean that we shouldn’t be talking about “cloud-native” as a presumptive goal for everything involved in telecom-in-the-cloud.

A “cloud” is a resource pool, and while a public cloud today is based on x86 or ARM because that’s where the demand is, there’s no reason why we couldn’t build clouds from any resource, including white boxes. One of the interesting points/questions is whether we could build a hybrid white-box cloud by linking an open data plane to a cloud-native control plane via a user plane that could induce UPF behavior from white-box devices. Another is whether the way to harmonize a virtual device and cloud-native is to say that virtual devices are made up of cloud-native elements that are tightly coupled and so can function as a single element. Maybe we need to think a bit more about what “carrier cloud” really means, if we want to get “cloud-native” right there.

The biggest question of all, IMHO, is whether we should be thinking about re-thinking. Is the old 4G LTE Evolved Packet Core tunnel mechanism even the optimum “user plane” model? We’ve evolved both technology and missions for mobile networks. Maybe, with some thought, we could do better, and that’s a topic I’ll look at in a later blog.

In Search of the Rational 6G

Given the fact that most experts recognize that 5G was over-hyped and is very unlikely to measure up to expectations for either its providers or its users, it’s no surprise that there’s interest in pushing its successor. At MWC there was a session on 6G, and this article described some of the views expressed. It’s worthwhile to take a look at the 6G situation, if for no other reason to lay out what we really need to be doing if we’re to create a useful next-gen mobile wireless technology.

The most insightful thing in the piece is the comment that “The projections are that the likes of you and I will only get 6G into our hot little hands from around 2030 onwards.” A Wikipedia article on 6G notes that “as of January 2023, there is no universally-accepted government or non-government standard for what qualifies as 6G technology.” Given that there was industry activity (the Alliance for Telecommunications Industry Solutions or ATIS) back in 2020, and that there’s almost a quarter-billion hits on “6G” on Google search, it seems that as we did with 5G, we’re getting the cart before the horse with 6G.

The article mentions some of the early-hype expectations: “100 times the capacity of 5G, with sub-millisecond latencies” The panel at MWC also talked about energy efficiency, more deployment scenarios including IoT and edge computing, better security, convergence of satellite and terrestrial networks, resilience, applications of AR/VR, better reception of cell signals in areas like tunnels and elevators, manufacturing robots, 3G mapping, metaverse, and so forth. If you combine these expectations with the reality points in the paragraph before, you can already see some interesting disconnects.

The most obvious problem is the classic “6G is a UFO” problem. It’s not landing in your yard for inspection, so you can assign it any properties you like. If we don’t have any agreed-on technology for 6G at this point, how do we know it will be a hundred times as fast as 5G? Can 6G radio waves travel faster than 5G or 4G? Obviously not, so how would 6G alone create sub-millisecond latency?

The next thing that should, at least, be obvious is that a lot of what we’re saying will come along through 6G was stuff we also promised with 5G, and clearly have not yet delivered. Why? Because a lot of things we promise for wireless standards are things that wireless standards can limit, but cannot create by themselves. In other words, what many standards (including 5G and the rational proposals on 6G) for wireless are doing is advancing communications so that it would not limit applications being contemplated, though not yet available. Will all the other barriers to those applications fall? We don’t know.

Those who’ve read my blogs know I’ve been pointing out that in order for many of the things we think 5G would enable to actually happen, we’d need an entire tech ecosystem to build up all the pieces. Edge computing, for example, doesn’t depend on 5G, it depends first and foremost on applications that drive enough value for the user and profit for the provider to create a business. That depends on broadening “provider” to mean not only the service provider, but the providers of critical facilitating technology to the service provider.

So what we’re saying here is that 6G is really aimed at advancing wireless communications so that it wouldn’t limit the growth of these new applications, not so that these new applications will suddenly burst on the scene. In order for the latter thing to happen, we’d have to see the entire application ecosystems emerge. The truth is that the real war for a rational next-gen wireless standard won’t be fought at the network level at all, it will be fought at the application software level. Maybe even more fundamentally, at the application targeting level, because what’s needed is some initiatives to determine where the application of things like low-latency computing could provide monetizable value.

Does this mean 5G and 6G are then so suppositional as to be essentially useless? No, surely not for 5G and very likely impossible for 6G as well. Why? Because there were, and will be, some learned and justified wireless service requirements mixed in with all the suppositional stuff. 5G hype said that it would increase user bandwidth by roughly ten to twenty times versus 4G LTE, and that it would increase total number of connected devices supported by ten times. The former, hasn’t proved to be true, and in any case there’s little you can do with a mobile device to exploit the higher theoretical speed. The latter is a significant benefit in controlling network costs, and thus service prices. We can expect that there will be 6G features that enhance the network’s economics and help build new business cases, but also things that will happen as a sort of insurance policy to protect the ability to support stuff we’ve not yet really thought of.

There’s a limit to be considered, though, even with respect to something like data capacity. Generally speaking, the amount of information (data) a wireless signal can carry is proportional to its frequency. To carry more, you have to use higher frequencies, which is why millimeter-wave signals can carry more data. The problem is that as frequencies go up, the radio network starts working more like radar, bouncing off things instead of going through them. That means that obstructions block the signals, and if you go high enough in frequency, even trees are a barrier. Some of the 6G hype, talking about terabit bandwidth, can be achieved only by raising frequencies to the point where they’d fail to penetrate almost anything, making them useless for practical cellular networks.

Why then are we having so many 6G discussions? Because “news” means “novelty”, not “truth.” Every tech publication, including this blog, has to consider the regular and awful question “What do we write about now?” There’s just so much that can be said about a given topic before it’s not “novel” and thus not “news”. More significantly, before people stop reading the stories, stop getting served the ads, and stop indirectly paying the sources. Not only that, there’s probably a thousand readers who might be interested in a claim of greater capacity, and almost none would want to read a story on mobile modulation.

The biggest problem with all of this is that it obscures the real, and useful, mission of keeping network and cellular standards ahead of the applications that would be connected. Absent some attention to the network issues, the network could be a barrier to investment in the very ecosystemic elements it depends on for long-term growth. Maybe we really do need 6G, a rational 6G. Whether we’ll get that may mean waiting until 2030 to see what emerges, I guess.

Banks, Bubbles, and Busts

I doubt that anyone but Wall Street short-sellers were happy about the problems with Silicon Valley Bank and Signature Bank. The broad problem for each of these banks has been discussed, but what about the linkage? Is there a reason why both banks, with different exposure to different risks, failed in a period of less than a week? What does this tell us, what must we learn, to protect not only tech and startups, but economic health overall.

SVB’s problem was that unlike most banks, it didn’t really do a lot of short-term lending. Tech companies stockpiled their cash there, cash from VC funding rounds, IPOs, and operations, and the bank held the cash in bonds rather than lending it out. When the Fed started raising interest rates, the price of bonds dropped because the bonds’ own interest payments were less attractive given the new rates. That meant that the bank’s reserves fell. Had they loaned out more of the money, the higher interest rates set by the Fed could have given them more money, not less. In addition, the tech sector and startups and VCs that largely create it were pressured by the same rise in rates, borrowing less and pulling money from their accounts. When the bank had to sell some assets at a loss, the run started. The perfect storm.

Signature Bank’s problem was crypto. Almost a quarter of the bank’s deposits came from the crypto sector, which even at the end of 2022 was showing some warning signs. As a result, the bank announced it would shrink the share of deposits represented by the crypto space in 2023. However, that had apparently not gone far enough when fears of bank problems were raised by the SVB problem’s emergence. The SVB weekend closure meant a stablecoin lost access to what backed it, and suddenly there was no stability in crypto anywhere.

In both cases, the banks suffered both a confidence crisis among depositors and among investors. This in turn caused regulators to step in. Bank failures aren’t unusual; there are usually a dozen or so in a given year, but it was hoped that regulations put in place after the 2008 crisis would prevent them in the future. It has worked so far among the major banks, but smaller regional and specialized banks like SVB and Signature have continued to be vulnerable.

Given that the “cause” of the two takeovers were very different, why did we see them almost one on top of the other? Part of the reason is the contagion created by publicity. Depositors who have more in a bank than the FDIC insurance would cover were fearful after SVB, the great majority of whose depositors were largely unprotected until March 13th, when the government stepped in with guarantees. Short-sellers jumped on every bank stock to make money, and some investors were fearful and joined them in selling. But while these were also “causes” of the problems, what started the perfect storm? Speculation, meaning bubbles. One thing that’s been true of the financial industry for at least a century is that it loves bubbles, loves to speculate and leverage to gain more.

Crypto is a self-contained bubble; there is no real foundation of value behind it, only the “technical” buy/sell relationship of the market. Even “stablecoins” that are supposed to be pegged to assets like the US dollar are really bubbles. Why would you invest in something that’s pegged to something else? Answer, you hope that the value appreciates despite the peg but that the peg protects the downside. Well, one stablecoin was temporarily caught in the SVB mess, and it may not be out of the woods. That’s because nobody reserves 100%; if enough people bail on a stablecoin, it fails too.

The tech VC world is also, I would argue, a bubble. For decades, the VC community funded multiple startups in the same space to create “buzz”, the media attention that tended to make the space seem hot and induce established firms to buy up the startups or be left out. The current focus is on social-media startups, despite the fact that ad sponsorship is a zero-sum game. What you do is to raise money, spend a bit on cloud resources to host your startup, spend more to promote it, and then try to “flip” it quickly. We’ll see AI playing along these lines shortly, and in great volume.

Banking once was regulated enough to prevent most bubbles, but those regulations were weakened in 2018. The real problem, though, was Wall Street’s appetite for more risk, and their search for new bubble opportunities. Even dot-com bubble and the credit-default-swap bubble didn’t stimulate addressing the systemic risk of bubbles, just minimal local ways of reducing their impact. As a result, banks were exposed to problems that really existed outside banking…what might be called “inherited bubbles”. While there are banks that want to issue crypto, the majority who are at risk there are at risk because they’ve banked crypto firms’ assets. SVB banked startup assets, tech assets. Both held reserves in bonds, reserves required by law, and at the same time the Fed was taking steps to make the bonds worth less.

The global economy has a bunch of moving parts, but they’re not separate parts. Instead, they all move in a complex way, and so something that distorts one segment will almost inevitably bite you in a segment that seems on the surface to be totally unrelated. Should the Fed have asked what the impact of higher rates would be on bonds and those who hold bond reserves? Surely, but if they did, it didn’t show. They left it to the market, and the market loves bubbles, until they burst.

There are a lot of reasons why it’s said that the US is losing its top spot in tech innovation, but I think the biggest reason is one that’s not talked about at all. It’s bubbles. Startups that are actually aimed at creating a real, new, value have almost disappeared in the VC bubble. The thing about bubbles, including startup bubbles, is that they enrich capital and not labor. You don’t need many real engineers to launch a social-media bubble, far fewer than you’d need to start a company that was going to build a product and employ real people. Bubbles have enriched the wealthy rather than creating jobs for the average worker, and it’s real stuff that advances technology, and pretty much everything else.

Are Operators Really Going to Move Routing to the Cloud?

I’ve worked with telcos for decades, consulting with them, cooperating on international standards bodies, and sharing meals and drinks with many of their executives and planning staffs. There’s a lot of good things you can say about them, but even I have to admit that they have a tendency to take root and become trees. That’s one reason why I have to wonder about a story (quoting an analyst study) that says that each big telco will spend $1 billion on cloud network transformation. The other is that I’m hearing nothing from operators to support the statement.

This story suggests a huge boom in cloud usage as part of the telco network, one that will generate enormous investment and necessarily generate even more enormous benefits to justify that spending. This, at a time when enterprises, who have way longer and broader experience with public cloud services than telcos do, have decided that cloud cost overruns are often crippling and some companies (Basecamp, for example) have saved millions moving higher-level features off the public cloud and into the data center. So are telcos behind the enterprises in recognizing that the cloud’s savings are often illusory, or are they ahead of their usual pace in accepting new technologies?

The devil, they say, is in the details, and it’s not just this general prediction I have a problem with. According to the article, “46% of telco network capacity will be entirely cloud native in the next three to five years.” There is simply no way that’s going to happen, unless somebody proposes that we redefine both “cloud-native” and “cloud”. I don’t know a single telco who plans to do anything like that. Network capacity is based on routers and optical paths. A big Tier One in the US told me over a decade ago that they’d looked at hosted router capabilities to save money, not in the cloud but in their own facilities, and had determined that they couldn’t perform as the task required. To host virtual routers in the cloud, as a “virtual network function”, has also been examined and rejected.

I talked to a telco CTO on this point two weeks ago, obviously not relating to the story/study but to the question of transport evolution. They were excited about “convergence” on more optical capacity, but they were not excited about the use of VNFs for traffic. The problem was cost. Cloud providers charge for traffic, and so about the silliest thing you can do in the cloud is push a lot of featureless bits through it. “We ran the numbers on a carrier Ethernet business connection, and the cost of VNFs even providing security was outlandish.”

The comments cast doubt on the benefits of hosting real network elements in the cloud. According to the article, early adopters, defined as companies with “a comprehensive telco cloud strategy with well-defined goals and timelines; advanced players in terms of the proportion of network functions that have been virtualized; and those that expect more than 50 percent of their network capacity be on cloud”, will recover 47% of their investment in three to five years. Let me get this straight. These people are going to go to their CFO with a project that won’t pay back even half its cost within three to five years? I don’t know any CFOs who wouldn’t throw them out of the office with that story.

Finally, we have the notion that a BT project that’s saving them a lot of money with the cloud. The problem is that it’s not using the cloud to push bits, but to host what are really OSS/BSS applications now run on mainframes. Cloud network transformation doesn’t happen because somebody moves record-keeping on customers or assets to the cloud. None of that has anything to do with network capacity. To move network capacity to the cloud, you’d have to move service data plane handling to the cloud, and there is absolutely no operator I’ve talked with that’s looking to do that. Even if, contrary to my CTO comment, virtual functions might support business point-of-connection services, NFV’s support of those is focused on universal CPE (uCPE), meaning open white-box devices on the customer premises and not in the cloud. And the initial NFV “Call for Action” white paper made it clear that routers and big iron traffic-handlers were not targets of NFV.

That doesn’t even get to the point about “cloud-native”. Like a lot of concepts that get good ink, the term is smeared across a bunch of things it rightfully shouldn’t be applied to. I believe that “cloud-native” means “created from stateless microservices”, and that structure is reasonable for GUIs but totally unreasonable for data-plane handling for the simple reason that there’s too much message traffic between components.

The story introduces comments about 5G and Open RAN, which suggest that all this is moving traffic to the cloud, but the majority of virtual-function usage defined in 5G relates to control-plane rather than data-plane behavior, and while O-RAN does define data-plane elements as hosted functions, operators are interested in this more as a means of opening up the implementation to white boxes and local servers, not pushing the functions to the cloud. RAN is an edge function, and you can’t backhaul a data plane to a cloud hosting point.

The story, and apparently the study, also talk about 5G and Open RAN as sources of “potentially lucrative services and use cases”, but where are they? We’ve had 5G discussions for five years, and there have been no new lucrative services, only continued speculation. Yes, there are still operators (and 5G vendors) who believe in these new services and use cases, but there’s also plenty of people who believe in elves and the tooth fairy, and some who think the earth is flat. Wishing won’t make it so.

I think that the main problem here is conflation. People think of telcos as network operators, as though all they operated were networks. A big telco has the same business management issues as any other big company. They have, hire, fire, and pay employees, manage real estate, file tax reports, handle their stock activity, and so forth. All this stuff is lumped into “OSS/BSS”, and while things even relating to the network are only a small part of that, even OSS/BSS gets lumped into “network”. The problem with this particular story/report is that it says that “46% of telco network capacity will be entirely cloud native in the next three to five years.” As I said earlier, there is simply no way that’s going to happen and I don’t know any real telco operations type who believes otherwise.

Another conflation problem is treating virtualization and hosting of features as being the same as public cloud hosting. The NFV ISG didn’t propose to move features to the public cloud. The initial work focused on deploying features on carrier cloud hosting. My own work with the ISG was directed at making virtual functions compatible with the platform software used to build clouds, so as to take advantage of the capabilities being developed there. I’m not suggesting that there was, and is, no value in public cloud hosting of some functions, only that saying that VNFs make up things like 5G doesn’t mean that they’re hosted in public clouds. A “cloud” VNF isn’t necessarily a public cloud VNF, and that connection is made way too often.

I don’t have access to the report the article cites, so I can’t say whether it’s responsible for the claims, whether it has been summarized inaccurately by somebody, or whether the reporter didn’t understand the details. In a sense, I’m surprised, but in another sense maybe not. Is this another example of pushing hype? If so, it would be really disappointing.

What’s the Best Broadband Technology?

The economics, and opportunities, associated with broadband deployment have always been complicated. It’s been almost 20 years since I started using “demand density”, a measure of economic value that could be “passed” by a mile of broadband infrastructure and available to be hooked up and monetized. Since then, I’ve seen the maturing broadband market changing even that dynamic a bit. Overall demand density, over the operators’ footprints, by state, by metro area, may be useful in broad-strokes planning, but we’re now seeing a need for some micro-focusing as we consider what technologies are best in satisfying broadband needs and operator profit goals.

I pointed out in past blogs that within any given service area we tend to have pockets of concentration of demand/opportunity, and pockets where there’s not much going on. In New Jersey, my home state and a state where overall demand density is among the highest in the nation. We have large state-owned land areas where there’s nearly-zero residential or business population. But today, in the world where smartphone connectivity is as important (or even more important) than wireline broadband, even those areas may require broadband connectivity to support highways that move through them.

It’s also true that all of those pockets of concentration I mentioned aren’t the same. Obviously an important measure of the economic value of a “community” is the total spending that it can generate from both residential and business users. The question is how that can be realized, and that depends on a number of complicated factors.

One such factor is the distribution of residential income. Take two communities, one of twenty thousand residents and one of fifty thousand. Suppose the first community has twice the household income of the second. Is the second still a better opportunity because it offers more total income? No, for three reasons. First, higher-income communities are usually more spread out, so the “pass cost” to prepare to connect customers will be higher. Second, zoning rules in the first community will likely limit businesses locating there, which reduces the total network opportunity. Finally, households tend to spend a given percentage of disposable income on broadband, and higher-income households have more disposable income as a percentage of total income.

Another, growing, factor in opportunity measurement is the value of mobile coverage in the area. Going back to my New Jersey example, a mobile operator might have virtually no “direct” opportunity from residents in an undeveloped state-land area, but if a major highway supporting commuting or leisure travel passes through, or if there’s a popular business or recreational destination in the area, then lack of mobile broadband there will discredit an operators’ service to all who need it there, regardless of where they live.

One of the things that these factors have done is change the dynamic of fiber broadband deployment. Twenty years ago, Verizon dominated residential fiber broadband in the US because it had a high regional demand density and could therefore afford to push Fios to much of its service footprint. In the years since, AT&T started deploying fiber in its higher-density pockets, and new players have started to offer fiber in areas where the incumbent operator didn’t.

5G Fixed Wireless Access (FWA) is another development related to the intricacies of demand density. An FWA deployment is a single fiber-fed tower (fiber to the node or FTTN) that supports users in a rough one-mile radius. Because user connections are made via RF, the pass cost is limited to the cost of the tower and fiber feed; no trenching of fixed media is required. It’s no wonder that many sources say FWA is the fastest-growing form of broadband in the US.

Then there’s satellite. If you look at mobile coverage maps, you see that there’s a good chunk of most geographies where no mobile service is available, and it’s highly probable that there’s no fixed-media broadband there either. Satellite broadband is often the only broadband available in undeveloped areas, because the “node” is up in space and can serve a very large area without any other infrastructure deployment needed. Even in the US, there’s growing interest in satellite because it’s available everywhere, and it’s likely that some smartphones will support satellite broadband directly in the near future, supplementing their normal mobile broadband connections.

OK, so we have really two “broadband infrastructures” deploying. One is the physical-media “wireline” form that includes fiber, CATV, and some copper loop. The other is the RF form, which includes mobile broadband, satellite broadband, and FWA. The most significant thing happening in broadband is that second form, and in particular the mobile and FWA pieces. The reason is that these are “seeding” broadband infrastructure further into thin areas, making them “thick enough”.

If you have to push glass to a 5G tower or FWA node, wouldn’t it be possible to branch out from that point with fiber PON? Couldn’t the ability to serve a given thin location be enough to make that location more suitable for residential and business development? The more mobile and FWA we deploy, the more we reduce the incremental pass cost for even fiber to the home or business, because we have a closer feed point to exploit.

I’ve talked with some of the proponents of the “fiber broadband for all” thesis, and this is how the more thoughtful of the group see their goal being achieved. Is it realistic? I’ve tried to model that question, and while I’m not completely confident about the result for reasons I’ll note here, they’re interesting.

My model says that the current mobile broadband and FWA trends will, within 10 years, create enough “seed” points of fiber feed that almost 70% of the US households could be offered PON connections, and that would rise to almost 80% in 20 years. The qualification I’ve noted is that it’s nearly impossible to project economic, political, and technical trends out that far. The biggest caveat my modeling reveals is dependence on the cost and performance of FWA.

It is credible to assume that table stakes for home broadband in 10 years will be at least 500 Mbps download and 200 Mbps upload. If FWA can achieve that, that eliminates one risk to the model results. If technology advances in FWA could extend range (subject to topology) to 2 miles, that would eliminate one risk, and create another. At the 2-mile range level, FWA that meets basic service goals would be so much cheaper than PON that an FWA provider could undercut a PON/CATV provider on price, even in areas where PON and CATV are now dominant. In other words, we might have a kind of reversal of the impact of seeding, one where FWA starts to cannibalize the low-end piece of the fiber space, to the point where fiber growth would likely stop well short of the model predictions.

One thing that seems clear, more so for sure than the future of universal FTTH, is that broadband availability based on terrestrial tools is going to expand significantly in the next decade, in the US and in other developed economies. That likely means that satellite broadband will remain confined to under-developed countries and areas, and that terrestrial broadband will continue to improve, limited only by the willingness of operators to invest.

Is VNF Portability a Real Problem for Telcos or Vendors?

Just how difficult is “carrier cloud”? There’s been a lot of talk about how hard it is for operators to deploy their own cloud resources, particularly when many services have a much bigger footprint than the operators’ own real estate holdings. There’s been a lot of talk about how public cloud partnerships are favored by operators, in part because of the footprint problem and in part because most are uncomfortable with their ability to sustain their own cloud resources. Now there’s talk about whether even public cloud hosting could raise unexpected costs. Maybe, but it’s not clear just whose problem that is.

The article cited here is about the question of the portability of network functions, meaning the hosted functions that were created under the specifications of the Network Functions Virtualization ISG of ETSI. According to the article, “Network functions (or NFs) developed for one vendor’s cloud cannot enter another’s without expensive repurposing by their suppliers.” There are three questions this point raises. First, is it true? Second, whose problem is it? Finally, does it matter?

The best we can say regarding the first question is “Maybe”. Any software that runs in a public cloud could be designed to use cloud-specific web service tools, which would make the function non-portable. However, nearly any function could be written to run in a virtual machine without any specialized tools, an IaaS application. These applications would be portable with a minimum of problems, and in fact that was a goal of the NFV ISG from the first. Subsequent efforts to align with “CNFs” meaning “Containerized Network Functions” admit to a bit more risk of specialization to a given cloud, but it’s still possible to move properly designed VNFs between clouds with an acceptable level of effort.

The second question has a bit of its own ambiguity. The author of a VNF determines whether it uses non-portable features, which means that a VNF “supplier” could in theory write either largely portable or thoroughly non-portable VNFs. In that sense, it would be the VNF supplier who made the bad decision to render a function non-portable. However, many of these suppliers are beholding to a particular cloud provider, via their development program. Cloud providers would love to have VNFs run on their cloud alone, so in a sense their proprietary goals are at fault. But operators themselves have the final word. Nobody puts a gun to your head and demands you buy a non-portable VNF. If operators, either individually or collectively through their participation in the NFV ISG, demanded that all VNFs forswear the use of all but the most essential non-portable features and take steps to make it easier to port those that must be used, VNF authors would toe the line because they couldn’t sell otherwise.

But the third question is likely the critical one. Does any of this matter? It’s the hardest question to answer because it’s a multi-part question.

The first and most obvious part is “Does NFV matter at all?” The NFV ISG got started over a decade ago, and quickly aimed well behind the proverbial duck in terms of pooled-resource hosting (despite my determined efforts to bring it into the cloud era from the first). The initial design was off-target, and because most groups like this are reluctant to admit they did a bunch of the wrong stuff, the current state of NFV is still beholding to much of that early effort. Are there operators who care about NFV? Sure, a few care a little. Is NFV going to be the centerpoint of carrier cloud? Doubtful, no matter who hosts it.

The second part of the question is “Does cloud portability matter?” The majority of operators I’ve talked with aren’t all that excited about having to integrate one public cloud provider into network services, and are considerably less excited about the integration of multi-cloud. In the enterprise, the majority of users who purport to be multi-cloud are talking about two or more clouds in totally different areas (a Big Three provider plus Salesforce is the most common combination). Slopping application components between cloud provider buckets isn’t something anyone likes much, and so operators get little encouragement for the idea outside the tech media. So, the answer to this question is “not a whole lot.”

The third piece of the question is “Does cloud portability in carrier cloud actually mean function portability?” The majority of interest by operators in cloud partnerships at the service feature level comes in areas like 5G, where some elements of 5G are provided by the cloud. These elements integrate with the rest of the network through interfaces defined by the 3GPP or O-RAN Alliance, which means that they’re essentially black boxes. Thus, if Cloud Providers A and B both offer a given 5G feature, connected via a standard interface, the feature itself doesn’t need to be portable because the same black box is available from multiple sources.

The final piece? “What about features of more advanced services?” The truth is that the hosting of basic network functions isn’t going to move the ball much for operators, or for cloud providers. The big question is whether something bigger, like “facilitating services” or even OTT services like the digital-twin metaverse, might lie ahead. If it does, then broader use of better features could provide operators with the incentive to self-host some things.

The problem is that this question has no clear answer at the moment. Some operators are committed to facilitating services, some are even looking at OTT services, but nobody is doing much more at this point than dabbling. One reason is that there are no real standards for how these new features/functions would work, which means that there’s a risk that they wouldn’t be implemented and interfaced consistently.

That’s the real risk here, not the VNV risk. VNFs were designed to be data-plane functions, and were gradually eased over into standardized control-plane elements for 5G. IMHO, generalized facilitating services or OTT service features would likely be “real” cloud components, unlikely to obey the management and orchestration rules set out for NFV. Still, though, they’d need to have some standardization of APIs and integration or providers of the new features/functions wouldn’t be able to write to anything that would ensure they could be linked with operator networks and services. The NFV ISG, in my view, is not the appropriate place to get that standardization done, and until we have such a place, then the risk the article describes exists, just in a form different from what’s described.

What Ciena Might (or Might Not) be Showing Us About Optical Convergence

There aren’t all that many enduring questions in networking, but one I recall coming up almost thirty years ago came up again last week. It was “Is it getting cheaper to provide more capacity than to try to optimize it?” There’s a new flavor to this question, though: “Should we be using more optical paths to substitute for MPLS LSPs?” I’ve blogged about the potential benefits of a metro-network model where metro centers are meshed in some way via optical trunks. Might we actually be heading there? Ciena’s quarterly numbers, reported on March 6th, suggest that maybe we are, but not conclusively.

The company’s revenue grew by just over 25%, and their premier product area, “Converged Packet Optical” jumped from $541 million in fiscal 1Q22 to $736 million in the first fiscal quarter of 2023. Routing and switching also gained (from $86 million to $120 million), and platform software and services gained just slightly. All the other areas showed a decline y/y, including Blue Planet.

The Blue Planet story might be the most direct indicator of a shift in thinking. Blue Planet is Ciena’s operations automation software framework, and the fact that revenue there declined while the revenue for optical/packet equipment was up significantly suggests that most of the gear is going into sites with Blue Planet already in use, or where it is still not being considered. That could be an indication that optical networks generally are lower-touch than router networks, and require less operations automation. Indeed, that’s one of the value propositions for the “more bits not more bit management” theme.

Another interesting point from their earnings is that the company got about 40% of its revenue from non-telco sources. Does this mean they’re seeing some data center missions, things that might lead to success in the metro of the future? Maybe. While the “webscale” players are the largest segment of the non-telco space, the other two segments (cable and government/enterprise) combine to be larger, and they’re similar to the telco space in terms of requirements.

The question of data centers is critical because data center coupling to the network is essential in injecting new service features. As I’ve noted in the past, these features have to be injected close enough to the user to allow for personalization and a high QoE, but deep enough to benefit from economies of scale. The only place that can happen is “metro”, the major metropolitan-area concentration points. There are, for example, about 250 metro areas in the US. Metro areas also represent a great meshing point, which would expand the optical convergence opportunity.

The key piece of the Ciena story that relates to metro is their “Coherent Routing”, which is a software-managed multi-layer model that combines IP/MPLS at the top layer with an optical base. Metro is the logical place where the layer transition would occur, and where their new 8140 router (announced last week) is targeted. The 8140 is a logical candidate for both metro aggregation and metro service injection, since you could make data center connections to it. You can tie it to business Ethernet connections, to residential broadband (FTTH, cable, etc.) and to mobile and 5G networks, and you can also connect to a metro data center owned by the operator, an interconnect/COLO player, or a cloud provider, as well as to traditional public cloud regional data centers. Obviously it supports connection to the Internet and to CDNs. Ciena shows it interconnected with the 8190 for many of these missions, too.

I think that Ciena has a very good, perhaps even great, metro story, but they have an issue they share with another metro pioneer, Juniper, in that their documentation and positioning isn’t exactly evangelistic. They can do what’s needed, but you won’t find material that shows the full potential and role of metro in the network of the future. If an operator has already recognized just how important the metro is, Ciena’s stuff can connect with their thinking, as Juniper’s can. If the operator hasn’t made the connection yet, it’s not likely that the Ciena material will push them over the threshold into metro-land.

Ciena (and Juniper) aren’t simply ignoring the obvious here. The importance of marketing to operators, even to cloud providers, is a matter of debate among vendors. These are giant buyers, large enough to support dedicated sales resources with technical sales support backup. Given that, most network vendors don’t really provide what might be called “educational” or “mission-focused” marketing material on the theory that their sales relationships can carry that. However, my contacts with operators suggest that this isn’t a great strategy. Salespeople tend to shy away from educational/evangelistic activities; they’re time-consuming and they may not pay off for many quarters, when sales organizations have to focus on making quota in the present. Thus, they may be slow taking advantage of strategic superiority when they have it, simply because it’s not translating to tactical sales success. Next year, it would.

For Ciena, I think the problem is particularly acute because “packet/optical convergence”, the concept that’s in play to shift focus from bandwidth management to bandwidth creation, is very transport-centric rather than service-evolution-centric. Operators can view that convergence either as a cost-management strategy or as a new-service strategy. The operators’ own biases and legacy push them toward the former, and ultimately it’s the new service approach that has the potential of driving significant spending by operators, and thus revenue for vendors. And convergence tends to focus toward the bottom layer, which of course is less visible and tends to develop only when the layer above it runs out of gas. What happens with convergence if the packet/router players in that top layer step up? They push the optical piece downward, which constrains its growth. For Ciena, the convergence has to take place at the metro level to maximize their equipment sales.

Cisco, who has an even-less-specific metro positioning than Ciena, nevertheless is looking to make its routers into the optical on-ramp through its Acacia acquisition, which lets routers with highly effective optical interfaces be the on-ramp to packet/optical convergence. Since those routers can connect with each other efficiently via optical trunks, they let Cisco take a bigger bite of the top piece of the convergence story. The more routers do at the metro level, the more it is that Ciena is stuck in a pure aggregation role, because service injection deeper than the metro is highly unlikely.

Router vendors aren’t the only players who need to understand how critical the metro is, optical players like Ciena may have even more at stake. With a strong metro story, they can make metro a barrier to router expansion and an outpost in the services space. That’s a role operators really need some vendor to play, and so the position is perhaps the strongest out there for expanding vendors’ opportunities. I suspect we’ll see clear signs of just what vendors might be going after the space by the end of this year.