August 2022 – Page 2 – Welcome to CIMI Corporation's Public Blog

Do We Need a New Generation of Abstractions for Virtualization?

Ancient Romans didn’t have computers or networks, but I’d bet that if they did, we’d see “In abstractione simplicitatis” graven on their monuments; “In abstraction, simplicity”. We’ve been taking advantage of that principle, though often implicitly rather than explicitly, since the dawn of computing, and going forward it may well be that our ability to frame the right abstractions will determine how far and fast we advance.

The earliest commercial computers were often programmed in “machine language”, which required considerable skill, took a lot of time, and introduced many errors that cropped up in unexpected ways. Programming languages like FORTRAN and COBOL came along and abstracted the computer; you spoke one of these “high-level” languages to a “compiler” and that translated your instructions into machine language for the computer. We’d have a major software problem these days without this abstraction.

The other problem early computers quickly faced was that of complex but repetitive tasks. You have to read and write files, or “I/O devices” and this involves complicated things like locating the file, sensing end of data or error conditions, etc. If these were left to the programmers writing applications, there would be hundreds of programmers doing the same things in different ways, and it would waste programming time. Not only that, every condition (like “mount this drive”) that required human coordination would be done a different way, and operations would be a mess. What fixed this was the concept of the operating system, and without that abstraction we’d have a programming and operations resource problem.

If you look at these examples of early abstractions, you can see that there’s a common element. What we’re trying to do is to isolate software, software development, and software operations from resources. The same thing happens in networking. The venerable OSI model with its seven layers was designed so that for a given layer, the layer below it abstracted everything below into a single set of interfaces. We abstract networks and network operations, separating them from resources. Which, of course, is also what virtualization, containers, and cloud computing do for IT overall.

There’s a challenge here, though. Two, in fact. First, if abstraction is so important in computing and networking, how do we know we have a good one? If we pick a bad strategy for abstraction, we can expect it to fail to deliver on its promise of efficiency. Second, if there’s enough abstraction going on, we may end up with nested abstraction. Stay with Latin a moment and you find “Quis custodiet ipsos custodes”, which means “Who will watch the guards themselves”. Who will abstract the abstractions, and what do they turn into?

Network abstraction and compute abstraction have each evolved, largely independently but joined loosely by the common mission of information technology. We are now, in the world of distributed software components and resource pools, entering a phase where recognizing interdependence is really essential to both these critical technology areas. An application today isn’t a monolith loaded into a compute-monolith, it’s a collection of connected components hosted across a distributed set of systems and connected within and without by a network. What is the abstraction for that?

The cloud is a compute-centric abstraction. You deploy something on virtual resources, and that works in large part because you have a set of abstractions that make those resources efficient, ranging from operating systems and middleware to containers, Kubernetes, and operations tools. The network is almost presumptive in this model; we assume that connectivity is provided using the IP model of subnetworking, and that assumption is integrated into the container/Kubernetes deployment mechanism.

The questions I’ve raised about service lifecycle automation, network infrastructure security, and service modeling all relate in some way to an abstraction model, but we really don’t define exactly what that model is. That we have questions in how we can operationalize complex services, how we can integrate hosted elements with physical devices, and how we can build features into a network via hosting, or connect them on the same network, or both, is proof of that lack of an explicit model of abstraction. We’re seeking it, so far without success.

What are we seeking? I’ve worked with service and application standards initiatives for about thirty years now, and I’ve seen attempts at answering that question in virtually every one. I’ve also seen almost-universal failure, and the reason for the failures is that all approaches tend to be either top-down or bottom-up. Both have had fatal problems.

In a top-down approach, you start with the goals and missions, and from them you dive into how they can be achieved, through what’s essentially the process of analysis, which is a bit like hierarchical decomposition. There are two problems with the top-down approach. First, there’s a risk that it will require wholesale changes in existing technologies because those technologies don’t guide and constrain the early work—it’s above them. Second, the entire process will have to be completed before anything can be delivered, since the bottom-zone where real stuff exists isn’t addressed until the very end.

The bottom-up approach trades the fact that it resolves these problems for the fact that it causes others. Starting at the bottom means that you presume the use of at least specific current technologies, which is good because they do exist and so can be immediately exploited. It’s bad because we’re presuming that these existing technologies are at least suitable, and hopefully optimal, in addressing the goals we’ve not yet even gotten to (because they’re at the top and we started at the bottom). It’s also bad because trying to exploit that current technology as we climb up from the bottom means that we may well be picking low benefit apples that will make further climbing difficult to justify, and also that we may cement in intermediary practices to support our exploitation, practices we’ll have to undo if we continue upward.

Are we then doomed? Do we have to figure out some way to start in the middle, or at both ends? Well, maybe any or all of these things, but whatever we do, it is likely it will exploit the modern notion of the intent model.

Back in those early days of programming, as the concept of “modular programming” and “components” emerged, it was common to design programs by creating “shells” for each component that defined the interfaces and (through the name and comments) the basic functionality. This was eventually formalized in most modern programming languages, to require a “definition” or “class” module which was then “implemented” by a second module. Any module that implemented a class was equivalent, so at the class/definition level, the implementation was opaque.

An intent model works much like this. You specify the interfaces, meaning the inputs and outputs, but the contents of the model are invisible; it’s a “black box”. If you use intent model principles in a top-down design, you can decompose a service into pieces and specify only the stuff that’s visible from the outside. You can then decompose that further, to the point where you “implement” it.

In service modeling, you could imagine that all services could be considered to have three pieces. First, the Service_Access piece that defines how users connect to the service. Second, the Service_Connectivity piece that defines how internal connections are made, and finally the Service_Features piece that defines what functional elements make up the service experience. Each of them could be decomposed; Service_Access might decompose to “Access_Ethernet” and “Access_Consumer_Broadband”. If this process is continued, then an “empty” intent model could be defined as a hierarchy, and only at the bottom would it be necessary to actually send commands or deploy elements.

Done right, this approach maintains the separation between services and resources, and also requires that you define the way that choices between implementations and features are made, without mandating that you actually do anything to solidify the process. The “resource binding” task comes only at the end, and so you’re highly flexible.

The obvious challenge in applying this strategy to service and application lifecycle management is the lack of a toolkit, but in some sense we might actually have one already, called a “programming language”. Since programming languages and techniques have applied an intent-model-like approach for decades, why not build services (for example) like we build applications? I actually did that in the first ExperiaSphere project, which used Java to create service models. It might not be the ideal approach, particularly for highly dynamic services with regular changes, but it could at least serve as a testbed to prove out the concept.

Intent modeling has been around for a long time, first informally and then in a formalized way, but it’s greatly under-utilized in network services and IT. We could go a long way toward easing our way into the future if we took it more seriously.

Hosted-Function and API Security for Service Providers: Not There Yet

Most network operators have their hopes pinned on 5G and IoT, and it’s already clear that a lot of these hopes will be dashed simply because both concepts have been over-hyped. Now, to make matters worse, there’s a growing concern that the implementation of 5G, and support for IoT applications, may have created a significant vulnerability. Even before this story came out, some technologists in the operator community were cautiously sharing their own fears of problems.

One of the network issues we usually ignore is that, in IP networks, the devices are part of the network and exposed as addressed elements, almost like users are. Not only that, it’s often possible to send messages to these network devices. Fortunately, this issue surfaced decades ago and operators have taken steps to ensure that network devices are difficult to attack.

When 5G and Network Functions Virtualization (NFV) came along, they introduced the need to host network functions on a resource pool. That means that the resource pool and functions now generate a potentially wider attack surface, and this is particularly true when function-authoring isn’t highly proceduralized to prevent the introduction of malware or simple vulnerabilities.

At one level, this problem seems related to the problems associated with any componentized application, in the cloud or in the data center, but there is a special problem with network functions because a security fault could expose the network and its users and not just a single application or single user. Cloud providers face a similar risk; if their own infrastructure can be attacked then it might be possible for a badly behaved user to cross over into the infrastructure, and from there obtain information from other users in what they believe to be private application components and databases. However, cloud providers understand the risk and how to manage it, and network operators aren’t sure they do.

Over the last six months, about a third of network operators have commented to me on this problem, and indicated that they’re working out how to audit for risks and how to address them. Obviously this means that they don’t believe they’re completely in control at this point.

Security risks can also be created when network assets are exposed via APIs, either to support partnerships with higher-level service providers as some operators are already planning to do, or to expose them to users and developers. This challenge isn’t unique to IoT, but because IoT has been an ongoing revenue target for many operators, more has been done in the space than in other potential API-based applications.

APIs represent places where a third-party application can call on network facilities, which means that they could be exploited to do something evil if the APIs aren’t fully secured and protected against vulnerabilities that could generate exploits. Many operators will admit that they are used to operating within their own little world, and that the notion of exposing network features in any way is a new one. One operator said that they found eleven potential exploits in their IoT API set when an audit was done. Fortunately there was no indication any were actually exploited.

Even if we dismiss malware and exploits for the moment, there’s still the pedestrian problem of things like denial of service attacks. Offer an API through which any sort of message can be introduced, and you offer a portal through which network traffic and “feature load” can be introduced. Another operator found that their event portal could be used to introduce enough traffic to swamp their IoT servers, for example. Metering of portals through which traffic can be passed is critical.

With the risks associated with hosted functions and APIs so clear, you’d wonder why there haven’t been any widely publicized problems, or any concerted attention given to the potential for those problems to arise. Part of the reason is that operators tend to think of hosted functions and APIs in service terms, and don’t see them the way that cloud providers look at their own business. You need a resource pool to host something, and that pool should be secured as a resource pool rather than per-application. Otherwise, the security is unlikely to be effective. You need APIs to be viewed as invitations to the party, and guests (as we all know) can misbehave once admitted.

It doesn’t help that application security is still a work in progress. You can almost bet that, about every six months, there will be a new security announcement, and that it will purport to be an essential advance in security practices and levels. Operators aren’t used to this sort of moving-target situation, and each new layer of security features adds complexity, to the point where some experts argue that we’re making things less secure rather than more secure when all the operations issues are considered.

Then there’s the big question, which is whether application security measures are suitable for hosted-function security within networks. Interestingly, virtual networking came along (Nicira, now part of VMware) as a means of reliably separating tenants in multi-tenant virtualized infrastructure. Virtual networking is widely used by cloud providers, but you don’t hear much (if anything) about it in network operator applications. The standards bodies have not only failed to consider it explicitly, they’ve also failed to discuss issues of address space usage, management, and assignment. All these things are critical in multi-tenant infrastructure.

As for API security, there is no question that things like IoT APIs increase the attack surface, but this is very similar to the risk that any interface poses for an operator. You can limit DDoS attacks by isolating the originating port, and you could surely do that with APIs as well. The real risk is the APIs that might expose composable features aimed at those who build OTT services, such as proposed by AT&T. Low-level features are low-level because they’re tied more directly to network behavior than customer interfaces, even IoT service interfaces, would likely ever be.

What operators really need to be doing here is opening some discussion, in some formal sense and in an established standards body, to address this question of hosting and API security for network operators. What vendors, particularly virtual network vendors, need to be doing is promoting their own solutions to the problem. That would not only raise awareness, it would introduce some strategies to be tested in the broad market, where the final validation of a solution will always have to be provided.

Comcast Goes Live with Xfinity Stream

Cable companies offered TV from the first, and were also among the first to offer broadband Internet. Now there are indicators that they’ll be separating the two. Comcast is about to go live with its Xfinity Stream service, which will be available on Roku and Amazon streaming devices and on some TVs, with other platforms expected down the line. This could have a seismic impact on both broadband and streaming, and it could be a good or bad thing for cable companies and for consumers.

Like the telcos, the cablecos have traditionally had specific territories, geographic areas where they were able to deploy wireline infrastructure. Early cable TV was a combination of a media source where antennas didn’t work well, and a means of delivering video that wasn’t affiliated with an over-the-air channel and thus not redistributed as linear RF. Over time, cable (via the industry’s CableLabs shared R&D) has been modernizing the format for CATV delivery to improve its value as a consumer broadband conduit. CATV has a lower “pass cost” for infrastructure than things like FTTH and better capacity than DSL, but it’s not practical for cablecos to build out a national footprint.

The advent of streaming video, and in particular the growth of live streaming of programming, has impacted the cablecos because these services are available over any broadband connection with reasonable bandwidth. That means that cable TV now competes with streaming TV everywhere cable has service, but also everywhere any broadband is available. From a competitive perspective, that’s obviously not a good thing.

Another problem with cable TV in competition with streaming is the need for a set-top box. While these are usually combined with the normal home WiFi base stations, and while any broadband connection will have that feature these days, set-top boxes are required for every TV and they’re more expensive than streaming sticks. Not only that, more and more TVs come with built-in streaming support, so there’s no need for a device at all. That cuts service costs to users, and reduces the installation and support challenges for operators.

Comcast has seen steady declines in subscriber growth this year, and it’s clear that TV revenues are under threat to streaming providers. Given all of this, it’s no wonder that the company wants to be able to offer TV service over any broadband connection. Eventually, IMHO, it’s likely that they’ll move everything to Xfinity Stream.

The risk, of course, is that this will encourage the disconnect between live TV and consumer broadband that’s created the problem for cablecos in the first place. For at least a period of time (and who knows how long) Comcast will have to deliver linear RF to legacy STBs as well as streaming broadband video, which means a higher infrastructure cost. Their Xfinity Stream will still have to compete with other streaming services, too.

One reason Comcast may have decided to make the move despite potential market issues is the risk created by 5G home broadband. 5G is no revolution in mobile service terms because of limitations in mobile devices’ ability to meaningfully exploit it, but it could be a major revolution in home broadband. Fixed wireless, as it’s often called, has a lower pass cost than even CATV cable, and it’s not difficult to see how a player in the space (like T-Mobile) could encroach on Comcast’s territory and at the same time offer services in places where Comcast doesn’t have CATV at all.

Comcast may be especially vulnerable to this because they’ve tended to work to get deals with subdivision developers for exclusive broadband rights. While these deals can prevent other operators from running wireline cable/fiber in these communities, they can’t block consumer decisions to use fixed wireless if it becomes available. And concentrations of Comcast-or-cableco-committed subdivisions might well encourage some creative antenna-placing by fixed wireless competitors.

The biggest industry question posed by Xfinity Stream is whether Comcast is in fact reacting to fixed wireless. If major consumer broadband providers like the cable companies believe that streaming over fixed wireless is a threat, then it probably is. That would mean that operators overall would be under some competitive pressure from the technology, that T-Mobile is likely to be a more significant (maybe very significant) player in the future, and that streaming technology is likely to supplant all linear RF TV delivery.

Another interesting data point is that Comcast is quietly deploying fiber in some of their home territories, even in places where their CATV plant is fairly modern and capable. You don’t need fiber to deliver linear RF and current-level consumer broadband. They are likely seeing areas where services like Verizon Fios are available or likely to become available as being under competitive threat for broadband speed, including upload speed where CATV is weaker. So far, fiber is apparently being used more for distribution than for home connection, but a shorter CATV span with fewer subscribers on it could result. That could mean better CATV broadband, making the approach a kind of FTTN model.

Google is also expanding its fiber broadband service to five additional states. Google doesn’t target entire states, of course; they typically enter market areas where there isn’t high-speed service available and where there are pockets of higher demand density. Nevertheless, this move suggests that other players will be jumping into the broadband space, and the Google model is for broadband-and-streaming-TV rather than linear RF.

I think that what we’re seeing is a multiplication of broadband delivery options, not a focus on a specific approach. Sure there’s a lot of PR on fiber-to-everyone, but that’s not going to happen. What will happen is fiber to any demand-dense area and other technologies, including and especially fixed wireless for the demand fringe areas. In between we’ll see things like FTTN or FTTC (node and curb, respectively) as cable companies and perhaps even telcos try to reduce the cost to serve users in areas where densities are marginal.

I also think that there is no future in the linear RF model, that streaming video, including streaming of live TV, is going to rule the future. That raises a very significant question, which is whether streaming aggregators like YouTube TV (Google) or Xfinity Stream from Comcast will then come under pressure from content companies who take advantage of the inherent populism of Internet streaming to go directly to the customer with their material. That would be a major shift in market practices, and could have a major impact (positive or negative; it’s too early to tell) on the content industry.

Is Amazon Going to Give Us Real Robots?

Amazon’s decision to buy iRobot, the company that makes the popular Roomba robotic vacuum, has stirred a lot of speculation, ranging from some solid business questions to the expected speculation that this is just the latest round in the effort to have robots take over the world. Are robots poised to overtake AI as the center of our fears of sentient technology, or (gasp!) might they mate and create something truly awful?

Most of you have probably seen commercials that linked Alexa command capability to Roomba technology. Those are the link to the most rational explanation for the deal. The “smart home” is getting smarter not only in terms of what specific gadgets give it smarts, but in its ability to interact with us and accept instructions. Think “the Swarm” and you get an idea of why some find this a bit disquieting, and there are surely risks, but also benefits.

Not the least being financial. Companies like Amazon recognize the classic “whole is greater than the sum of its parts” rule applies to smart homes and other facilities particularly well. The value of Alexa is limited if all you can do is turn on a light or play some music. Add to the capabilities and you add to the value, and the combined value grows at an even faster rate. Imagine an intruder, detected by Ring security and confronted by a charging Roomba? Well, maybe we need to think deeper than that.

Household robotics have an interesting potential. Vacuums were actually a bit of a low apple, though; the devices don’t need “intelligence” as much as what in animals we’d call synapses. If you’re about to run into something, stop or turn to avoid. They can rely on random movement to cover an area, but they could also be made to work with a simple map, either pre-installed or derived from past movement. If we want home robots to go somewhere, we have to advance what they do, and how they understand the facility where they’re working.

You could advance from vacuuming to a broader cleaning mission, which might mean extending current device capabilities to recognize different floor types and adapting to the current one, to perhaps even mopping/floor dusting. You could include a broader dusting mission for a taller, true, robot, providing that you could “teach” a robot to avoid knocking things over. You could make a robot bring you something if you could add object identification. All these applications demonstrate that there are three elements to a broader robotic device—mechanical manipulation and movement (M³), situational awareness, and mission awareness. All three would require augmentation from the state-of-the-vacuum baseline, and that creates a number of questions and challenges.

One question is how to distribute the intelligence. I’ve done a little work in robotics, and my conclusion was that the M³ features had to be largely in-device. First and foremost, a robot has to be able to avoid doing something destructive simply because it’s moving. This is also a fair robotic analog to animal/human reflexes. You don’t think about dodging something thrown at you, you simply do it. But situational awareness and mission awareness are things that could well be ceded to some higher-level element.

Amazon, with Ring security devices, has some of that higher-level stuff already. It makes sense to assume that a smart robotic home would have a home controller, and that the controller would likely be an outgrowth of current smart-home-control elements. Not only would that leverage current technology, it would also facilitate cross-actions of the type we are already used to in home control. Your speakers can control your lights directly, or they can control a home controller. In either case, you’re using cloud-resident voice recognition and command generation.

One technology shift that seems critically important here is the transition from “ordinary” sensors that simply detect objects (ultrasonics/radar) and ones that can actually analyze a visual field and interpret it. Amazon already does this on its Ring devices, which can pick out humans with a fairly high level of reliability, and Google Lens can identify a broad range of things. My presumption is that Amazon will likely advance robotics in this area first, adding video analysis to robotic devices by using augmented Ring technology. That would let a robot map a space in three dimensions, identify pets and people, and even specific people.

The big question is less one of technology direction than of near-term application. Moving from a robot vacuum to anything else is a major step. To expand cleaning duties meaningfully you’d need to give a robot the ability to move up and down at least a few stairs. To do dusting or fetch something, the robot would have to be a lot taller than a Roomba. Any of this means that the consumer price would surely rise significantly, which would reduce the addressable market.

Of course, Amazon may decide to take robots more in a business direction. Its own warehouses and shipping would be a fairly logical place to use enhanced robotics, and they could leverage their own experience into sales to other businesses, and eventually move things over to the consumer space when costs could be kept manageable. But what is the value of acquiring Roomba then?

The big barrier to all of this is safety. Remember the chess-playing robot that broke a boy’s finger? It’s doubtful that Amazon or any US company would release a product that could make that mistake, but one thing the incident shows is that quick movement by a person, particularly one that doesn’t fit an expected pattern, could throw off robot intelligence. The old Three Laws of Robotics may be logical and seem to cover all the bases, but they presuppose robots with almost-or-completely-human intelligence, which early devices surely will not have.

The “robot” in the chess match was really an industrial robotic arm, and behind it was an AI/computer-based chess program. Some stories on the chess-robot incident linked the technology with the use of AI to “play” chess. The difference is that AI chess doesn’t attempt to link the computer directly to moving pieces on the board. When AI or robotic technology crosses over to manipulate things in the real world, there are going to be issues with how those manipulations can be prevented from damaging property or hurting people or animals. Teaching AI to play chess doesn’t teach it to safely move pieces around when the opponent is one of those disorderly biological systems we call a “human”.

It really goes back to M³ and the other two of our three elements. AI chess focuses on mission awareness. Mechanical manipulation and movement have to focus on the interplay between a robot element and the real world, and it’s situational awareness that has to provide the break on how M³ and mission awareness combine. That’s what is going to have to be transformed if robots are going to do more than crawl around our floors, and Amazon may be signaling they intend to help robots to their feet and integrate them into our lives. Otherwise, the deal makes little sense.

Will Cisco’s Big Organizational Change Help, or Hurt?

If we have a shape of things to come, maybe now we have an organization of things to come. That’s what Cisco is promoting, at least, with the announcement it would be combining the enterprise networking and mass infrastructure groups. The challenge here is first to decide if the decision was driven, as some say, by the departure of enterprise group EVP and GM Todd Nightingale. Others say that it might be the result of the long-standing competition between Cisco business units and a company drive to stamp it out. Or, just maybe, there’s some actual market driver.

Both enterprises and network operators I chat with have been generally suspicious of the move and cynical about the motivation. Both groups believe that their interests are better served by separating the groups, and both believe that the product requirements for the groups are diverging rather than converging. They also believe that divergence will accelerate in the future, so if buyer attitude is the measure of a move, then Cisco may have read the tea leaves wrong.

Jonathan Davidson was heading the mass infrastructure group, and will be the head of the yet-unnamed new group. He was EVP and GM at rival Juniper in the past, so he understands Cisco’s main competitor, and the mass infrastructure group has played the major role in development of Cisco’s network software and interface assets. Some within Cisco say that he’s seen at Cisco as the stronger of the two executives, which is why Nightingale is believed to have taken the CEO job at Fastly, a serverless edge platform play.

The way the two were “seen at Cisco” is a credible force behind the shift because Cisco has always been viewed as encouraging rivalry among its senior executives, a climate that most think John Chambers fostered and that has continued either with Robbins’ support or at least his acquiescence. A few also point out that Cisco execs have often gone off to start a tech company, to be later bought back in by Cisco, but this has involved startups and not companies already established. And whether Fastly is a worthy acquisition is another unknown we won’t get into here.

Despite skepticism and rivalries, though, there are some solid reasons for the change, and they lay out pretty much as Cisco has described them. Networking is changing, both in terms of available technology and in terms of the balance of technology between operators and users.

Over the last four decades, enterprise networking has undergone a radical shift, from networks that were essentially smaller-scale implementations of operator networking to networks that were based on virtual-private-network services. Enterprises used to deploy their own nodal routers, linking leased lines in a grid that topologically resembled the networks of full-scale telcos. Now, they build networks using access routers and VPN services; there are no trunks and nodes at all. That’s a pretty profound shift.

And, of course, you could argue that was a reason to keep enterprise and operator networking separate. Users are now edge consumers, and operators are node consumers. But there’s another ingredient in the story, which is that enterprises want not just “network services” but managed network services. In managed services, the provider supplies the edge technology as well as the network service. Part of the enterprise shift is due to the fact that enterprises’ business scope now spans multiple operator territories, and operators have been slow to provide reasonable federated multi-provider services.

Every enterprise isn’t going to deploy managed services, but if CSPs and MSPs are highly successful (which they are) then they become dominant buyers in the edge device space, which means that operators of some sort are dominant buyers in the WAN technology space overall. Score one for consolidation.

Now to the next point, the data center and switching. Over the last decade, enterprises have shifted their application strategies toward “appmod” or application modernization, which is really GUI modernization. The goal was to accommodate the use of a browser and smartphone apps as on-ramps to legacy business applications running in the data center. This accounts for nearly all enterprise cloud spending, and it has reduced the need to change data center apps, or to supply web/app enhancements that ran in the data center and thus required more data center spending. That’s slowed the growth of the data center.

Except, of course, for cloud data centers. Cloud providers are massive buyers of servers, and that means massive buyers of data center network equipment, the switches. Their volume purchases are driving the switch market, and that means that switches are at least equally influenced by the provider side. Score two for consolidation.

Then there’s software. All network devices these days are software-based in terms of functionality. All the big network vendors have their own operating systems that run on their boxes. All have management software that facilitates network operations. Many have other software products that relate to network hosting and feature delivery. Buyer influence on software tends to follow buyer spending on network equipment, which as I’ve noted has been shifting more to providers over time. Certainly it would be foolish for a vendor to think that their network software would fork into a distinct segment for enterprise and another for operators/providers. Score three for consolidation.

Three-zero for consolidation of enterprise and operator-centric network equipment business units? Can we then conclude that Robbins, perhaps having done a pilgrimage to some mystical glade in the mountains, has returned to Cisco with a vision of the True Future? Maybe, but he’s also returned with a kernel of an old bugaboo that could turn on him and on Cisco. It’s called marketing versus engineering.

Marketing and sales is all about inducing prospects to become customers, and customers to become better customers. The classic formula for success here is to give the customers what they want. Engineering is then supposed to create the product to match customer desires. Demand-side product management in action.

But engineering is all about doing cool stuff that advances technical state of the art. Once that’s been done, it’s up to sales/marketing to make the customers want it. Clear-eyed engineers know where technology will take the industry, and build it. They, the buyers, will come. Supply-side product management, in other words.

I’ve seen more companies killed by this marketing/engineering face-off than by any other organizational factor. By creating a single product team, Cisco eliminates a strong marketing/engineering tie because engineering is now serving two different marketing and sales teams. Yes, that’s been true before, but there was an element of market-sector division when we had separate marketing, sales, and product groups. Now one giant, powerful, engineering tame faces off fragmented sales/marketing. It could be ugly.

Cisco has always been a sales-driven company, a “fast follower” rather than a tech leader. Could Robbins see the engineering consolidation as a means of creating tech leadership? Could he be failing to see the risk of a sales/marketing face-off of the kind often fatal to other firms in the past? We’ll see.

Any Interest in a Podcast?

A number of those who know me or have read my blog have asked me about doing a podcast. I’ve been looking into the concept, but the problem is that there’s a fair amount of work in prepping one, and also a cost getting it hosted. I can’t justify that on the same free-no-sponsorship basis as I use for my blog. My question to my LinkedIn friends is whether they think some fee/sponsor basis for a podcast, done at least one time per week on general issues in telecom and computing, would be interesting. Your comments are welcome!

Why Services are Different From Applications in Orchestration

I’ve talked about service modeling and lifecycle automation, and about the software that’s needed to make a modeled service effective. It’s time to look at another related issue, which is the issue of just what you’re orchestrating. While it’s popular to talk about “cloud-native” technology for network features, the cloud is an application hosting technology at its core, and networks and network features aren’t the same thing at all, even at the most fundamental level.

I’ve looked at a lot of user applications of cloud computing, containers, and Kubernetes, and one thing that strikes me is that the great majority of them are characterized by one word; unanimity. There is one resource pool, one administration, and one user domain. If you look at the world of network services, you find three distinct domains, the interplay of which frames the reality of any service deployment.

The first of these domains is the service domain. Where is the service to be offered? This question is usually easy to answer but that makes it easy to ignore. The scope of a service target sets the boundaries of the area where resources will necessarily be consumed. That frames the question of the way the other domains interact to set orchestration requirements.

The second domain is the administrative domain. Who controls resources; has the right and ability to manipulate and allocate them? Where a service domain crosses multiple administrative domains, it indicates that the service will require some inter-provider coordination or the use of a generally available resource like the Internet. Administrative domains can fragment resources, and when they do it fragments orchestration. Administrative domains are also fragmented; in a service there’s a single “seller” and potentially multiple “partners”, with the former having service contract/SLA responsibility to the buyer.

The resource domain is the last, bottom, of our three domains. Resource domains are divided by a number of factors in an abstract sense; “network” and “hosting” are examples. In the real world, the meaningful distinction is between control zones, resource collections that can be manipulated through a single API or management application. The big control-zone challenge is that network requirements and hosting requirements are interdependent; you have to connect a hosted element where it’s been placed, so changes in hosting change networking. Similarly, you have to replace a hosted element if the network connection to it is lost and can’t be recovered.

The reason these domains are important is that they impact the way that service lifecycle management can be automated through orchestration. That’s true both for technical reasons (the examples I’ve given above) and for reasons of sales/marketing.

From the very early days of networking, operators separated the “business” side of networking from the technical operations side. In a typical telco, the OSS/BSS systems form the basis for the business side and the NOC/NMS the basis for technical operations. This separation is reflected in the TMF’s vision as well, including the way that the TMF service model or SID divides things into “customer-facing” and “resource-facing” services, and calls the thing that gets sold a “product” rather than a “service”.

I’ve carried the separation through in my own work, separating my ExperiaSphere model into a service and resource domain, but the TMF move is to me the compelling one. The TMF represents the OSS/BSS community, which for operators is the CIO organization. Network operations falls under another unit completely, and so blurring the boundaries between those two domains would mean somehow unifying behavior of groups that have been separated since Bell talked with Watson.

The administrative domain then looms as a perhaps-decisive area to consider. A number of questions emerge from that consideration, some of which relate to how making some changes on how we define the domain could impact the effectiveness of cross-domain orchestration.

Fundamentally, administrative domains are areas where inter-provider federation is required. This federation could consist of network-to-network interfaces (NNIs), the practice followed in the ‘70s with the X.25 and X.75 packet switching standard. It could also include the kind of service-element contributions envisioned by both the IPsphere Forum and the TMF SDF work fifteen years ago. This sort of thing deals with situations where the “administrations” involved are truly independent organizations, so it would also presumably cover any relationships with fully separate subsidiaries, where regulatory mandates require independent operation.

The obvious question would be whether to combine the administrative interconnect and federation requirements with the control-zone separation mandated in the resource layer because of differences in technology or operations responsibility. If you need to exercise management through a different API channel, how much of a difference does it make whether the two APIs are within the same company jurisdiction?

One difference is the need for “cloaking” of infrastructure. Operators are happy to federate service components or entire services with partners, but not to cede those partners visibility into their infrastructure, or direct control over any part of that infrastructure. I’ve seen this point made consistently for decades, and there’s no reason to believe it’s going away.

Another difference is the extent to which the resources represented by a federated or separated API are abstracted. It’s often true that when an operator shares a truly federated service element with another, that element is represented only functionally and not at the implementation level. In short, where service elements cross administrative domain boundaries, the thing that crosses is almost a service-domain element. However, it has a direct resource relationship because it exists because it represents a “foreign” resource set. In my ExperiaSphere work, I reflected this by treating a federated/contributed element as the “boundary” object of the resource domain.

I think that may be the optimum way of dealing with those three domains; services and resources become the two operative domains, with the administrative domain represented by a special class of service object.

What happens with resource domain orchestration is the next question. The reason this is important is that the obvious division within the resource domain is the one between hosting and network. If this division of resources means a division in orchestration, then it would be necessary to define a higher orchestration level to envelope both. If we want to avoid this, we need to somehow limit the need for orchestration in one or both those divisions, and in any other divisions that might arise. There are two ways to do that.

Way number one is to abstract the resource management process by defining specific “services” that are available. We do this today with network connectivity based on presumptive Internet/VPN availability. In a sense, what happens is that the resource division is managed independently of the service domain, with the goal of “exporting” features that can be consumed functionally. This converts our resource divisions into what look like the federated features we used to subduct the administrative domain into the resource domain.

The second way is to incorporate awareness of and control over one resource division in the orchestration capabilities of the other. This approach is easiest to see in connection with hosting’s control over networking. Kubernetes provides for specification of a virtual network technology. If Kubernetes’ networking capabilities were expanded to include enough of the specific function-oriented requirements of networking (address assignment, URL decoding, connection/tunnel creation, etc.) then it would be possible to make connectivity subordinate to hosting. It’s not clear to me at this point whether this approach is practical, though Nephio might explore it as part of their mission.

I remain convinced that some sort of explicit service/resource model is essential, though. The relationship between hosting and network, OTT and on-net, and multiple standards initiatives and industry groups, are all very dynamic, and most of the initiatives are focused on lower-level orchestration and automation. That leaves a lot of variability and complexity that lives above any new tools and capabilities. I think we’ll eventually need to deal with it, and that it would be smart to think matters over now.

Where Does the MEF’s LSO Model Fit in Lifecycle Automation?

One of my favorite authors on networking in the last century (it sure feels odd writing that!) was Tannenbaum, and I loved his quote in “Computer Networks”; where he said “The wonderful thing about standards is that there are so many to choose from.” We could say that about standards bodies these days, and one such body is the MEF. Like all organizations, the MEF has evolved to stay alive, and its current iteration offers some guidance on service orchestration.

Originally, the MEF was the “Metro Ethernet Forum” but today it may be best known for its Lifecycle Service Orchestration (LSO) work. LSO is one of those sprawling standards initiatives that, like those of the TMF, are difficult to absorb for any who haven’t lived with them from the first. The current publicity on the work focuses on the LSO Sonata APIs, so let’s start our exploration there.

APIs are ways of exposing functionality as a generalized interface rather than as a human-driven portal or GUI. Obviously, an API set is only as useful as the functionality it exposes, and APIs inherit any limitations in functionality that exist in the underlying software. If there are issues with how a given function works within a given piece of software or software system, the APIs will inherit those issues. Like any implementation, APIs can also create their own unique issues if they’re designed wrong.

This “functionality” stuff is both the strength and the limitation of LSO Sonata. The APIs it defines are clearly linked to important business functions associated with network operator/service provider operations. That fact means first that there’s a strong incentive to use them, and second that the broad importance of the functionality almost assures that most operations software will be able to map functionality to those APIs. However, the limitation of the exposed functionality derives from its association with high-level OSS/BSS functions. What happens below, particularly at the network level with FCAPS, is out of scope.

Operators seem to like LSO Sonata for the same reason as enterprises like APIs exposed by their core applications; they facilitate the use of cloud front-end technology to broaden the categories of users that can be supported and tune the user experience to each. You need something like LSO Sonata to create OSS/BSS cloud front-ends of the kind that already drive enterprise use of the cloud. For this mission, there’s not much doubt that LSO Sonata is a good fit.

This API mission also fits in the pattern of opex optimization that the operators have pursued for the last decade. A network, like any cooperative system, can be operationally inefficient for two broad reasons. First, the means whereby operations teams exercise service lifecycle management can be sub-optimal. Second, the underlying network technology may itself be operationally deficient; the stuff is just hard to manage. Since capital equipment inertia limits the rate at which technology changes can be introduced, operators have focused on the first reason for inefficiency.

The problem is that the benefits that can be accrued by optimizing the exercise of management rather than optimizing the manageability of the network have largely been wrung out at this point. In addition, operators are finding that not all operations processes can be turned over to users by creating a portal. Some are too complex, too risky to network stability overall. Not only that, things like function hosting and even white-box switching have introduced new levels of technical complexity. All of this is becoming critical for operators, and none of it is within the scope of LSO Sonata.

The MEF does address some lower-level standards, though. One notable example is their “Lean NFV” model, and it’s notable for a positive contribution and also for a negative connotation. We’ll look at both, but start with the latter for reasons soon to be obvious.

NFV, or Network Function(s) Virtualization, is an initiative launched to replace physical devices with virtual functions hosted on commercial off-the-shelf servers. I’ve blogged about NFV many times, so I won’t go into all the issues I believe came up in association with it. The specific issue here relating to Lean NFV is that much of the focus of the NFV ISG within ETSI was on something other than what drove its launch, the “universal CPE” or uCPE model. This proposed an on-premises device that would host edge functionality by loading it with various edge-related features (like firewall).

To me, that concept was weak from the first. NFV is a high-complexity software structure, and it’s hard to justify using it to support software feature control in an edge device. The other problem is that the uCPE approach puts the whole NFV notion into direct competition with the vendors who would have to provide the virtual-function software in the first place. If you’re a firewall vendor who offers an appliance, why would you want to decompose functionality to let it run in a white box? Given this, I’d argue that focusing a standard on uCPE was a bad idea.

The positive contribution Lean NFV makes is the decomposition of the NFV management elements (specifically the VNFM, VIM, and NFVO) into “microcontrollers” which I take to be microservices, and the centralization of parameters in a key-value store (KVS) repository. If you are in fact going to do uCPE virtual functions, then you need a standard way of managing them, including their parameters. I also think that all those NFV functions should have been microservices all along, and that I think is what the “microcontroller” concepts introduced in Lean NFV mean.

The problem in Lean NFV is that while it creates a single API to manage a uCPE deployment (MEF Presto), that API is again a very high-level element, and almost everything in detail below it, including how the microcontrollers are structured, is out of scope. The effect is to harmonize the approach with LSO Sonata, to make uCPE visible almost to the service level, but to do so by abstracting away all the details.

You can’t hide complexity within an abstraction, you can only transfer responsibility for it away to another party, one who may be even less capable of resolving it. In this case, as in other cases we’re seeing in the whole service modeling and orchestration debate, there are just too many of these other parties who’d have to cooperate to make things whole. Even then, we’d be faced with perhaps a dozen different implementations of the same thing, and new layers of orchestration that add complexity.

Bottom-up strategies are always appealing for developers and standards-writers because they start with what we know. That’s not the question; we should be thinking about where they end up and not where they start. All too many times, we never get to the top of the matter, or we get there by skipping over the middle. I’m afraid that’s what the MEF has done.

5G Winners and Losers: What Differentiates Them?

Every new technology creates winners and losers, and 5G is no exception. Light Reading talks about this, specifically in terms of mobile operators, but I think we need to look a bit harder at this issue. 5G is important to operators and vendors alike, after all, and it’s also important to use 5G as an example of how a hotly promoted technology does in the real world these days.

The LR article talks about Verizon and T-Mobile as the exemplars of losing and winning, respectively. The story seems to link the Verizon problem with the slowdown the mobile industry faced after a good 2021. The implication, necessarily, is that T-Mobile somehow avoided that problem, and to me that begs the question of why that should be. After all, Verizon is a big telco with a great home territory and an opportunity to create symbiosis between wireline and wireless services. There has to be more to it.

One obvious truth is that wireless has been, for decades, more profitable than wireline and T-Mobile is a wireless operator, where Verizon is both. While Verizon’s territory has higher demand density than rival AT&T and in fact comparable to EU telcos, the fact remains that return on infrastructure in wireline is under considerable pressure. And, unlike AT&T, Verizon hasn’t been on the leading edge of technology modernization for their broadband services overall. It’s my view that this combination has limited the value of Verizon’s dual-model broadband market position.

Another point is that Verizon has tried harder than perhaps any other operator in the US to promote the notion that 5G services are differentiable based on speed. Their push for 5G as something that would matter a lot to consumers set them up to exploit the early 5G hype, which peaked in 2021, and they enjoyed a nice pop based on that exploitation. However, in the world of tech media, every technology is first hyped to the sky and then faces a stark choice. You either have to redefine it so it looks like it’s met its hype-set goals, or you have to turn on it. 5G suffered the latter fate in late 2021 into this year, and so Verizon was vulnerable to the shift. T-Mobile never pushed speed that way, it simply said it had 5G, and that kept it out of the artificial 2021 upside and the real 2022 downside of 5G in the media.

Both T-Mobile and Verizon have 5G home Internet options, and it’s hard to say which of the two is doing better based on released financial data, but the stories I get suggest that T-Mobile is ahead in this space, and for sure they have a broader coverage map (estimated 120 million homes) versus Verizon (20 million homes). The broader availability of T-Mobile helps them justify a more aggressive ad campaign nationally, which of course then helps them sustain their coverage lead. However, Verizon’s home Internet is almost twice as fast, based on actual user experiences. T-Mobile also lacks any wireline broadband option that would compete with their 5G home Internet service, something that may also make their ad campaign more aggressive.

One question this raises is whether a strong 5G-to-the-home option could be the best answer for an operator who wants both mobile and fixed broadband. This question is particularly important for smaller countries and that depend on tourism. Should they consider true wireline, meaning fiber connectivity, for homes and businesses, or should they go bold and try to do everything with 5G, including millimeter-wave technology for home and business? That move could save a lot of money for them.

Another question that may be more pressing to the Tier One operators is what this might mean for business 5G service and 5G Core, including network slicing. Many operators (including Verizon) have hyped up the notion of IoT applications of 5G, meaning sensors connected via 5G. That strategy hasn’t gotten broad traction (or, frankly, much traction) because the great majority of IoT uses fixed installations for their devices, and WiFi, a custom protocol like Z-Wave or Zigbee, or even wiring will serve at least as well and cost less. Network slicing and private 5G have also been pushed to the business community, with highly publicized but very minimal success. In fact, my contacts tell me that the majority of private 5G really going in is simply modernizing private LTE.

Anyone who looked realistically at 5G from the first (as I’ve tried to do) would conclude that it was going to succeed as an orderly evolution of wireless, which is what it really was. There was never any good chance that it would open new markets in the near term, meaning that new stuff wouldn’t drive 5G adoption and that operators couldn’t expect to earn new 5G revenues. The Verizon/T-Mobile comparison in the Light Reading article, to me, demonstrates that operators who didn’t depend much on the hype did better in the long run.

The interesting thing is that there almost certainly are new applications that would require or at least benefit from 5G, and that these applications could boost 5G operator revenues. Why aren’t we seeing anything about this? Two reasons.

First, the media process is always driven by the insatiable desire for clicks and ad eyeballs. Bulls**t has no inertia, unlike real markets, so there’s a tendency for the media to jump way out in front of any tech trend because it’s a new path to those desired clicks and eyeballs. Often the slant that’s taken early on is a slant that’s easy and sensational, which is rarely the case with real-world stuff. Thus, when the right one comes along, application-wise, it sounds pedestrian compared to the hype, so it’s not news.

Second, networking is still trying to get over the Problem of the Twentieth Century, which was that we had more stuff to deliver than we had effective delivery mechanisms. It’s not that it’s trying to solve that problem—we’ve largely solved it—but that it’s still behaving as though the problem existed. When the network was the limiting factor, network technology unlocked a lot of pent-up stuff and never had to give a single thought to how to develop an application. Now it does, and the industry still clings to that Field of Dreams.

These two factors are why hype is destructive to value, and they are both operating in some form or another in pretty much all of tech. We live in an age of ecosystems but we think and plan like we’re in an age of products. No vendor, no operator, can hope to succeed on a large scale without a technology advance on a fairly broad front, but few if any can get their heads and wallets around such an advance. Tech needs to look forward enough, and broadly enough, to secure its own optimum future.

Where is Meta Taking the Metaverse Now?

Meta’s quarter missed across the board. This is its second quarter of issues, and its stock has been declining steadily, to the point where it’s lost about half its value. Obviously this isn’t a good thing for Meta, but the big question is what it might mean to the OTT space, the metaverse, and the tech markets overall.

One essential truth here is that social media may be social, but society is fickle. The whole social media thing is about being in touch with pop culture, which changes rapidly. Not only are the users gadflies, any successful social platform has a community of users who are happy to complain about things they don’t like, which serves as the source for new platforms that fix those issues. We’ve had social-media failures before (remember Second Life?) and we’ll continue to have them, because that’s the nature of social media.

Regulators have no love for social media either. Meta’s efforts to use its market capital to buy up players has met with regulatory scorn; the FTC has just sued to block Meta’s acquisition of the maker of the Supernatural fitness app. So think about it; your own space is doomed to social death, you can’t use your current gains to buy into adjacent areas…not a happy picture.

Meta was smart in that it realized this, which is why it jumped so aggressively on the metaverse concept. The problem for Meta there is that it’s essentially an all-or-nothing bet on a concept that’s going to take considerable time, investment, and luck to bring to maturity. Meanwhile, to avoid Street condemnation, they have to tell the world what they intend, which means that others (like Microsoft) are free to jump out and to their own thing in competition. Meanwhile, social media is changing as it always does, and not to Meta’s benefit.

How did Meta let things come to this? I think that like most companies, they’ve had their eyes on their feet instead of the horizon. To be fair, the regulatory shift that Sarbanes-Oxley represented shifted companies’ focus from longer-term to this-quarter, which sure makes watching your feet look smart. The problem is that this view not only disguised the risk something like COVID represented, it disguised what a recovery from that risk would mean.

Facebook is a more immersive form of social media, compared to something like TikTok, which Meta admits is a major threat to it in the near term. Meta introduced Reels in Instagram to shift its focus to compete better, but if you think about it, they should have been planning an evolution as soon as COVID hit. People sitting at home under quarantine conditions use social media one way, but those same people use it another way when they’re back out in the world in what’s been called “relief” behavior.

This too shall pass away, as the old saying goes. Facebook succeeded largely because it created a trend, and now it’s in a position where responding to others’ initiatives is critical. By the time Facebook makes Reels a successful TikTok competitor, what will social media look like? Just a quarter ago, we might have said “the metaverse”, and that still might be true, but the problem is that short-term Wall Street pressure is now causing Meta to short-change its metaverse.

The current Meta advertising on the metaverse is focusing not on a social experience but on a personal one. Their vision of the metaverse has always depended on virtual reality, which means that their Reality unit (where the metaverse lives) has necessarily been looking at how to make a metaverse look better. A social metaverse needs to look compelling, but it also has to be realistic in the way that avatars that represent people can interact. That, as I’ve noted in past blogs, demands lower latency in processing a collective vision of reality in which the collected users (via their avatars) can live and move. Otherwise, avatars will “lag” and any interaction will be jarring rather than representative of real personal interactions. The problem, obviously, is that improving latency to make interaction realistic means things like edge computing, meshing of edge sites, low-latency access, and so forth. All of these things are possible, and many might be within Meta’s ability to drive forward, but not when their profits are under near-term pressure.

A metaverse where kids can learn about dinosaurs, the rain forest, or endangered species is surely helpful from an educational perspective. Another one where surgeons can train in virtual reality to hone their skills is helpful in improving surgical outcomes. Are either big money-makers? That’s the challenge here, and to face it Meta is making the metaverse into a kind of super video game platform with evolutionary capability.

Many, if not most, of the metaverse applications Meta now seems to be promoting could be delivered via a gaming platform, if it were augmented with high-quality virtual-reality capability. Microsoft’s vaunted counter to Meta’s metaverse move was the acquisition of Activision Blizzard, which of course is a gaming platform. That raises the question of whether Meta can meet any short-term metaverse goals without having a gaming franchise of its own to leverage. If not, then the questions are how fast the “evolutionary” capability could be delivered, and what form it could take. The two are clearly related.

To evolve the metaverse to the vision that was first laid out, the “social metaverse” that Meta implied with its announced restrictions on the personal space of avatars, would require edge computing and low-latency meshing of the edge locations, or confining the users to locales that were able to assure low latency within them. This would be not only an evolution of the metaverse as it seems to be constituted today, but an evolution in social media, to favor the creation of virtual communities that mirrored the ability to regularly interact.

Among young people, most social-media interactions are with others they know in the real world, and usually see regularly. The obvious question then is whether a “metaverse” for those people is even interesting. Remember, this is the group that’s made TikTok a success, and they tend to use it for short interactions when they’re either physically apart or want to establish a kind of private back-channel to a shared real-world activity. Think people at a party chatting about another attendee.

This raises two critical questions about Meta’s future. First, is its current challenge due to the fact that social media is evolving away from the longer-term immersion that Facebook represented. If it is, then the metaverse is the wrong answer. Second, does Meta already know this, and is now trying to repurpose its metaverse initiative to fit a different market niche?

It also raises a question for the metaverse community, especially the startups. If the metaverse is just a super-game, then what real opportunities does it open? Startups, because of their VC financing, are notorious for wanting a quick buck, and how many VCs who backed the metaverse model will believe that can be achieved now? Worse yet, how many of those VCs would have backed a startup whose payoff depended on cracking the gaming space? Add to this the fact that some reports are saying that the current situation with tech VCs is similar to a stock market crash, and you have some funding risks to consider.

The biggest risk, of course, is that a big shift from Meta will be broadly interpreted as an indicator that the metaverse is failing as a concept. The truth is that we’ve not moved far enough along in laying out the ecosystem it will necessarily become to understand even what makes it up. We don’t know what technologies will be critical, or what the ROI in various metaverse applications will be. The real danger is that we may delay answering those questions, and thus delay the realization of what I think will prove to be an important, even critical, concept.