Is Fiber in Our Future After All?

What the heck is going on with fiber to the home? There seem to be a lot of announcements about new fiber deployments. Isn’t there a problem with fiber deployment costs in many areas? Are we headed to universal fiber broadband after all? The answers to these questions relate to the collision between demand and competitive forces on one hand, and hard deployment economics on the other. We’re surely going to see changes, but they’re going to make the broadband picture more complicated, not more homogeneous.

Demand shifts, when they happen, change everything, and streaming video is such a shift. Consumers have long been substituting viewing recorded video rather than live, and as companies like Amazon, Hulu, and Netflix started offering streaming programming from video libraries, many have discovered that the broader range of material gives them more interesting things to watch. We have more streaming sources than ever today, and companies who had relied on live, linear, TV programming are finding that they’re struggling to maintain customers in that space, while their broadband Internet services are expanding.

It’s really streaming that changes the broadband game, and the fiber game. Streaming 4k video works well for most households at 100Mbps, better at 200Mbps for large families, and those speeds are beyond what can be delivered reasonably over traditional copper-loop technology. Cable companies, whose CATV cable has much higher capacity to deliver broadband, is already pressuring the telcos in the broadband space, and fiber to the home (FTTH) is the traditional answer.

FTTH, even in its passive-optical-network form, has a much higher “pass cost” than CATV cable (today, roughly $460 per household for PON versus roughly $185 for CATV), and both these technologies are best-suited for urban/suburban deployment. Telcos like Verizon, with a high concentration of urban/suburban users and thus a high “demand density”, have countered cable companies with fiber effectively. Where demand densities are lower (like AT&T), there’s a risk that going after urban/suburban users only would offend a large swath of rural customers, and even create regulatory risk. AT&T, staying with our example, has lagged Verizon in fiber deployment, though they’ve been catching up recently.

Streaming video demand, and competition between CATV and fiber, has been increasing telco tolerance for higher fiber pass costs, which in any case have fallen roughly $150 from the early days of Verizon’s FiOS. The big problem with both CATV and fiber is the need to trench and connect the media. You need rights of way, and you need crews with equipment, who have to be very cautious not to cut wires or pipes in crowded suburban rights of way and easements.

Another factor in the fiber game is that suburbs are growing, which is gradually making the suburban areas more opportunity-dense, and improving the return on fiber. It’s a bit too early to take the assertions that COVID will drive a major relocation push seriously, but it is possible that people who have accepted the benefits of city life are more likely to rethink the risks. WFH isn’t going to empty cities, but it may swell suburbs enough to shift the economies of fiber a bit…if only a bit.

The game-changer, potentially, is 5G in general, and millimeter-wave in particular. Feed a node with fiber (yes, FTTN), use 5G to reach out through the air a mile or so to reach users, and you can achieve a low pass cost. Just how low depends on a lot of factors, including topology, but some operator planners tell me that a good average number would be $205 per household. Something like this could deliver high-speed broadband to rural communities not easily served by any broadband alternative. Not only that, the technology could be used by competitors to the wireline incumbent; all you need is a single fiber feed to the town.

“Predatory millimeter-wave” may be the decisive technology in fiber, deployment, ironically. In the first place, it’s a direct consumer of fiber in areas where FTTH is simply never going to happen because of low demand density. Second, it’s the only realistic model for competitive home broadband in areas where a competitor would have to be established from scratch. Finally, it’s good enough to support streaming services, but not as good as fiber, so it’s going to tip the scales on pass-cost tolerance further, encouraging telcos to deploy fiber in suburban areas previously considered marginal.

There is also a chance that the FTTN deployment associated with millimeter-wave 5G will help expand FTTH. Clever planning could encourage symbiosis between the two fiber models. You could create a PON deployment, some sites of which were FTTN nodes rather than homes/businesses, and reach out from the edges of FTTH for a further mile or so. You could also selectively feed PON from a traditional fiber-fed node as well as supporting millimeter-wave.

Where millimeter-wave broadband could really shine is in conjunction with the utilities. Anyone who runs wires or pipes can pull fiber, and many have done that. The question for many who have is how to leverage it effectively, and the 5G/FTTN hybrid could do the trick. Some of my broadband data modeling worked on what I called “access efficiency”, which relates to how dense the rights of way are. In many rural areas, access efficiency is such that a majority of the rural population could be reached from a utility right of way. If we imagined a kind of “linear PON”, with fiber feeding 5G-mm-wave nodes along a right of way, any site within a mile or so of the right-of-way could be reached. If the cost of one of those nodes were reasonable, it could be a better option than fiber-to-the-curb, which then relied on MOCA cable for the household connection.

I think we’re going to see some fiber expansion. We’ll also surely see more people predicting universal fiber, but in the US at least I don’t think there’s any realistic chance of that. In fact, too much focus on impractical fiber strategies could end up hurting technologies that will actually boost fiber deployment overall. 5G/FTTN may not satisfy the fiber-to-everyone school, but it would radically improve broadband access, and increase fiber deployment too. That should be our goal.

Will even 5G/FTTN create universal gigabit broadband in the US and other countries with similar demand density variations? Not likely, but that’s more a public policy question than a technology question. While we’re answering it, people who live in sparsely populated areas are likely to find themselves with fewer, and poorer, broadband choices.

What Would an Innovative Telco Opportunity Look Like?

OK, if telecom innovation is dead, as both the Financial Times and my recent blog agreed it was, what would “innovation” have to look like to restore things? Can public cloud providers play a role in helping telcos innovate, or would the reliance on them just create another form of “disintermediation”? Let’s look at these questions, re-making some past points with some fresh insights, and making some new ones.

People have been telling me for decades that the telcos are motivated by competition and not by opportunity. Given the nature of their business, which has enormous capital costs and relatively low return on capital, most competitive threats emerge from other telcos. At least, if one considers “competition” to mean competing with telcos’ current businesses. This simple point may help explain why telcos have been so reluctant to branch out to better and more innovative services; they don’t value the opportunity those services might generate, and they fear a new host of competitors.

After posting the blog I reference above, I got some pushback from people who believed that telcos could innovate by reselling what are effectively OTT forms of communications services. I don’t believe that the revenue opportunities for telcos here are sufficient to create a bump on their profit-per-bit decline, but that’s not the only problem with the notion. Being a reseller exposes telcos to all the new skills required for retail service promotion, and their reluctance to compete in the OTT space is partly due to their desire to avoid proactive selling at the retail level, when they’re used to simply taking orders.

Some OTT position is essential, given that all the credible opportunities that have emerged or seem likely to emerge are related not to connectivity but to how connectivity can be used to deliver new services. In short, the connectivity that’s the product of the telcos is simply a growth medium supporting what could (inelegantly, to be sure) OTT bacteria. The telcos have been not only reluctant to embrace OTT services, there are positive reasons they can’t. The biggest such reason is that you can’t compete with OTT players when your own competing services have to contribute to cover connectivity service profit shortfalls those OTTs don’t have to worry about.

The solution to this problem seems clear to me. If your original product (connectivity) has commoditized, you need a new product. If you don’t want to field an OTT offering, or don’t believe it would work for the reason I just cited, then you need to field another wholesale offering, something beyond connectivity, where you can establish a viable wholesale business before disintermediation takes hold. What? I think there’s only one answer, and that is information.

OTT services are typically experiences of some sort, and experiences are valuable to the extent that they’re personalized. You can’t just shove video at people, you have to give them something they want to watch. Weather isn’t interesting if it’s for somewhere far from where you are and where you might be heading. These are simple examples, but they prove that people want stuff in the context of their lives, which is why I’ve called the sum of the stuff that personalizes experiences contextual services.

There are a lot of things that create contextual services, the most obvious these days being tracking web activity. Ads are personalized by drawing on what you buy and what you search for or visit. This sort of contextual service isn’t easily available to telcos because it relies on information gathered from OTT relationships, the very area telcos don’t want to get into. Telcos need a kind of personalization that OTTs don’t already have an inside track with, and the best place to find it might be location and location-related information.

Anyone who watches crime TV knows that mobile operators can pin down your location from cell tower access, not perhaps to a specific point in space but to a general area. They know when you move and when you’re largely stationary, and they could know when you are following a regular pattern of movement and when you’re not. Best of all, all the stuff they know about you can be combined with what they know about others, and what they know about the static elements of your environment, like buildings, shops, sports venues, and so forth.

Obviously, using data from other people poses privacy issues, but there are plenty of examples of services based on even individual locations already, and an opt-in approach would be acceptable by prevailing industry standards. When we generalize to communities of people, there’s no issue. For example, mass movement of mobile users associated with things like a venue letting out would signal a likely change in traffic conditions on egress routes. That movement can be tracked by cell changes by the users, something that’s already managed by telcos.

Is a shopping area crowded, or relatively open? Are there a lot of people in the park? Those are questions that analysis of current mobile network data could answer, though obviously you’d need to aggregate information from multiple providers. That’s OK, because what we’re looking for, recall, is a wholesale information offering. Telcos could offer an API through which retail providers could gather their information, and aggregate it as needed with information from other telcos.

They could also aggregate it with IoT data. One of the big problems with the classic IoT public-sensor model is that nobody has any real incentive to deploy public sensors; there’s no ROI because their use isn’t confined to the organization who paid for deployment. Suppose that IoT sensors were deployed by telcos, and that rather than their being public, they were information resources for the telcos to use in constructing other wholesale contextual services. Whatever we believe IoT could tell us could be converted into a service that digests the sensor data and presents it as information, which is what it is.

There’s more that could be done with all this data. Recall from my previous blogs that many advanced IoT opportunities involve creating a “digital twin” of a real-world system. The “real world” here could be the real world in the local sense of a given user. We already have sophisticated mapping and imaging software that can construct a 3D reality of places, and that could be used to provide augmented-reality views of surroundings, with place labels and even advertising notices appended.

The notion of a real-world digital twin is the ultimate piece of the contextual-service puzzle. A telco could obtain, or even contract for, the basic software needed, and of course there’s likely no organization more used to large capital expenditures than a telco, so they could also fund a significant IoT deployment to feed sensor-based services. The sale of wholesale services obtained from these systems could then be a revenue kicker, and it would still allow telcos to stay out of the retail application business for as long as it took for them to gain comfort with that space, which might never happen, of course.

I don’t think it’s likely that telcos will do this sort of thing. I think public cloud providers might well consider it, and might form relationships with telcos to obtain information from mobile software to feed the information services and real-world digital twin. It’s hard to imagine this kind of thinking from a telco who hasn’t been able to come to grips with cloud-hosting even elements of critical 5G infrastructure, and are seeking public cloud providers instead.

I think this is an example of an opportunity that telcos will be so slow to recognize that they’ll end up losing it to another player. Likely, to their own public cloud “partners”. Instead, they’ll try to convince themselves that some different form of connection services is innovation. They’ll succeed in the convincing part, but what they’ll get won’t be innovative at all, it will be another form of telco business as usual.

Telecom’s Innovation Failure: How Did We Come to This?

Why has a Financial Times article suggested that “Telecoms innovation talk may be nothing but hot air?” It’s probably more important to wonder if it’s true, of course. Many pundits (myself included) have talked about this issue before, and even some telecom executives have offered their own views. I recently recovered a hoard of my past email exchanges, including those relating to some very significant telecom initiatives that turned out to be nothing. Some real-world experiences may help focus this long-standing topic on something useful, or at least uncover why nothing seems to be useful at all.

My first experience with the issue of telecom innovation came way back in the 1980s. The Modified Final Judgment had broken up the old Bell System, separating telephony in the US into long-distance and “Regional Bell Operating Companies” or RBOCs. The RBOCs established their own collective research arm, Bell Communications Research or Bellcore, and I was at a meeting there, doing some consulting on a project. It was a large and diverse group, and in an introduction, the sponsor of the project made a comment on development of what today would likely have been called an “advanced” or “OTT” service.

At this point, one of the attendees whispered something, and to my surprise, three-quarters of those present got up and walked out. The whisperer, a lawyer for one of the RBOCs, paused and said “I’m sorry, but this discussion is collusion under the new regulations.” That was the end of the meeting, the end of the project, and the end of free and open discussion of advanced service topics.

About three years later, another Bellcore project came up, relating to the creation of specialized communications services for the financial industry. The project had two phases, one relating to fact-finding research into how the industry used communications, and one that would then identify service targets of opportunity. In the meetings relating to the first phase, the project manager made it clear that “communications services” couldn’t touch on service features above the network. I’d been involved with Electronic Data Interchange (EDI) projects at this point, and I wondered if the use of a structured “data language” like EDI could serve. No, I was told, that’s the wrong level. It has to be about connection and not data or it’s an “advanced service” and has to be done via a fully separate subsidiary.

Fast forward to the decade of the 2000s, and I was a sort-of-de-facto-CTO of the IPsphere Forum (IPSF). This body was created to build what would be the first model of a composed, multi-element, multi-stakeholder, service architecture, with “services” made up of “elements”. The group was making great progress, with full support from Tier One operators globally, but one day representatives from the TMF showed up. Some EU Tier One operators had been told by their lawyers that they couldn’t be in the IPSF any longer, because it wasn’t “open” enough to be acceptable under EU telecom regulations. The IPSF had to be absorbed by the TMF, a body that was sufficiently open. It was absorbed, but there was no real IPSF progress after that.

When the Network Functions Virtualization (NFV) initiative came along, it was created as an “Industry Specification Group” or ISG under ETSI. It had significant industry support, much as the IPSF had, but the telecom members seemed content to advance the body’s work through a series of PoCs (Proof of Concepts), and these were put together by vendors.

I’d fought to have the IPSF model be based from the first on prevailing cloud standards, with no new “standards” at all. The telecom people believed their application was unique, demanding extraordinary (“five-nines”) reliability, for example, so they rejected that approach. The NFV ISG is now trying to retrofit cloud-native to NFV, of course.

Cutting through all of this is a truth I came upon by surveying telecom players for four decades. There’s a very stylized way of selling to the telcos. You engage their “Science and Technology” group, headed by the CTO, and they launch a long process of lab trials, field trials, etc. Of all the lab trials launched, only 15% ever result in any meaningful purchase of technology.

So what do my own experiences seem to say about telecom innovation? Let’s review.

First, regulations have hamstrung innovation from the first. At the dawn of “deregulation” and “privitization” in telecom, when innovation was the most critical, the regulatory bodies threw every obstacle they could into the game, with the goal of ensuring that prior regulated monopolies couldn’t benefit from their past status. They succeeded, and the industry didn’t benefit either.

Second, people who run companies with their hands tied can only learn to punt. Regulations prohibited innovation, so regulations created a culture that innovators would flee, and what was left was a management cadre who accepted and thrived on legacy. The companies who gave us some of the most significant inventions of modern times became just “buyers”, and buyers don’t design products; vendors they buy from do that. Telecom innovation was ceded to vendors, and those vendors represented legacy services and not innovation.

Third, lack of understanding of cloud technology, created by ceding innovation to vendors, hamstrung any telco attempts to define innovative services or exploit innovative technologies. The first two points above have made telcos a less-than-promising place for innovative cloud-native technologists to seek a career. There are plenty of vendors who will pay more, so those vendors (to the telecom space, new vendors) with those innovative cloud-native technologists would have to promote their new technology ideas to the telcos.

Finally, the innovative vendors don’t see the telecom space as their primary markets, and some don’t see that space as a market opportunity at all. A telecom sales cycle is, on the average, four times as long as the sales cycle to a cloud provider, and two to three times as long as to an enterprise. Telecom sales, 85% of the time, don’t result in any significant revenue for the seller, only some pilot spending. Add to this the fact that there are few (if any) people inside the telco who would understand the innovative, cloud-native, thinking required, and you have little or no incentive to push new ideas into the telcos from the vendor side.

Increasingly, telecom is mired in legacy-think, because they’re staffed with legacy thinkers in order to maximize profit per bit from services whose long-term potential is nothing other than slow decline. Their hope of redemption, by the vendors who support and profit from that myopia, is way less than “speculative”. The fact is that neither group are going to innovate in the space. That means that the future of network innovation is the cloud, and that’s going to establish a whole new world order of vendors and providers.

More Action in the SD-WAN Space

The SD-WAN space has been percolating for years, and there are some recent signs that it may finally be sticking its head out beyond the old extend-the-VPN-to-small-sites mission. One thing that seems to stand out in two announcements is a managed services connection. As always, though, there’s no shortage of competitive intrigue in the mix, and so the outcome of all of this is still a bit murky.

SD-WAN is the most recent virtual-networking strategy to emerge, but not the first. From the early days of virtualization and the cloud, virtual networks were the go-to strategy for sharing network and hosting infrastructure among multiple tenants, including organizations within a company whose information technology had to be kept separate for security/compliance reasons. Virtual networks create, in some way, an overlay on top of ordinary IP networks, and this overlay strategy allows for connectivity management at a highly refined level, without impacting or depending on features of the underlying IP transport. Connectivity is separated from transport, in short.

Enterprises have so far embraced virtual networking for more mundane reasons. Some have adopted it in their data centers to improve security, and most recently many have used SD-WAN to extend corporate VPNs to locations where MPLS is either not available or not economical. The refined connectivity control that virtual networking and SD-WANs could offer wasn’t the user priority, which is why most SD-WAN products did, and still do, little to enhance connectivity control value-add.

Not all SD-WAN products have been so myopic. When Juniper acquired 128 Technology, they acquired what I’ve said from the first was the best overall SD-WAN and virtual-networking strategy available. The biggest selling point for 128 Technology was and is its session-aware handling of traffic, which means that it can identify user-to-application relationships and assign traffic priorities and even connection permission based on that. Integration of 128T’s SD-WAN with Juniper’s AI platform (Marvis and Mist) provides improved operations automation, and the combination of the two offers both network operators and managed service providers (CSPs and MSPs) a strong basis for a managed SD-WAN offering.

Managed services are quietly getting hot, and especially managed SD-WAN. Part of the reason is that enterprises are generally having issues with acquiring and maintaining the skilled network operations professionals they need, given competition from cloud and network providers and equipment vendors. Another part is that in the small-site locations where SD-WAN is the only VPN option, local technical skills are likely very limited, and a managed service is the only realistic solution. My data shows that CSP and MSP managed SD-WAN services are the largest source of new SD-WAN sites.

Then there’s the whole SASE thing. When you start talking about a service edge device and security in the same breath, it’s hard not to think of SD-WAN and what security it might bring to the table. The session-awareness approach can offer zero-touch security by letting users define what sessions are allowed, and barring everything else. Prioritization for QoE is a natural feature of SASE too.

Finally, the cloud. The largest source of cloud applications for enterprises is the front-ending of core business applications in the data center with friendly web-centric portals. This is how most companies have provided for user access to their data, including online ordering, and it’s also increasingly how companies want to support worker access, particularly given that many workers now want to use smartphones, tablets, Chromebooks, and other stuff in their work-from-home mode of operation. If the cloud is where these new application on-ramps are located, then workers need to get to the cloud rather than directly access the data center. Even before the Juniper acquisition, 128 Technology had a strong hybrid and multi-cloud story, as good as or better than the rest of the players.

If all of this stuff is important, and if Juniper has acquired an SD-WAN player that can do all of it and well, then it would be truly surprising if there weren’t competitive counterpunches in process, and we had two of them recently.

Extreme Networks is acquiring the Ipanema SD-WAN division of Infovista, with the apparent goal of bringing its network offerings up to address competition in the managed SD-WAN space. Extreme already has its ExtremeCloud managed service portfolio, but the Ipanema SD-WAN has “application intelligence”, allowing it to make decisions on QoE based on the specific importance of applications using the network. It’s also able to support dynamic routing in hybrid and multi-cloud applications. Finally, it pushes a “cloud-native” implementation. I think it’s clear that Extreme will be enhancing the application-intelligence features to extend their utility in security and access control, moving them closer to feature parity with Juniper’s 128 Technology and Marvis/Mist capability.

Cisco sees the handwriting on the wall too, but they may be speaking a little different language. Recall that Cisco has its own Cisco Plus network-as-a-service and expensing-versus-capitalizing offering. Cisco also has the most strategic influence of enterprise network buyers of any vendor, having in fact almost twice the influence of Juniper and five times that of Extreme. A decision to push SD-WAN features and enhancements through CSP/MSP channels would undermine their own as-a-service plans and reduce the impact of their enterprise account control. I think their positioning of their ThousandEyes deal shows that tension.

The SDxCentral story characterizes this as “Cisco’s WAN-on-Demand Strategy”, which sure sounds like an as-a-service strategy. Is it a coincidence that this is also how Cisco described its Cisco-Plus…as a NaaS? What the Cisco strategy with ThousandEyes does is improve visibility across all clouds and networks. The WAN-on-Demand stuff is really a Cisco initiative involving a bunch of cloud relationships for SD-WAN routing, which ThousandEyes can provide visibility for and that Cisco’s management products can look into and facilitate acting on. It’s not directly comparable to the Juniper or Extreme session/application awareness stuff.

WAN-on-demand does raise the question of whether Cisco is subducting IP and IP networks, promoting an overlay SD-WAN approach to respond to the fact that the cloud is becoming the application front-end of choice, and thus the thing that customers, partners, and employees have to connect with. By promoting cloud interconnect, Cisco is promoting the use of cloud provider backbones, and if all IP networks are just the plumbing underneath a Cisco NaaS vision, then a Cisco managed service strategy for SD-WAN could become the go-to managed-service solution, or so Cisco hopes.

Cisco could obviously make their offerings available to CSPs and MSPs, and could promote a managed-service vision and SD-WAN vision competitive with other vendors, or they may have decided that they want to get into the space on their own, and are starting to position the “NaaS” term as a placeholder for their own strategy, and a way of avoiding saying they’re going to offer managed SD-WAN and other network services, a statement that would surely raise the risk of channel conflict with CSPs and MSPs. We’ll have to watch how Cisco positions over the next few months, because service provider fall planning cycles are only roughly a month away.

How Can We Data-Model Commercial Terms, Settlements, and Optimization?

In the two earlier blogs in this series on as-a-service and runtime modeling, I looked at how the lifecycle modeling of services could facilitate the handling of service events and support lifecycle automation. I then looked at expanding the modeling to include the orchestration of event-driven applications, and at how the two models could be integrated to improve application and service utility.

Paying for services (like paying for anything) is a higher priority for the seller than for the buyer, but buyers are interested in accounting for what they’ve been asked to pay for. Sellers are interested in cost accounting, profit management, and integrating third-party elements into their own service offerings. There’s obviously a commercial dimension to be considered here.

Let’s start by recapping (quickly) the approach to unified runtime and lifecycle modeling I advocated in the last two blogs. I suggested that the runtime model and the lifecycle model be connected by creating a parallel lifecycle element that was linked to each runtime element by an event exchange that was confined to a change in the in-service state. The runtime element and the lifecycle element would each be bound up in their own models as well, so there would be two model structures, bound by in-service event exchanges at the level of the runtime elements. If an event-driven application had a dozen component elements, then there would be a dozen parallel lifecycle model elements.

Commercial terms, in the TMF approach, are assigned to Customer-Facing Services (CFSs), which to me implies that a service element is always represented by a pairing of a CSF and a Resource-Facing Service or RFS, because obviously it’s the way a service is bound to resources that creates costs. This is logical in many ways, but to me it encourages rigid relationships between the CFS and RFS, and that might have negative consequences when it was necessary to replace a broken resource.

When I did my first, TMF-linked, ExperiaSphere project, I took a slightly different approach and suggested that the “resource domain” would advertise “behaviors” that were then bound at service setup to the “service domain”. This was to keep the service models from becoming specific to resources, something generally important but critical if some service components were to be acquired from third parties. Commercial terms, in my approach, were associated with the behaviors, and the dynamic binding could then consider commercial terms in selecting a resource set to fulfill service requirements.

If we could assume that commercial terms would always be associated with the committing of a service component to its corresponding service, we could make commercial terms management a part of lifecycle management, and it could be applied when the service lifecycle enters “in-service”. That would correspond to the way both the TMF and my first-generation ExperiaSphere worked, but it’s not a suitable solution to service elements that are usage-priced.

Usage pricing requires integration with the runtime event/information flows in order to determine usage. It would be highly undesirable to have usage information collected outside those flows, for performance and resource efficiency reasons, so we would have to assume that usage data would be collected by the runtime service element itself, or that it could be collected by counting events/messages directed to the element, as part of the workflow.

It seems to me that if it’s important to offer support for commercial terms collection and reporting, it would be better to include the collection of the usage data in the “middleware” associated with event/message flows, rather than trying to write it in to each of the processes in the flow. The latter allows for too much variation in approach, and redundant logic. The good news here, I think, is that we have a technology available to do the message statistics-gathering, and it’s already in use in related areas. It’s the Envoy sidecar.

In service mesh applications (including the top two, Istio and Linkerd) a sidecar is used to represent a process element in the mesh. Since the sidecar sees the traffic, it can count it, and in service mesh applications the results can be obtained from the service mesh element (Istio, Linkerd) if there is one. If there’s no current sidecar, then I think there’s ample reason to employ Envoy where it’s necessary to monitor usage and cost. We could obtain the data, and integrate it with overall commercial-terms management, using what I called in 2013 “derived operations”.

What derived operations says is that you obtain the status of something not by polling it directly, but by reading it from a database that’s populated by polling. This principle was considered by the IETF briefly (i2aex, or “infrastructure to application exposure”) as a means of providing application coupling to network MIBs, rather than having many applications polling the same MIB. If we assume that we have Envoy available for all our realtime processes, then we could say that either the lifecycle management or commercial terms data model included instructions to run a timer and collect the data (from Envoy or a service mesh) at a predetermined interval and use it to populate the data model.

If we assume that we have the “rate” price for a service element stored in either the commercial terms or lifecycle management data model, and the usage data is also stored there, we have all the information we need to calculate current costs while the service/application is active. If we are setting up the application, and if we can make selections of resources based on cost, we can use the “rate” prices to compare costs and pick what we like. Any references to the cost/price/usage would be made to the data model and not directly to the Envoy sidecar, service mesh, etc. That would decouple this from the specific implementation.

The question is whether you need to have a separate commercial model or if the lifecycle model could serve both purposes. I’m of the view that adding another model dimension would be justified only if there was a clear benefit to be had, and I’m unable to see one so far. I’d determined in ExperiaSphere that the state/event tables should include a timer to be started before the indicated process was run, and that the timer event should then be activated if the timer expired. This also means that there should be a “stop-timer” indicator in the table, so that suitable events could stop the timer. That allows the timer to be used to activate periodic statistical polling and store the result in the data model, so there’s already a timer available to exploit for polling.

Given all of this, it’s my view that collecting information associated with the commercial terms of a service/element, and creating price/cost data for either selection of components or for generation of bills or reconciliation, is best considered a function of the lifecycle model. Placed there, it matches the needs of those services that are not usage priced without any special accommodation; cost data is simply part of the model. For usage-priced services, the lifecycle model’s in-service state/event processes can include a timer that would, when it expired, indicate that the Envoy/service-mesh package should be polled for statistics, and they’d then also be stored in the lifecycle model.

All of this appears to be workable, but one big question is whether it’s valuable too. The service mesh technology developing for the cloud, combined with various deployment orchestration tools like Kubernetes, have the ability to manage deployment and lifecycle stages, including scaling, without a model. Edge computing might be supportable using just these tools, or it might not. It seems to me that it depends on what “edge computing” turns out to mean, and how 5G hosting might influence it. If edge services are created by composing “service atoms”, potentially from different sources and in different locations, then homogeneous development practices to exploit service mesh/orchestration tools will be more difficult to apply across the whole service, and the composing process will have to be modeled. If connection services and edge application services are combined, that same truth comes forward. We may not need a combined modeling strategy for networks and hosting everywhere, for everything, but we’re likely to need it in some places, and for some things. That makes it important to know how it could be done.

Where Could “Metro Transformation” Take Us?

We tend to think of transformation as a proactive process, meaning that some technology shift has the effect of transforming things. It’s a nice picture for the industry, and it’s easy to understand, but the biggest transformation in networking may be happening a different way, and we may be missing it.

From its earliest days, networks have been focused on traffic aggregation for efficiency. The “access” network feeds the “metro” network, which feeds the “core” network. This structure makes a lot of sense when you assume that network traffic is created by the sum of exchanges between network users, because it ensures connectivity and a measure of network efficiency at the same time.

The problem now is that we have multiple forces that are changing what networks are really being asked to do. One force is the explosive growth in video content, growth that’s surely going to not only continue but likely accelerate as a result of greater demand for streaming video. Another is the advent of 5G, which relies more on hosted features/functions than prior generations of mobile technology, and 5G is a potential driver for our last force, edge computing. To relate these forces of change to the changes they’re forcing, we need to look at each.

Content delivery changes network dynamics because the majority of content traffic is focused on a relatively limited repertoire of material, and because the traffic levels and QoE requirements associated with video content in particular challenge effectiveness over long-haul pathways. Content delivery networks (CDNs) have for decades provided local hosting of content to enhance performance. A CDN is actually less a “network” than a hosting environment, and “local” will generally mean “within a major metro area” for best performance/economy tradeoffs.

5G, and feature hosting in general, changes the network dynamic too. In the past, “features” of networks were the result of cooperative device behavior. With mobile networks, the “devices” tended to be concentrated in the major metro areas because mobile traffic largely originates there, and the features the devices represent related to the users of the network and the management of their mobility, largely metro functions. 5G also broadens the model of feature hosting, replacing the presumption of COTS servers as the host with a wider variety of devices, some placed where singular devices would be the only economical solution. A “pool” of resources then spans both metro and access networks.

Then there’s edge computing. As a model for hosting, edge computing and 5G’s notion of a pool of resources would coincide, with edge computing embracing even the inclusion of both privately owned resources and “cloud” resources in the same pool, something that 5G operator relationships with public cloud providers also encourages. What’s interesting about edge computing is that both the CDN mission and the 5G/network-feature-hosting missions are potentially subsets of edge computing, meaning that the edge is a unifier in at least some sense.

CDNs have largely insulated the IP core from massive traffic growth associated with content delivery. 5G and edge computing seem to insulate the core from feature-related, cloud-related, future missions as well. All those things now can be expected to live primarily within the metro area, and the “metro network” then becomes much more important, to the point where it’s fair to say that the future of networking overall is the future of metro networking.

The biggest piece of that metro-network future is that the metro becomes a giant virtual data center, meaning that the primary purpose of the metro network will be to connect the pooled resources associated with the broad and unified edge mission. That means that some organized and standardized form of data center networking will be the foundation of metro networking, and that data center interconnect (DCI) will focus on creating as low a latency and as high a capacity as possible with DCI connections. There will obviously be “layers” to the virtual-data-center model, meaning that places with real server farms will be the best-connected, but other outlying locations will also be connected with sufficient capacity to ensure no significant risk of congestion and latency occur.

Operationally speaking, this demands a common model for lifecycle management, since any significant variability at the operations practices level would raise the risk of human error and delays in remediation that would, because of the growing focus on metro-hosted features and content, be a major service problem. Finally, there’s a question as to whether there needs to be a framework for building edge applications at the runtime level. Orchestration of independent components of software, particularly for event-driven applications, is a significant technology challenge, and how it’s done can reasonably be expected to impact the network requirements within the metro zone.

Competition is also certain to impact how metro networking evolves. On one hand, the metro is (as I’ve already noted) a giant virtual data center. The value of metro, and the whole reason why metro emerges as the critical opportunity point in all of networking, lies in what’s hosted rather than in how it’s connected. That would create a hosting bias to metro, and a data-center-interconnect bias to metro network planning. If we can build a data center network, could we not extend it via DCI, creating a giant fabric for metro? On the other hand, metro is also the inside of the access network, the major likely point of attack for hackers, and a place where Level 3 practices would surely scale better than Level 2. As a result, we could see a switch model of metro, a router model of metro, and perhaps some other models too.

One interesting possibility is the emergence of SDN as a unifying approach to metro. SDN is already more likely to be found in the data center than anywhere else, and SDN control of forwarding paths eliminates a lot of the issues of large L2 networks. Google, with Andromeda, has also demonstrated that SDN can be used “inside” a BGP emulator edge layer to create an IP core network, so it follows that SDN could be used to build a connectivity layer that could be purposed toward L2 or L3, or both.

Another possibility is that metro could be built out of some kind of extension to the “disaggregated” router/switch model that’s been popularized by DriveNets but is also supported by vendors like RtBrick. Obviously, a disaggregated virtual device is created by uniting multiple physical devices, and it might be possible to adapt that approach to the creation of metro-virtual devices united from multiple physical-device clusters.

The biggest change in metro, of course, is the fusion of network and hosting, which is what’s behind the “metro-as-a-giant-data-center” theme. That fusion not only changes the requirements and options for network-building, it changes the operational goals as well. It’s obvious that somebody who can’t watch video isn’t very interested in whether the CDN caching process is broken, whether the URL is redirecting wrong, or whether the network connection is bad. Edge computing, 5G feature hosting, IoT, and other stuff that will settle in the metro will have similar disregard for the exact problem source, and a big focus on optimizing the remediation. We may be creeping up on something totally new, a notion of “MetroOps”, and if we are, that notion could percolate into the cloud and the data center too.

Extending as-a-Service Modeling to Edge Event-Driven Applications

In the first part of this series, we looked at the possibility of modeling “as-a-service” offerings using a data model, with the goal of deciding whether a common approach to modeling all manner of services could be created. That could facilitate the development of a generalized way of handling edge computing applications, both in terms of lifecycle management and in terms of presenting services to users. It’s the need to represent the service in runtime, not just in lifecycle terms, that we’ll now explore.

We’ll kick this off by returning to software-as-a-service. SaaS has to present a “runtime” interface/API for access. The resource-oriented aaS offerings could be viewed as being related to lifecycle management, and what gets run on a resource-as-a-service offering of any sort is then responsible for making its runtime visible. SaaS has to present the “service” in the way that an application would present it, as a runtime interface. This is interesting and important because many of the 5G applications of edge computing would relate to lifecycle management, while edge computing overall is likely driven by IoT applications where models of an application/service would be modeling runtime execution.

Most applications run in or via as-a-service are represented by an IP address, an API. If the application is loaded by the user onto IaaS or PaaS or onto a container service, then the user’s loading of the application creates the API. In SaaS, the API is provided as the on-ramp, by the cloud provider. In either case, you can see that there is an “administrative” API that represents the platform for non-SaaS, and a runtime API that represents the application or service.

One complication to this can be seen in the “serverless” or function-as-a-service applications. These are almost always event-driven, and because the functions are stateless, there has to be a means of state control provided, which is another way of saying that event flows have to be orchestrated. In AWS Lambda, this is usually done via Step Functions, but as I’ve been saying in my blogs on event-driven systems, the general approach would be to use finite- or hierarchical-state-machine design. That same design approach was used in my ExperiaSphere project to manage events associated with service lifecycle automation.

Given that we can use FSM/HSM for lifecycle automation and for the orchestration of event-driven applications, wouldn’t it be nice if we could somehow tie the processes together? There would seem to be three general ways that could be done.

The first way would be to simply extend our two-API model for SaaS, and say that the administrative API represents the exposed lifecycle automation HSM, and the service API the service HSM. We have two independent models, the juncture of which would be the state of the service from the lifecycle HSM reflected into the service HSM. That means that if the lifecycle HSM says that the service is in the “run” state, then the service HSM is available.

The second approach would be to say that there is a lifecycle HSM linked to each element of the service, each individual component. We’d have a service HSM whose elements were the actual software components being orchestrated, and each of those elements would have its own lifecycle HSM. Those HSMs could still be “reflected” upward to the administrative API so you’d have a service lifecycle view available. This would make the lifecycle state of each component available as an event to that component, and also let component logic conditions generate a lifecycle event, so that logic faults not related to the hosting/running of the components could be reflected in lifecycle automation.

The final approach would be to fully integrate the two HSM sets, so that a single HSM contained both service event flow orchestration events and lifecycle events. The FSM/HSM tables would be integrated, which means that either lifecycle automation or service orchestration could easily influence the other, which might be a benefit. The problem is that if this is to be in any way different from the second approach above, the tables would have to be tightly coupled between lifecycle and service, which would mean there would be a risk of having a “brittle” relationship, one that might require reconfiguration of FSM/HSM process identities and even events if there were a change in how the components of an event-driven service deployed.

Selecting an option here starts easy and then gets complicated (you’re probably not surprised!) The easy part is dismissing the third option as adding more complexity than advantages. The harder part is deciding among the first two, and I propose that we assume that the first approach is simpler and more consistent with current practices, which wouldn’t mix runtime and lifecycle processes in any way. We should assume option one, then, unless we can define a good reason for option two.

The difference between our remaining options is the coupling between runtime behavior and lifecycle behavior, which means coupling between “logic conditions” detected by the actual application or service software and lifecycle decisions. Are there situations where such coupling is likely justified?

One such situation would be where load balancing and event steering decisions are made at runtime. Service meshes, load balancers, and so forth are all expected to act to optimize event pathways and component selection among available instances, including scaling. Those functions are also often part of lifecycle processes, where program logic doesn’t include the capabilities or where it’s more logical and efficient to view those things as arising from changes in resource behavior, visible to lifecycle processes.

This seems to be a valid use case for option two, but the idea would work only if the service mesh, API broker, load balancer, or whatever, had the ability to generate standard events into lifecycle management, or vice versa. You could argue that things like service meshes or load balancers should support event exchange with lifecycle processes because they’re a middleware layer that touches resource and application optimization, and it’s hard to separate that from responding to resource conditions that impact QoE or an SLA.

That point is likely to be more an argument for integration between lifecycle management and service-level meshing and load-balancing, than against our second option. One could make a very solid argument for requiring that any form of event communications or scalability needs to have some resource-level awareness. That doesn’t mean what I’ve characterized as “primitive” event awareness, because specific resource links of any sort create a brittle implementation. It might mean that we need to have a set of “standard” lifecycle events, and even lifecycle states, to allow lifecycle management to be integrated with service-layer behaviors.

That’s what I found with ExperiaSphere; it was possible to define both standard service/application element states and events outside the “ControlTalker” elements that actually controlled resources. Since those were defined in the resource layer anyway, they weren’t part of modeling lifecycle automation, only the way it mapped to specific management APIs. I think the case can be made for at least a form of option two, meaning a “connection” between service and lifecycle models at the service/application element level. The best approach seems to be borrowed from option one, though; the service layer can both report and receive an event that signals transition into or out of the “running” state, meaning an SLA violation or not. The refined state/event structure of both the service and lifecycle models are hidden from the other, because they don’t need to know.

In this structure, though, there has to be some dependable way of relating the two models, which clearly has to be at the service/application element level. These would be the lowest level of the runtime service, the components that make up the event flows. Logically, they could also be the lowest service-layer model elements, and so there would be both a service logic and a lifecycle model for these, linked to allow for the exchange of events described above. Service logic modeling of the event flows wouldn’t require any hierarchy; it’s describing flows of real events. The lifecycle model could bring these service-layer bottom elements back through a hierarchy that could represent, for example, functional grouping (access, etc.), then administrative ownership (Operator A, B, etc.), upward to the service.

If we thought of this graphically, we would see a box representing the components of a real-time, event-driven, application. The component, with its own internal state/event process, would have a parallel link to the lifecycle model, which would have lifecycle state/event processes. This same approach could be used to provide a “commercial” connection, for billing and journaling. That’s what we’ll talk about in the last blog in this series.

Extending Data-Modeled Services to Run-Time: Lessons from aaS Part 1

Abstraction in any form requires a form of modeling, a way of representing the not-real that allows it to be mapped to some resource reality and used as though it was. We have two very different but important abstraction goals in play today, one to support the automation of service and application lifecycles and the other to support the execution of applications built from distributed components. Since both of these end up being hosted in part at “the edge” it sure would be nice if we had some convergence of approach for the divergent missions. It may be that the “as-a-service” concept, which has elements of both missions already, can offer us some guidance, so we’ll explore modeling aaS here, in a two-blog series.

Everyone seems to love stuff-as-a-service, where “stuff” is anything from hardware to…well…anything else. As-a-service is an abstraction, a way of representing the important properties of something as though those properties were the “something” itself. When you buy infrastructure- or software-as-a-service, you get something that looks like infrastructure or software, but is actually the right to use the abstract thing as though it was real. For “aaS” to work, you have to be able to manage and use the abstraction in an appropriate way, which usually means in the way that you’d manage and use what the abstraction represents.

There could be multiple ways of doing that, but I think there’s a value in organizing how that would be done, and at the same time perhaps trying to converge the approach with modern intent-model concepts and with data-driven service management of the type the TMF has promoted. Automating service management, including applications management, is an event-driven process. Control-plane network exchanges are also event-driven, which means that most of what’s critical in 5G could be viewed through the lens of events. That’s a big chunk of the expected future of the cloud, distributed services, and telecom.

In the cloud, as-a-service means that the prefix term is offered just as the name suggests, meaning as a service. IaaS represents a hardware element, specifically a server, that can be real or virtual. SaaS represents an application, so while there is surely a provisioning or setup dimension to SaaS use, meaning a lifecycle dimension, the important interfaces are those that expose application functionality, which is the use of the service not the management of the service. PaaS is a set of tools or middleware elements added to basic hosting. The new container offerings are similar specializations.

Applied to hosting, most IaaS represents a virtual machine, which of course is supposed to look and act like a real server, or is it? Actually, IaaS is a step removed from both the real server and the VM. A real server is bare metal, and a virtual machine is a partitioning of bare metal, meaning you have to load an operating system and do some setup. IaaS from most cloud providers already has the OS loaded, and so what you’re really getting is a kind of API, the administrative logon interface to the OS that’s been preloaded.

To avoid using network service model terms, TMF terms, before we’ve validated they could work, I’m going to call the representation of an aaS relationship a “token”. So, suppose we start our adventure in generalizing aaS by saying that in as-a-service relationships, the service is represented primarily by an “administrative token” that includes a network URL through which the service is controlled. You can generalize this a bit further by saying that there’s an account token, to which various service tokens are linked.

Suppose we had a true VM-as-a-service, with no preloaded OS? We could expect to have our administrative token that represented the VM’s “console”, or a configuration API from which the user could load the necessary OS and middleware. That would suggest that we might have another layer of hierarchy, a token representing the VM, another representing the OS admin login.

From this, it appears that we could not only represent any resource-oriented IT-aaS through a series of connected-hierarchical tokens, but also maintain the relationship among the elements of services. We could, for example, envision a third layer of hierarchy to my VMaaS above, representing containers or serverless or even individual applications. Because of the hierarchy, we could also tie issues together across the mixture.

If we were to rehost a container in such a VMaaS configuration, we would “rehost” the token in the new token hierarchy where it now belonged. At the time of the rehosting, we could create a history of places that particular token had been, too. That could facilitate better analysis of performance or fault data down the line, and even be of some help in training machine learning or AI tools aimed at automating lifecycle management.

What we can take from all of this is that it would be perfectly possible to create a data model to describe, and interact with, those aaS offerings that represent resources. That’s likely because what you do with resources is create “services”, meaning runtime behaviors, and the resources themselves are manipulated to meet the service-level agreements (SLAs), express or implied. That means lifecycle management.

Could the modeling be extended to the runtime services themselves? Since aaS includes runtime services (SaaS, NaaS), it would be essential that we include runtime model capabilities in the picture, just to accommodate current practices, but edge computing applications like IoT are likely to generate services, projected through APIs, to represent common activities. Why have every application for IoT do things like device management, or interpret the meaning of location-related events?

In my early ExperiaSphere project, in the Alpha test in particular, I created a model that represented not only lifecycle behavior but runtime behavior. The application used a search engine API to do a video search, retrieved a URL, and then initiated a NaaS request for delivery. The NaaS request was simulated by a combination of a “dummy” request for a priority ISP connection (with Net Neutrality, of course, there could be no such thing) and a request to a device vendor’s management system for local handling. What the Alpha proved was that you could create a runtime in-use service model, and merge lifecycle behavior into it, providing that you had intent-modeled the service components and had control interfaces available to influence them.

Could that approach work for SaaS and NaaS overall? That’s what we’ll explore in the next blog on our aaS-modeling topic.

The Infrastructure Bill Kicks the Broadband Can…Again

While the bipartisan infrastructure bill isn’t law (or even finished) at this point, we do have some reports on its content. Light Reading offered their take, for example. I downloaded and reviewed the bill, so let’s take a look at it, what it seems to get right, and what it may have missed.

The goal of the broadband piece of the bill should be familiar; it’s all about closing the often-cited “digital divide” that dooms consumers in many rural areas to broadband Internet service capacities far below the national average. However, all legislation is politics in action, and lobbying in action too. What emerged from all the lobbying and politicking was two distinct positions on what broadband should be. One group favored what could be called a “future-proof” vision, where the goal was to provide capacity that’s actually more than most US households have, or even want. Another group favored a broadband model that was aligned with the technology options and practices of the current market.

Fiber proponents, of course, wanted to see gigabit symmetrical broadband, something that can be delivered over fiber but is problematic with nearly every other technology option. What this approach would have done is to expand the digital divide into suburbs and metro areas, rather than closing it, because cable broadband wouldn’t fit the model. In addition, it would likely disqualify fixed and mobile wireless technology, which is the easiest of all our new options to deploy.

The operators themselves were generally in favor of a more relaxed 100 Mbps download and 20 Mbps upload, which some public advocacy groups feared would make two-way video for remote learning (something we needed and still need) and work less useful. Operators were also leery of mandated low-cost options and terms that could force them not to cherry-pick high-value areas where their revenue could be expected to be better.

The notion of universal fiber is simply not realistic, because fiber costs in areas with low demand density would be so high that only massive subsidies could induce anyone to provide the service. Thus, it’s a win for the practical political and technical realities that the standards were set at 100/20 Mbps, though I think that 100/35 would have been almost as achievable and would offer better future-proofing for remote work and education.

What’s a bit disappointing here is that there’s no specificity with regard to how the broadband speed is measured. As I’ve pointed out, simply measuring the speed the interface is clocked at doesn’t reflect the actual performance of the connection. A company could feed a node or shared cable with (for example) 10 Mbps of capacity, clock the interface at 100/20 Mbps, and appear to comply.

This would seem to admit mesh WiFi, a technology that was used over a decade ago and that almost universally failed to meet its objectives. The problem is that WiFi range is short, and mesh technology loads the WiFi cells closest to the actual Internet backhaul, so performance is very unlikely to meet remote work or school objectives, yet it could meet the simple interface-speed test.

There is a provision in the bill that states “The term ‘‘reliable broadband service’’ means broadband service that meets performance criteria for service availability, adaptability to changing end-user requirements, length of serviceable life, or other criteria, other than upload and download speeds, as determined by the Assistant Secretary in coordination with the Commission.” Here, the “Commission” is the FCC and the Assistant Secretary is the Assistant Secretary of Commerce for Communications and Information. It would seem that details of service criteria could be provided later on, which could address the interface-versus-real-speed issues.

Mandated low-cost options are required, but it’s not completely clear how they’d work. The big question is whether the same service would have to be offered (100/20 Mbps) at a lower price point. If that’s the case, then the mandate could result in some major operators (telcos and cablecos) selling off areas where the cost mandates would likely apply, which could mean lower overall economies of infrastructure in those areas and a need for more subsidies. Hopefully some clarity will emerge in the rest of the bill’s process.

The “digital redlining” issue is similar in impact. Network operators typically try to manage “first cost”, meaning the cost of deploying infrastructure when revenues are unlikely to build quickly enough to cover it. That would favor using higher-revenue-potential areas to pull through some of the deeper infrastructure needed, infrastructure that lower-revenue areas could then leverage. One thing I think could be hurt by some interpretations of digital redlining is millimeter-wave 5G. You have to be able to feed the nodes where mm-wave originates through fiber, and early and effective node deployments could naturally favor areas where the “revenue per node” is the highest. Those early nodes could then be the source of fiber fan-outs to other nodes, offering service to other areas whose return on investment couldn’t otherwise justify connection.

The bill seems to ignore another issue regarding neighborhoods, one that I’ve seen crop up again and again. In many states, builders of residential subdivisions, condos, and apartments can cut exclusive deals with an ISP. The ISP “pre-wires” the facility for their broadband in return for the exclusivity. That means that there’s no competition for broadband Internet in these complexes, and little incentive for the dominant ISP to improve services. Since ISPs seek these deals where they believe consumer spending on broadband is high, the same practice may result in having no deals offered in lower-income complexes, which would mean that the neighborhood would have to be wired to offer service. That can create an effective red-lining.

All of this is to be based on a significant data gathering and mapping of unserved (less than 25/3 Mbps) and underserved (less than 100/20 Mbps) broadband service areas. That process is likely to take time, and like all legislation the effect will depend on whether a change in administration results in a change in policy. Time may be important for another reason; there are funds for incentives and subsidies included in the bill, but the majority would help initial deployments but not cover costs down the line. The subsidies provided per user aren’t funded in the long term, so there is no assurance that either incentives or subsidies will have a major impact beyond the next election.

As I said at the start of this blog, legislation is politics, and both the overall infrastructure bill and the broadband portion are examples of politics in action. Since the Telecom Act was passed about 25 years ago, I’ve recognized that Congress isn’t interested in getting the right answer, but rather the politically optimum answer. It’s easy to fob off details to the executive branch, particularly to a Federal Commission like the FCC, but the Telecom Act was mired in shifts in party in power, legal fights, and changes in interpretation. The biggest problem with the broadband terms in the infrastructure bill is that they follow the path that’s failed in the past, and that I think is likely to fail again.

What Would “Success” for 5G Mean?

One of my recent blogs on 5G generated enough LinkedIn buzz to demonstrate that the question of 5G and hype is important, and that there are different interpretations to what constitutes 5G success. To me, that means I’ve not explained my position as well as I could have, which means I need to take a stab at the issue again, specifically addressing a key point of mine.

My basic position on issues relating to 5G (or any other technology) is that there is a major difference between what you can use a technology for, and what justifies the technology. As I said in the referenced blog, there is not now, nor has there ever been, any realistic chance that 5G would not deploy. It’s a logical generational evolution to mobile network technology, designed to accommodate the growing and evolving market. In fact, one of the most important facts about 5G is that it will deploy, which means that having a connection to it offers vendors an inroad in something that’s budgeted. This, at a time when budget constraints for network operator spending are an ongoing problem to vendors.

The question with 5G, then, isn’t whether it will happen, but rather what will drive it, and how far the driver(s) will take it. Putting this in very simple terms, we have two polar positions we could cite. The first is that 5G is really nothing more than the evolution of LTE to 5G New Radio (NR), and that little or no real impact can be expected beyond the RAN. This is the “Non Stand-Alone” or NSA vision; 5G rides on what’s an evolved/expanded form of 4G Evolved Packet Core (EPC). The second is that 5G concepts, contained in 5G’s own Core, will end up transforming not only mobile networks but even wireline infrastructure, particularly the access/metro networks. Obviously, we could fall into either extreme position or something in between.

Where we end up on my scale of Radio-to-Everything-Impacted will depend not on what you could do with 5G, but on what incremental benefit to operator profits 5G could create. If 5G offered a lot of really new applications that would justify additional spending on 5G services, and in particular if operators could expect some of those new applications to be services they’d offer and get revenue from, then 5G gets pushed toward the “Everything” impact side of my scale. If 5G could offer a significant improvement in opex overall, then it would bet pushed toward “Everything” as far as the scope of improvements justified. If neither happens, then 5G stays close to the “Radio” side of the scale, because there’s no ROI to move the needle.

If 5G does in fact end up meaning little more than a higher-capacity, faster, RAN, it doesn’t mean that 5G core would not deploy, but it would mean that the features of 5G Core that were actually used, and could actually differentiate one 5G network (or vendor product) from another would be of less value, and be less differentiating. In fact, they might not even be offered as part of a service at all, in which case there would be no chance the market could eventually figure out how to build applications/services that would move my needle toward the “Everything” end of the scale.

My view of the possible drivers to move 5G toward the “Everything” end of the scale has been that they relate to applications of 5G beyond calling, texting, and simple Internet access. That, to me, means that there has to be a set of service features that are valuable to users, deliverable to a community of devices, and profitable for the operators to deploy. I doubt that anyone believes that something that met these requirements could be anything but software-based, and so I believe that exploiting 5G means developing software. Software has to 1) run somewhere, and 2) leverage some easy (low-on-my-scale) property of 5G to exploit low-apple opportunities and get something going.

Software that’s designed to be edge-hosted seems to fit these criteria. One of 5G’s properties is lower latency at the radio-connection level, which is meaningful if you can pair it with low latency in connecting to the hosting point for the software, the edge. Further, 5G itself mandates function hosting, which means that it would presumably justify some deployment of edge hosting resources, and those might be exploitable for other 5G services/features/applications. However, that’s less likely to be true if the software architecture, the middleware if you like, deployed to support 5G hosting doesn’t work well for general feature hosting. 5G can drive its own edge, but it has to be designed to drive a general edge to really move my needle.

There’s been no shortage of 5G missions cited that would drive 5G. Autonomous vehicles are one, robots and robotic surgery are another. All of this reminds me of the old days of ISDN, when “medical imaging” was the killer app (that, it turns out, killed only itself). All these hypothetical 5G applications have two basic problems. First, they require a significant parallel deployment of technology besides 5G, and so have a very complicated business case. Second, it’s difficult to frame a business model for them in any quantity at all.

If anyone believes that self-driving cars would rely on a network-connected driving intelligence to avoid hitting pedestrians or each other, I’d gently suggest they disabuse themselves of that thought. Collision avoidance is an onboard function, as we have already seen, and it’s the low-latency piece of driving. What’s left for the network is more traffic management and route management, which could be handled as public cloud applications.

Robots and robotic surgery fit a similar model, in my view. The latency-critical piece of robotics would surely be onboarded to the robot, as it is today. Would robotic surgery, done by a surgeon distant from the patient, be easily accepted by patients, surgeons, and insurance companies? And even if it were, how many network-connected robotic surgeries would be needed to create a business case for a global network change?

Why have we focused on 5G “drivers” that have little or no objective chance of actually driving 5G anywhere? Part of it is that it’s hard to make news, and get clicks, with dry technology stories. Something with user impact is much better. But why focus on user impacts that aren’t real? In part, because what could be real is going to require a big implementation task that ends up with another of those dry technology stories. In part, because the real applications can’t be called upon for quick impact because they do require big implementation tasks, and vendors and operators want instant gratification.

How do we get out of this mess? Two possible routes exist. First, network operators could create new services, composing them from edge-hosted features, and target service areas that would be symbiotic with full 5G NR and Core. Second, edge-computing aspirants could frame a software model that would facilitate the development of these applications by OTTs.

The first option, which is the “carrier cloud” strategy, would be best financially for operators, but the recent relationships between operators and public cloud providers demonstrates that operators aren’t going to drive the bus themselves here. Whether it’s because of a lack of cloud-skills or a desire to control “first cost” for carrier cloud, they’re not going to do it, right or wrong though the decision might be.

The second option is the only option by default, then, and it raises two of its own questions. The first is who does the heavy lifting on the software model, and the second is just what capabilities the model includes. The answers to the two questions may end up being tightly coupled.

If we go back to the Internet as an example of a technology revolution created by a new service, we see that until Tim Berners-Lee, in 1990, defined the HTML/HTTP combination that created the Worldwide Web, we had nothing world-shaking. A toolkit opened an information service opportunity. Imagine what would have happened if every website and content source had to invent their own architecture. We’d need a different client for everything we wanted to access. Unified tools are important.

Relevant tools are also important. Berners-Lee was solving a problem, not creating an abstract framework, and so his solution was relevant as soon as the problem was, which was immediately. The biggest problem with our habit of creating specious drivers for 5G is that it delays considering what real drivers might be, or at least what they might have in common.

Public cloud giants Amazon, Google, and Microsoft have a track record of building “middleware” in the form of web-service APIs, to support both specific application types (IoT) and generalized application requirements (event processing). So do software giants like IBM/Red Hat, Dell, VMware, HPE, and more. Arguably, the offerings of the cloud providers are better today, more cohesive, and of course “the edge” is almost certainly a special case of “the cloud”. There’s a better chance the cloud providers will win this point.

The thing that relates the two questions of “who” and “what” is the fact that we don’t have a solid answer to the “what”. I have proposed that the largest number of edge and/or 5G apps would fit what I call a contextual computing model. Contextual computing says that we have a general need to integrate services into real-world activity, meaning that applications have to model real-world systems and be aware of the context of things. I’ve called this a “digital twin” process. However, I don’t get to define the industry, only to suggest things that could perhaps define it. If we could get some definition of the basic framework of edge applications, we could create tools that took developers closer to the ultimate application missions with less work. Focus innovation on what can be done, not on the details of how to do it.

And that’s the 5G story, IMHO. 5G proponents can either wait and hope, or try to induce action from a credible player. I always felt that doing something was likely a better choice than hoping others do something, so I’m endorsing the “action” path, and that’s the goal of sharing my 5G thoughts with you.