Looking for the “Why” in Service Lifecycle Automation

What is the goal, the real and ultimate goal, of service lifecycle automation? That may seem like a stupid question, and in more than one dimension. First-off, everyone knows that service lifecycle automation is supposed to cut opex. Many know that opex is actually a bigger portion of each revenue dollar operators earn than capex is. Furthermore, many also know that the traditional measure of operator financial health is EBITDA, which stands for “earnings before interest, taxes, depreciation, and amortization”, and that doesn’t include capital purchases in any event but does factor in opex. But how much of the addressable opex have we already addressed? That’s the big question.

Across all the operators, opex represents about 30 cents of each revenue dollar, where capex represents around 21 cents. Since I started keeping tabs on, and modeling for, this sort of thing back in 2016, operators have inaugurated measures to reduce opex, and some operations costs have in fact declined a bit. Customer support costs are down by over 25% since then, due to increased use of customer support portals. Network operations costs are down starting in 2020, by about 6%, but the other components of opex have continued to creep upward. The one that’s gone up the most, and is already the largest, is customer acquisition and retention.

Nearly every mobile operator contact I have will admit that efforts to differentiate mobile services based on any aspect of service quality or reliability have proven ineffective. Instead, operators use handsets to influence buyers, or rely on inertia. The question is whether that’s really the best approach, or even whether it has any staying power in the market. In off-the-record comments, operators will admit that they believe smartphone giveaways or discounts can work to fend off prepay price leaders, but it’s becoming increasingly clear that they’re not broadly effective.

For wireline services, things are similar. Bundling of broadband and live TV is still a big draw, but even in the wireline world there’s still a benefit to differentiating based on local devices (in this case, the wireless/WiFi router). Users should know that having faster WiFi won’t normally have any effect on quality of experience, but they apparently don’t, and some ISPs run ads that strongly imply that their WiFi will indeed give them better Internet.

If operators want to reduce their acquisition and retention costs, they’ll need to get away from a focus on devices and move to a focus on services in general, and service quality in particular. Operators have picked pretty much the whole crop of low apples in customer care already, and while it’s paid off well enough to fend off a profit-per-bit crisis, it’s also covered up some basic problems. Fixing those might fix opex overall.

One interesting point to consider here is the relationship between capex and opex in both network and data center. Right now, operations support for both networks and data centers run roughly four-and-a-half cents of each revenue dollar, but the total technology investment in the network is at least twenty times that of the data center. This means that IT operations is a lot more expensive per unit of technology, and that’s a problem that played a role in rendering NFV ineffective. Because NFV didn’t address lifecycle management any differently, a shift to virtual functions would surely have cost more in opex than it saved in capex.

This obviously raises questions regarding things like 5G function hosting and “carrier cloud” or edge computing. Operators have between eight and thirteen times the operations cost per unit of technology as public cloud providers. With that opex-per-unit disadvantage, they could never hope to be competitive as a hosting source, and in fact their 5G technology investment could be at risk. Some operators who read my comments on why they were focused on public cloud partnerships for 5G hosting told me that opex cost was their big problem; they couldn’t efficiently manage their own infrastructure.

Right now, of course, the operations costs associated with IT equipment is largely associated with the CIO organization, responsible for OSS/BSS. For decades, operators have been divided on whether OSS/BSS needed modernization or should simply be tossed out, and that’s the question I think is at the heart of the whole service lifecycle automation debate.

For efficient service lifecycle automation, you need two very specific things. First, you need a unified strategy for operations for both network equipment and hosted-function infrastructure. 5G should have made that point clear. Second, “service lifecycle” means both operations/business support and network/hosted-function support.

I think that the NFV ISG recognized the first point, the need for unified operations, and that recognition was behind the decision to link management of a hosted function with the management tools used for the physical device the function was derived from. However, they left out the issue of managing the hosting itself. A virtual device fails because the virtual function failed, and that failure is equivalent in a sense to a real device failure. It also fails if what it’s hosted on fails, and the management of the resource pool is therefore just as critical.

My own ExperiaSphere work relates to the second point above. Lifecycle automation is going to require some form of state/event processing. It’s hard to see how service lifecycle automation couldn’t require both OSS/BSS and NMS integration, and if it did, we’d be making OSS/BSS functions into processes run based on state/event relationships just like NMS functions. I think this would effectively remake the OSS/BSS.

A unified view of a service lifecycle, one that integrates service and network operations, hosted functions and network devices, would be the logical way to address lifecycle automation. But would it have any impact on opex? I’d love to say that my modeling said that it would, that it would be the path to opex nirvana. I can’t say that, at least not for the present.

The problem is that consumer broadband is really plumbing, and plumbing is something that’s important only if it breaks; otherwise it’s invisible. You can’t differentiate invisibility, and if acquisition/retention costs are the major component of opex (which they are, by far) then how can you show customers that they should pick, or stay with, you because of that invisible thing?

But, and it’s a big “but”, there are reasons why operators shouldn’t be willing to consign themselves to being the plastic pipe that connects your leopard-skin toilet seat to the public sewer. There are reasons why, if they do accept that, they’d still need to cost-optimize the whole picture. In either event, there is at least some function hosting in their future, and they need to address that so as to reduce their unit opex for server pools.

If we need to do something for lifecycle automation, reflecting those two points, what might that look like and what specific impact on opex might be possible? We’ll look at that in a follow-up blog.