Can ONF Stratum Meet its Full Potential?

Industry and standards bodies in telecom have, in my personal view and from my own experience, a bit of a tarnished history.  Even the fifty-odd operators I’ve interacted with in the last month think that the activities of these groups take way too long, and a very large majority say the results aren’t “transformative”.  Now, however, the ONF may have found its wings, or at least its legs.  I’ve blogged on its Stratum initiative before, but we need to look at it again, now that it’s been open-sourced by the ONF and in the context of the current market.

Stratum is one of two important developments in the open-switch market sector.  DANOS, the other, is a Linux Foundation project derived from an earlier AT&T initiative (dNOS) to create an “open, disaggregated” alternative to traditional proprietary network devices.  Stratum may take the concept further than DANOS, in part because it’s a more generalized (or at least generalizable) approach, and in part because it includes a model to virtualize custom switching chips, making it an easy fit to new open switch silicon designs.

Open switching products are exceptionally interesting to network operators, particularly ones like AT&T who (because of their low demand density) have issues with profitability on network infrastructure.  Using white-box switches with open software, operators could reduce their device costs by as much as 50%, according to operators themselves.  They also believe that these devices could reduce opex as well by standardizing network operating system features and providing APIs to link with open operations automation tools.

DANOS is important for its role in AT&T’s network.  It’s based on the Vyatta software-hosted router technology that AT&T acquired with the breakup of Brocade.  Many operators were very interested in Vyatta technology, but Brocade seemed unable to leverage that interest once it acquired Vyatta in 2012.  Part of the reason might have been the onrush of interest in NFV, which launched with the Call for Action white paper in the fall of that year.  Part, of course, was likely Brocade’s own failure to do insightful positioning of what they acquired.

As useful as DANOS is, it does have a tight bond with traditional routing.  We are already seeing a transition in data center networking, one moving toward more “programmable” forwarding, and it’s very possible that optical grooming applications, the “packet-layer” stuff I blogged about earlier regarding Cisco and Ciena, might play a role in the future.  If so, there’s a chance that DANOS might be behind the curve of device needs already, and in danger of falling further behind.

Stratum has a very different genesis.  It emerged as an abstraction for network switching, an “open-source thin switch implementation” that’s built to exploit the P4 flow-programming language.  The original goal was to abstract the hardware layer of switches used in SDN with OpenFlow, the original SDN model that was the basis for founding the ONF.  Stratum retains an SDN slant in its positioning, and the ONF seems careful not to be seen as rushing away from its own SDN roots.  That, in fact, may be the big issue they’ll have to address if Stratum is to truly remake networks, and the ONF as well.

If DANOS is a stiletto, then Stratum is a Swiss army knife.  The software in DANOS, the network operating system that makes up the “NOS” in DANOS, is capable of playing a lot of IP-network roles.  Stratum is more generalized in its approach.  The NOS, in the Stratum material, lies above Stratum and interacts with it through APIs, including P4.  The ONF has other projects that fill in the NOS layer, including ONOS, Trellis, and CORD.  These are an SDN cloud controller, a leaf/spine controller for NFV, and an architecture to transform the CO into a data center, respectively.

The strength of Stratum, in my view, has always rested in the P4 flow programming model.  P4, combined with merchant switching silicon, creates the data plane of a network device.  You can use P4 to program flows that are controlled by SDN, by NFV, or by traditional control-plane adaptive routing and switching.  In fact, you could use Stratum with DANOS, to provide a useful hardware abstraction layer.  What this does is separate the capital component of network equipment, the white-box switches, from the protocols and even missions at the service level

You can do traditional switching and routing with Stratum, or you can do OpenFlow SDN, or you could do something that’s not even formally defined.  If the 5G User Plane, for example, could benefit from a custom abstraction-based model as I’ve suggested it could, then Stratum could program the forwarding and expose the APIs needed to make the new model work.  And if something else comes along, the boxes are simply repurposed by changing the layers above Stratum.

This “universality strength” brings a collateral strategic strength, which is that Stratum is not bound to the current vision of “virtual networks”, which says that you virtualize a router network by virtualizing the routers.  A network of virtual boxes, as I’ve said before, isn’t a virtual network, it’s the same box network it always was.  You need virtual devices, in the form of collectivized hosted features, to create a virtual network, but what unfetters virtual networks from the past is the ability to collectivize those features rather than linking boxes.  Stratum could provide the mechanism to do that at the hardware-abstraction level, and the higher-level models illustrate how the way that network services are collectivized can be varied fairly easily.

The big question is whether the ONF understands this, and a second question nearly as important is whether they can unfetter Stratum from the old SDN story they told.  SDN was, in my terms, a way of changing box networks by pulling the control-plane out of the boxes and hosting it in the cloud.  It never proved itself out at scale.  If the ONF insists that Stratum is the box layer of OpenFlow SDN, then it inherits the things that OpenFlow SDN never proved about itself, and limits its own success before it ever really gets started.

The good news is that the ONF does have a sort-of-model in which Stratum fits, and that model is generally suitable for describing most networks and services in the present, as well as the way they seem to be evolving.  They have a central policy process that frames what I’ve called “virtual network abstractions” as well as lower levels to describe functional architecture.  Their only omission lies, I think, in relying on NFV to describe how features are hosted.  It’s taken NFV six years to get where they are, which is not yet where true cloud-think would demand that networks be taken.

What the ONF needs is a compelling vision of the way cloud-think would take networks.  That vision could then be decomposed top-down into layers that could map to the Stratum model already defined.  It could help unite what otherwise might look like a bunch of diverging trends and goals.  It could give the ONF new relevance.

It could also help compete against DANOS, even though the two are not strictly competing technologies.  DANOS is a box model, a way of doing routing without router vendors and switching without switching vendors.  For buyers, that’s worthwhile, but in the end, it commits the buyers to a model of networking defined by the people they’re trying to disintermediate.

I’d like to see AT&T embrace both DANOS as a NOS and Stratum as a P4 engine and foundation for the hardware abstraction process.  P4 joined with the ONF last year, and it’s a Linux Foundation project for open-source administration.  DANOS is also hosted by the Linux Foundation, so it may be that common administration will keep these projects tracking closer.  Formal commitment would be more useful, of course, and for AT&T in particular, the unification of all this stuff could be a step toward creating a model for a future service-and-experience-driven networking approach.  As I said earlier this week, they need that.

Can an “Activist Investor” Manage AT&T Better?

Activist investors are always the bane of company management, and AT&T is surely no exception.  Elliott Management bought a stake in the company, and that sent AT&T’s stock on a rally.  Elliott thinks that AT&T has lowered shareholder value through its aggressive M&A, and management (not surprisingly) disagrees.  The question is who’s right and why, and what might happen if “right” ends up being different for shareholders than for the industry.

Shareholder value is the key goal of any company, for the simple reason that the company is responsible to its shareholders not the industry, or even its customers.  M&A generally builds debt levels, which are often seen as an increased risk.  Thus, what Elliott might be saying is that AT&T’s position in M&A, particularly with Time Warner, is a strategy that’s bad in the fairly limited timeline of investors, whether it’s logical in “the long run” or not.  But it’s possible that AT&T did the wrong thing from any perspective, and also possible it did the right thing.  To figure out which, we have to look at AT&T’s position.

All network operators are historically bit-pushers.  They sell capacity and connectivity, and they do that by building networks.  What we’ve seen over the last two decades is that broadband Internet has generated a thriving market for data connectivity to everyone, but at the same time has depressed the revenue per bit.  The impact of that depends on what I’ve called “demand density”, which is essentially the dollar opportunity passed by a mile of infrastructure.  Operators with high demand densities can expect lower cost per bit, thus making the revenue decline less an issue.  AT&T has a much lower demand density than rival Verizon, so they have to worry more.

It’s no surprise that AT&T’s has-to-worry status means it works harder to establish things like opex automation (ONAP) and open-source and open hardware models.  However, cost management can only take you so far, and so AT&T is also looking at climbing the value chain.  Hence, DirecTV, streaming video, and Time Warner.

A decade ago, it was TV delivery that justified wireline networking.  Operators like AT&T and Verizon got into the TV delivery business for that reason, and so all of AT&T’s moves would make perfect sense if this were 2009 instead of 2019.  It’s not, and so the question is whether the up-the-value-chain approaches AT&T has taken are still valid.

The Internet has messed up a lot of business models, and you can see by current reporting that content and TV are among the recent casualties.  Streaming video works directly on Internet “dialtone” and so you don’t need cable companies, TV services, and so forth.  Not only that, advertisers have been shifting budgets more to online services because they get better targeting, which means that there now have to be more commercials per minute of show.  In short, content may not have fallen from its royal “Content is king!” state, but its crown is getting seedy-looking.

So now, if we take stock, do we decide that AT&T’s strategy is bad?  After all, content is circling the drain too.  But is it doomed?  Simple truth; we have to watch something.  Reruns of old shows, movies, and series don’t carry us to the infinite future.  Amazon, after all, and Netflix are producing original content.  Might AT&T unfetter itself (gradually, of course) from a traditional TV-network-ish model and jump into producing content for streaming only?  Could they be looking to emulate Netflix and Amazon more than Comcast?

This is exactly what I think is happening.  AT&T’s automation and open-box strategies will take time to pay off.  In fact, they’ll take time to get right, because they’re not right yet.  In the meantime, AT&T has the opportunity to leverage original I-own-it content, which is worth something.  They can start to shift to the same model that Amazon and Netflix and Hulu, and now Disney and Apple, are promoting, with original content to leverage.  They can even offer other content producers a conduit to market, and they can reap some financial benefits from this during the period when they work their magic in network modernization.

But what about demand density?  The answer to that is Internet dialtone services.  If you look at “the network” as the source of all revenue, demand density matters.  If you look at Internet dialtone as the infrastructure of the future, then so what if your own home area has low demand density.  Service someone else’s area instead, riding on their increasingly unprofitable infrastructure.  There is an issue as to how we provide incentive to keep offering more capacity that’s steadily less profitable, but that’s a public policy problem.  We subsidize rural broadband in the US now.  Maybe in the future we have to subsidize more broadband than just rural.  It’s separate from the Internet dialtone economy, because you know that somehow everyone will get their Internet fix.

AT&T isn’t doing dumb stuff, in short.  It’s not as smart as I’d like it to be, or as smart as it thinks it is, but it’s doing generally the right thing.  In fact, if anything, AT&T should be doing more to exploit Internet dialtone, looking for additional investments it can make in the higher-layer experiences that represent what people really want from the Internet.  Now, when the value of these higher-layer experiences isn’t fully realized, even by people like Elliott Management.

IoT isn’t about sensors, it’s about services.  Advertising isn’t about serving ads, it’s about contextualizing them.  The future of “the network” is building what it connects users to, not building the connections themselves.  Content, meaning video content, is the obvious on-ramp to services above the network.  AT&T is right to jump into video content, providing that it isn’t mistaking the on-ramp for the Interstate.

The fundamental problem AT&T has, and that all network operators have, is that there will never be a time when bandwidth earns more revenue per bit.  The best that cost management can hope to do is slow the erosion of profit on connectivity services.  That means that you have to look elsewhere for increased profit, and with a thriving OTT industry in place, it’s not hard to figure out that the place to look is up the value chain toward the experiences users really pay for.  Bits are just the plumbing to deliver them.

The higher-layer services of OTTs are built almost universally on a cloud-native platform.  Thus, you could argue that AT&T’s focus should be taking a cloud-native slant on everything, starting with content delivery but expanding into a service-centric ecosystem that can host future experiences.  This is where we can’t yet tell if AT&T gets it.  Does AT&T understand what their next step is?

Elliott isn’t the one to answer that question.  They’re “activist investors” that many would label “corporate raiders”.  But AT&T may not be able to answer the question either; they’ve not answered it so far.  No revenue strategy is endless.  Companies succeed by reinventing themselves as needed, and AT&T deserves applause for having done more than most of its carrier partners worldwide in doing that up to now.  They have to keep it up.

The Hidden Battle for a Hidden Layer in Operator Networks

Sometimes the interplay of news is more newsworthy than the news itself.  Last week we had Ciena’s quarterly report and Cisco’s deal for Acacia, and the two certainly create an interesting combination.  Add in the now-almost-routine comments that telco profit per bit is declining and that IP is the dialtone of the connected world, and you have something really profound.

Connectivity sucks, profit-wise.  There may be a big appetite for bandwidth, but not for paying for bandwidth.  As a result, the climb in service bandwidth has been steeper than the price per bit, and that’s never going to end.  There will never be a time in the future when bit-pushing is as profitable as it is today, and it may shortly be not profitable at all.  We have to get this point out there because it’s the central truth around which all the other stuff revolves.

The next truth is that when operators can’t make more money with more bits, they put a lot of pressure on their infrastructure costs.  That pressure can come in the form of a demand for discounts, or perhaps seeking an industry price leader (Huawei comes to mind), or even a consolidation of equipment—many boxes into one superbox.  Any way you look at this, the vendors are in the same box as the buyers are (no pun intended).

There’s no question that optical transport spending is on the rise, because of that big appetite for bandwidth.  Ciena’s sales prove that out, at least for the near term.  There’s also no question that spending more to produce less-profitable bits can’t go on forever.  Thus, operator pressure on optical transport spending will increase, which is also what Ciena’s outlook suggests, and what Street analysts worried about.  The Street also worries about Ciena’s margins, for the good reason that price pressure on sellers is inevitable when buyers question their ability to sustain investment.

Ciena, of course, knows all this, and in fact has known the trend would develop for almost a decade now.  What they’ve tried to do is frame “packet optical” gear as the answer.  Almost all traffic, and probably all traffic growth is in IP.  If optical transport could rise above Level 1 and create virtual electrical-layer (“packet”) pipes, it could make routing simpler…and cheaper.  That could take some price pressure off Ciena’s products.

By putting it on routers, sold by vendors like Cisco.  The prospect of having optical vendor sales pressure relief at Cisco’s expense is hardly appealing to Cisco management, so what’s the solution?  Well, if the optical/electrical boundary can be exploited from below, why not exploit it from above?  If most optical traffic is IP, then it originates in routers.  If the routers have good optical interfaces, say from a vendor like Acacia, then routers could displace optical devices.  Turnabout is fair play (or equally competitive play, at the least).

Cisco, figuring that the battle with optical vendors over the “packet optical” boundary was inevitable, also figured it would mean Cisco spending more on Acacia interfaces.  Why not buy the company and keep the money in-house?  That’s the simple justification for the deal.

Cisco (and the router vendors overall) have an advantage in the packet wars.  Optical vendors, perhaps especially Ciena, have failed to make a case for themselves in the packet space, despite attempts from a number of different directions.  The packet layer is a lot more complicated, both topologically and operationally.  The router-optics story is simple: “Be invisible”.  Subduct the whole optical network into a router interface.  That’s surely an easier sales pitch, and it also has the advantage of creating a consistent operations framework.

Then, finally, there’s Huawei.  The ultimate cost-reduction strategy has always been “Beat Huawei up on price.”  Ciena’s greatest successes come in the US, where Huawei isn’t much of a factor.  It’s hope for pricing relief in the future is that Huawei will be barred from deals in other geographies too, or that at least there will be a risk in adopting Huawei gear that operators outside the US won’t want to take.  OK, that’s a possibility, but you could also hope that somehow “Buy Ciena!” would be written across the face of the moon.  The point is that you have to define strategies that can succeed on your own initiatives, or you’re hoping to find gold while digging for worms to fish with.

The thing that nobody has addressed here, but that optical vendors need more than anyone else, is an architectural model of the network of the future.  It’s clear that the majority network investment goes into the access/metro part of the network.  This portion of the network may carry IP traffic as the dominant mission, but it’s not a classical IP network because nearly all of it is focused on getting traffic to a public-network on-ramp.  Think of mobile infrastructure as an example.  Whether we’re talking about the 4G EPC or the 5G UP, the mobile-centric part of a mobile network is more aggregation than routing.  That could mean it’s more “packet” than IP, and that packet optics might be a natural fit.

Even router vendors have some technical challenges.  If a router with an optical interface is to be a true, almost-universal, replacement for a separate optical layer, then it has to be able to do very effective packet-level grooming at the least, meaning it has to have a “Level 2” pipe-like path control process below routing, not one based on MPLS.  It may also have to address agile optics and wavelength cross-connect.  Cisco can probably make some of this happen, if they have time.

It’s the optical vendors, mostly Ciena and just possibly Infinera, who will decide how much time Cisco has.  Neither of these vendors has shown much insight into what a future mobile-like architecture for access/metro would look like if optical devices rose up to assume that Level 2 pipe/packet role.  Could one of them break out?  Anything is possible, but we’re talking about a decade of fumbling for both our optical contenders, and it’s hard to see how we could credibly presume a revolution when the optical underclass (at least in topology terms) have been idling about for a decade.

All this, of course, begs the question of just what a future architecture for the “packet” overlay part of transport would look like.  The challenge is that the whole of access/metro is evolving under a variety of pressures, some real (there will be 5G New Radio and more capacity) and some highly speculative (5G core, IoT, edge computing).  Absent a clear picture of what connectivity from the metro network outward to the user looks like, and what it’s expected to do, we can’t really say much in detail.  But I do think that we can say that if we had a true network service abstraction goal for virtual networking, rather than a network-of-virtual-boxes goal, we’d be ahead of the game.

Separating Hype from Reality in Edge Computing

Edge computing is another of those topics on the verge (if not over it) of being over-hyped.  Part of the reason is that we don’t really have a solid definition or taxonomy for the space, and part because we haven’t really looked at the business case.  I’ve been trying for a year to get some good inputs into this, and I just finished a run through my enterprise contacts to ask for their contribution.  As usual, I’ve then pushed the results into my buyer-decision model, and this blog is about the results.

Let’s start with taxonomy.  “Edge” computing broadly means putting a smart agent close to the point of action, which users define as being the point where information that requires attention originates, and to which the responses are returned.  In somewhat-technical terms, then, it’s at the near end of the control loop.  In this definition, “close” could mean right alongside or at the point of action, on premises, or back further in the network.

What we should do is consider “cloud” computing to be a collection of distributed hosting resources that are allocated as a pool, not dedicated to an application.  Edge computing is then a subset of cloud computing, where the resources are hosted at the “network edge”, meaning the closest accessible point to the user demarcation.  “Local” computing is a distribution of the edge computing resources to the other side of the access connection, meaning on the customer’s premises.

Both users and third parties (like network operators or cloud providers) can build clouds, and thus deploy “edge computing”.  The distinction between an operator/provider cloud or edge and a user cloud or edge is the scope of applications that can draw on the pool of resources.  Since a local computing resource is unlikely to be available to anyone other than the owner of the place it’s deployed, that distinction is more meaningful when you get to network-edge hosting.

All edge computing has a disadvantage.  Distributing computing, in this age of super-microprocessors, is likely going to reduce the utilization of a given computer by reducing the applications suitable for hosting there, and multiple distributed systems are more expensive to support than the same computing power would be if it were centralized.  This tends to reduce economic efficiency overall, so there has to be some compensating benefit that would drive edge deployment.

Generally, users see the benefit of edge computing to be reduction in latency, which means the time between when a signal requiring attention is originated and the response is received to be acted upon.  You hear a lot about latency these days because it’s also cited as a benefit for 5G, so it’s worth having a look at what might be latency-sensitive applications to assess their characteristics and credibility.

The one we hear about the most is autonomous vehicles, and in my view this one is the most problematic.  The systemic self-drive application is obviously something that enterprises themselves wouldn’t deploy resources to support, but enterprises do have autonomous vehicle missions in-house.  In fact, the in-house missions are more credible as edge computing drivers than the systemic ones.

The problem with general self-drive cars as an edge computing driver is that every indication says that the self-drive vehicles would make onboard decisions rather than cede control to an external process.  Even minimal process latency could derail self-drive applications (in a hundred milliseconds, a speeding vehicle would move almost ten feet), and were there to be a network or processing outage (or a DDoS attack) you could end up with deaths and injuries.  Central control of “cars” is out.

What could be “in” is control of material-movement systems whose major challenge is avoiding each other.  You can interleave forklifts rather nicely if you control them all, and that to me sets the requirement for realistic edge support of autonomous vehicles.  If your vehicles are moving about in an area where other such present the only regular risk of collision, you have a great autonomous vehicle application.

Where are they?  Users say that companies who operate warehouses of any sort, do large-scale material transfer between vehicles, send vehicles to pick up things in a holding yard, and so forth, are all credible applications.  That means manufacturing, retail, and transportation fit the bill.  In my survey (164 users), 22 said they could see some applications of edge-controlled autonomous vehicles.

But where’s the edge?  In nearly all cases (all but 4 in my sample), the users said the requirements were “local” meaning in-building or within several buildings in a campus, and would be covered by local networking (WiFi).  None of the “local edge” candidates thought they’d want to host the control point on someone else’s systems; they saw themselves with an “appliance” that did the job.  Thus, they didn’t see this edge application as requiring 5G or public cloud hosting at all.  Those who did were envisioning large-scale facilities that were more like junk yards than like buildings, and where material was likely left out in the weather.  All four of the companies who saw the larger-scale edge mission were transportation companies; rail, ship, trucking, and air.

Another edge mission you hear a lot about is in the medical area.  I recently read a piece talking about how low latency could be critical in handling a patient who was “coding”, meaning sinking into a possibly dangerous state.  The problem, of course, is that this sort of problem is already detected locally by the devices hooked to the patient.  Ceding it to something more centralized, even something on the same floor of the hospital, risks a significant problem if that facility fails or is disconnected, and for little gain given that the expensive part of the monitoring is part of the patient’s attached sensor web.

Some healthcare-related users in my study group did believe that where a patient’s condition was monitored by multiple devices for multiple classes of reading, some intelligence to correlate the conditions being measured and making a broader determination of whether things are going well or starting to slip, might be helpful.  However, again, they didn’t see this as requiring edge computing support from others, instead focusing on using specialized “hardened” local devices on each floor.  They also believed the communication could be handled by WiFi, and would not require 5G.

Outside these two focus areas, enterprises didn’t offer me any examples of a latency-critical application.  My own work with enterprises (focused on transportation, medical, and banking/investment) doesn’t show anything different, either.  Thus, while none of this suggests that there is no mission for edge computing, it does suggest that enterprises don’t see a clear mission.  That, in turn, suggests that for the moment at least, edge opportunity is either hype or it’s dependent on systemic changes in services.  The mass market, the consumer, is the only credible driver of the edge.

The reason that’s not a good thing for edge-aspiring vendors is that systemic changes in services means exploiting IoT and contextualization drivers, and both those areas almost mandate participation by network operators.  Operators, of course, are legendary for studying something to death, and then coming up with a solution that wouldn’t have worked even if it had been delivered in a timely way.

What runs at the edge are higher-level services.  They’re digestions and correlations of information collected over an area, fit into a form where personalization-sensitive features can apply them to improve information relevance.  There are navigational services that would promote the value and safety of autonomous vehicles, and these services’ value would increase as the timeliness of their information increased.  To make their data both broadly useful and timely, you’d want to host the services close to where their data was collected, and then publish them at low latency to the prospective users.  This same model would be optimal for contextual or personalization services, and I’ve blogged about the nature of contextual services before.

A service-driven approach to IoT and contextualization would be a powerful driver of edge computing because of its dual collection/publication makeup.  You have to collect sensor data, correlate it, and publish it fast enough to be useful.  Services, of course, also lower the cost of the on-board or in-hand technology needed to assimilate IoT or contextual/personal information fields, so there’s an offsetting benefit to any hosting cost.  In short, these services are the key to edge computing.

Our biggest problem in edge computing is that we implicitly devalue the only application set that could actually make it relevant.  We talk about “IoT” as a sensor game, envisioning perhaps an autonomous vehicle making a zillion lightning inquires to sensors in its path and deciding what to do.  Forget it; the cost of that would be enormous and most of the vehicular safety features have to be on-board so they aren’t as likely to be exposed to failure or attack.  It also begs the question of why someone would deploy (and pay for) those sensors.  Sheer human goodness?

Like so many other technical innovations, edge computing is all too often applied but not justified.  We don’t doubt that there a lot of things you can do with edge computing, and there’s no shortage of tales regarding what they are and how wonderful they’d be.  That’s not our problem, though, because we don’t have edge computing to apply.  What we need is something that justifies its deployment, and like other technical marvels, we’re coming up a bit short on examples.  The one we find that’s truly credible is just too complicated for many to be willing to wait for or bet on.

So to summarize, there are some enterprise applications for edge computing in certain verticals, but it’s doubtful whether these applications could drive significant edge computing deployment.  Certainly, generalized prospecting for edge opportunity among enterprises would be a very unfulfilling experience for the sales personnel involved.  There are great opportunities for edge computing as a part of cloud computing infrastructure, where applications that digest and present insight from sensor data are added to traditional hosting.  It’s then a question of how we can get those applications promoted, because nobody is going to build out edge computing in the hope it’s eventually justified.

A Broader Look at Operator Cloud Plans

Network operators are going to both offer cloud computing services and adopt them internally, but the question is “How?”  It’s now looking like internal applications for cloud computing are influencing operator cloud planning more than expected.  Last week I talked about AT&T, which is about the only provider I can really talk about in specific detail.  However, I do have general statistics on operator cloud plans, and I’ll share them (and some modeling and opinions) here.

Let’s start with true internally deployed cloud data centers.  All operator classes think that “eventually” they will have elements of some network services running in hosted-feature form.  Thus, all think they’ll have some sort of hosting.  Tier Ones are prepared to say that they could, by harnessing all the “drivers” of carrier cloud I’ve blogged about, eventually build out their own cloud infrastructure with good economy of scale.  You’ll note all the “eventually” qualifiers here; over the last year, Tier Ones have pulled back on specific and optimistic positions on how long carrier-cloud infrastructure will take to justify itself.

Only a few Tier Twos and no Tier Threes seem to have any specific “cloud infrastructure” aspirations.  Those two tiers see “hosting” or “edge computing” as deploying some selective server racks in a convenient location, rather than building cloud infrastructure.  Of the 100,000 carrier cloud data centers my model shows could be justified by 2030, over 80% come from Tier Ones and nearly all the rest from contained-geography-focused Tier Twos.

Network services with some hosted features aren’t the end of things operators could host in the cloud.  A more recent area of interest is operators’ own business applications.  All operator classes tended to think their own applications and public cloud enterprise applications would need comparable cloud service resources.  Think “application cloud” here, and you get the idea.  Operators think both the opportunity to offer public cloud, and the need to host some of their own applications in the cloud, will develop before they’re likely to have achieved reasonable economies of scale and tolerable first costs for their own data centers.  This is the new insight that’s been driving some changes in operator cloud strategy, so we’ll look at it in more detail.

Let’s start with operators’ own applications.  All the Tier One operators plan to use cloud computing for internal applications.  Today, about two-thirds say their internal applications are “likely” to be hosted on their own clouds, but that number is down from almost 80% just a year ago.  Almost half “would consider” hosting internal applications (what AT&T calls “non-network” applications) in the public cloud under some circumstances, primarily as a backup strategy or to serve thin international service geographies.

Again, things change rapidly as we dip down toward the smaller operators.  About half of all Tier Twos and almost all Tier Three operators say they are “likely” to host at least some of their internal applications in the public cloud rather than on their own clouds.  These operators don’t believe they can achieve reasonable economies with their own clouds, but it’s important to note that these operators are also (in about the same percentage) looking at offering public cloud services through resale agreements with public cloud providers.  Interestingly, most Tier Threes see themselves more as consumers of cloud services than as providers, even in that “eventually” future.

For public cloud services offered to others, the numbers (as already noted) are similar.  About two-thirds of Tier One operators think it’s “likely” they will ultimately provide their own infrastructure for public cloud services to enterprises, and also likely to offer third parties hosting facilities for SMB SaaS applications.  This doesn’t reflect the view that offering public cloud services would justify a build-out of carrier cloud (even among Tier Ones, only about a quarter seem to believe that), but rather a view that they’ll eventually get carrier cloud infrastructure to support other drivers (which we’ll discuss below).  I’m also seeing willingness to outsource enterprise cloud services to a cloud provider partner is becoming more credible as time passes, as the recent announcements in the space show.  The reason seems to be a fear that the opportunity will develop before the operator can build out efficient internal cloud resources.

Because Tier Two and Three operators are more reactive, they haven’t planned for their own clouds much, and don’t have an internal political position to protect.  They’ve expressed willingness, and even desire, to use public cloud resources extensively.  I saw in consulting activities five years ago that some Latin American Tier Two and Three operators were actively looking at public cloud provider partnerships based on some early sales interest.  However, these operators (and Tier Two and Three operators in general) have found that early sales interest in public cloud services rarely developed into real opportunity.  Almost three-quarters of Tier Twos and over 90% of Tier Threes said the sales cycle for public cloud services was too long, and that they would be happy to “send prospects” to a public cloud provider but didn’t want to have much involvement in presale support, much less post-sale.

Let’s summarize where we are at this point.  Operators think they’ll be using the cloud to host their own applications, and most think they’ll likely be “offering” cloud services either on their own infrastructure or on public cloud partnerships.  The question is whether the need for the cloud will develop in a way so as to justify their own carrier cloud infrastructure, or whether they’ll have such a slow ramp with such diverse requirements that they’ll probably have to rely in the near term on public cloud partnerships.

How “far” is “near-term?”  That depends on the pace at which the demand drivers I’ve been blogging about might develop.  If operators see a lot of near-term opportunity to harness the cloud, that could justify a build decision.  If not, then they’ll likely have to start with a partnership deal with a public cloud provider.  To talk about the result of that, we have to look at the drivers themselves.

Let’s start with NFV.  At the moment, virtual CPE (vCPE) is the only NFV mission that’s getting much specific attention.  Tier Ones think that NFV is helpful for vCPE in business services, and in some very specialized situations, even for residential vCPE.  Most Tier Ones expect vCPE to mean deploying a uCPE appliance at business locations to hold virtual features, and perhaps augmenting that with cloud-hosting of additional features.  Tier Twos and Threes are split on vCPE; about a third say they would like to have cloud-hosted vCPE and two-thirds say they want uCPE on the premises.

I had expected that Tier Two or even Tier Three operators who had a very contained service geography (a city, for example) might be more inclined to cloud-host features, but there wasn’t much of a difference in the view of these geographically focused operators versus those with more widespread prospect bases.  While getting gear to more distant users is recognized as a problem, especially in emerging markets, it’s difficult to find good cloud hosting services or facilities close to dispersed customers, which makes operators wary of a cloud solution.

For the public cloud services driver, as previously noted, attitudes are closely related to NFV/vCPE thinking.  All Tier Ones say they expect they will “eventually” offer public cloud services to business customers, and about three-quarters think that’s true of residential services too.  Tier Two and Three operators have varying views, aligned this time with their service areas.  Metro-focused operators think business cloud services are a strong strategy, with nearly all Tier Twos buying in to the idea, and about two-thirds of Tier Threes.

What about the other drivers, like contextual services, IoT, advertising and video?  Even Tier Ones are telling me that whatever comments are being made about these drivers of carrier cloud, the truth is that it’s “wait and see” in real planning terms.  Nobody really knows what any of these drivers would mean in terms of software architecture or hosting requirements, so nobody is rushing out to build anything to support them.  Some of the drivers are looking doubtful in credibility terms, too.

Most everyone thinks IoT is hyped at this point.  Some Tier Ones are still hopeful that somehow something will happen, but they’re still focusing on the happy future where every device has a 5G radio attached and is associated with someone who’s paying the bill.  IoT processing is still a totally dark area for operators, and while we can’t rule it out (everyone, even operators, follows the money), it isn’t looking like much will develop in the next three or more years.

Advertising and video are a real near-term opportunity, and here we see an opportunistic segmentation of operators according to whether they offer or plan to offer their own streaming services.  Those who do are eager to understand video platforms, ad platforms, and personalization or contextualization.  The others see that as coming with whatever their video source or partner offers.  In the US, you can see that AT&T fits in the roll-your-own-video school and Verizon is happy to partner with somebody.

True contextual services were languishing till late last year, according to operators.  This year, they’re being tied to the whole AI thing, which is good in the sense that it generates some management interest, but bad in the sense that it’s a technology and not an application.  A lot of operators say “we’re going to integrate AI into…” something, but they don’t really have a clear idea what AI will do for that something, nor do they have a clear idea what the business value of the whole thing will be.  In short, AI is anonymizing the contextual services space, when what’s really needed there is an understanding of the enormous value of context in productivity and personalization.  Sure AI could help implement it, but the overall service model needs to be defined before we start trying to write code, AI or otherwise.

From my contacts overall, I’m getting a sense that the space is dividing.  There are a few operators like AT&T who have a clear goal in cloud computing and carrier cloud, and though many of them don’t exactly have the projects aimed in the optimum (or even right) direction, they’re making progress.  The rest seem to be slipping into more inaction.  It’s possible that many had hoped that some generalized standards process, like NFV, would somehow answer all the questions.  That doesn’t seem likely now, and so they’re out of insights, for now.

That’s why the “application cloud” is so interesting.  It may have created confusion for AT&T regarding the way it fits within their overall cloud plan, but it’s also advanced cloud thinking.  It could do that for other operators too.  But one final qualifier: Operators are interested in advanced services in direct proportion to their concern about revenues and profits for basic connection services.  AT&T has a fairly low demand density, and thus a greater risk of marginal profits from connection services.  Whether application-cloud thinking will spread widely, even to operators with good demand densities (like Verizon), is something we can’t yet predict.