Are Network Vendors Jinxed on M&A?

Why do so many vendors mess up acquisitions?  It’s always been a relevant question because, well, vendors always seem to mess them up.  It’s relevant now because of the Ericsson announcement it was acquiring Cradlepoint, and that deal could be a poster child for a number of the issues I’m going to raise.

There are two broad reasons why a tech acquisition makes sense.  The first is that you’re buying revenue or a unique customer base, and the second is that you’re buying a position in a market growing so fast that you can’t wait to develop your own product.  While it might seem obvious, the key for any vendor looking to acquire is to first identify which of these are their motivation, and second to protect that proposed benefit with specific steps.

Buying revenue is a pretty straightforward reason, and if the deal is being shepherded by the CFO it’s likely to make sense.  The key is usually to look at the share-price-to-revenue relationship of both companies.  The ideal situation comes when a company that’s trading at a low share-to-revenue relationship is acquired by a company with a higher ratio.  The deal alone will apply the buyer’s multiple to the seller’s revenue and it’s a win.

Buying a customer base isn’t nearly as easy to navigate.  There has to be a sense of symbiosis, meaning that you could expect to sell your product into the other company’s base.  You obviously have to be able to keep the base intact, the base has to consist of a reasonable number of real prospects for your products, and you have to be able to deliver the message through the sales channel.  Sometimes the “old” sales channel won’t be able to absorb the “new” message, and if they can’t and you decide to augment or replace them, you may lose the base.

If you want real complexity, a real potential for a mess, though, it’s impossible to beat the situations where someone buys a company to get a position in a critical future technology.  This can fail for a whole bunch of reasons, so let’s just go through them, starting with the ones I’ve seen the most.

The biggest issue I’ve seen by far is lack of any real vision of the future, driving, opportunity.  Network vendors based in the US (not to name any names!) have a tendency to buy companies more for sales objection management than anything else.  “Gosh, I’d have won that deal if we’d had one of those darn widgets,” the salesperson tells the CEO.  The CEO thinks a moment, then says “Well, darn it, we’ll just buy us a widget company!” and does.  Maybe it’s not quite this stupid (at least, not always), but what does seem to be a universal problem is accepting sales input for what should be a strategic decision.

You have to start a future-strategic-position deal with a clear definition of the position you’re depending on.  A market is an ecosystem, so don’t think “IoT drives 5G” as being a statement of a strategic vision.  It’s a hope, unless you can frame out what the market that creates the drive is, who the players for the critical roles will be, and why those players will accept those roles.

You can easily see how believing that “IoT drives 5G” (for example) could lead a 5G player to think about buying into the IoT space, but that would likely be wise only by happy accident.  There are a lot of steps, players, products, and market steps that need to connect those two dots, and unless the buying company is darn sure they can make all those steps happen, the decision is a major risk.

A good question to ask is “is the company I’m buying already seeing revenue growth from the opportunity I think the deal will help me exploit?”  If the answer isn’t a decisive “Yes!” then there’s probably some pieces missing in the opportunity ecosystem.  Too many companies mistake media hype for revenue growth here.  The media will run any exciting story.  They will seek out people who can say something that makes it more exciting.  If they ask an analyst how big the market is, and they get a ten-billion-dollar-per-year answer, they’ll tell the next analyst they ask that the bid is ten billion, and ask if they’re willing to raise.  That’s how we get future market estimates that rival the global GDP.

Even if you have a viable opportunity and credible ecosystem, you still have to decide whether you can exploit it.  If you don’t have a product available to support an emerging opportunity so credible and immediate that you can’t wait to build something on your own, then you’re probably not very good at strategizing that market.  Is the company you’re buying any better at it?  Where are the smarts needed going to come from?  Can either company quickly generate a lot of media buzz (get some analysts to help if you can’t!)?  Can you put together a strategy that you can execute on quickly enough?  All these are critical questions.

All of the answers to all the questions may depend on another point, which happens to be another of the reasons why company acquisitions go wrong.  Can you, acquiring a new company, merge the workforces and cultures quickly enough, and with minimal resentment, so that a new working team can be created?  Assume the answer is “No!” unless you’re ready to work hard, and when your whole reason for the merger is to reduce combined headcount, you’re really in it deep.

A decent number of M&As are driven by what’s called “consolidation”.  The theme is that Company A and B are competitors, both with decent market share.  If one buys the other, the resulting company would have the combined revenue of the two, and would be able to cut a bunch of cost, including workforce and real estate.  That’s arguably what brought Alcatel and Lucent together, and that marriage is still in counseling, even after the who package ended up inside Nokia.  Any time there are going to be job cuts associated with M&A, you can assume that all those who see themselves at risk will start looking around “just in case”.  If they find something, they may not wait to see how the dice fall.  Those who are most likely to move are those with the best skills, who can attract the most favorable offers.  In short, the very ones you needed to keep.

Symbiotic M&A is fairly common in the software space, but even there it poses challenges.  The VMware decision to buy Pivotal, for example, is going to take some positioning, marketing, and strategy finesse to make it work.  In the network space, I think it’s fairly rare to see a good M&A.  For some companies, it never seems to happen.

That’s what Ericsson should be thinking about.  Ericsson has never been a marketing powerhouse.  Like most telco suppliers, they’re not strategic wizards either.  Imagine, given the formula for successful M&A, how difficult it would be if you don’t understand much about strategy and you can’t communicate what you know in any case!  There has to be a real vision of the future behind the deal, behind almost any M&A deal that’s destined to succeed.  Ericsson needs to promote it in every way, or they’ve just tossed a bundle of money away.

Event-Driven OSS/BSS: Useful, Possible…?

Why are operators interested in event-driven OSS/BSS?  Maybe I should qualify that as “some operators” or even “some people in some operators”.  In the last ten days, I’ve heard (for the second time) a sharp dispute within an operator regarding the future of OSS/BSS.  I’ve also heard vendors and analysts wonder what’s really on.  Here’s my response.

First, it’s important to remember that OSS/BSS systems were originally “batch” operations, meaning that their inputs weren’t even created in real time.  People processed orders and entered them, or field personnel went out and installed or fixed things, and then changed the inventories.  In short, OSS/BSS started off like all IT started, with something like keypunching and card reading.

Over time, what we call “OSS/BSS” today evolved as a transaction processing application set built for the network operator or communications service provider.  Online transaction processing (OLTP) replaced batch processing for the human-to-systems interaction model.  Eventually, as it did in other verticals, this OLTP was front-ended by web portals for customer order and self-service, and even for use by both the operators’ office staff and field personnel.

The recent trends (recent in telco-speak, meaning in the last decade) that have been pushing some people/operators for change has emerged largely from the network side rather than from the service side.  Physical provisioning used to be the rule.  If you wanted a phone line, somebody came out with a toolbelt and diddled in some strange-shaped can sitting (usually at a crooked angle) somewhere on the lawn.  You still have to get some physical provisioning, of course, but with IP networks the “service” is delivered over access facilities often set in place earlier.  Most of the stuff that has to be handled relates to the network side, now automated as opposed to manually configured.

So now, we have this network management system that actually tweaks all the hidden boxes in various places and creates our service.  You make service changes as box-tweaks, and if there’s a problem a box manager tells the service manager about it, so there can be an update to the customer care portal and so there can be (sometimes) consideration for an SLA violation and escalation or charge-back.

You can glue this new stuff onto the existing transaction-and-front-end-portal stuff, of course, and that’s what most operators have done.  Changing out an OSS/BSS system would be, for most operators, about as stressful as changing out a demand-deposit accounting system for retail banks.  The people who don’t want to see OSS/BSS revolution represent the group of operators where this stress dominates.

There are two issues that are driving some operators and some operator planners to doubt this approach.  First, there’s the long-standing operator fear of lock-in.  Remember how hard changing OSS/BSS systems would be?  For a vendor it’s like buying a bond—regular payments and little risk.  Second, as services created via the NMS get more complex and change more often, the relationship between OSS/BSS and NMS gets tighter, and the OSS/BSS can constrain what could be offered to customers.

When you have users at portals, customer service reps at transaction screens or portals of their own, and network condition changes all pouring into the same essentially monolithic applications, no good can come from it.  Collisions in commands and conditions can bring about truly bad results, and this is why service complexity tends to favor a modernized approach to OSS/BSS.  It’s also why you hear about making things “event-driven”.

But being event-driven opens other doors.  If we go back (as I know I often do!) to the TMF NGOSS Contract model, we find an approach that links network and operations events to processes via a contract data model.  Event “X” in Service State “3” means “Run Service Bravo”.  This has major implications on lock-in, and even on whether there needs to be anything we’d call an OSS/BSS at all.

What the NGOSS Contract does is to “service-ize” operations software.  Instead of a big monolithic chunk of code, you have a bunch of services/microservices that do very specific things.  A software vendor could offer “event services” instead of monolithic OSS/BSS systems.  Some of the event services might actually look a lot like things like a retail portal or an analytics tool, so some general-purpose “horizontal” software could be included where appropriate.  Operators could mix and match, which may be why vendors really hate this approach.

It may also explain why the whole TMF NGOSS Contract thing didn’t take off when it came along well over a decade ago.  The TMF has recently shown some signs it would like to resurrect the concept and make something of it, but in order for that to happen, the network operator members of the body will have to out-muscle the OSS/BSS vendors.  In most international groups I’ve been involved with, the operators are novices about manipulating group processes, so this is going to be a difficult challenge both for the operators and for the TMF.

What happens here could be important for the OSS/BSS space, providing that the TMF does advance the NGOSS Contract notion and that it’s actually implemented.  Remember that this was advanced once and wasn’t implemented, so we can’t take TMF acceptance as an indication of an actual product change.  If we do get this change, it could be the beginning of a period of rationalization of operations and business support software, which most would probably agree is long overdue.  It might also have other impacts.

The first is that this change might percolate into service provider operations overall.  Network management is even more event-centric than OSS/BSS, and yet the NMS and even zero-touch automation models currently evolving are really as monolithic as OSS/BSS systems.  If operators see the advantages of event-driven OSS/BSS, can they fail to see that extending the principle into network management, and thus to operations overall, would benefit them significantly?

The next question is whether a “contract-data-model” approach, combined with event-to-process steering by the model, could be used in other applications.  Remember that operations processes have been generally converging on the same transactional and portal shift that applications have followed in general, could operations software lead the rest of the space into a contract-driven approach?  If so, it would be a total software revolution.

The use of the contract-data-model approach would guide the path toward the use of microservices in application software.  Almost anything that deals with the real world, including traditional transaction processing, can be re-visualized as event-driven.  The limitations that the approach put on the processes (which is that they’re stateless insofar as they operate only on the contract data model data itself) could encourage functional decomposition to the appropriate level.  The result would be resilient and scalable because any instance of any given process could handle the events that the state/event relationship targeted to it.

Might this revolutionize the cloud, even create a kind of SaaS-as-microservices future?  Not so fast.  All event-driven systems have an inherent sensitivity to latency, because the in-flight time of data creates a window in which simultaneous events can’t be contextualized.  The same problem occurs (more often) in monolithic software implementations of event systems, where events have to be queued for processing when resources are available, and this loss of context is one reason why that monolithic approach isn’t suitable for event-driven systems, including those of network and service management (which is why I don’t like ONAP).

Apart from the contextual problems, event-driven systems have to manage latency to prevent workflows from accumulating too much of it as they pass around through a sea of microservices.  One of the reasons why it’s important to view a network as a series of hierarchical black-box intent models is to control the scope of event flows, so that you don’t end up having excessive response times.  If you want to believe in SaaS-for-everything and event-driven at the same time, we’d have to think carefully about how the contract data models and state/event tables were structured, and plot the event- and workflows carefully.  Of course, you don’t have to go event-driven to make OSS/BSS a SaaS project; monolithic software can be hosted in the cloud and offered as a service, but that’s another topic.

I believe that OSS/BSS systems are inevitably moving from specialized purpose-built software to collections of horizontal tools.  The key, for operators, is to recognize that at the same time there’s an event-driven dimension moving the needle, and if they ignore the latter trend, they may end up with a bunch of connected monoliths instead of a collection of services and microservices, and in the cloud age that would be a very bad outcome.

Why Separating the Control and Data Planes is Important

The separation between control and data planes is a pretty-well-accepted concept, but one complicating factor in the picture is that the term “control plane” isn’t used consistently.  Originally, the “control plane” was the set of protocol messages used to control data exchanges, and so it was (as an example) a reference to the in-band exchanges that are used in IP networks for topology and status updates.  With the advent of more complex services, including 5G, the term “control plane” references the exchanges that manage the mobile network and infrastructure, riding above the “user plane”, which is now the IP network and thus includes the data plane and the old control plane!

About a year ago, I did a presentation to some operators that used a slide that defined four layers of a network.  I put the data plane at the bottom, the control plane next, and then the service plane, which I said was a better name for the layer of signaling that mediates IP data flows, the lower two layers.  At the top of it all, I sat the experience plane, which provides the actual experiences that users of a network are seeking—video, websites, etc.

We actually have examples of things in each of these four areas today, and the way the planes relate to each other and to the user is important in understanding the best approach to defining future network architectures and infrastructure.  However, as soon as you start looking at that problem, you confront yet another “plane” that runs vertically along all these horizontal layers.  That’s the virtualization plane, and it contains the signaling that manages whatever hosting and deployment processes are used to realize the functionality of the other planes on a cloud infrastructure.

When we talk about things like 5G, we’re dealing mostly with service functions, and so we’re dealing mostly with things that live in the three lower planes.  My experience plane is relevant to 5G only in the sense that the things in it are part of the public data network that 5G user plane activity is connecting with.  The specs don’t define how we actually achieve virtualization, except through reference to NFV, which actually does a pretty poor job of defining the way we’d want to deploy and manage functional components in any of those planes.

You can see a bit of this, by inference, in my blog about Ericsson’s complaints about Open RAN security.  Since we’re not dealing with the actual mechanisms of virtualization in an effective way, there are indeed potential security issues that emerge.  Not just with Open RAN (as I noted), though.  All of the network planes have the potential for security issues at the functional level (a function could be compromised), and to the extent that they interconnect elements or virtualize them, they also present security risks at the implementation level, meaning an attack on infrastructure or the orchestration and management interfaces.

The special challenge of virtualization, meaning the replacement of fixed appliances like routers or firewalls by hosted features (single features or chains or combinations thereof) that do the same thing, is that we’ve exposed a lot of additional stuff that not only has to be secure, but has to be somehow created and managed without generating a lot of operations burden.

This is why I’ve always liked the notion of “fractal service assembly” or “intent modeling”.  If we presumed that a given feature or function was represented by a functional abstraction linked to an opaque (black-box) implementation, and if we could make the resulting model enforce its own SLA within the black box, then the complexities of implementation are separated from the functional vision of the service.  Since that vision is what users of the service are paying for, that creates a level where service provider and service consumer speak the same (functional) language, and the devilish details are hidden in those black boxes.

One of the advantages of the black-box approach is that creating an abstraction called (for example) “Router-Element” with the same interfaces and management APIs as a real router would let you map those 1:1 to a real router.  Before you say “So what?”, the point is that you could map any software instance of a router, any SDN implementation of a router, or any collection of all of the above that did routing, all to that abstraction.  You now have an evolution strategy, but this isn’t the topic of this blog, so let’s move ono.

The relevant advantage is that we now have, because the implementation of what’s inside the black box is opaque, a unit of functionality that doesn’t assert any of the interfaces or APIs that the implementation (versus the functionality) requires.  We can contain the scope of the internal service and control plane layers of my model.  We can also now define a virtualization-plane process that operates only on the functions.  Inside the black boxes there may be virtualization, but it’s invisible.  This is what fractal service assembly is about.  You divide things up, make each thing opaque and autonomous, and then deal with a network of things and not one of specific technologies.

If the service plane, then, deals with “thing networks”, it becomes independent of the implementation of the things.  We can build services without worrying about implementations.  We can, by asserting proper interfaces from the data and control planes into the service plane, construct service interfaces for 5G’s “control plane” to consume.  Our black-box service plane is then the 5G user plane, and our service plane has a layer that creates the proper 5G interfaces, then defines the 5G processes that consume them.  Those, like our lower level, can be functional black boxes that hide the implementation.

This is the important point here, the conclusion I’m aiming for.  Every layer in our approach is a black box to the layers above, just like the OSI model mandated in 1974.  We have to define the binding interfaces between layers, represent them as intent-modeled interfaces, and then focus on how to implement the inside of those black boxes, both in terms of current technology and in terms of the various visions of the future.  The approaches that can do that will save us, and the others will be simply more marketing pap for the masses.

Is Ericsson Right About Open RAN Security?

Vendors love to rain on open initiatives, so it’s not surprising that Ericsson, perhaps the leading 5G vendor, is now casting clouds and shade on Open RAN.  Specifically, they’re warning about the security risks it might present.  Is this yet another of those cynical vendor ploys, or is there something to the issue?  In particular, are there practices that potential Open RAN users should look for, or look out for?

Open RAN is just what the name suggests, a model for implementing the 3GPP 5G RAN (or New Radio, NR) spec in an open way, rather than a way that encourages monolithic devices.  It’s gained a lot of traction in the media (see HERE and HERE in just the last week), and a lot of operators are also paying attention.  That, of course, is likely to arouse vendor concerns.

The fundamental objection Ericsson makes in the blog I referenced above is that Open RAN creates an “expanded threat surface.”  The key quote is “The introduction of new and additional touch points in Open RAN architecture, along with the decoupling of hardware and software, has the potential to expand the threat and attack surface of the network in numerous ways….”

I have a major problem with this, frankly.  What it sounds like is a plea to abandon software/hardware systems in favor of proprietary appliances.  Every 5G network, or any other kind of network, is controlled at the service and network levels by software running on servers, which sounds like decoupling of hardware and software to me.  Not to mention the fact that there’s probably not a single initiative in all of networking whose purpose is to bring software and hardware together in an appliance.

But let’s look at the specifics, and to do that, we have to use a reference for an Open RAN architecture, so I’m providing this figure from the O-RAN Software Community.  One specific point is the hardware/software decoupling, which is IMHO total nonsense, so we won’t pursue that further.

The first specific complaint Ericsson makes is that “New interfaces increase threat surface – for example, open fronthaul, A1, E2, etc.”  On the surface, this might seem at least possibly true, but I think that doesn’t survive examination.

The A1 interface, as the figure shows, links the Near-Real-Time RAN Intelligent Controller (RIC) element with orchestration and service management.  Every device in any network is going to have a management interface.  Every software element in an application or virtual-function-based service (remember that 3G is supposed to be based on NFV) has orchestration and management.  To claim this is an attribute specific to Open RAN is silly.  If your implementation of 5G doesn’t protect management and control-plane interfaces, you have a problem whether you’re Open RAN or not.

The E2 interface(s) are similar; they are part of the 5G “Control Plane” and should be totally separated from the user plane.  Who’s attacking there?  They’d have to be part of the 5G infrastructure, and again, if you have no protection for the components of your own infrastructure, there’s no hope for security in any form.

“Management interfaces may not be secured to industry best practices?”  Sure, that’s true, and it’s true for all the management and control-plane interfaces in any compliant 5G implementation.  One reason to separate data and control planes is to prevent security problems.  5G mandates that.  Sorry, but this is silly too.

The final point Ericsson makes is “(not exclusive to Open RAN):  adherence to Open Source best practices.”  The parenthetical qualifier says it all.  If this is a blog about Open RAN security problems, why include something that’s not specific to Open RAN?  In any event, open-source best practices aren’t necessarily best for network function virtualization in any form; they tend to be oriented toward IT and applications.

It is true that there are orchestration (DevOps), administration, and security measures appropriate to the design, development, and deployment of the Open RAN elements.  There are the same issues with VNFs, or management system components, or any other piece of software that’s used in a network service application.  Again, the solution to any Open RAN issues would be to address the issues for everything that’s software-implemented in 5G, NFV, OSS/BSS, NMS, ZTA, and all the rest.  We don’t even know at this point whether cloud solutions to deployment like Kubernetes, or for event movement like a service mesh (Istio) could fully address the requirements of a network implementation.  Do we think that this means that we should turn back the clock to the network equivalent of the Old Stone Age and commit to a future of bearskins, stone knives, and network appliances?

So, is this a cynical play or a real definition of issues?  Both.  On the one hand, there is nothing in Ericsson’s blog that IMHO points to an Open RAN-specific problem with security.  What they’re saying is that if you’re stupid enough to fail to secure your own control and management interfaces, Open RAN will add to the things you’ve been stupid about.  True, but in a security sense, not relevant, which brings us to the other hand.  3GPP, NFV, SDN, and just about every other initiative in the telco space for the last decade, has played fast and loose with the question of how you isolate the critical control- and management-plane functions.  If they’re even addressable in the data plane, it’s a problem, period.  Even if they’re not, there needs to be a barrier between the components that could be provided as part of a service, and the components that are part of the service infrastructure.

I’ve complained about the lack of address space discipline in many of my blogs, relating most often to NFV (Ericsson was a member of the NFV ISG and they didn’t stand up in support).  There’s a general tendency among telco people to ignore the issue, and I think it’s been ignored in the 3GPP too.  Any service control plane or “Level 3 control plane” (the two are different; in 3GPP 5G terms, the “user plane” is the Level 3 data plane and control plane and the “control plane” is a higher-level 5G layer) should be separate from the data/user plane.  That can be accomplished through maintaining them in a separate address space (one of the RFC1918 private IP addresses) or by defining a virtual private network or SD-WAN.

That same thing holds true for software elements and their deployment.  If you have a lazy process of development and certification for security and compliance, nobody has to attack you from the outside, they can simply sell a piece of malware into your framework and let it eat everything from the inside out.  A new generation of parasitism?  That’s not what we should be doing, but Open RAN doesn’t create the risk.

It may be true that the Open RAN stuff adds interfaces and APIs and even components to the threat surface, but there are more than enough interfaces there already to constitute a threat, and I don’t think that Open RAN increases the threat level in any meaningful way.  It’s surely true that if one were to apply address-space or virtual-network separation for the control layers of 5G, however it’s implemented, the result would protect all those interfaces/APIs and defend the threat surface universally, as long as you didn’t for some inexplicable reason keep some of the interfaces outside the scope of your remedy.

There are two lessons to be learned from Ericsson’s paper, I think.  First, these are troubling times for network operators and network equipment vendors, and in those situations, we have to be particularly wary of any statements or documents that clearly serve vendors’ interests, even if they’re couched in service-to-the-industry terms.  Vendors have always had blinders on with respect to things that touch their bottom lines, just as pretty much everyone else has always had.  If this is a surprise to anyone, welcome to the real world.

The second lesson is more important.  In the network operator or telco space, we are seeing a movement to the cloud that’s guided by bodies and people who don’t really understand the cloud.  That’s perhaps understandable too, because the technologists in both the vendor and operator spaces have focused on network-building as a task of assembling and coordinating boxes.  Once you make an explicit commitment to virtualization and functions, you are entering a whole new world.  Add in an explicit commitment to the cloud, and it’s immeasurably worse.  This is a field for specialists, and those specialists rarely work for telcos or their vendors.

At some point, the networking community is going to have to make a choice.  They can keep blundering around in the pseudo-cloud world they’ve created, a world that bears little resemblance to the real cloud world.  They can sit back and wait for the real cloud world to solve the problems of virtualized networking, either as a mission or by accident.  Or…they can learn enough about the real world of the cloud to guide themselves to their own best outcome.  The real message of the Ericsson complaint about Open RAN security is that it’s time to make that choice, or it will be made by default, and not in the optimum way.

I don’t want to beat Ericsson up unfairly on this; every single major network vendor engages in at least a subtle marketing sabotage campaign on threatening initiatives.  I’ve seen real arguments between vendors and operators at standards meetings, around points that would threaten a vendor’s incumbency in the name of openness.  I also want to note that the actual Ericsson blog is more a qualified list of concerns than the kind of overt attack on Open RAN that many stories on it described.  Still, if the goal of the piece was to call attention to the risks of virtual infrastructure, it should have been done in a more general and constructive way.

Oracle’s Quarter and the Future of SaaS

The biggest players in a space always set the tone, but they don’t always tell the story.  Oracle last week turned in a good quarter, and they’re in many ways an outlier in the public cloud space.  They don’t figure in most media cloud discussions as they rank number five in most public cloud reports, but they do represent a fairly unique blend of traditional (IaaS) and SaaS cloud services, and they do better in the latter.  They also represent a company that’s made the transition from selling software to selling cloud services pretty well.  They’re also the (apparent, but with unspecified terms) winner in the TikTok battle, which I’ll leave out of this blog pending more detail.

Financially speaking, Oracle reported a successful quarter, not only beating estimates but beating versus last year, pre-COVID.  They also guided higher for the next quarter.  The bad news (at least somewhat bad) is that their latest quarter was below both of the prior quarters, so they did have some exposure to the COVID problem.  They may have a better exposure to the recovery, though, and it’s the market factors that create that exposure that make them interesting.

The impact of COVID on enterprise IT is still evolving, but it’s pretty clear that the virus and lockdown have influenced budget planning.  Today, over two-thirds of enterprises tell me that they plan to shift more spending to IT-as-an-expense, away from traditional capital-centric IT.  They like the idea of being able to grown and shrink spending to match their expected revenues, and they realize that a hit like the one they got from COVID could be weathered more easily were they more agile.

Some software vendors have addressed this trend, visible even before COVID, by shifting more to a license basis for software versus payment-and-maintenance, and licensing based in some way on usage rather than something static like company size.  However, the most obvious way to address a need to shift from capital-IT to expense-IT is to use the cloud.  But….

…there is a big difference in the agility of cloud services.  Traditional cloud services will largely displace data center equipment and/or data center incremental investment.  Since cloud hosting tends to consume either newly developed software or software that’s moved from somewhere else, it doesn’t impact software costs (unless you’re unlucky enough to have a strange software license that might actually charge more for cloud hosting).  The kind of cloud service that best fits the goal of shifting IT costs from capital to expense is SaaS, which happens to be the big focus of Oracle.

Big, but not exclusive.  Oracle also has an IaaS cloud offering, though their market share in IaaS (which I estimate at about 6%) is less than its share of SaaS (which is about 12% if you include Microsoft Office 365 as a SaaS offering, or over 28% if you don’t).  This year, I’ve noticed that Oracle is cropping up in comments enterprises make to me about cloud planning, which it didn’t do much in 2019.  Companies who were willing to talk about the reason had an interesting tale to tell.

Shifting to a SaaS model isn’t easy for enterprises, for the obvious reason that whatever applications account for the largest part of their IT budgets aren’t provided in SaaS form.  They have to change applications to move to SaaS, and that sort of change can create major pushback from line departments whose users see different interfaces and often different process flows.  That’s problem enough, but there’s another problem rarely talked about in SaaS transformations.

Moving applications to SaaS form, to save money, has to displace IT resources, which means that the data center is likely to have less overall capacity.  Moving applications to SaaS may also have an impact on how the applications integrate with other applications that haven’t been moved, and perhaps can’t be moved yet.  Some users are telling me that Oracle’s SaaS/IaaS combination, combined with Oracle’s features and skills at integrating the two inside Oracle’s cloud, facilitates their shift to SaaS.  It might then be that Oracle’s approach will gain them not only SaaS market share but IaaS market share as well.

Oracle cited some indication of this on their call.  After citing analyst firms’ notice of their SaaS offering, Ellison says that “…what’s interesting is that those same analysts are beginning to take notice of the technical quality and customer satisfaction associated with Oracle’s Cloud infrastructure as a service business.”  It’s likely that the analysts, like me, were hearing from users about their strategies for increasing SaaS adoption.

Oracle, of course, had the advantage of having applications they could move to the cloud in SaaS form, and it’s actually fairly easy to spawn a user-specific application instance on top of an IaaS service and frame the offering as SaaS.  Over time, you could then redo critical parts of the application to facilitate more efficient use of the compute platform.

The ability to create a symbiosis between a SaaS and IaaS strategy is helpful, but you need SaaS to drive the bus here.  The big question in both software applications and cloud computing is whether the COVID challenge will result in a major shift toward SaaS, which would imply a major shift away from enterprise custom development in favor of packaged software.  It’s not rational to assume there’s much money in framing an enterprise’s own software as SaaS; it has to be a vertical or horizontal package.

Enterprises don’t write nearly as much of their own software today as they did in the past.  I can easily remember a period when the larger enterprises wrote most of their own applications, and when even a “fourth-generation language to facilitate development (what we’d call “low-code” today) was a revolution.  A prerequisite for acceptance of SaaS is an acceptance of canned applications to replace self-developed stuff.

That’s not something that develops instantly, and we have to remember that we’ve had SaaS for some time.  There have been local successes in SaaS, mostly in the very applications that Oracle and Salesforce compete in, but broader SaaS has to come from broader and more vertically focused applications.  That’s something Oracle may be thinking about, but I think that class of SaaS, and the full realization of Oracle’s ambitions in the cloud, may take some time.

Ciena Gets Some Mojo

Ciena beat its estimates in both EPS and revenue, which raises the question of whether operators are moving more to a “capacity-biased” model of network architecture even without a complete picture of how such a model could be optimized, or how it could contribute to improved profits in the long run.

An interesting jumping-off point for this is a quote from Gary Smith, the CEO: “…the networks today are more ready than before for a step function and capacity to support demand.”  Ciena attributes a lot of its success to the faster-than-usual adoption of its latest product, WaveLogic 5, which is an 800G solution.  Capacity is not only the philosophical basis for networking, the quest for capacity is arguably the fundamental driver in network infrastructure spending.

Transport is essential, but it’s the bottom layer of a stack that converts raw optical capacity into connectivity and services.  Why, then, is Ciena beating on revenue and EPS when vendors higher in that stack, closer to the real buyer of the capacity, are reporting misses?  There are a number of reasons, some pedestrian and some interesting.

One pedestrian reason is that you have to establish transport in order to offer services.  Anything that introduces new geographies, new facilities, is going to have to be a termination point for optical transport.  That gives players like Ciena an early chance to pick a seat at the infrastructure-project table.

Another non-strategic point is that, as Smith says, WaveLogic 5 is in a class by itself in optical transport, so the head-to-head competitive posturing that higher-level vendors have to deal with is much reduced, or absent, for Ciena.  If you can get an 800G product that’s market-proven, and you’re seeing the usual industry trend toward higher and higher bandwidth demand, why not swing for the seats (to stay with our “seats” theme), equipment-wise?

Getting a bit more strategic, Ciena’s ability to supply the top-end transport gear combines with the natural desire of buyers to have a single-vendor transport architecture, to give Ciena a good shot at account control in the transport area.  Remember that priority seating guarantee that WaveLogic 5 could offer?  Ciena can use it to save seats for other Ciena products, and the bigger piece of the deal you can cover, the more indispensable you become.  It also lets you afford to keep dedicated account resources in play, further improving your control of accounts.

More strategic yet is the opportunity to use account control and customer credibility to climb out of pure optical into the packet layer.  Ciena reported “…several new wins in Q3 for our Packet business. We now have more than a dozen customers for our Adaptive IP solutions, including Telefónica UK and SK Telink, which we announced in Q3.”  Network budgets for carrier infrastructure are, to a degree, a single food source in an ecosystem full of predators.  One way to maximize your own position in that situation is to steal food from others, and Adaptive IP can steal lower-level (Ethernet-like) connectivity budget from the router vendors.

Adaptive IP is also a bridge to the Blue Planet operations/orchestration software.  Ciena had a mobile-operator Blue Planet win in Q3, demonstrating that it can use its transport position to bridge upward into network operations and automation.  This, IMHO, is an important step toward delivering on the positioning story of network-as-a-service (NaaS), which is all about creating agile transport to steal functionality from higher layers, and router vendors.

I’ve been critical of Ciena’s ability to deliver on its Adaptive IP and Blue Planet stories, but it seems like they’re doing a bit better at that.  Part of the improvement, according to operators, is the result of operator concerns that old network-building paradigms are hurting profit per bit.  Part, according to some operators I’ve talked with, is due to the fact that Ciena is doing a bit better in positioning their assets.  Their story isn’t perfect, particularly for situations where a prospective buyer gets the story first from website collateral, but it’s improved.

This is important to Ciena, because their earnings and EPS beats can’t disguise the fact that overall optical transport spending is under considerable pressure because of the coronavirus and lockdown.  As Smith says that buyers are finding “they can run their networks a little hotter.”  That defers investment in infrastructure, but of course all bad things, like all good things, “must end someday.”  What Ciena has to be thinking about now is what the rest of the vendors in the network stack are thinking about, which is “What happens when things get back to normal?”

Transport is inevitable.  Are operators and other transport consumers offering a temporary priority to this layer because it is so fundamental, and will they then overspend, relatively, in this network layer and at this point in time?  If so, normalcy won’t necessarily mean a big uptick for the optical layer.  The other higher-layer vendors might then see their numbers go up, and with that see themselves better positioned at the infrastructure-spending table.

Ciena now needs to manage the transition to normalcy.  They have to enhance their packet-optical and Blue Planet stories further, spread the net of marketing wide to open a dialog with those who aren’t currently engaged, and with the higher-layer planners of the buyers they already have.  Smith, on the earnings call, acknowledges that Ciena may have seen the uptick toward normalcy earlier than higher-layer vendors did, simply because those vendors were at a higher layer and networks have to build upward.  They have to expect the other layers will see the uptick eventually, and they have to compete for eyeball share then, as well as now.

What Ciena still lacks is that net of marketing.  It’s not surprising that companies who sell to a relatively small number of enormous customers through gigantic deals would rely more on sales, but in periods of change, lack of a strong marketing/positioning strategy puts a lot of burden on the sales force, and it’s particularly dangerous when you have to engage with new companies, or new people in the same company.

If there’s an agile packet optics function that somehow lives between transport and IP, and if that layer can work with transport to simplify IP, then it has value to the buyer.  If that layer can be definitively and tightly coupled to transport, then vendors like Ciena reap that value in the form of sales.  If the layer floats without an anchor, then the more aggressive vendors are likely to grab it up, and nobody’s likely to think an optical vendor is aggressive, either in technical evolution or marketing effectiveness.

Transformation isn’t confined to optics or transport.  You can see the router vendors contending with the pressure to define a new infrastructure model, and that same pressure will impact the transport layer eventually.  That it’s impacting other layers now means that people Ciena doesn’t ordinarily engage with have already gotten those seats I’ve mentioned time and again in this blog.  Their views will inevitably impact Ciena’s deals, and so Ciena needs to spread its message to them, and clarify their role in that future infrastructure model for all.

The Many Dimensions of and Questions on VMware’s Telco Strategy

VMware came out with an announcement of their 5G Telco Cloud Platform, the latest step in the developing battle for all those carrier cloud applications.  Their press release is a bit difficult to decode, but it’s worth taking the effort because the company is presenting what’s perhaps the last chance of network operators to take control of their cloud destiny.  The biggest question, though, is whether they want to do that.  As I said in yesterday’s blog, operators now seem determined to avoid carrier cloud.  Does this impact VMware, then?

Almost every network operator (“carrier”) has a transformation vision that includes, and is often centered on, a shift from specialized appliances to some form of hosted-feature framework for their network infrastructure.  This has come to be known as “carrier cloud” because all of the approaches involve some form of cloud computing.  I’ve modeled the opportunity for carrier cloud since 2013, and the “optimum” deployment (the one that returns the best ROI) would create about a hundred thousand new data centers by 2030, the great majority qualifying as “edge” data centers.

In 2019, as I’ve noted in prior blogs, enterprises had an epiphany about their use of the cloud, recognizing that most of their cloud benefits would be derived from a split-application model that hosted the GUI-centric front-end piece in public cloud facilities while the transaction processing back-end piece stayed in the data center.  The carriers had their own epiphany that same year—they realized they didn’t really know how to go about building carrier cloud and the cost implications were frightening.

The result of this realization/fear combination was that operators suddenly started seeing public cloud providers as partners in carrier cloud applications, a way of dipping their toes into the waters of the cloud without risking drowning.  Operators were more than happy to take advantage of the interest, an all the cloud providers have carrier cloud programs, focusing primarily on 5G Core feature hosting as the early application driver.

The problem with this, from the operators’ perspective, is that the cloud provider solutions are complete enough to put operators at risk for vendor lock-in, and also create (in their minds, at least) the risk that they’d get committed to a public cloud transition strategy, only to find they can’t transition because they never developed a path in the self-hosting direction.  That pair of concerns is what VMware seems to be focusing on.

The 5G Telco Cloud Platform is a platform, not a 5G-as-a-service approach.  Its primary virtue is that it can host any cloud-open solution to a carrier feature-hosting mission, meaning that it forms a kind of carrier-cloud middleware.  You can deploy it on your data center and in one or more public clouds, and the carrier cloud solution is then non-specific to partners or to the extent to which the operator wants to commit to public versus private hosting of carrier cloud.  In short, it removes both of those risks.

Why, then, is this approach not sweeping the markets and dominating the news?  The press release was dated September 1st, and it was picked up in Light Reading’s news feed, but not on their homepage as a feature story.  I think a part of that is that the VMware carrier-cloud position was covered earlier by some media in connection with the Dish adoption of the platform.  Another reason, I think, is that the positioning in the release doesn’t capture the reality of the situation.

Let me offer this quote as an example: “As CSPs evolve from NFV networks to cloud native and containerized networks, VMware is evolving its VMware vCloud NFV solution to Telco Cloud Infrastructure, providing CSPs a consistent and unified platform delivering consistent operations for both Virtual Network Functions (VNFs) and Cloud Native Network Functions (CNFs) across telco networks. Telco Cloud Infrastructure is designed to optimize the delivery of network services with telco centric enhancements, supporting distributed cloud deployments, and providing scalability and performance for millions of consumer and enterprise users. These telco centric enhancements enable CSPs to gain web-scale speed and agility while maintaining carrier-grade performance, resiliency, and quality.”

This is arguably the key paragraph, but what the heck does it mean?  First, telcos are not evolving from NFV networks to cloud-native.  They have no statistically significant commitment to NFV anywhere but stuck on the premises inside universal (white-box) CPE.  Second, there’s nothing less interesting to the media than a story about how you’re evolving your solution, which is that the release says.  Revolution is news, evolution takes too long.  Finally, all the descriptive terms used are pablum, they’re table stakes for anything.

And here’s the best part; the link in the release for more information is broken (that appears to be a problem with Business Wire, not VMware; you can get to the target by typing in the text, or clicking HERE).  From there, you can eventually get to the detail, which is really about the NFV solution that the new approach is evolving from.  Still, in that detail, you can dig out some things that are important, perhaps not to the media but to the buyer.

Here’s why I think this is important, regardless of positioning.  The buyer matters.  What operators actually need at this point is a way to hedge their bets.  They really want to be in the clouds, but they’re more than wary, they’re frightened.  Right now, the cloud looks like this giant foggy mess that obscures more than it resolves.  Operators have jousted with cloud concepts for a decade and had absolutely no success.  They’re still among those who believe that you make something cloud-native by stuffing it in a container and perhaps using Kubernetes to orchestrate it.  “CNF” stands for “Containerized Network Function”, not “CNNF” for “cloud-native network function”.  If you think the cloud is fog, after all, what do you think of cloud-native?  Properly applied, VMware could give them that.

“Properly applied” is a critical qualification here.  The CNF versus CNNF thing is one of my concerns with the VMware story.  The operator use of “CNF” to mean container-network-function is well-established.  That VMware uses it in the quote I provided above raises concerns that they’re redefining “container” to mean “cloud-native”, and sticking with the old NFV evolution story.  More on that in a minute.

Carrier cloud middleware could be a revolution.  You can deploy it in any public cloud, or even all of them, and so you have that toe-in-the-stream option that seems popular right now.  You can’t be locked into a cloud provider because it runs on them all.  You can’t be held hostage in the public cloud forever, seeing your profits reduced by public cloud profits on the services you consume, because you can move the stuff in-house.

Microsoft, as an example, is providing a 5G solution that includes 5G functions.  How portable that will be remains to be seen.  VMware is partnering with various players (as their Dish deal show) to create 5G solutions that are portable.  This approach could give VMware a direct path to the hearts and minds of the carriers who are looking at virtual-network technology to transform what they have, and build what they’re now getting into.  They’ve just got to sing better.

NFV’s goals are fine, but there’s no evolving from it in a carrier cloud sense because it’s only broadly used, as I’ve said, in CPE.  Yes, the operators want to shift to cloud-native, but they need more than containers to do that.  VMware needs a true cloud-native vision, and they need to explain it clearly and make it compelling.  Then, they have to be prepared to do a fair amount of executive education in their buyer community.

There are some technical shifts needed too.  VNFs are old; not only obsolete but never really helpful in carrier cloud.  Containerizing, meaning CNFs versus VNFs, are only a transitional step, a pathway to a true cloud-native (CNNF, in my terms) goal.  The transformation to CNNFs has to be accompanied by a shift from NFV-specific orchestration and management to cloud orchestration (Kubernetes) and management.  VMware has the platform in place to support the strategy, but they need to develop and then advocate the strategy or nobody will care.

The beauty of VMware’s partner approach to the network functions/features themselves is that if VMware is prepared to advance its platform to support CNNFs, CNFs, and VNFs, it can find partners in each of the areas and promote the cloud-native transformation in a way that acknowledges reality (the CNNF approach is the only cloud-native approach) but also blows the necessary political kisses at NFV proponents who want to justify all the time and effort spent on NFV and its evolution.  Politicians know they have to kiss all babies they’re presented with; vendors marketing to as diverse an interest group as the network operators should do the same.

But what exactly is a CNNF?  That’s perhaps the key question for VMware, because it’s difficult to see how a model of decomposed cloud-native features could be created without in-parallel conceptualizing of the way it would be hosted and the middleware needed to support it.  Obviously, because VMware is supporting an “embedded” or edge-ready version of Kubernetes, they see Kubernetes as a piece of the story.  How much real experience do we have with Kubernetes in cloud-native deployments?  Is most of the heavy lifting done in a service mesh layer instead?  Lots of questions.

This is the big barrier for VMware because they have to answer these questions or they have no end-game to present to the operators.  That their positioning doesn’t depend on operators deploying their own cloud is great, but it’s critical that operators have a vision of what they are deploying, as much as where.  Defining the next step isn’t enough.  Evolution as a long battle for local feature supremacy is an unattractive picture to present to a telco CFO.  Better to show progress toward a goal, and so that CNNF end-game play is where VMware needs to focus.

The challenge in focusing there is that while there’s no question that VMware’s platform can support CNNFs, there’s a lot of questions regarding what a CNNF is, and how one is architected into a service.  A critical first step in that, as I’ve said many times, is recognizing that the control and data planes of an IP network have to be separated in order to optimize cloud-native deployment.  The data plane is a simple forwarding process, linked pretty tightly to trunk locations.  The control plane is where cloud-native implementation could shine, but is it separated in VMware’s vision?  Because VMware is talking platform and not service implementation, we don’t know.  That could make it hard for VMware to promote a CNNF-based approach to infrastructure, and without that, their challenges with their 5G Telco Cloud Platform could be significant.

Operators Continue to Back Away from their Own Clouds

The service providers themselves may be giving carrier cloud its death blow, not tactically but strategically.  In the last two months, operators worldwide have been shifting their thinking and planning decisively away from large-scale data center deployments.  Carrier cloud deployments, which my model said could have generated a hundred thousand new data centers by 2030, now looks like it won’t happen.  And it’s not just that it will be temporarily outsourced to public cloud providers.  It’s G-O-N-E.

In mid-September, operators will begin (with various levels of formality) their normal fall technology planning cycle, which will take till mid-November and guide spending plans for the years beyond.  Over 85% of them now say that they don’t want to “make any significant investment in data centers”.  That doesn’t mean they won’t have them (the do already), but that they are not looking to create services and features that will require large-scale in-house hosting.

The current market dynamic was spawned by operators deciding that, rather than building clouds of their own to offer cloud computing, they’d partner with the cloud providers.  Then the operators started to show interest in hosting 5G features, and all three providers are now in a push (Google, most recently) to provide not only minimal hosting but also the 5G software itself.  When that pathway opened for them, they insisted it was just a transitional approach, a way of scaling costs as 5G deployed.  Now?

Now, they’ve been easing away from the cloud, obviously.  OSS/BSS systems, their own “enterprise applications” were the next thing to be ceded by many operators to public cloud hosting.  Hey, enterprises think the public cloud is the next big thing for their applications, so why should service providers be different?  Answer, of course, is that service providers had expected to deploy their own clouds and somehow lost the will…or the justification.

There were two reasons why operators said they weren’t interested in having data centers anymore, and they were roughly equally cited.  The first was that they lacked the skills to build and sustain cloud computing infrastructure, and were doubtful that they’d learn those skills by consuming the infrastructure from a third party.  The other was that they doubted they would ever really have the applications to justify their own carrier cloud infrastructure.  In either case, it boils down to the fact that they don’t want to get into the hosting business.

Part of the problem here is that back in 2013 when I first modeled the carrier cloud space, operators believed that they would be deploying data center resources to host NFV.  By this time, modeling their input on the topic, I came up with an NFV-driven data center expansion of a thousand data centers worldwide, up to now.  In point of fact, my operator contacts say we have on data centers we can attribute to NFV.  Without the pre-justification of data centers, the next application would have to bear the entire first cost.

5G, the chronologically next of the drivers, started off in planners’ minds as a pure RAN upgrade—the 5G Non-Stand-Alone or NSA version that ran 5G New Radio over 4G LTE infrastructure.  That was a reasonable evolutionary approach, but the operators came to believe that the competitive 5G market would force them to deploy 5G Core almost from the first.  Had the operators started off with carrier cloud using NFV as the driver, they could have hoped for another three or four thousand 5G-justified data centers by this point.  They started late, and didn’t have the pre-deployed data centers, so they’re behind on this too.

The rest of the application drivers for carrier cloud, the largest drivers, are all now seriously compromised.  IoT, video advertising and personalization services, and location/contextualization-based services, are all over-the-top services that operators have historically not offered and are culturally uncomfortable.  Does anyone think an operator would build out cloud infrastructure on a large scale to prepare for any of them?  They don’t believe it themselves, not anymore, and that’s the critical point.

If you need some specific evidence of this point, consider that AT&T is, according to the WSJ, looking to sell off its Xandr digital advertising unit.  This unit would have been a logical way to exploit new personalization/contextualization features that might be created or facilitated by virtual network infrastructure.  If you had even the faintest thought of future engagements in personalization/contextualization, would you kill off the easiest way to monetize your efforts?  I think not.  Recall, also, that AT&T is a leader in looking to public cloud providers to outsource its carrier cloud missions.

If you’re a software or equipment vendor, this is a disappointing outcome, but frankly one that those very players brought on themselves.  Sales of a new technology to a buyer is more than taking an order on a different order pad.  Vendors in the data center and cloud technology space just couldn’t engage the buyer effectively, largely because they didn’t speak the same language.  The fact that all these data center drivers will either go unrealized or be realized on public cloud infrastructure is a serious hit to the vendors who could have built those hundred thousand data centers.

This is also going to have a major impact on the transformation of the network, the shift from routers and devices to software-centric network-building.  When there was a carrier cloud to host on, it was logical to presume that the network of the future would be built largely on commercial servers.  Now, it’s almost certain that it will be built on white boxes and different elements of disaggregated software.

There’s always been a good chance that the to-me-mandatory control- and data-plane separation requirements of software-based network infrastructure would demand a special data plane “server”, a resource pool dedicated to fast packet handling.  The control plane could, in theory still have been hosted on carrier cloud, but if there’s no carrier cloud and the only alternative is to host the whole network control plane on a third-party provider, control-plane white-box deployment starts to make a lot of sense.

The question is how this comes about.  You can take a router software load and run it in the cloud, in which case your control and data planes are not separated.  You can also take traditional router software and run it in the cloud for control-plane handling alone, letting it then communicate with a local white-box data plane for the packet-handling.  Or you can build true cloud-native control-plane software, in which case whether you run it on a white box, your own server, your own cloud, or a cloud provider’s cloud wouldn’t matter much.  That could facilitate the evolution of the control plane into the binding element between legacy connection services and new over-the-top or higher-layer services.

Is the network of the future a data plane of white boxes, joined to a control plane that spans both dedicated white boxes and some sort of cloud, even the public cloud?  Does that cloud-centric piece then expand functionally to envelop traditional control and management functions, new services that grow out of the information drawn from current services, and things we’ve never seen or heard of?  Do operator services and services of over-the-top players somehow intermingle functionally in this control-plane-sandbox of the future?  I think that might very well happen, and I also think it might happen even without a specific will to bring it about.

This might also frame out some of the details of edge computing.  5G already has a near-real-time segment in its control plane, which to me implies that we’re starting to see network/control-plane technology divide into layers based on the latency tolerance of what runs there.  If we’re able to assign things to an appropriate layer dynamically, we can see how something like a mobile-edge node could host 5G features and also higher-layer application and service features that had similar latency requirements.  If we had a fairly well distributed edge, we might even see how failover or scaling could be accomplished, by knowing what facilities exist nearby that could conform to the latency-layer specifications of the component involved.  This might even end up influencing how we build normal applications in the cloud.

One question this all raises is whether the operators are in any position to supply the right infrastructure and platform architecture for carrier cloud.  A more important question, since I think the answer to the first question is obvious, is whether the operators are in any position to define how network features/functions are hosted in carrier cloud.  Should they let the cloud providers run with that, redefining things like NFV and zero-touch automation?  NFV, at least, was supposed to identify specs, not create new ones.  Might the trend toward public cloud hosting of 5G end up helping carrier cloud, and even helping operators transform, more than operators themselves could have?  I think that’s a distinct possibility.

Another question is whether the operators really think they can host all network features in a public cloud.  NFV hosted virtual devices, so it didn’t present network and latency issues greatly different from current networks.  If you start thinking cloud-native, if you start thinking even about IoT and optimum 5G applications, you have to ask whether some local hosting isn’t going to be needed.  We might well end up without “carrier cloud” but with a real “carrier edge” instead, and that could still generate a boatload of new data center opportunities.  We might also see specialized hosting in the form of white-box implementations of network transport features, things that benefit from their own chipsets.

The cloud is a petri dish, in a real sense.  Stuff lands in it and grows.  The goal of vendors, cloud providers, and the operators themselves must be to fill the dish with the right growth medium, the technical architecture (yes, that word again!) that can do what’s needed now and support what blows in from the outside.  I think that natural market forces just might be enough to align everyone with that mission, and so it’s going to be a matter of defining the relationships in that control-plane-cloud.  Who does that?  Probably, who gets there first.

What’s the Real Role of Virtual Network Infrastructure in New Services?

Does a true virtual network infrastructure promote new services?  To make the question easier, can it promote new services better than traditional infrastructure?  You hear that claim all the time, but the frequency with which a given statement is made doesn’t relate to its truth these days.  Rather than try to synthesize the truth by collecting all possible lies, let’s look at some details and draw some (hopefully) technically meaningful conclusions.

The opening piece of the puzzle this question raises is also the most complicated—what exactly is a new service?  Operators tend to answer this by example—it’s things like improved or elastic QoS, wideband voice, and all the other stuff that’s really nothing more than a feature extension of current services.  All this sort of stuff has been tried, and none of it has changed the downward trajectory of profit per bit.

Analysts and writers have tended to answer the question in a totally different way.  “New services” are services that 1) operators don’t currently offer, and 2) that have some credibility in terms of revenue potential.  These would normally be part of that hazy category called “over-the-top” (OTT) services, because they are not extensions to connection services.  This is, in a sense at least, the right answer, but we have to qualify that “rightness”.

We have a growing industry of companies, the “Internet” companies, that have offered OTT services for decades.  Web hosting, email, videoconference and web conference, and even cloud computing, are examples of this traditional level of OTT.  Operators could get into these spaces, but only by competing against entrenched incumbents.  Given the competitive expertise of the average operator, that’s a non-starter.

What remains of the OTT space after we wash out the Internet establishment are services that for some reason haven’t been considered attractive.  In the past, I’ve identified these services as falling into three categories.  The first is personalization for advertising and video-related services, the second is “contextualization” for location- or activity-centric services, and the last is IoT.  All these services have a few attributes that make them unattractive to Internet startups, but perhaps valuable to operators, and it’s these attributes that would have to link somehow with virtual-infrastructure implementations of traditional service features if the virtual network of the future is really a path to new services.

The first of these attributes is that information obtained or obtainable from legacy services form the basis for the service.  The movement of a person, a group of people, or an anonymous crowd is one example.  Past and current calling/messaging behavior is a second example.  Location, motion, and pattern of places visited is a third.  All these are stuff that we can know from today’s networks or their connected devices.

The second attribute is that this critical information is subject to significant regulation for security and privacy.  What you’ve searched for or purchased online is, for many, a potential privacy violation waiting to happen.  Imagine extending it to who you’re talking with, where you’ve stopped in your stroll, and where you are at this moment.  This sort of thing would require explicit permission, and most Internet companies do everything short of fraud (well, most of the time, short) to avoid posing the question “Will you accept sharing this?”

The third attribute is that the service is likely useful only to the extent that it’s available pervasively.  A good example is contextual services that rely on location and behavior.  If they work within one block or one town, they don’t provide enough utility to an average user to encourage them to pay.

Which brings about the final attribute: there is credible reason to believe users would pay directly for the service.  Ad sponsorship has one under-appreciated limit, which is that the global ad spend has for years been largely static in the $600 billion range.  Everything can’t be ad-sponsored; there’s not enough new money on the table, so new stuff would cannibalize the advertising revenue of older things.

All this leads us to now address the opening question, and I think many of you can see a lot of the handwriting on the wall.  There are three pathways for virtual network infrastructure to facilitate the development of new services.  First, the new infrastructure could do a better job of obtaining and publishing the information needed for new services.  Second, the new infrastructure could create a better framework for delivering the services, perhaps by tighter coupling with the infrastructure in a cloud-native way.  Finally, the new infrastructure might be built on cloud “web service” features, a kind of function-PaaS, that could also be used in constructing the new services.

As regulated entities, operators understand privacy and compliance.  They actually hold all the information that anyone would need, it’s just a matter of getting it exposed.  Further, if we had strong APIs to provide a linkage between a higher-level service and the RAN, registration, mobility, and transport usage of cells and customers, that data would be useful even without personalization.

Operators also bill for services now, so having to deliver services for bucks would be no paradigm shift for them.  They have to make all manner of regulatory disclosures with respect to information, and they’d have a conduit to the user to obtain permission for data use.  The beauty of having the operators take this data and convert it into something that could then spawn personalization or contextualization is that the raw data wouldn’t have to be available through the operators’ services at all.  Third-party apps couldn’t compromise what they don’t have.

How does virtual network infrastructure contribute to these four points?  “Virtual” network infrastructure means at the minimum that network features are cloud-hosted, and if we want to maximize benefits, it means that the implementation is cloud-native.  As I’ve noted in many blogs, this doesn’t mean that all the elements of a service are running on commercial servers.  I think it’s likely that data-plane features would still be hosted on white boxes that were specialized via silicon to the forwarding mission.  It’s going to come down to the control plane.

Most of what a network “knows” it knows via control-plane exchanges.  It’s possible to access control-plane data even from boxes, via a specialized interface.  In a virtual network implementation, the access would be via API and presumably be supported in the same way that control-plane messaging was supported.  I think most developers would agree that the latter would at least enable a cleaner and more efficient interface, and it would (as I’ve noted before) also enable this control-plane-hosting framework to become a sort of DMZ between the network and what’s over top of, or supplementing in a feature sense, the network.

Let’s look at those four points with this control-plane-unity concept in mind.  First, if the control plane is indeed the place from which most network information would be derived, then certainly having the mechanism to tightly couple to the control plane would maximize information flow.  We can say, I think, that this first point is a vote in favor of virtual-network support for new services.

The second of our four points is the management of critical information.  In our control-plane-unity model, the service provider could introduce microservices that would consolidate and anonymize the information, so that if the information is exposed either to a higher-layer business unit or to another company (wholesale service to an OTT), the information has lost its personalized link, or the degree of personalization can be controlled by a party who is already regulated.  That means our second point is also addressed.

Point four (I know you think I’ve missed one, but bear with me!) is the question of willingness to pay.  This one is a bit more complicated, because of course users want everything free.  The reason why free-ness is difficult for these new services is that personalization to the extent of the individual is what makes focused advertising valuable.  It is possible to anonymize people in information services, of course, but unless there’s some great global repository of alias-versus-real mappings, every source of information would necessarily pick their own anonymizing strategy, and no broad knowledge of behavior could be provided.  Some work is obviously needed here.

In the meantime, of course, there’s always the chance that people would pay.  We pay for streaming video today (in most cases), so there’s at least credible reason to believe that a service could be offered for a fee if the service’s perceived value was high enough.  Operators couldn’t make this value guarantee unless they either offered the retail service themselves, or at least devised a retail service model that they could build lower-layer elements into.  More work is needed here too.

Point number three is the hardest to address.  It’s difficult to build a service that has a very narrow geographic scope, particularly if that service is aimed in large part at mobile users.  No new network technology is going to get fork-lifted into place, after the old has been fork-lifted into the trash.  A gradual introduction of virtual-network technology defeats virtual-network-specialized service offerings by excess localization.

The best solution here is to focus more on 5G, not only on 5G infrastructure but on the areas of metro (in particular) networking that 5G would likely impact.  If an entire city is virtual-network-ized, then the credibility of new services driven by virtual-network technology is higher, because a large population of users is within the service area for a long time.

The ideal approach is to play on the basis of virtualization, which is the concept of abstraction.  Some of the control-plane information that could be made available to higher-layer applications/services via APIs could also be extracted from existing networks, either from the devices themselves or from management systems, appliances, or applications (like the IMS/EPC implementations, which could be either software or device-based).  If an abstraction of service information APIs can be mapped to both old (with likely some loss of precision) and new (with greater likely scope and better ability to expand the scope to new information types), then we could build new services that would work everywhere, but work better where virtualization of infrastructure had proceeded significantly.

The conclusion to my opening question isn’t as satisfying as I’d like it to be, frankly.  New virtual-network architecture implementations could offer a better platform for new services, but there are barriers to getting those architectures in place and realizing the benefits fully.  The biggest problem, though, may be that operators haven’t been comfortable with the kind of new services we’re talking about.  Thus, the irony is that the biggest question we might be facing is whether, without a strong new-services commitment by operators, we can hope to ever fully realize virtual-network infrastructure.

Why I’m Obsessed with Architectures

If you ever wondered why there were so many songs about rainbows, you might be among those who wonder why Tom Nolle talks so much about architectures.  Somebody actually raised that point for me in a joking way, but it occurred to me that not everyone shares my own background and biases, the factors that make me think so much in architecture terms.  What I propose to do is explain why I believe that architectures are so critical in the evolution of networking and computing, and why we don’t have them when we need them.

In the early days of software development, applications were both monolithic and small, constrained by the fact that the biggest computers that a company would have in the early 1960s had less memory than a smartwatch has today.  You could afford to think in terms of baby-steps of processing, of chains of applications linked by databases.  As we moved into the late ‘60s and had real-time or “interactive” systems and computers with more capacity (remember, the first mainframes came along in 1965), we had to think about multi-component applications, and that was arguably dawn of modern “architecture” thinking

An architecture defines three sets of relationships.  The first is the way that application functionality is divided, which sets the “style” of the application.  The second is how the application’s components utilize both compute and operating-system-and-middleware resources, and the third is how the resources relate to each other and are coordinated and managed.  These three are obviously interdependent, and a good architect will know where and how to start, but will likely do some iterating as they explore each of the dimensions.

The role of architecture came along very early in the evolution of software practices, back in the days when much of application development was done in assembler language, so the programming languages themselves didn’t impose any structure (functions, procedures, “blocks” of code).  Everyone quickly realized that writing an application as a big chunk of code that wandered here and there based on testing variables and making decisions (derisively called “spaghetti code” in those days) created something almost unmaintainable.  Early programming classes often taught methods of structuring things for efficient development and maintainability.

Another reason why we started having architectures and architects was that it makes sense to build big, complex, systems using a team approach.  The systems are essentially the integration of related elements that serve a common goal—what I tend to call an “ecosystem”.  The challenge, of course, is to get each independent element to play its role, and that starts by assigning roles and ensuring each element conforms.  That’s what an architecture does.

The 3GPP specifications start as architectures.  They take a functional requirement set, like device connectivity and mobility, and divide it into pieces—registration of devices, mobility management, and packet steering to moving elements.  They define how the pieces relate to each other—the interfaces.  In the 3GPP case, they largely ignore the platforms because they assume the mobile ecosystem is made up of autonomous boxes whose interfaces define both their relationships and their functionality.  It doesn’t really matter how they’re implemented.

Applications also start as architectures these days, but an application architecture has to start with a processing model.  Some applications are “batch”, meaning they process data that’s stored in a repository.  Others are “transactional”, meaning that they process things that follow an input-process-update-result flow.  Still others are “event-driven” meaning that they process individual signals of real-world conditions.  Because applications are software and utilize APIs, and because the hosting, operating system, and middleware choices are best standardized for efficiency, the resource-relationship role of an architect is critical for applications—and for anything that’s software-centric.

Suppose we were to give four development teams one step in our input-process-update-result flow and let them do their thing optimally based on its individual requirements.  We might have a super-great GUI that couldn’t pass data or receive it.  That’s why architectures are essential; they create a functional collective from a bunch of individual things, and they do it by creating a model into which the individual things must fit, thereby ensuring they know about how they’re supposed to relate to each other.

You can see, from this, two challenges in architecture that have contaminated our network transformation goals.  The first is that network people are, by their history, box people.  They think in terms of functional distribution done by boxing and connecting.  When you apply that kind of thinking to software-and-cloud network infrastructure, you create “soft-box networking”, which doesn’t optimize the software transformation because it’s constrained.  The second is that if the ecosystem you’re trying to create is really large, and if it’s divided up into autonomous projects, there’s usually no overarching architecture picture at all.

NFV suffered from both these problems.  The NFV “end-to-end architecture” was a box architecture applied to a software mission.  The architecture guides the implementation, and in the case of NFV what some said was supposed to be only a “functional diagram” became an application blueprint.  Then the NFV ISG declared the surrounding pieces of telco networking, like operations, to be out of scope.  That meant that the new implementation was encouraged to simply connect to the rest of the ecosystem in the same way as earlier networks did, which meant the new stuff had to look and behave like the old—no good path to transformation comes from that approach.

Anyone who follows NFV knows about two problems now being cited—onboarding and resource requirements for function hosting.  The NFV community is trying to make it easier to convert “physical network functions” or appliance code into “virtual network functions”, but the reason it’s hard is that the NFV specs didn’t define an architecture whose goals included making it easy.  The community is also struggling with the different resource requirements of VNFs because there was never an architecture that defined a hardware-abstraction layer for VNF hosting.

Even open-source projects in middleware and cloud computing can suffer from these problems.  Vendors like Red Hat struggle to create stable platform software for user deployment because some middleware tools require certain versions of other tools, and often there’s no common ground easily achieved when there’s a lot of tools to integrate.  We also normally have multiple implementations of the same feature set, like service mesh, that are largely or totally incompatible because there’s no higher-level architecture to define integration details.

What happens often in open-source to minimize this problem is that an architecture-by-consensus emerges.  Linux, containers, Kubernetes, and serverless evolved to be symbiotic, and future developments are only going to expand and enhance the model.  This takes time, though, and for the network transformation mission we always talk about, time has largely run out.  We have to do something to get things moving, and ensure they don’t run off in a hundred directions.

Networks are large, cooperative, systems, and because of that they need an architecture to define component roles and relationships.  Networks based on software elements also need the component-to-resource and resource-to-resource relationships.  One solid and evolving way of reducing the issues in the latter two areas is the notion of an “abstraction layer”, a definition of an abstract resource everything consumes, and that is then mapped to real resources in real infrastructure.  We should demand that every implementation of a modern software-based network contain this (and we don’t).

But who produces the architecture?  That’s the challenge we’ve had demonstrated in almost every networking project involving service provider infrastructure for as long as I’ve been involved in the space (which goes back to international packet networks in the ‘70s).  Network people do boxes and interfaces, and since it’s network people who define the functional goals of a network project, their bias then contaminates the software design.  Open-source is great to fill in the architecture, but not so great at defining it, since there are so many projects contending for the mission.

This is why we need to make some assumptions if we’re ever to get our transforming architecture right.  The assumptions should be that all network functions are hosted in containers, that cloud software techniques will be adopted, at least as the architecture level, and that the architecture of the cloud be considered the baseline for the architecture of the network, not the previous architecture of the network.  Running in place is no longer an option, if we want the network of the future to actually network us in the future.

There are a lot of songs about rainbows because they’re viscerally inspiring.  I’m always singing about architectures because, in a software-driven age, we literally cannot move forward without them.  Large, complex, systems can never accidentally converge on compatible thinking and implementation.  That’s what architectures do, and why they’re important—even in networks.