Context and “Information Fields”

This is the second piece of my series on contextualization, and it focuses on the implementation of the “information fields” concept that’s one of two key elements in my contextualization model.  If you missed the first, it’s available HERE.  The third blog in the series will cover the other key element, the trusted agent, and the final blog will cover the application of AI to this model.

“Information fields” are the term I’ve been using to describe contextual inputs available to services, applications, and users to add relevance to the interaction between services and users.  The goal is to make a service or application look as much like a virtual “ghost” companion as possible.  In order for a personal agent to be truly personal and truly capable of acting as an agent, it must share context, and information fields are my basis for obtaining that context.

Why “information fields?”  The answer is that First Law of Contextualization; protect personal privacy.  The greatest risk in contextualization is having it reverse engineered to lead back to the requestor.  Contextual information has to be kept separate both from the actual contextualizing agent and from other contextual sources, or this First Law is at risk.  If we visualize the personal agent process as a ghost companion moving through life alongside us, we can visualize information fields as something our ghost can see in the netherworld that we cannot.

A “contextual input” is something that contributes to contextualization.  Some of that is directly related to the user and the user’s mobile device, such as the “movement vector” (location, direction, speed) or the mode of transportation.  Other contextual input relates to the user’s surroundings, interactions, and even historical behavior.  Geography is a contextual input, including the identification of structures, points of interest, and so forth.  So is the weather, the crowd/traffic levels, time and date, and even retail or service offerings being presented.  Things that we could see or sense should, in a perfect contextualizing world, generate information fields so our ghost companion can also “see” them.

There are two obvious questions about information fields.  The first is who provides them, and the second, how they’re accessed and used.  For these questions, we’ll have to address all three of our Laws of Contextualization.

Always look to the past when trying to plan for the future.  We are actually already “contextualizing” when we serve ads based on cookies that provide information on things like past web searches.  What happens in these cases is that a web server and web page will obtain the cookies and from them make a decision on things like what ads to serve.  Obviously something like this could be made to work in broader contextualization applications, but it violates some or even all of our Three Laws.

The big problem with the current system is its violation of privacy, or First Law.  The client system becomes a repository for contextual information, and applications/services obtain that information from it with relatively little control.  We have to assume that truly advanced contextual services would present too much risk if this sort of implementation were adopted, so we need to look at other options.

There’s also a value-chain problem here, with both our Second and Third laws.  Our second law was that the value to the consumer of contextualized services have to cover the cost.  The consumer in the current model doesn’t see any clear cost/benefit relationship, and in fact in nearly all cases doesn’t know explicitly what the benefit or cost is.  We buy into an “Internet ad” ecosystem, which clearly benefits some providers of elements of that ecosystem, but which makes cost and benefit to us invisible.  Yes, it may be there, but is the trade-off fair?

The third law may be the biggest near-term problem.  For information fields to work, there has to be a benefit to publishing them.  Every stakeholder in contextualization needs to have a motivation to play, and in the current system the information, once stored in the client system, is available for exploitation.  There are probably contextual resources that could be made available in this form (retail offers come to mind), but a lot of the “background” context needs some commercial stimulus to boost availability.  Even the retail offer data and similar contextual information could be problematic if not properly managed, because retailers might use competitor information to diddle pricing.

There seem to be three ingredients associated with information field success.  The first is a trusted information field catalog, the second explicit information field coding, and the third a mechanism to compensate information field providers for their information. 

A trusted information field catalog lets contextual processes find what they’re looking for by going to a single point.  Think of it as a kind of logical DNS, and it could in fact likely be implemented like a DNS is implemented.  However, the catalog has to be strictly controlled to ensure that accessing an information field doesn’t leave a trail back to the user that pirates can follow, and that the information resources in the catalog are from trusted sources.  Transactions with the catalog then have to be secured, HTTPS at the minimum.

The explicit coding of information fields mandates a taxonomy of information that lets a contextual process ask for what’s available based on specific characteristics.  This could be integrated with the catalog, or maintained as a separate database (an RDBMS, for example).  The goal is to allow a contextual process to ask for retail offerings in electronics in the 1500 block of Z Street, for example, and to receive a list of one or more information fields that can provide the information.

Compensation for stakeholders is essential or there won’t be enough of them.  My estimate is that there are already over a billion information sources that could be tapped for information field deployment, without any new sensor deployments required.  Some of this data is in municipal hands but most lies in corporate databases.  For some (retailers, for example) the benefit of providing the information as an information field lies in improved sales, and so no explicit compensation is needed.  For others, a small if credible revenue stream would be enough to tip the scales, and for some a clear revenue source would be needed to justify the investment needed to frame new information services.  The key is to be able to settle fees without having a chain of payments that ends up by identifying our end customer, and thus violating our First Law.

This is probably the most difficult of our information-field challenges.  Traditional ad sponsorship, which people like because it doesn’t require people pay anything, has already surrendered enough privacy to prompt regulatory inquiries and action in both the EU and the US.  True, thorough, contextualization will raise the risks significantly, which to me means that another avenue to “settlement” among the parties will be required.  That, of course, will mean that something has to do the settling, and that something has to be “trusted” in a regulatory sense.  The trusted agent is thus our next topic.

An Introduction to Contextualization

It is my view that contextualization is the most important issue in networking.  If the history of advances in IT can be linked, as I believe, to bringing technology closer to consumers/workers, then the ultimate step is to make tech a participant in our lives.  We’re seeing early initiatives in this direction with things like personal assistants, but from the very first “assistance” has tended to run up on the reef of context.

The most popular early question asked of these assistants was “What’s that?”, which reflects a very basic truth; an effective assistant is an almost-human but invisible companion.  It sees what we see, knows what we know, which means it shares our context.  It’s that context-sharing, in fact, that makes it valuable.  However, it’s a lot easier to say we’re going to create a ghost-like companion than it is to make that companion real.

Contextualization to create ghost companions who are literally partners in our lives presents some major technical and social challenges.  The most obvious, and usually most problematic, is the problem of protecting privacy.  Something that knows everything we know is both a companion and a potential thief and predator.  If “the network”, meaning a set of network-connected applications and database resources, know everything about us, we can bet that most of that everything will end up being hacked or revealed through the fine print of user terms of service.  A major breach of confidentiality would have such dire legal consequences that the risk alone would deter most companies from exploiting contextualization fully.

Context is important for our ghost companion because to be useful rather than distracting, it has to mimic our behavior.  Anyone who’s walked around town with a young child while trying to complete an adult mission knows the challenges that a behavioral disconnect can create.  To make an application or service mimic behavior means it has to understand what drives it, and that’s what I mean by context.

We have context today, but in nearly all cases we obtain it by having it communicated to our “ghost companion” directly.  That’s bad because a companion should share our context, not be informed of it.  Anything less than sharing means that our ghost is too ephemeral to be truly useful to us.  It needs a foot in both worlds, so to speak.

What are the technical ingredients of contextualization?  The best answer lies with our own senses.  Our visual sense is by far the richest of all our senses, meaning that it conveys the most information in the least time.  We must start any formula for context with the notion that our ghost companion must see what we see.  That doesn’t necessarily mean that our ghost has to have virtual eyes and full visual recognition intelligence, but it does mean that it has to be able to infer or construct at least a loose picture of what’s around us, what we would see if we scanned our surroundings.

The second key contextual ingredient is mission.  We behave differently when walking down a street to the office, versus walking down the same street to go to lunch, versus walking down the street while shopping.  Even something like shopping creates different missions; we might be looking for “a gift” or we might be looking for a specific pair of running shoes or a computer.

The third contextual ingredient is recent stimulus.  A text, email, phone call, or other interaction with the outside is often a behavioral trigger.  It can, for example, create a change in mission.  These stimuli are appeals to our senses, intrusions from the outside that could change what we do, drive what we try to do.

The final ingredient in context is history.  We can understand the future only through the analytic filter of the past, because people are individuals with their own way of looking at things and their own balance of logical versus emotional responses.  We may, in the future, face the spooky possibility that a service could read our thoughts, but until we cross that bridge the best way to interpret the other contextual elements is by reviewing how we reacted to them before.  Do I jump puddles or walk around them?

We already have the means of acquiring all of these contextual elements, at least in the sense that we have the information available.  The real work in contextualization is to do the acquiring and exploitation of contextualizing information within three constraints, which I’ll call the “Three Laws of Contextualization”.

The First Law is the consumer’s privacy must be protected, at least to the point where contextualization doesn’t add to risks to privacy already routinely faced.  In my own view, it should be possible to create a framework for contextualization that could, down the line, absorb some of today’s ad placement functions, as a way of responding to likely regulatory intervention.

The Second Law is contextualization must create enough value to the end consumer to cover any costs they’re expected to bear.  Those costs could be direct fees or indirect costs associated with the surrender of personal information.  If the latter is the case, the trade-off of cost/benefit must be clear.

The Third Law is every stakeholder in the contextualization process must have benefits that cover their ROI expectation when compared to costs/risks.  We can’t assign somebody an unprofitable role in contextualization just because we don’t know how to make the role profitable.  A way must be found.

If protection of user privacy is indeed the prime issue (which it clearly is), then that means that contextualization has to frame contextual data somewhere under explicit user control.  You do not allow services and applications to collect and correlate because that would necessarily give the owner/provider of that process access to user information that would blatantly violate privacy concerns.  There would only be two possible solutions—have the user’s own device do the contextualizing by absorbing outside information on demand, or have a “trusted agent process” hosted somewhere perform contextualization on behalf of the user.

Either of these two mechanisms would, in my view, depend on the notion that contextual information was present as what I’ve previously called an “information field”.  An information field is published knowledge about something specific, and it’s emitted by everything that we would consider to be a part of our environment.  Information fields live in our ghost’s netherworld, and by sensing those fields our ghost companion can create context without having us force-feed it.  None of the field providers would see any of the other requests for information or their results.  This would limit the ability of an information provider to use contextual information in such a way as to identify the requestor, or at least identify them enough to pose a risk.

An example of this relates to location.  Misuse of information to facilitate stalking is one of the major public policy concerns regarding “IoT” or related services.  If a given user, while on a given street, were to ask for nearby retail locations providing a given product, and if that user then looked at websites, and if that user had a cookie identifying themselves as perhaps a young female, having all that information would be enough information to make it possible to intercept the consumer.

One point this should make clear is that the information used to “contextualize” advertising could not be collected in such a way as to personalize the consumer.  A contextual analysis of a user’s behavior might suggest ads, but the suggestion of what ads to show should be made by the user’s device or trusted agent, not by the ad source.  This could have significant implications on having contextual services offered by websites who inherently know the user’s identity, such as social-media sites.

There are clearly a lot of dimensions to contextualization, more than I’m going to attempt to cover in a single piece.  What I plan instead is to do three more, one on the information fields and their implementation, one on the trusted agent, and a final piece on how AI could fit into things.  I can’t predict the exact timing of these pieces because I’ll treat other key industry news as it comes (as usual), but I’m hoping that over the next couple weeks we can complete them all.  As always, I’ll be posting the availability of each piece on LinkedIn, and that’s the forum I’d encourage anyone to use in commenting or making suggestions.

Inside the Most Critical Insight of Service Lifecycle Automation

I mentioned John Reilly, a TMF guru, in a prior blog, with a sad note at his passing.  And speaking of notes, I took the time over the weekend to read my notes on our conversations.  I was struck again by John’s insights, particularly when a lot of his points related to issues not yet raised and technologies not yet popular.  I’m writing this blog both as a tribute to John’s insights and in the hope that I can make the industry understand just how impressive…and important…those insights were.

It’s been my view from the first that John’s greatest insight came with his proposal known as “NGOSS Contract”.  NGOSS is “Next-Generation OSS”, of course, and John’s work was an attempt to create a model for the OSS of the future that was, as we’d say, “event-driven”.  Event-driven software responds in real time to events, which are signals of condition changes.  In the early part of this century, the OSS/BSS industry was viewing events as little more than transactions.  One came along, you queued it, and when you popped it off the queue in your own good time, you processed the event.  When you were finished, you popped the queue again.

John’s problem with this approach was context.  Events are real-time signals, implying real-time responses.  Yes, queuing things up in the traditional way compromises the timing of real-time, but the big problem John saw was that a modern service was made up of a bunch of loosely coupled functional systems—access, metro, core, vendor domains, device differences, subnet models, and so forth.  All these systems were doing their cooperative thing, but doing it in an asynchronous way.  Each functional system had to be considered independent at one level.  If something seen in one of the functional systems required attention, it was logical to assume an event would be generated.  What this meant was that there might be two or three or more independent sets of changes happening in a network, each of which would impact multiple services and would require handling.  How would these asynchronous needs be addressed by conventional software that simply queued events and processed them in order of arrival?

If asynchronicity and independence was an issue, the opposite condition was another.  A service made up of a series of functional systems has to coordinate the way those systems are deployed and managed, which means that we have to be able to synchronize functional system conditions at the appropriate times and places to ensure the service works overall.

There was also a challenge, or opportunity, relating to concurrency.  If functional systems were independent but coordinated pieces of a service, it made sense to process the lifecycle events for functional systems independently and concurrently.  Why feed them all into the queue for a big monolithic process?  What was needed was a mechanism to allow functional systems to be handled independently and then to provide a means of coordinating those times/places where they came together.  That would make the implementation much more scalable.

The answer, to John, started with the SID, the TMF service data model.  A service was represented as a contract, and the contract defined the service and a set of functional sub-services with a to-the-customer (“Customer-Facing”) and to-the-resource (“Resource-Facing”) specification.  Since there was a definition for each of the sub-services that made up a service, the SID could store the context of each and could also store the way the sub-service contexts related to the service context.

The applicable concept was a familiar one in protocol design—the state/event table.  A communications protocol has a number of specific states, like “Initializing”, “Communicating”, “Recovering” and so forth.  It also has a number of specifically recognized events, so it would be possible to build a kind of spreadsheet with the states being the rows and the events the columns.  Within each state/event intersect “cell”, we could define the way we expected that event to be handled in that state.  That’s the way context can be maintained, because as long as there is a single source of truth about the service—the SID’s Contract—then all the asynchronous behaviors of our functional sub-services can be coordinated through their state/event tables.

Of course, writing a spreadsheet won’t do much for event processing.  John’s solution was to stay that the state/event cells would contain the process link to handle the combination.  This was an evolution to the SID, a shift from a passive role as “data” to an active role in event-to-process steering.  It was essential if OSS/BSS were to transition to a next-generation event-driven model, the “NGOSS” or Next-Generation OSS”.  This created a new contract model, the “NGOSS Contract.”

Rather than try to frame this in a new software development model (for which there would be no tools or support in place), John wanted to use the software development model emerging at the time, which was Service Oriented Architecture or SOA.  An event, then, would be steered to a SOA service process through the state/event table that handled it.  That process would have the NGOSS Contract available, and from that all the information needed to correctly handle the event.

Suppose you got two events at the same time?  If they were associated with the same functional sub-service, they’d have to be handled sequentially, but if the state of that sub-service was changed by the first event (which likely it would be) then the second event would be treated according to the new state.  If the event was for a different functional sub-service, the handling of that second event could be done in parallel with that of the first event.  Just as with microservices today, SOA services were presumed to be scalable, so you could spin up a new process to handle an event and let it die off (or keep it around for later) when the processing of the event was completed.

I built on John’s thinking in my work on the TMF’s Service Delivery Framework (SDF) project.  At the request of five operators in the group, I did a quick Java project (the first ExperiaSphere project) to demonstrate that this approach could be used in SDF.  I called the software framework that processed an NGOSS Contract a “Service Factory”.  In my implementation, the Contract had all the information needed for event processing, so you could spin up a copy of a suitable factory (one that could “build” your contract) anywhere and let it process the event as it came.  Only the contract itself was singular; everything else was “serverless” and scalable in today’s terms.  This wasn’t explicit in John’s work or in my discussions with him, but it was implied.

What John had come up with was the first (as far as I know) model for distributed orchestration of scalable process components.  Think of it as the prototype for things like Amazon’s step functions, still a decade in the future when John did his work.  Yes, a service lifecycle automation process was what John was thinking about, but the principle could be applied to any system that consists of autonomous functional elements that have their own independent state and also obey one or more layers of collective state.  A series of functional subsystems could be assembled into a higher-layer subsystem, represented with its own model and collectivizing the state of what’s below.

You can manage any application like this.  You can represent any stateful process, including things like distributed application testing, as such a model and such a series of coupled processes.  It was a dazzling insight, one that had the TMF and the industry truly caught on, could have changed the course of virtualization, NFV, OSS/BSS, and a bunch of other things.

And it still could.  We’ve wasted an appalling amount of time, expended all too many dollars wastefully, but the market still needs the concept.  Distributed state as the sum of arbitrary collections of sub-states is still the hardest thing to get right of all the things that the cloud demands, but the most critical piece in service lifecycle automation.  If we’re ever going to use software, even AI, to manage services and reduce costs, John’s insights will be critical.

Is the CNTT Backing Into a Useful Vision of Telecom Virtualization?

I did not attend the meeting of the Common NFVI Telco Task Force (CNTT) but I have spent a good bit of my day yesterday sorting through emails from those who did attend.  Without revealing any confidences, it’s pretty clear to me that like many telecom virtualization debates, the CNTT stuff is demonstrating that we are trying to impose a low-level approach to something when we haven’t agreed on the high-level approach, or perhaps even the thing we’re approaching.

To start with, most of you will recognize that I’ve been blogging about both the CNTT and the issues that are being raised by the first meeting.  I didn’t talk specifically about the CNTT model because it wasn’t public until this meeting, and I don’t use private information as the basis for blogs.  However, my own views, as contributed to clients and others, are mine to share and I’ve been doing that.  Some of what’s covered here reprise those views, adding in the specific CNTT context now publicly available.

So let’s get going.  One operator contact of mine described the discussion as the “battle of the box natives versus the cloud natives”.  NFV, according to both this operator and my own experience in the ISG, focused itself from the first on creating virtual analogs for physical devices.  That approach was perhaps reasonable given the fact that the original “Call for Action” industry paper was focused on replacing appliances with hosted virtual functions.  The discussion of a common NFVI is hardly the place to debate whether this was a smart approach, but because it’s the thing we’re debating now, it’s the place where all discussions on how NFV is going necessarily have to find a home.

The “cloud natives” part of this reflects the truth that a small but growing number of operators think that the original approach of NFV was wrong, and that virtual functions should really be something more “functional” than “device” focused.  This would take a cloud-native approach to implementing the functions, and that in turn would strongly suggest (if not mandate) a cloud-centric model for hosting and deployment.

According to my operator friends, this device-native-versus-cloud-native stuff is embodied in a basic point, which is the contrast between “virtual network functions” (VNFs) and “cloud native functions” (CNFs).  VNFs are what NFV is explicitly about, and thus what we’d expect NFVI to be hosting.  A CNF, according to a rather large and actually nice presentation of the point by Cisco HERE, is actually a cloud component that happens to be pressed into a networking mission.

The CNTT meeting wasn’t supposed to be addressing this issue set at all; it was aimed at creating a specific set of NFVI missions, to which specific classes of hosting resources could be assigned.  However, one step in doing that was to define what VNF missions were out there, and another to use those missions and their hosting requirements to size the resource classes in NFVI’s pool.  It was this task, according to many operators, that exposed some interesting information.

If you look at the CNTT paper, you find a description of VNF Requirements and Analysis where it says that the list of “network functions” it provides cover almost 95% of the telco workload.  What’s interesting about the list is that almost everything on it has nothing to do with virtual CPE or service chaining, and in fact is not a per-subscriber deployment but rather a deployment of a functional element shared at the service level across service customers.

It’s pretty obvious that we’re seeing a kind of emergence here, a vision (even if it’s only an implicit vision) of the CNF (which some operators say stands for “Cloud-Native Function” and some say for “Cloud-Native Network Function”) as a specific service element.  The emergence is driven by the obvious fact that most of the VNF missions don’t relate to most of the VNF work that’s been done.  Instead, they relate better to the CNF view.  The question that my operator contacts say is being raised in the CNTT is whether CNFs are a subset of, a supplement to, or a replacement for the VNF.  My view is that no matter which of these three are intended, the fact is that CNFs will replace VNFs.

The great majority of the listed features fall into the “control plane” or “management plane” categories, as I’ve discussed them in the past.  CNTT uses this same separation, and it’s pretty clear that functions of this type that are expected to be persistent and support multiple users (callers, for example) in parallel are actually cloud applications and not components of user-specific services.  What CNTT’s categorization does is make it clear that the most useful “VNFs” are actually “CNFs”.

The bad news is that the CNTT doesn’t actually talk about CNFs, and this is where the operators I’ve talked with break from the sense of the meeting.  These operators believe that what the CNTT has done is create two distinct classes of virtual functions.  The VNF class of functions are per-user in nature, focus on replacement of a physical device with a virtual instance, and are primarily designed to be part of a chain of features in the data plane.  The CNF class of functions are multi-user, cloud-scalable, and service-feature oriented.

This simpler classification is more useful than the complex tiers the CNTT came up with.  It reflects reality, which is a good starting point for any classification system.  It also admits to the role of “NFV” and VNFs while at the same time making it clear that most of the stuff operators actually intend to do with hosted functions isn’t NFV at all, and wouldn’t be served optimally by the application of NFV orchestration and management principles.

This classification would be helpful in specifying what you really need in NFVI too, whether it’s hosting VNFs or CNFs.  VNFs require data-plane optimization, and so would likely be implemented on virtual machines running on servers with optimized hardware for network throughput, hardware like custom NICs.  CNFs are much more like generalized application components, designed as microservices and expected to scale dynamically under load to accommodate more (or fewer) users.  You need some degree of VNF-to-NFVI specialization, but I don’t think that most of the NFVI categories the CNTT has proposed really make sense for containerized microservices, meaning CNFs.

CNFs have become the conceptual framework for virtual functions built on cloud principles rather than on NFV principles.  They run in containers, they’re orchestrated by Kubernetes, and they’re designed to be fully scalable.  When the literati among the telecom industry’s cloud-native advocates use the “cloud-native” term, it’s CNFs they’re really describing.

Which is why CNFs will ultimately eat everything else, meaning NFV and VNFs as formally described.  VNFs and NFV’s software framework are the wrong way to do virtual functions overall, and at best only an OK way to do the limited data-plane-centric subset of virtual functions the ISG focused on.  However, you can do almost every VNF mission using CNFs, and it’s clear from the CNTT’s own VNF classification that most VNF missions aren’t even in the VNF wheelhouse to start with.

One could use the CNTT document as the basis for advocating an effort to converge the VNF concept with the CNF concept, but I’m not going to propose that for the simple reason that it would likely create a couple years of infighting.  The better path would be to let nature take its course; the strategy that can be implemented best and fastest will win, and CNFs are surely that.  That operators raised important questions in a comparatively unimportant forum shows that at least some are on the right track.

The CNTT is lost between two worlds, as my operator friend suggested.  It has one foot in the box and the other in the cloud, and nobody has legs long enough to span that distance without losing their balance or splitting themselves down the middle.  I think the industry should be pleased that something useful has been raised in the CNTT debates, even if the thing raised was inappropriate to the group and whether the group was even a useful place to raise any high-level issues at all.  Count your blessings, NFV; there are precious few to enumerate.

It’s hard for an industry to toss five years of work, and five years of not-insignificant spending, and admit they did the wrong thing.  It’s much easier to simply do the right thing under a different name, and perhaps eventually merge the two terms to cover the retreat from past mistakes.  That’s what I believe is happening, starting now, starting with the CNTT efforts.

What’s a good place to start?  There are three critical steps that, if not taken, will hamper everything else.  The first is that the industry needs to adopt the model-driven event-to-process steering that the TMF proposed with NGOSS Contract over a decade ago.  The second is that we need to address data-plane handling in CNF, meaning containers.  The third is that we need a model that lets us encapsulate current infrastructure and management APIs in the same model structure as we use to define service deployments in CNF/VNF form.  Missing any of these things will prevent us from fully realizing service lifecycle automation.  If we miss them all…it could be ugly.

Fixing What’s Wrong with our IoT and 5G Goals

Can we capture innovation and reality at the same time?  I said in an earlier blog that past IT spending waves had come along because we moved IT closer to the worker.  That simple process has driven every one of our IT revolutions, raising the rate of IT spending growth by 40% over the five-decade average.  It stopped in about 2001.  Can we get it going again?  I submit that if we can, that would be a possible way to advance things like IoT and 5G.  I also submit that making the good things happen (again) will mean thinking cloud-native.

Moving “closer to the worker” or to the consumer, is arguably stepping closer to real-time intervention in experiences.  We started in the ‘50s by recording transactions, and moved to distributed computing and desktop computing as a means of intervening in work.  Could we actually make IT a part of a work and/or life experience?  What’s needed, and how to we systemize it?

When IT becomes part of the experience, the critical thing needed is relevance.  In traditional systems, workers seek information and consumers seek entertainment or something.  Getting closer, being a part of the experience, means introducing what’s needed as the need arises, not on request.  Obviously, something introduced that’s not actually relevant to the worker/consumer goal is distracting rather than empowering.  Relevance is one of those things we know when we recognize but can’t quite define, though.

I submit that the critical ingredient in relevance is context.  Information presented in context is relevant, so contextualization is something our hypothetical next wave of IT really needs.  What creates context, though?  We do, in most transactional systems, because we provide it explicitly.  You go to the “edit your account information” screen because that’s what we want to do, and by going there we set context.  That simple mechanism locks us into the prior IT wave, though.  To get to the next one, we need to be able to use information to anticipate what a worker/consumer would ask for.  When a doctor puts out a hand and an assistant slaps the correct instrument into it, with nothing being said, we are seeing proper contextualization.  Our system has to be like a good assistant.

Context, in an IT sense, has to come from a combination of things.  First and foremost, it has to reflect the stimulus that’s acting on the subject.  IoT is vital to contextualization because it presumes the availability of sensor-based information to gather knowledge about the real world.  Combine that information with knowledge of where the subject is in that real world, and you have a picture of what’s likely stimulating the subject.

A subject on the corner of 46th and the Avenue of the Americas has a specific set of stimuli based on things like location, time of day, and weather.  That’s clearly not enough.  At the least, we’d need to know what direction our subject was traveling in.  Give it a moment’s thought and you realize we’d also have to know whether the subject is on foot or in a vehicle, the speed of travel, whether the subject is a driver or passenger, etc.

And that’s still not quite enough.  There are a lot of things that might be going on with our subject even within a specific set of these conditions.  It would help if we knew whether the subject made this particular journey regularly, and if so where the subject was typically heading and what they ended up doing when they got there.  Remember, we don’t want the subject to have to set context, so we have to rely on interpreting things based on prior behavior.

We could propose a simplified starting point for our quest for relevance here.  Context equals subject location and travel vector in combination with knowledge of geography and local conditions and behavioral history.  This is important for a number of reasons, not the least being that we’ve really not called out stuff like IoT sensors in a direct way.  Contextualization isn’t a direct consumer of IoT, it’s a consumer of information that IoT could provide.

The obvious questions, then, are what is a direct consumer of IoT and what it is that contextualization does directly consume.  There are actually a couple possible models here, and the models are where implementation approaches come in.

The presumptive model for IoT is that the sensor data (and probably controllers as well) are “on the Internet”.  This approach has little to recommend it in terms of financial feasibility or public policy compliance, because it’s very difficult to see how the investment needed to deploy and sustain IoT could be recovered if everyone just grabbed what they wanted, or how data could be secured.  Most cloud provider planners and network operator planners tell me this approach won’t work, but we still don’t seem to be able to shake it off.

A more likely model is the “information utility” model.  With this approach, people who have IoT sensors (or the funds to deploy and maintain them) could package their sensor data into a consumable and protectable form and sell it.  That information would then be transformed into retail services by a higher-level player.  In effect, the information utility players would be something like ISPs or CDN providers, offering a feature that’s integrated into something else.

Information can be presented at many levels, of course, and so this model could evolve into one where some of the information was directly consumable at the retail level.  It’s very hard to model how all this might evolve, but it appears that the “low-level” information utility providers would most likely be sensor owners who want to exploit what they have with minimal need to provide technology enhancements or generate retail sales.  The higher-level players would likely be OTT players with a retail service vision driving them, and making an investment in specific IoT sensors and information to give them an edge in that service set.

The information utility model suggests that since retail services would be constructed from the utility information services that were, by definition, “utility” and presumably somewhat competitive, we could expect to see information utilities vie for retail service attention by improving information quality and cost.  The model also promotes the exposure of sensor/IoT information from companies who have conventional sensor/control systems whose elements are on a private network or even directly wired.

This suggests that IoT is less a network of sensors than a network of information, which doesn’t really require any new sensors or sensor/5G technology at all, but does require some model for service discovery and sharing on a large scale.  In fact, the information utility model is a poster child for the service mesh technologies that are increasingly a part of cloud-native.  Microservices, linked with a mesh that can facilitate discovery, scalability, resiliency, and more is the real heart of an IoT system.

Our problem with IoT is that we’re trying to validate a mission for a facilitation, rather than a true service mission.  That’s our 5G problem too.  Until we recognize that “demand” for something means consumer, retail-level, demand, and until we build services to address that kind of demand, we’re going to see technology innovations under-realize their potential.

More Signals the Hybrid Cloud is Driving Change

The hybrid cloud might be coming into its own.  For over a year, it’s been clear that enterprises are finally getting their acts together on hybrid cloud, and cloud providers were starting to position more for enterprise adoption.  This quarter, we’ve seen two players with the credentials to further expand the hybrid-cloud universe show signs that they intend to do just that.  We’re even seeing signs that it’s paying off for them.

Just as a quick stage-setting, let’s look at hybrid cloud.  Since the dawn of cloud computing, we’ve been fighting a misperception of how enterprises would use the cloud.  The initial focus of “migration” of legacy apps to the cloud was never realistic because it doesn’t allow for full exploitation of cloud benefits.  You need to write cloud-native apps, but even the recognition that was essential has been difficult to secure in the market.  Developers need a framework in which to write and deploy stuff, and enterprises as late as last December weren’t seeing one.  With two players pushing frameworks, that’s going to change.

IBM closed on its Red Hat deal, and in addition announced its Kabanero cloud-native model of development, based on a whole series of existing and new open-source projects.  The Street is a bit preoccupied with Red Hat as simply a revenue stream rather than a provider of new addressable market opportunity, so we don’t have objective measures of IBM’s potential with Red Hat included, and of course Kabanero is only a week old.  The important point is that IBM’s WebSphere is an established enterprise development framework, and you need cloud-native development to build the new model of the hybrid cloud.  I think the Kabanero model will eventually unite with Red Hat’s offerings, and give IBM both a broader opportunity base and a direct pathway to hybrid exploitation.

Microsoft is, according to my rough survey, the leader in enterprise hybrid cloud today, and it beat Street estimates of cloud growth in the most recent quarter.  There are a lot of reasons for Microsoft’s success, but the big top-level one is that Microsoft has positioned Azure for hybrid cloud from the first.  They have an application-and-server presence in the market, a strong developer community, and a set of tools that were designed to create a seamless hosting framework across both Azure and Microsoft’s Windows Server platforms.

When Microsoft CEO Satya Nadella was introduced on their earnings call last week, his very first comments on technology were directed at Microsoft’s development credentials: “In a world where every company is a software company, developers will play an increasingly vital role in value creation across every organization….”  If anyone doesn’t see that the future of the cloud is in cloud-native development after reading that transcript, they’re delusional.

Microsoft’s big AT&T deal is another important point in their favor.  For a telco, a relationship with a cloud provider is most likely to mean a desire to build a business in public cloud services to enterprises, something symbiotic with their networking business with the same customer base.  Microsoft can bring a lot to the table in that space, and Microsoft is also viewed as less of a threat to longer-term telco business goals than either Amazon or Google.

IBM had some interesting comments on their call too.  “It is clear that the next chapter of cloud will be about shifting mission critical work to the cloud and optimizing everything from supply chains to core banking systems. This requires a hybrid multi-cloud open approach to provide portability, management consistency and security for these enterprise workloads. We’ve been building hybrid cloud capabilities across our business to address this opportunity and to prepare for this moment….”  Then there’s this: “We continue to see good hybrid cloud growth this quarter as clients leverage our reliable and scalable IBM Cloud Private solution, built on open source frameworks like Containers and Kubernetes.”

Telcos like IBM too: “GBS is working with another leading telco provider to advise on their digital transformation journey to the cloud. As part of this, IBM will develop and manage a center of excellence that will power and enterprise-wide Red Hat Ansible implementation for hybrid and multi-cloud platforms. This will enable them to transform their product and technology organization support an agile DevOps culture for its developer teams, while moving its application portfolio to the cloud to reduce complexity and accelerate delivery in time to value.”

There’s surely some positive news for both IBM and Microsoft, but also some questions.  For IBM, it’s hard not to wonder whether the company really has a broad hybrid cloud strategy.  Their comments on their earnings call, unlike Microsoft’s, didn’t show the commitment to cloud-native development that I’d have hoped IBM would articulate, given Red Hat and Kabanero.  For Microsoft, the risk of a player like IBM or Google defining a cloud-native deployment model is considerable, but they don’t seem to be pushing a cohesive model of their own.

The impact of all of this on networking also raises some questions.  I’ve noted in past blogs that the telcos were on the fence with respect to “carrier cloud”.  There is no question that were all the drivers of carrier cloud exploited, it could be the largest incremental investment in cloud data centers in the global market.  However, there’s also no question that the telco community is among the least-prepared of all markets for cloud and cloud-native.  They’d love to dip some toes in before they commit to a leap, and working with a public cloud provider they don’t see as a long-term threat could be a smart move.

Google has its own hybrid cloud aspirations, as its Anthos platform announcement has demonstrated.  Anthos is explicitly a hybrid cloud story, one also based on containers and Kubernetes and leveraging other Google technologies like the Istio service mesh.  In many respects it’s similar to IBM’s Kabanero, and that’s important because it might be an indication that the framework for building cloud-native applications is going to emerge in a way not specific to public cloud at all, much less to a single cloud provider.

Like Amazon.  They’re a big wild card in all of this, and at present Amazon emphasizes the web services that can be used to support hybrid cloud more than a single platform/architecture to frame hybrid and cloud-native development.  Amazon’s challenges in the hybrid space are formidable, despite what the media and Wall Street tend to say.  Yes, Amazon is the cloud leader, but the majority of their revenue is not from enterprise hybrid cloud, it’s from content and social-media startups.  Amazon has no real presence in the data center or engagement with IT organizations.  Further, the last thing they’d want is an industry pushing a universal, portable, cloud-native middleware and toolkit model.  Market leaders don’t like the idea of facilitating open migration among cloud providers.

“Don’t like” isn’t the same thing as “will never support”, of course.  Microsoft’s Azure is growing faster than AWS, and if I’m right and the big public cloud opportunity for the next decade is enterprise hybrid cloud, then Amazon can’t surrender it.  Amazon is no slouch in technology, and the need for a strong architecture and development model for hybrid cloud is so critical to the cloud, and to software evolution overall, that there will be plenty of benefits to drive competitive efforts.

We’ve been seduced by terminology, in a sense.  “Migration” to the cloud implies that all that needs to be done is to relocate stuff.  That’s never been true.  The cloud is a different hosting model, a different networking model, and it needs a different development model.  We’ve ignored that truth and by doing so we’ve slowed the adoption of cloud technology by limiting the benefits available to justify its use.  What’s happening now—finally—is that we’re framing that new cloud-native model.  It doesn’t change the business rules of technology transformation, so benefits are still necessary to drive change.  It does provide a path to those benefits that more and more developers will be able to follow.

A final point here is in order.  Microsoft’s statement “In a world where every company is a software company….” joins in my view the “The network is the computer” quote of Sun’s John Gage as the most profound and predictive comments of our age.  Gage’s comment in 1984 should have been the starting point for the cloud era, but somehow it caught on as a slogan and not as a transformational mission.  And as Nadella said, today every company is a software company because software is the glue that binds technology to human experience.

Shedding the Hype and Realizing the Future

According to a Reuters article the other day, blockchain is a “shiny mirage”.  The administrator of LinkedIn’s Carrier Ethernet group asked in a post on 5G “Who (which customer) exactly is clamoring for this And, how does it make sense, when 4G itself hasn’t been adequately monetized yet?”  Why is it that over the last 30 years, over 85% of telecom projects have failed to generate any significant deployment?  In many cases, the ones that have failed the most spectacularly have been the ones that led the news cycle.

We clearly have a problem in our industry, and many would argue it’s not confined to technology.  I pointed out a couple weeks ago that as technology gets more complicated, it becomes more difficult for buyers to acquire the skills needed to make even a basic assessment of value.  Without such an assessment, it’s hard to get a project going, and in particular hard to get one going the right way.  I’ve also pointed out that in the age of click bait, the value of something lies in its ability to generate ad revenue, which has more to do with its novelty than its objective business case.  I’ve also pointed out that we have a history of doing things the wrong way, propagating the limits of the past into the future to reduce everything’s potential to drive incremental benefit.  Which of these is the real problem, or is it all of them?

I’ve lived through the IT revolution, in a very real sense.  I can still read Hollerith card punch code and I could struggle through paper tape.  I remember when a top-end IBM PC had two 128 kb floppy drives and 128 kb of RAM, and when even “mainframes” had only 160 kb, when network connections at 1200 bps were fast, and when everyone wrote programs in assembler (machine) language.  Our industry has generated some profound changes, some dazzling innovations.

It’s that incredible past that makes all the over-hype and under-shoot experiences of today so difficult to understand.  We had three waves of IT investment explosion in my career, each of which corresponded to a revolution in computing.  The last one ended in 2001, and nothing has come along since.  I submit that the period around 2001 is where we need to look to understand what’s happening now in the industry, what changed for the worse.

The thing that had been propelling successive waves of IT spending growth (averaging 40% versus the “average” rate of growth over five decades) was the simple paradigm of moving IT closer to the worker to improve productivity.  I did a presentation on this to Wall Street in the early 2000s, and the biggest brokerage firm flew its strategists in from all over the world to hear it.  Moving from punch-card batch to personal computer real-time raised productivity, and that justifies IT spending growth.  And by the late ‘90s, we’d given more computing power to a worker, on the desktop, than we had in the data centers four decades earlier.  And that was where our problems started, not with the power but with the question “What now?”

By 1998, we had started a NASDAQ boom when the objective tech spending couldn’t justify it. The third wave of IT spending boom was, by then, on the wane.  Many of you remember that this was when we started to hear all the high-flying networking stories, the era of wholesale bandwidth and how Internet users were worth fifty thousand dollars each in valuing your stock.  If you can’t drive stocks with the truth, then fables will serve.  This, of course, resulted in the NASDAQ dump, which decisively ended the last of our waves of tech-spending boom.

It also brought in Sarbanes-Oxley, designed to keep Street analysts from creating fiction to boost stock bubbles.  The problem was that SOX meant the industry that had previously been pushing us into the future with revolutions now found that nothing beyond the current quarter did anything for their stocks.  Strategy gave way to tactics, to sales.  Forget the future because we can’t wait for it.

This is when media started to change, before the online pub era ever really got going.  Instead of having subscriptions we had ad-sponsored publications.  Ad sponsorship means sales-focused, because what the companies pushing the ad dollars wanted, remember, was sales in their current quarter.  In the year 2000, the most influential source of information for tech buyers, outside the experience of a trusted peer, was the content of five key tech publications.  Five years later, after those publications had been absorbed into the ad-and-sales swamp, the influence score of publications overall had declined by almost a third.

At about this time, we saw a major change in IT spending.  Until 2005, the annual tech spending of enterprises was roughly 50% budgetary modernization and incremental improvement and 50% new-project-new-benefits.  In 2005, the new-project contribution began to decline, and by 2012 it was only 20% of total IT spending.  This project stuff had driven our past waves of IT spending growth, and we no longer had it.

Online information resources, and dependence on them, were already growing in 2005, and by 2010 online pubs dominated the industry.  The difference with online publications is that you can see what’s interesting to readers right down to the article.  Knowing what gets clicked on and who’s clicking lets you target ads, but it also lets you target material.  People used to “get” a magazine; now they got a list of URLs and clicked on what was interesting.  Thus, you get more clicking if you make everything interesting, and once a click has been made you’ve reaped the ad value no matter whether the story is 200 words or 2,000, or whether it’s even read.  Remember the joke story “Man Bites Dog?”  “News”, as I’ve often said, means “novelty”, not “truth”.

I could see the change in ad-sales mindset in my own consulting practice.  Startups were once my major customer base, because they wanted to know what strategy to follow to make them a success.  By 2010, the goal of “consulting” was to get consultants to say a company’s strategies were a success instead, regardless of what they were doing.  I had a vendor VP say in a meeting “Why should I do what you say just to get the market to accept my products when I can buy reports telling them to accept them?”

This is why blockchain was overhyped, why NFV was overhyped, why 5G was overhyped.  It’s also why part of the overhype excess was too much promotion for market reality, but a part was also a failure to realize what could have been.  That failure was caused in large part by the failure to assess and address reality, and that in turn by the conviction that reality would have taken longer than an earnings quarter in any event, so why bother with it.

We see the news others pay for us to see.  Should we be surprised that it serves the interest of the payer and not the reader?  We buy things that can make the numbers of vendors for the current quarter, so should we be surprised that revolutions all seem to be falling short?  Operators are always telling me that vendors are “dragging their feet” to address operator problems.  True, but that’s because they’re hurrying to address their own, and they’ll keep doing that until operators take charge in a meaningful way, not just beat their breasts and cry.

We’ve had progress in the industry, both before that critical year-2000 point and after.  Before 2000, it was created by a culture of innovation, and now it’s created by a few insightful innovators like Steve Jobs, so good they could break out of the overall stifling cotton-ball of near-term sales and clicks on URLs.  But asking for a Steve Jobs to come along drive something forward is like asking to win the lottery as a personal finance plan.  Innovation has to be systematized or you can’t depend on it.

There is one area where things are still working, and working well.  The cloud, and its associated development tools and techniques, are thriving.  Kubernetes and containers are rapidly becoming the de facto architecture for application deployment in the cloud, which makes them the de facto model for application development.  It’s not that Kubernetes has been immune from hype or that the open-source community has a spotless record of doing the right thing while ignoring the cynical market pressures everyone else seems to be succumbing to.  It’s that a market collective has grown up around the cloud, and that collective is collectively doing the right thing, albeit among other things.  Not as fast, perhaps, as a clever vendor might have, but certainly fast enough to meet market needs.  We are getting a cloud-native framework for the future, despite all the problems I’ve cited here, through our collective.

Could this work in other areas, other hype magnets like 5G or IoT?  The concept of collectivism would certainly be applicable, but there’s a barrier here in the buy-in price.  Cloud-native doesn’t have what operators would call a high first cost, a big investment needed just to get something going at a credible scale.  IoT and 5G, at least in the form we’ve been discussing them, do.  So, we need to discuss them in different forms.  We need to creep into a collective for both by addressing not the things themselves, but some things that could drive 5G and IoT.  That’s going to be the topic of some future blogs.

IBM Takes a Big Step Toward Cloud-Native

Everyone talks about cloud-native, but like the weather, nobody seems to be able to do much about it.  In fact, the interest in the concept often seems so abstract that you could argue “cloud-native” is a state of mind.  It’s not, of course, but clearly something is missing between the raising of the interest profile, even at the CIO level, and realizing the goal in real deployments.

If you’ve followed my blog, you know that it’s been my view that applications for the cloud have to be developed within an ecosystem of middleware and tools that create a kind of virtual operating system.  Applications are then written for it, and because they obey rules that were designed to impose cloud-native principles and exploit the cloud’s unique capabilities, they’re “cloud-native” out of the box.  We’ve known for a year or more that containers and Kubernetes were the foundation of this virtual operating system, the “Kubernetes ecosystem” I’ve mentioned from time to time.  We’ve also known that more was needed, and IBM is now stepping up with what might be the most organized attempt to define that virtual operating system that’s come along.  It’s called “Kabanero”.

Out of curiosity, I tried to see if the name meant anything clever.  It sounds Spanish, but Google’s translate says it’s Maori for “diagnosis”.  I don’t think IBM had that in mind, so I’ll leave it to IBM’s PR people to come up with the meaning, if there is one.  Meanwhile, let’s look at the concept itself.

Kabanero is a Kubernetes ecosystem designed to facilitate the development of containerized, cloud-native, applications.  Besides Kubernetes, it includes Knative, Istio, and Tekton, three new open-source projects (Codewind, Appsody, and Razee).  For those who don’t know the details of the first three, Knative is a scale-to-zero “serverless” Kubernetes extension, Istio is a service mesh, and Tekton is Kubernetes-centric CI/CD kit.  Of the new elements, Codewind is a container-friendly IDE extension for popular development environments like Eclipse, Appsody is a framework for creating and managing runtime stacks for containerized applications and a repository for popular stacks, and Razee is a multi-cluster at-scale Kubernetes distribution/deployment framework.

All of this is hard for non-developers (and even some developers) to get their heads around.  Remember, though, that the goal of Kabanero is to provide a complete virtual operating system that makes the cloud look like a computer, with special features like scalability and resiliency.  It enforces practices on developers to exploit those features, and it operationalizes the results to reduce the inconsistencies in deployment and application operations that would inevitably creep in if every developer were to approach containers and cloud-native in their own way.

This is a major step forward for containerized apps and cloud-native, because of that creeping disorder and inconsistency.  If we go back to the old days of computers, we’d come to a time when there was no standard disk I/O, no middleware or operating system tools.  Imagine how difficult it would be today to deal with data harmony in a major interreactive enterprise if every application that needed relational database capability created its own model.  Not only would writing applications have been incredibly complex, running them properly would have been an almost-insurmountable challenge.

RDBMS would have languished without a common model.  Cloud-native would do the same, so what this is going to do is unleash cloud-native, create a pathway to the goal and not just a PR slogan.  This raises two questions.  First, what will unleashed cloud-native actually do, and mean?  Second, why is IBM making such a big move here?

The biggest fallacy in cloud computing was the notion of “migrating to the cloud”.  Yes, there are some applications that under-utilize server resources enough that public resource pools could save money for their users overall.  Yes, there are some users who can’t maintain their own data center resource pools properly for lack of access to the skilled workforce involved.  Both these are limited opportunities.  The benefits of the cloud require you write applications for them, not move applications to them.

What we’ve seen so far in cloud computing is a combination of two things—enterprises who are able to build new application front-ends using cloud principles, and startups who needed cloud properties from the first and pulled together the tools needed to build real cloud-native behaviors.  Kabanero could mainstream cloud-native.

The first thing that will do is revolutionize application development.  Most of the code run today was designed on principles that were already old when the cloud came along.  Cloud-native will require that we redo that, rethink all the application architecture, design, and operations principles we’ve lived with for decades.  How revolutionary is this?  Recall my criticisms of operator transformation projects like NFV and ONAP.  These projects needed cloud-native thinking, and they’re either in big trouble or will be because they didn’t have it.  But going back to redo all that stuff in cloud-native form will be very difficult at this point.

Application development here means more than just software architects and programmers churning out code.  The tools needed for development in a cloud-native world are different, as the components of Kabanero show.  Application lifecycle management, continuous integration and delivery, service discovery, scalability, resiliency, DevOps, GitOps, and just about everything else we know today will be impacted by cloud-native.

The second thing this will do is revolutionize the cloud.  Cloud providers today have tended to create their own framework for cloud-native, on the simple theory that this would differentiate them during cloud consideration and lock users in once code was developed.  A uniform framework for cloud-native will eventually force cloud providers to adopt it.  Don’t believe that?  Look how quickly everyone’s managed container services have become managed Kubernetes services.

Is this bad for cloud providers, then?  In a sense, perhaps, because it means not only that differentiation among public cloud providers will be more difficult, but also that competitors who’ve not made the top tier (like, obviously, IBM) will have a shot.  But the total value of incremental cloud services that cloud-native could drive approaches a trillion dollars per year, so a little jostling for the market won’t offset the fact that there’s going to be a lot more market to jostle for.

The third result of cloud-native is the death of the machine.  Virtualization all too often has locked us into a historical backwater instead of freeing us.  NFV created virtual devices to build networks with, and so built the same kind of networks we had all along.  Virtual machines were just as operationally complex and as inflexible as real machines.  Containers are different.  They’re an abstraction of what you run, not what you run on.  If the cloud world is virtual, containers are the closest thing we have to creating it, because they don’t constrain it.

Forget VMs, IaaS.  The future is containers and container hosting.  That’s the net message of Kabanero, and that’s what IBM is clearly banking on.

This could actually be a very smart move for IBM, one I must admit I hadn’t expected.  In a tactical sense, Kabanero links an open, pan-provider, framework for cloud-native to IBM’s still-strong-and-with-Red-Hat-stronger position in development.  If IBM makes Kabanero work, they erode the benefit of being one of the Big Three in the cloud, because anyone who can host Kabanero can win the largest piece of the cloud pie.

In a strategic sense, this could be even bigger.  I learned programming on an IBM computer, and for most of my career in software development, architecting, and management, I looked at IBM as the thought leader.  Up to about a decade ago, they were the unchallenged leader in strategic influence on enterprise IT.  It made them a boatload of money, and they threw it away.

Kabanero could get it back for IBM.  Add in the Red Hat dimension, which further improves the base of businesses that IBM could expect to influence with Kabanero, and you have what might be a really compelling story.  Will it be, though?  IBM has done great things and dumb things, and for sure the trend over the last decade has been toward dumbness.  They could still mess up both Red Hat and Kabanero, and I think if they do, they’ve probably lost forever.  They could also get it right, and if that happens, we might see a revival of those old IBM notepads embossed with the single word “Think”.  For sure, it’s about time somebody started doing that, and for that start alone I applaud IBM’s efforts.

What Went Wrong with NFV: The Operator View

Most would agree NFV has not met expectations.  I’ve blogged about the things I believe are the issues, and the feedback I got from operators on those blogs has included operators’ own views on the subject.  They’re not always congruent with my own (in some cases they’re almost contrary), but they are always interesting, given that operators are the ones who will have to make NFV a success.

What is the number one problem with NFV?  The narrow first choice of operators who contacted me is that VNF licensing fees are much higher than expected, which means the cost of a service based on hosted virtual functions is higher, and operator profits are lower.

On the surface, you can see both sides of this issue.  On the one hand, operators expected that if device vendors spun out their software features as one or more virtual functions, the function licensing would be significantly cheaper than buying the boxes would have been.  That seems logical.  On the other hand, the device vendors say that the cost and value of their product is more tied into things like the R&D for the software and support of software-hosted features than it is in the boxes, which in many cases are simply OEMed in some form.

Still, this is the operator complaint where I part company.  You don’t build a successful service model by setting the prices for others’ participation.  If operators thought that device vendors would make sweetheart deals for their functionality, they had absolutely no justification for that thinking.  From the very first, it’s been clear this would happen.  It was clear to anyone who thought about it that open-source software should be prioritized as a source of VNFs.  Such a move would give operators a guarantee of a lower cost point, and would also provide a free-market incentive for device vendors to manage their VNF costs.  The notion never really took off, but a major initiative to be more inclusive of open-source could still be mounted, and could still benefit NFV overall.

The close second in the area of operator problems with NFV is that onboarding VNFs has proven to be much more difficult than expected, amounting to a one-off for every VNF targeted.  This problem has many roots, which makes it more difficult to address at this late date.

In my response to the Industry Call for Action paper released in late 2012, I said that one of the critical points in NFV implementation was to define what I called an “NFV Platform-as-a-Service” definition, a framework of APIs into which all VNFs would integrate.  This move would enable operators to create, contract for, or require as a condition of use, a standardized “adapter” that would expose all control, parametric, and management APIs and data in a common way.  There have been a few strategies to address this approach, but the ETSI NFV ISG has not defined the NFVPaaS.

Part of the reason is that operators, in an effort to minimize their role in a consultative sale of VNF-based functionality, chose to emphasize well-known vendors, and to prioritize their integration.  The combination of “use what we know” and “do things fast” encouraged a customized approach, something that the early vendor-led proof-of-concept program accentuated.  With no common goal, these separate PoCs didn’t develop a common model for integration.

Issue number three from the operators was most NFV applications are really not about “cloud-hosting” functions at all (as the Call for Action paper said NFV would be), but rather are simply a shift from a proprietary device to “universal CPE”.  This is true, but in my view it’s really the fault of operators in the NSG and not the specs or the vendors.

The real issue here is the box-centricity of operators themselves.  Operators, for a variety of (not very good) reasons, fixated on NFV as the substitution of virtual functions for physical devices on a 1:1 basis.  That was an early, conscious, decision made by committee heads who were uniformly operator personnel.  Part of the reason was that operators wanted to be able to do NFV management like they’ve always done management, meaning as an element-to-network-to-service progression.  The “elements” used to be physical devices, so they are now “virtual” devices.  Another reason is that operators built networks from boxes for literally decades and just couldn’t see any other way.  Look at “service chaining”, which implies the whole of VNF connection is the emulation of physical interfaces and cables between boxes.

A broader vision for NFV, including a vision where cloud-hosting of more general functions was the goal, would have exposed a bunch of shortcuts in management and service modeling that would have slowed progress.  To avoid those, the ISG elected to focus work where there were fewer exposed issues, which is how we got to uCPE.  Those issues have still not been addressed, and so it’s going to be a lot of work and take a lot of time to fix this one.

The next operator issue is, in my own view, related to the last one and even to some of the others as well.  NFV benefits have proven difficult to obtain at the pace and level needed to justify the investment in the new technology.  This is the problem that CFO teams tend to focus on, but of course NFV is still largely driven by the CTO organization, so it falls down the list in number of mentions.

By the fall of 2013, it was clear to nearly every operator that capex reduction, the justification for NFV cited in the 2012 white paper, would not be sufficient to drive NFV adoption on a large scale.  In a meeting I had with most of the operators who signed that white paper, the sentiment was “If we want a 25% reduction in capex, we’ll just beat Huawei up on price!”  Clearly, the group was recognizing that significant opex benefits were needed to augment the modest capex reductions expected.

As it turned out, even those modest capex reductions were problematic.  First, as already noted, VNF licensing costs were higher than expected.  Second, it’s more operationally complex to host a function and keep it running in the cloud than to have it live in a box somewhere, which means opex for VNFs might be higher, offsetting some capex reductions.  This truth was reflected at least a bit in the second industry paper, released later that fall.

By that time, though, the ISG had agreed to make service lifecycle management out of scope to NFV efforts, hamstringing any innovative way of dealing with what was clearly more a cloud-related problem than a box-management problem.  The end-to-end (E2E) functional diagram of NFV’s architecture had also been approved, and while it was supposed to be a “functional” model, it was literally interpreted by almost every one of the open-source NFV software initiatives.

I found the last of the widely held negative impressions of operators on NFV the most interesting.  NFV is too complex.  Darn straight it is, because virtually every decision that could have been made to simplify it was made in another direction.  You start with a low-level architecture, and then you start testing it with use cases?  That’s what happened, and how likely is it that the architecture done without regard for what it was supposed to support, ended up supporting what was needed?  So you glue on little pieces to fix tactically what you should have prevented strategically.

This is still the most important issue, though, because it’s one we’re still facing with zero-touch automation and ONAP, with 5G Core, and with future standards and specifications.  We’ve had clear evidence, clear even to operators, that the NFV process went awry.  I’m not harping on that to lay blame, but to prevent the same kind of process thinking from going wrong in the next thing we try to do.  You can’t retrofit past initiatives, but you can realign current ones.

Is Cisco’s Acacia Deal their Best Move?

Cisco’s decision to buy Acacia has drawn praise and some pans from Street analysts worried about what the deal will do to Acacia’s sale to Cisco’s competitors.  There are reasons for the deal, I think, that don’t seem to be coming out in the media, and in some ways, it seems to buck a trend toward software and virtualization.  Obviously, something is going on in networking and obviously everyone isn’t seeing their proper course of action in the same way.  What is happening and what should be done?

The big problem that network vendors face is the cost pressure on network buyers.  The network operators have complained about lower profit per bit, and new services like 5G will only exacerbate the problem since they promise a lot more bandwidth for little or no increase in price.  In the enterprise space, we’ve been in a fifteen-year period of decline in the number of new projects that drive new benefits and justify new network capacity.  As a result, the enterprises are trying to create better business connectivity at little or no increase in cost.  All this “little-or-no” stuff rolls downhill to impact sales.

Any time an ecosystem is under pressure, there’s more competition and predators (vendors in this case) will try to shut others out of the food chain.  If you look at long-haul IP networking, it’s obvious that having an optical layer dedicated to transport, connecting to an electrical layer for connectivity and service delivery, is a complex structure.  If you had optical interfaces for switches and routers that could support long-haul connectivity, you could drive glass directly and cut out the middleman.

That’s probably what Cisco has in mind with Acacia.  They’ve been a supplier of optical stuff for Cisco, and owning them offers Cisco both improved margins on the boxes sold for long-haul missions and the opportunity to influence the development directions and priorities of the company.  How this will impact the Cisco competitors that already deal with Acacia is another matter, but I don’t think the short-term impact will offset the benefits to Cisco.

What might be wrong with the deal is that it’s inherently consolidative.  A growing ecosystem is better for vendors than constant fights over the diminishing fruits of a shrinking one.  Having more beneficial bit-pushing missions would be great for Cisco, and of course great for everyone else.  I’ve long believed that Cisco has shied away from market-education-based advances in the use of networking on the theory that raising all boats is philanthropy and not business strategy.

Not to mention the basic truth that market education is hard, and getting harder.  In a recent post on the Carrier Ethernet LinkedIn Group, Vishal Sharma, the group’s leader, asked why we were getting all these stories about the faster video download speeds of 5G, when it was far from clear that any significant population of users wanted to do more than view videos rather than download them.  My answer was that all online technical publications these days are ad-sponsored, and so a story like “The 5G You’ll Never Know You Have but Might Pay More For” are probably not going to play a big role in revenue planning.  Vendors want to push stuff by making everyone believe they need it, but they don’t want to take the time to show then the value either.  No vendor support, no publication support, no education.

That doesn’t mean that some vendors aren’t looking at options other than consolidation.  Virtual networking, virtualization, and cloud computing all have a connectivity dimension.  This new model of connectivity, a model that doesn’t bind services to the basic features of network devices we’ve had far longer than the current hot missions for connectivity were even talked about, is surely a part of the future.  Not only that, it’s getting socialized enough that a lot of the heavy lifting of buyer education is either being done or being swept aside by enthusiasm.

Juniper, a long-time Cisco rival, had its own announcement extending the Juke HTBASE toolkit it acquired to multi-cloud Kubernetes container orchestration.  Containers are increasingly seen as the foundation for cloud-native development, and Kubernetes and its growing ecosystem are the go-to orchestration framework at the center of any container deployment.

Cisco, Juniper, and other network vendors have also gotten into virtual networking directly.  Most now have SD-WAN strategies, but it’s the SD-WAN space that demonstrates the peril of virtual networking for network vendors.  It’s actually easier for a cloud vendor, such as VMware, to make a splash in virtual networking and SD-WAN, for two reasons.  The obvious is that it doesn’t undercut their own “real” network sales.  Nokia always struggled with Nuage positioning to avoid overhanging its router business.  The not-so-obvious is that buyers look more to cloud software players than network players for solutions.

The real driver of virtual networking overall is the cloud.  Even SD-WAN, a subset of virtual networking aimed at creating a universal IP VPN technology that includes cloud-hosted applications, is seen less as a network technology than as a cloud technology by many buyers.  Those who see otherwise tend to be looking for nothing more than basic small-site extension of MPLS VPNs, so there’s little differentiation to be had.

Virtual networking is actually the hottest thing in networking overall, if one judges “hotness” by the potential for revolutionizing both how networks are built and what benefit they bring to those who build them.  The problem is that virtual networking is linked to network-as-a-service, and the combination of the two requires even more education than either topic would alone.  In the era of click bait, it’s hard to get anyone willing to take the time to help buyers understand enough to build the business case.

My modeling has consistently shown that new cloud-native applications could address an additional $900 billion per year in revenues.  Reaping that requires the cloud, of course, and the cloud requires a virtual hosting model, a virtual networking model, and a software architecture and middleware kit to support facile and confident development.  Where the cloud community has been leaping ahead, relative to networking for example, is that it’s been able to foster a series of vibrant open-source projects to provide all the necessary stuff.

Juniper, here at least, has perhaps the best approach of the network vendors.  “Ride the coat-tails of the cloud” may sound a bit pedestrian in a strategic sense, but it could get the job done.  The risk for Juniper is that their positioning, which has given a whole new meaning to that “pedestrian” characterization, will fall short of an actual ride.  You don’t win by missing the jump or being dusted off opportunity’s coat-tails after all.  Juniper’s link of Juke to multi-cloud sells the technology short.

At least Juniper knows it needs to find coat-tails to ride, which is more than some can say.  Nokia is still the champion of virtual-network under-realization.  Nuage has been my candidate for best SDN strategy of the age, and Nokia still isn’t really taking advantage of it.  In the enterprise space, VMware’s NSX has five times the name recognition in my limited survey.  Even among operators, Nokia’s position is at best a tie with VMware’s.

How about Cisco, whose Acacia deal is what started this piece off?  Well, Cisco has never wanted to be a thought leader, only a thought manipulator.  I don’t mean that as cynically as it may sound; Cisco’s goals are tactical because 1) it fits a sales-driven company to think that way, and 2) you don’t need to be strategic if your competitors are bumbling about in the strategy sense.  Cisco’s risk is Nokia’s or Juniper’s success, either because they suddenly get some smarts or because they stumble onto the path to those cloud coat-tails.  Cisco will need to watch how things unfold with the cloud to ensure they’re ready if the right strategic steps suddenly become clear enough to stimulate action.