2019: It’s About Facing Reality

How different is January first from December thirty-first?  Not much in most ways, and of course it’s after January first now, but it’s still true that businesses think in terms of quarters and years, and so it’s fair to look ahead at our annual transition point to see what important things are coming down the pipe.  That’s particularly true this year, because we’re reaching critical mass in the most important technology transition of all.

Let me start by referring to an interesting Red Hat survey blog on business’ IT priorities.  What I find interesting here is less the absolute numbers, or even the way that priorities line up for this year and last, but rather the pattern of focus year over year.  I think you could read the chart as saying that we started to face reality in 2018.  Look at the 2017 priorities; cloud and security.  These are the tech equivalent of mom’s apple pie; they’re the kind of response you’d expect someone to offer off the cuff.  Look at the top priority for 2018—operations automation.  That’s reality.

Well, it’s at least a realistic goal.  The challenge for operations automation, which I’ve been calling lifecycle management, is that it’s easy to love it and hard to do it.  Translating the goal into realistic technical steps is critical to fulfilling it, and we’ve so far not been able to do that.  I think we’re starting to see, even if it’s still through some fog, the shape of the concept that’s key to our achieving IT goals…virtualization.

There’s implicit evidence that nearly everyone agrees virtualization is the key to the future, tech-wise.  problem hasn’t been accepting the primacy of virtualization as the issue of our time, but realizing it in a way that harvests all the reasons for putting it on top in the first place.  Virtualization of both applications and resources provide the mechanism for generalizing tools, which is the only path that can lead to a good place in terms of lifecycle management for either applications or services.

That means having the right abstraction and the right implementation of how it’s realized.  The abstraction part of virtualization creates what appears to a user as a “real” something, and the realization implements the abstraction to make it usable.  A “virtual machine” or VM has the properties of a real server, and we host VMs on a real machine via the intermediary tool of a hypervisor.  Anything that implements the abstraction fully is equivalent to everything else that does so, but the “fully” part is important.

The nice thing about virtual machines is that we know what the abstraction is—a machine.  That makes realization fairly straightforward too, because there’s a single accepted model of features and capabilities that an implementation of a virtual machine has to meet.  Even with VMs, though, there are some vague aspects to the abstraction that could bite users.  For example, are VMs resilient?  If so, there should be some specific parameters relating to how fast a replacement is offered and whether what’s run on the old is somehow moved to the new.  Users of VMs today know that implementations vary in those important areas.

The abstraction that NFV is based on, the “virtual function”, is a good example of excessive fuzz.  The best we can say for definition is that a VNF is the virtual form of a physical network function, meaning a device or appliance.  The obvious problem is that there are tens of thousands of different PNFs.  Obviously an abstract VNF representing a “router” is different from one representing a “firewall”, so what’s really needed with NFV (and in general with virtualization) is the notion of a class hierarchy that defines detailed things as subsets of something general.

We might, for example, have a “Class VNF”.  We might define that Class to include “Control-Plane-Functions”, “Data-Plane-Functions”, and “Management-Plane Functions”.  We might then subdivide “DPFs” into functional groups, one of which would be “routers” and another “firewalls”.  The advantage of this is that the properties that routers and firewalls share could be defined in a higher class in the hierarchy, and so would be shared automatically and standardized.  Further, everything that purported to be a “Firewall” would be expected to present the same features.

The primary challenge in realizing an abstraction is making sure your implementation matches the features of the “class hierarchy” the abstraction defines.  That’s not always easy, not always done, but it’s at least broadly understood.  The challenge in realizing virtualization comes in how we visualize the entire implementation and map it to hardware.  If there is no systemic vision of “infrastructure” then implementations of abstractions tend to become specialized and brittle (they break when you bend them, meaning make changes).

In modern virtualization abstractions, the goal is to utilize a collection of resources that are suited to fulfill parts of our implementation—a “resource pool”.  That resource pool is essentially a distributed platform for realization, and to make it useful it’s important that everything needed to utilize the distributed elements is standardized so everything uses it consistently.

What’s happened in 2018 in cloud computing is that we’ve taken some specific steps to do what’s necessary.  We’ve created a standard framework into which applications deploy, which we can call “containers”.  We’ve settled (pretty much) on a way of deploying application components into containers in an orderly and consistent way, which we call “orchestration” and for which we’re increasingly focused on Kubernetes.

What we realized in 2018 was that this basic container/Kubernetes model wasn’t enough.  We need to be able to generalize deployment not only across different container hosts, but also between data centers, in a hybrid cloud, and in multi-cloud.  There are a variety of tools that have emerged to fulfill this extension mission, just as there’ve been a variety of container orchestration tools.  We settled on one of the latter in 2018—Kubernetes.  We should expect to settle on a cloud-extension tool in 2019.  Stackpoint?  Rancher?  We’ll see.

There’s less disorder in the next extension to the basic 2018 model.  When you distribute resources you have to distribute work, which is a lot more than simple load balancing.  Google, who launched Kubernetes, has started to integrate it with a broad workflow product, Istio, and that shows a lot of promise in framing the way that pools of resources can share work.

The missing ingredient in this is virtual networking.  The Internet and IP have become the standard mechanism for connectivity.  IP is highly integrated into most software, which means that we have to support its basic features.  The Internet is the perfect fabric for building OTT applications, which is why we’ve built them using it, but for a given application its general, universal, connectivity is a problem.  What we’ve known for a decade is needed is “network-as-a-service” (NaaS), meaning the ability to spin up an IP network extemporaneously but have it behave as though it were a true private network.

We have “virtual network” technology in the data center, and Kubernetes in fact has a plug-in capability to provide for it.  However, this is generally focused on what could be called “subnet connectivity”, the subnet being where application components live in a private address space.  We also have, with SD-WAN, technology to virtualize networking in the wide area, meaning within a corporate VPN.  There are some commercial products that offer these two in combination, and you could of course assemble your own solution by picking a product from both spaces.  What’s lacking is the architectural vision of NaaS.

Should a network service spin up to connect the elements of an application or service, in the data center, in the cloud, in the real world?  How can it be made to handle things like web servers that are inherently part of multiple user services or applications?  These are questions that illustrate the essential mission trade-off of NaaS.  The more usefully specific it is, the more difficult it is to harmonize NaaS with the behavior of current web, network, and information technologies.  I can see, and surely many can see, how you could address trade-offs like the one I’ve shown here, but there’s no consensus and very little awareness of the problem.

That wasn’t always the case.  Earlier network technologies recognized the notion of a closed user group, a subset of the total possible connective community that, for a time at least, communicated only within itself.  These are fairly easy to set up in connection-oriented protocols like frame relay or ATM because you can control connection by controlling the signaling.  In datagram or connectionless IP networks, the presumption of open connectivity is present at the network level and you have to interdict relationships you don’t like.  Since source address validation isn’t provided in IP, that can be problematic.

While classic SDN (OpenFlow) isn’t an overlay technology, the first data center networking technology (Nicira, later bought by VMware to become NSX) is.  Since SD-WAN is also an overlay technology, it’s fair to wonder (as I did in an earlier blog when VMware bought VeloCloud and created “NSX SD-WAN”) whether somebody might unite the two with not only a product (which VMware and Nokia/Nuage both have) but also with effective positioning.  In point of fact, any SD-WAN vendor could provide the necessary link, and if classic network principles for containers (meaning Kubernetes) is adopted you can really “gateway” between the data center and WAN without excessive complexity…if you know why you want to do it.

Why haven’t we fitted all the pieces together on this already?  We’ve had the capability to do the things necessary for four or five years now, and nobody has promoted it effectively.  That may be because everyone is stuck in local thinking—keep your focus on what you expected to do.  It may be because vendors fear that a broad NaaS mission would be too complicated to sell.  Finally, it may be because the SD-WAN space is the logical place for the solution to fit, and vendors there are uniformly poor at marketing and positioning.  Without a compelling story the media can pick up on, market-driven changes in networking and virtualization are unlikely.

Competitive-driven changes, then?  The Oracle deal for Talari is, as many have pointed out, the first acquisition of an SD-WAN vendor by a cloud provider.  Of course, Oracle’s cloud isn’t exactly setting the market on fire, but that might be the motivation Oracle needs.  A strong offering in SD-WAN, combined with strong features in containers and orchestration, could give Oracle something to sing about—at least for as long as it took Amazon, Google, and Microsoft to move.  As I noted in a prior blog, the cloud providers may end up being the source of virtual networking advances from now on, as well as the source of full “cloud-native” capability.

That may be the missing link here.  Virtualization has progressed recently because there was a concept—containers—that offered both tactical and strategic benefits.  The realization of the former is leading us closer to the latter, influenced in large part by the cloud.  That realization is driven at the technical level, meaning it doesn’t really require market socialization to happen.  I’m leaning toward the view that it’s going to be some sort of NaaS extension to the container/virtualization challenge that ends up creating the full spectrum of virtualization features we need, including NaaS.  That might come from a cloud provider or from a premises IT player looking for hybrid cloud.  Wherever it comes from, I think we’re going to see it emerging in 2019.

That brings us to the last step, the one that closes the loop on the Red Hat blog entry I opened with.  To automate the relationship between abstract applications/services and real resources, you need modeling.  A model, meaning a data-driven description of the relationships of abstract elements and the recipes for realizing them on various resource alternatives, is the key to automated lifecycle management.

Models are at the top of the whole process, so starting there might have been nice, but the reality is that you can’t do service or application lifecycle automation without the framework of virtualization fully developed.  Since that framework is logically compartmentalized, we really have to expect the modeling to retrospectively absorb what’s being standardized below.  It’s also true that the whole model thing is only important insofar as we have some standard pieces included.  The details will matter eventually, because without model portability among users it’s difficult to see how we could ever fully harmonize lifecycle management.

We need models to do two things.  First, to represent abstractions as either the decomposition of a primary model into subordinate ones, or into steps to realize the model on infrastructure.  Second, to steer events to processes based on the state of the target model element(s) in the structure.  The need for “model specificity” lies in the fact that neither of these can be done using a generalize tool; we would always need to decompose the specific model structure we used.

All this ties together because true infrastructure virtualization, which has to virtualize everything we’d call “infrastructure”, can define virtual networking within itself, exposing only the address sockets that will connect to corporate VPNs or the Internet.  SD-WAN can virtualize the former, and the latter is something we’ll have to think about in 2019.  Should we presume that Internet relationships are transient closed-user-group relationships?  There would be benefits, and of course scalability costs.  Security already costs, so perhaps a trade-off point can be found.

Maybe December of 2018 wasn’t as different from January 2018 as many had hoped—including me.  Maybe that will be true in 2019 as well, but I don’t think so.  Happy New Year!