Optimization, Virtualization, and Orchestration

What makes virtualization, whether it be IT or network, work? The best definition for virtualization, IMHO, is that it’s a technology set that creates a behavioral abstraction of infrastructure that behaves like the real infrastructure would. To make that true, you need an abstraction and a realization, the latter being a mapping of a virtual representation of a service (hosting, connectivity) to real infrastructure.

If we accept this, then we could accept that in a pure virtual world, where everything consumed a virtual service or feature, would require some intense realization. Further, if we assume that our virtual service is a high-level service that (like nearly every service these days) involved layers of resources that had to be virtualized, from servers to networking, we could assume that the process of optimizing our realization would require we consider multiple resource types at once. The best cloud configuration is the one that creates the best price/performance when all the costs and capabilities, including hosting hardware, software, and network, are considered in realizing our abstraction.

Consider this issue. We have a company with a hundred branch offices. The company runs applications that each office must access, and they expect to run those applications at least in part (the front-end piece for sure) in the cloud/edge. When it’s time to deploy an instance of an application, where’s the best place to realize the abstraction that the cloud represents? It depends on the location of the users, the location of the hosting options, the network connectivity available, the cost of all the resources…you get the picture. Is it possible to pick optimum hosting first, then optimum networking to serve it? Some, even most of the time, perhaps. Not always. In some cases, the best place to run the application will depend on the cost of getting to that location as well as the cost of running on it. In some cases, the overall QoE will vary depending on the network capabilities of locations whose costs may be similar.

We have “orchestration” today, the task of deploying components on resource pools. One of the implicit assumptions of most orchestration is the concept of “resource equivalence” within the pools, meaning that you can pick a resource from the pool without much regard for its location or specific technical details. But even today, that concept is under pressure because resource pools may be distributed geographically and be served by different levels of connectivity. There’s every reason to believe that things like edge computing will put that principle of equivalence under fatal pressure.

The “ideal” model for orchestration would be one where a deployment or redeployment was requested by providing a set of parameters that locate the users, establish cost goals, and define the QoE requirements. From that, the software would find the best place, taking into account all the factors that the parameters described. Further, the software would then create the component/connection/user relationships needed to make the application accessible. It’s possible, sort of, to do much of this today, but only by taking some points for granted. Some of those points are likely to require more attention down the line.

I think that for this kind of orchestration to work, we need to presume that there’s a model, and software that decomposes it. This is essential because a service is made up of multiple interdependent things and somehow both the things and the dependencies have to be expressed. I did some extensive tutorial presentations on a model-based approach in ExperiaSphere, and I’ve also blogged about it many times, so I won’t repeat all that here.

One key element in the approach I described is that there’s a separate “service domain” and “resource domain”, meaning that there is a set of models that describe a service as a set of functional elements and another set that describe how resource “behaviors” are bound to the bottom layer of those service elements. The goal was to make service definitions independent of implementation details, and to permit late binding of suitable resource behaviors to services. If a service model element’s bound resource behavior (as advertised by the resource owner/manager) broke, another compatible resource behavior could be bound to replace it.

This could offer a way for “complex” orchestration to work above cloud- and network-specific orchestration. The service models could, based on their parameters, select the optimum placement and connection model, and then pass the appropriate parameters to the cloud/network orchestration tool to actually do the necessary deploying and connecting. It would be unnecessary then for the existing cloud/network orchestration tools to become aware of the service-to-resource constraints and optimizations.

The potential problem with this approach is that the higher orchestration layer would have to be able to relate its specific requirements to a specific resource request. For example, if a server in a given city was best, the higher-level orchestrator would have to “know” that, first, and second be able to tell the cloud orchestrator to deploy to that city. In order to pick that city, it would have to know both the hosting capabilities there and the network capabilities there. This means that what I’ve called “behaviors”, advertised resource capabilities, would have to be published so that the higher-layer orchestrator could use them. These behaviors, being as they are products of lower-level orchestration, would then have to drive that lower-layer orchestration to fulfill their promise.

That exposes what I think is the biggest issue for the complex orchestration of fully virtualized services—advertising capabilities. If that isn’t done, then there’s no way for the higher layer to make decisions or the lower-layer processes to carry them out faithfully. The challenge is to frame “connectivity” in a way that allows it to be related to the necessary endpoints, costs included.

Deploying connectivity is also an issue. It’s the “behaviors” that bind things together, but if network and hosting are interdependent, how do we express the interdependence? If the higher-layer orchestration selects an optimal hosting point based on advertised behaviors, how does the decision also create connectivity? If the network can be presumed to be fully connective by default, then no connection provisioning is required, but if there are any requirements for explicit connection, or placement or removal of barriers imposed for security reasons, then it’s necessary to know where things have been put in order to facilitate these steps.

It’s possible that these issues will favor the gradual incorporation of network services into the cloud, simply because a single provider of hosting and connectivity can resolve the issues without the need to develop international standards and practices, things that take all too long to develop at best. It’s also possible that a vendor or, better yet, an open-source body might consider all these factors and advance a solution. If I had to bet on how this might develop, I’d put my money on Google in the cloud, or on Red Hat/IBM or VMware among vendors.

Something is needed, because the value of virtualization can’t be achieved without the ability to orchestrate everything. A virtual resource in one space, like hosting, can’t be nailed to the ground by a fixed resource relationship in another, like networking, and optimization demands that we consider all costs, not just a few.