What’s the difference between “orchestration” and “lifecycle automation” in applications and services? I guess it’s not surprising that there should be confusion here, given that we tend to conflate terms in tech all the time as part of a desire to reduce technical details and the size of stories in tech media. Then there’s the desire of vendors to jump on a popular concept, even if it means stretching what they are doing. At any rate, I’ve been getting a number of comments and questions on these terms, and the distinction could be very important.
The term “orchestration” and “orchestrator” were originally used to describe things associated with writing and performing music. An “orchestra” plays music, an “orchestrator” prepares music to be played, and “orchestration” is what the orchestrator does. In software and services, the terms relate to the central coordination of a series of related tasks or steps, in order to allow them to be performed automatically rather than through explicit human control.
In computing, we’ve had this concept for longer than we’ve had popular use of the term. The old “Job Control Language” or JCL of IBM mainframes is a form of orchestration, allowing a sequence of programs to be run as a “batch”. However, the notion of orchestration came into its own with multi-component software and “DevOps” or developer/operations tools. The goal was to automate deployment and redeployment of applications, both to make the processes easier and to reduce errors.
While things like DevOps were emerging on the compute site, network operators and enterprises were also looking at tools to automate routine network tasks. The TMF has a long history of framing standards and practices designed to generalize the complex task of securing cooperative behavior among network devices.
Containerization, which creates a “container” that includes application components and the configuration information needed to deploy/redeploy them on generalized resources, was an important step in sort-of-linking these initiatives, because it retains deployment tools (Kubernetes) and introduces configuration and parameterization that in many ways mirrors what happens with network devices. Google’s Nephio may be a true bridge between Kubernetes and the network.
But what about “lifecycle automation?” The critical concept in lifecycle automation is that there is a “lifecycle”, meaning that services and applications progress through a series of operating states, some that are goals in normal operation and some that represent fault modes. This progress is signaled by events that represent indicators of behavior change. Events drive state changes, and in a manual network operations center (NOC) an operations team would receive notification of these events and take actions on them to either continue the progressive changes they represent (if they’re “normal” events) or to restore normal operation if the events signal a fault. The goal of lifecycle automation is to create software that can handle these events and take the appropriate actions.
You can see from this set of definitions that there would appear to be an overlap between the two categories. Orchestration in the world of Kubernetes does allow for redeployment in case of failure, load balancing and the creation of new instances or withdrawal of old ones. Some DevOps tools like Puppet and Ansible are “goal-state” technologies that have even more of a state/event mindset. Should we be saying that the concepts of orchestration and lifecycle automation are congruent but that implementations vary on the scope of event-handling? There may be another qualification we could consider.
One is “vertical scope”. Orchestration in the software world is fairly one-dimensional, meaning that you are deploying applications. In lifecycle automation, we often look at deploying infrastructure, configuring devices that have a fixed function, and so forth. Lifecycle automation aims at the entire lifecycle, which includes everything that’s needed to support that lifecycle. Orchestration is generally used in a more limited single-layer scope.
Another possible difference is in autonomy, and this is actually a point our previous vertical scope comments raise. There really are layers of technology needed to make an application or service work. Do some or all of these layers manage themselves independent of the application or service, meaning that they are autonomous? Two examples of this are server resource pool and network management. Do we “orchestrate” software deployment by presuming that if something in either of these platform technologies fails, we’ll simply deploy again and presumably get a different resource allocated? We never have to break down other things, put other things in different places. With lifecycle automation, we may have to consider that a failure means not only replacing things impacted, but even reconfiguring the software structure to optimize things overall.
Which raises yet another possible difference, which is interdependence. Lifecycle automation should take into account the fact that changes to an application or service configuration could require re-optimization. Take out one element and you can’t necessarily replace it 1:1, you have to reconsider the overall configuration in light of the service-level agreements explicit or implicit in the application or service.
Orchestration, IMHO, truly is a subset of lifecycle automation, one designed for a subset of the conditions that application and/or service deployment and operation could present. It’s designed for a simplified set of conditions, in fact. The broader problem, the real problem, needs a broader solution.
That’s what I think hierarchical intent modeling can provide. In this approach, each model element represents a functional component that may draw on subordinate components in turn. If one of them breaks or changes, it’s incumbent on the higher layer to determine if the SLA is still met, and if it is not, to report to its superior element to allow that element to signal a reconfiguration that may or may not require tearing down what was there before. If the SLA can still be met, then our object can determine whether it needs to re-optimize its own subordinate structures or if it can simply replace the thing that’s changing/broken.
The question, then, is whether simple orchestration is ever workable in a modern world. The answer is “it depends”, which means that if we could assume that an application/service could call on a pool of equivalent resources and that those resources were not interdependent in terms of assignment, then we could say “Yes!”
I’m not sure how often we could presume those conditions would exist, and I think the chances they would will reduce over time as we create more complex applications based on more complex commercial relationships with resource providers. However, I also believe that neither the networking community nor the cloud community have addressed these points explicitly in their design. That suggests that we might build network services and cloud architectures that would simplify lifecycle automation to the point where basic orchestration would suffice. It’s a race, but perhaps between contestants who don’t know they’re competing, and whoever figures things out first may gain a durable advantage in the market.