Where is Cloud-Native NOT a Good Idea?

Cloud-native technology is important to everyone, and critical to many, but there’s already a trend toward seeing cloud-native as sweeping everything else from the tech world.  I think it’s certain that every enterprise will end up adopting cloud-native applications or application components, and that at least three quarters of all applications will have cloud-native elements, but there is no universal constant (that number which, when multiplied by your answer, yields the correct answer) and there’s no universal development paradigm either.

Are there things for which cloud-native is absolutely not the right answer?  Yes.  Are there things that most believe to be universal cloud-native benefits, and aren’t?  Yes.  Are there myths about cloud-native that, even where it’s a good fit, could still lead companies astray?  There sure are, and we’ll look at all these points in this blog.

To briefly reprise a point for those who haven’t followed this topic, in my blog or elsewhere, “cloud-native” is a model of application development where logic is divided into small “microservices” that are individually resilient and scalable.  This model, combined with cloud hosting, lets an application heal itself and adapt to changing workloads or even feature requirements.  Because the microservices are loosely coupled and written based on abstraction-centric “intent-modeled” design, they can be changed easily to accommodate new requirements and the changes can be introduced quickly and without major production impacts.

This definition is important because it opens a discussion on the issues that my earlier questions are intended to raise.  Thus, let’s look at the definition and apply it to the real world of application development.

You can envision a cloud-native application as a swirling set of logic elements that appear, grow, shrink, and are replaced as needed.  Work, from the user perspective, is accomplished by enlisting a bunch of these logic elements to complete a task.  There’s almost always an “orchestration” or “step function” process involved that sequences the cloud-native microservices, stringing them out in what we’d call a “workflow”.  From this vision alone, you can infer one of the truths of cloud-native, which is that cloud-native is a framework for interactive applications or application elements, elements that are essentially handling units of work that are often called “events”.  Cloud-native is not batch processing.

An interactive application is one that has a close relationship with the behavior of humans or external systems.  Somebody hits a key, and (eventually) expects to get a response.  Somebody presses a button and (eventually) expects a gate to open.  The parenthetical here is offered to illustrate that the reaction is expected but not expected instantaneously.  Human think time is such that there is a tolerance for delay, and there’s a natural pace to the external system (how fast will a human move to the next task, or how quickly will the next truck present itself at the gate?) that means that delays in the processing, up to a point, won’t impact the application overall.

Contrast this with a batch application.  Here the data is already there, already stored, and the application gets a data element (a “record”) when it’s ready, processes it, perhaps outputs it, and then goes back to get another.  The difference is that with this batch model, any delay that’s introduced in the input-process-output sequence will accumulate, since it will lengthen the time between successive readings of the inputs, and that will lengthen the runtime of the application.

If I break a batch application into multiple, network-connected, microservices, I’ll add a transit delay for each of the connections, and those delays will accumulate.  Add ten milliseconds in network delay, and in a million records I accumulate ten thousand seconds of delay, or 167 minutes.  Yet, ten milliseconds of network delay would almost surely be unnoticeable in an interactive application.

Right here, then, we have an example of things we shouldn’t be looking at as “cloud-native” candidates.  Anything that works on stored data is almost surely not a cloud-native application candidate.  That doesn’t mean it couldn’t be broken up into components, only that the componentization should be limited to reduce the workflow-connection latency introduced by the process.  That would mean bigger microservices, which is at least a bit oxymoronic.  All componentized applications are not microservice applications, and not cloud-native candidates.

A microservice is a small, scalable, unit of processing.  That means it’s probably made up of simple logic, not highly complex iterative calculations or something.  Given that, cloud-native applications have simple logic components that can be scaled, not so much to support one user’s needs but to support the collective needs of vast numbers of users.  It’s volume, not complexity of logic, that characterizes them.  That’s why interactive/event-based applications are good candidates.

Even interactive or event-driven applications aren’t always ideal candidates for a pure cloud-native implementation.  One key question is just how work is steered through microservices.  An ideal cloud-native application is one where the application can be represented as a system with finite states (the so-called “finite state machine”), and where each possible event, in each of the states, identifies a specific process for handling.  In this model, the system’s data model and state/event table orchestrate the event-to-process relationships.

Where the path of work depends on the work previously done, meaning the results of earlier microservices in the workflow, it may be necessary to visualize the application as a “main portion” that’s at least somewhat conventional or monolithic in structure, and which then invokes the necessary microservices as they’re identified.  This is a model that could well come about if you translated traditional transaction processing into a microservice-and-cloud-native form.

If we apply this to the service provider world, what you end up with is the realization that there are three different kinds of “processing” activity that could be expected.  We have data-plane connection-service activity, we have control-plane activity related to things like mobility management, and we have management activity that’s really related to the software that’s automating the service processes overall.

Data-plane connectivity is probably not something easily translated to a cloud-native form.  Distributing instances of a data-plane process dynamically to handle load requires dealing with the issue of packet sequencing and distributed state control to ensure that you don’t end up mixing threads of different conversations (sometimes called “tail-ending”).  Some of the distributed state problem relates to the handling of control packets, which while suitable for cloud-native microservice handling, may present complications if the control packets have to impact data flows.

Control packet handling that relates to things like mobility management are ideal cloud-native applications.  An individual user generates a very limited volume of these packets, and the processing needed per user/packet is limited.  Load is created not by process complexity but by sheer volume, and so we’re dealing with simple, highly scalable processes.  A perfect cloud-native model.

Management processes are, in my view, the big opportunity for cloud-native handling.  Service lifecycle automation is (or should be) an inherent state/event process.  Services have a finite number of states through which they cycle during operation, and a finite number of events that represent conditions to be handled.  Since lifecycle automation is logically the first thing that happens with a service, before we have any data plane or control plane interactions, it would make sense to start cloud-native with lifecycle automation.  Which, obviously, we’ve not done.

This little sequence of data-plane to management-plane also illustrates a point I made in a prior blog, which is that we need to spend a bit of time looking at just what (if anything) could-native could do for the data-plane features of network services.  Remember, everything doesn’t have to be cloud-native, but everything should be considered a candidate.  We don’t need or want to force-feed cloud-native into the data plane, but we don’t want to miss an opportunity to rethink how connection services are handled via hosted functions either.