Today I want to take up the remaining issue with edge-centric and functional programming for event processing, both for IoT and NFV. That issue is control of distributed state and stateless processes. Barring changes in the landscape, this will be the last of my series of blogs on this topic.
As always, we need to start with an example from the real and familiar world. Let’s assume that we’re building a car in five different factories, in five different places in the world. Each of these factories has manufacturing processes that generate local events, and our goal is to end up with a car that can be sold, is safe, and is profitable. What we have to do is to somehow get those factories to cooperate.
There are a few things that clearly won’t work. One is to just let all the factories do their own thing. If that were done you might end up with five different copies of some parts and no copies of others, and there would be little hope that they’d fit. Thus, we have to have some master plan for production that imposes specific missions on each factory. The second thing we can’t have is to drive our production line with events that have to move thousands of miles to get to a central event control point, and have a response return. We could move through several stages of production during the turnaround. These two issues frame our challenge.
What happens in the real world is that each of our five factories is given a mission (functional requirements) and a timeline (an SLA). The presumption we have in manufacturing is that every producing point builds stuff according to that combination of things, and every other factory can rely on that. Within a given factory, the production processes, including how the factory handles events like materials shortages or stoppages in the line, are triggered by local events. These events are invisible to the central coordination of the master plan; that process is only interested in the mission—the output—of the factories and their meeting the schedule. A broad process is divided into pieces that are individually coordinated and then combined based on a central plan.
If we replace our factory processes by hosted functional processes, meaning Lambdas or microservices, and we replace conditions by specific generated IoT-like events, we have a picture of what has to happen in distributed event processing. We have to presume that events are part of some system of function creation. That system has presumptive response times, the total time it takes for an event to be analyzed and a reaction created. The event-response exchange defines a control loop, whose length is determined by what we’re doing. Things that happen fast require short control loops, and that means we have to be able to host supporting processes close to where the events are generated.
In both NFV and IoT we’ve tended to presume that the events generated by functions (including their associated resources) are coupled directly to service-specific processes. The function of NFV management is presumptively centralized, and if IoT is all about putting sensors on the Internet, then it’s all about having applications that directly engage with events. If our car-building exercise is an accurate reflection of the NFV/IoT world, this isn’t practical because we either create long control loops to a centralized process or create disconnected functions that don’t add up to stable, profitable, activity.
The path to solution here has been around for a decade; it’s hidden inside a combination of the TMF’s Shared Information and Data model (SID) and the Next-Generation OSS Contract (NGOSS Contract). SID divides what we’d call a “service” into elements. Each of these elements could correspond to our “factories” in the auto example. If there’s a blueprint for a car that shows how the various assemblies like power train, passenger compartment, etc. fit, then there would be a blueprint for how each of these assemblies was constructed. The “master blueprint” doesn’t need the details of each of these sub-blueprints. They only need to conform to a common specification. With a blueprint at any level, we can employ NGOSS Contract principles to steer local events to their associated processes.
What this says is that breaking up services or IoT processes into a hierarchy isn’t just for convenience in modeling deployment, it’s a requirement in making event processing work. With this model, you don’t have to send events around the world, only through the local process system. But what, and where, is that local process system?
The answer here is intent modeling. A local process system is an intent-modeled black-box “factory” that produces something specific (functional behavior) under specific guarantees (an SLA). Every NFV service or IoT application would be made up of some number of intent models, and hidden inside them would be a state/event engine that linked local events to local processes, with “local” here meaning “within the domain”. If these black boxes have to signal the thing above that uses them, it signals through its own event set. A factory full of sensors might be aggregated into a single event that reports “factory condition.”
From this, you can see that not only isn’t it necessary to build a single model of a service or an IoT application that describes everything, it’s not even desirable. The top-level description should only reference the intent models of the layer below—just like in the OSI Reference Model for network protocols, you never dip into how the layer below does something, only the services it exposes. Services and applications are composed not from the details of every local event-handling process, but from the functional elements that collect these processes into units of utility.
The “factory” analogy is critical here. Every element, every intent model, is a factory. It has its own blueprint for how it does its thing, and nothing outside it has any reason to know or care what that blueprint is. It should not be exposed because exposing it would let something else reference the “how” rather than the “what”, creating a brittle implementation that any change in technology would break.
This brings us to the “where”, both in a model-topology sense and in a geographic sense. If what we’re after is a set of utility processes that process local events, then we could in theory define the factories based on geography, or administration, or functionality, or a combination of those things. We can have multiple factories that produce the same utility process, perhaps in a different way or in a different place.
To make this work, you need to have a standard approach to intent modeling so that a “factory abstraction” at a higher level can map to any suitable “factory instance” below. That means standardized APIs to communicate the intent and SLA, and a standard way to exchange events/responses. Strictly speaking you don’t need to standardize what happens inside the factory. However, if you also standardize the state/event structure that creates the implementation—linking local events to local processes in a standardized way, then every intent model at every level looks the same, and processes that are used in one could also be used in others that required the same behavior.
If a high-level structure, a service or application, needs to reference one of our utility processes, it would represent it as an intent model and leave the decoding to the implementation. If that structure wanted to specify a specific factory it could, or it could leave the decision on what factory to use (Pittsburgh or Miami, VPN or VLAN) to a lower-level abstraction that might make the selection based on the available technology or the geography of the service.
If you presume this approach, then every element of a service is an abstraction first and an implementation second. Higher-layer users see only the abstraction, and all who provide implementations must build to that abstraction as their “product specification”. There’s no difference whether an abstraction is realized internally or externally, or with legacy or new technology. There’s no difference, other than perhaps connectivity, optimality, or price, in where it’s implemented. Within a given functional capability set, you pick factories, or instantiate them, based on optimality.
In the IoT space, there could also be abstractions created based on geography or on functionality. I used the example of driving a couple blogs back; you could envision traffic-or-street-related IoT as being a series of locales that collected events and offered common route and status services. A self-drive car or an auto GPS user might exercise a local domain’s services in abstract from a distance, but shift to a lower-level service as they approached the actual geography. That suggests that you might want to be able to allow an abstraction to offer selective exposure of lower-level abstractions.
It’s harder to lay out a specific structure of what a state/event model might look like for IoT, but I think the easiest way to approach it is to say that IoT is a service that can be decomposed, and that the decomposition process will balance issues like control loop length and geographic hosting efficiency to decide just where to put things, which frames how to abstract them optimally. However, I think that the goal is always to create a model approach that lets you model an intersection, a route, a city, a country, a fleet of vehicles, or whatever using the same approach, the same tools, the same APIs and event and process conventions.
Even a self-driving car should, in my view, have a model that lives in the vehicle and receives and generates events. That’s something we’ve not talked about, and I think we’re missing an opportunity. Such an approach would let you define behavior when the vehicle has no access to IoT sensors outside it, but also how it could integrate the “services” of city, route, and intersection models to create a safe and optimal experience for the passengers.
This raises the very interesting question of whether the vehicle itself, as something capable of being directed and changing speeds, should also be modeled. A standard model for a vehicle would facilitate open development of autonomous vehicle systems and also cooperative navigation between vehicles and street-and-traffic IoT. It shouldn’t be difficult; there are only a half-dozen controls a driver can manipulate and they tend to fall into two groups—switch with specific states (like on/off), and “dial” that lets you set a value within a specified range.
With proper factory support, both IoT and NFV can distribute state/event systems and processes to take advantage of function scaling and healing without risk of losing state control and the ability to correlate the context of local things into a master context. That combination is essential to get the most from either of these advances—in fact, it may be the key to making the “advances” really advance anything at all.