The Role of As-a-Service in Event Processing, and it’s Impact on the Network

We seem to be in an “everything as a service age”, or at least in an age where somebody is asserting that everything might be made available that way.  Everything isn’t a service, though.  Modern applications tend to divide into processes stimulated by a simple event, and processes that introduce context into event-handling.  We have to be able to host both kinds of processes, and host them in the right place, and we also have to consider the connection needs of these processes (and not just of the “users” of the network) when we build the networks of the future.

The purpose of an as-a-service model is to eliminate (or at least reduce) the need for specialized hardware, software, and information by using a pool of resources whose result is delivered on demand.  The more specialized, and presumably expensive, a resource is, the more valuable the “aaS” delivery model could be.  The value could come in the cost of the resource, or because analytic processes needed to create the resource would be expensive to replicate everywhere the results are needed.

You can easily envision an as-a-service model for something like calculating the orbit of something, or the path of an errant asteroid in space, but the average business or consumer doesn’t need those things.  They might need to have air traffic control automated, and there are obvious advantages to having a single central arbiter of airspace, at least for a metro area.  On the other hand, you don’t want that single arbiter to be located where the access delay might be long enough for a jet to get into trouble while a solution to a traffic issue was (so to speak) in flight to it.

Which might happen.  The most obvious issue that impacts the utility of the “aaS” option is economy the resource pool can offer.  This is obviously related not only to the cost of the resource, but also to how likely it’s going to be used, by how many users, and in what geography.  It’s also related to the “control loop”, or the allowable time between a request for the resource in service form and the delivery of a result.  I’d argue that the control loop issue is paramount, because if we could magically suspend any latency between request and response, we could serve a very large area with a single pool, and make the “aaS” model totally compelling.

The limiting factor in control loop length is the speed of light in fiber, which is about 120 thousand miles per second, or 120 miles per millisecond.  If we wanted to insure a control loop no more than 50 milliseconds long, and if we presumed 20 milliseconds for a lightweight fulfillment process, we’re left with 30 milliseconds for a round trip in fiber, or a distance of about 1800 miles.  A shorter control loop requirement would obviously shorten the distance our request/response loop could travel.  So would any latency introduced by network handling.  As a practical matter, most IoT experts tell me that process control likely can’t be managed effectively at more than metro distances because there’s both a short control loop requirement and a lot of handling that happens in the typical access and metro network.

Still, once you’ve paid the price for access/metro handling and have your request for a resource/service on fiber, you can haul it a long way for an additional millisecond or two.  Twenty milliseconds could get you to a data center in the middle of the US from almost anywhere else in the country, and back again.  That is, in my view, the determining factor in the as-a-service opportunity.  You can’t do everything as a service with that long a control loop, which means that event-driven processes in the cloud or as a part of a carrier service will have to be staged in part to resources nearer the edge.  But with proper software and network design you can do a lot, and the staging that’s needed for resource hosting is probably the driver behind most network changes over the next decade or so.

One obvious truth is that if electrical handling adds a lot to the delay budget, you want to minimize it.  Old-day networks were an electrical hierarchy to mass up traffic for efficient handling.  If fiber is cheap enough, no such massing up is needed.  If we could mesh hosting points with fiber connections, then we could make more seldom-used (and therefore not widely distributed) features available in service form without blowing our control loop budget.

In a given metro area, it would make sense to mesh as many edge hosting points as possible with low-latency fiber paths (wavelengths on DWDM are fine as long as you can do the mux/demux without a lot of wasted time).  I’d envision a typical as-a-service metro host network as being a redundant fiber path to each edge point from a metro data center in the center, with optical add/drop to get you from any edge point to any other with just a couple milliseconds of add/drop insertion delay.  Now you can put resources to support any as-a-service element pretty much anywhere in the metro area, and everything ties back with a low-latency path to a metro (and therefore nearby) data center for hosting processes that don’t require as short a control loop.  You could carry this forward to state/regional and central data centers too.

All this hosting organization is useless if the software isn’t organized that way, and it’s not enough to use “functional” techniques to make that happen.  If the context of an event-driven system has to be determined by real-time correlation of all relevant conditions, then you end up with everything at the edge, and everything has to have its own master process for coordination.  That doesn’t scale, nor does it take advantage of short-loop-long-loop segregation of processes.  Most good event-driven applications will combine local conditions and analytic intelligence to establish conditions in terms of “operating modes”, which are relevant and discrete contexts that establish their own specific rules for event-handling.  This is fundamental to state/event processing, but it also lets you divide up process responsibility efficiently.

Take the classic controlled-car problem.  You need something in the car itself to respond to short-loop conditions like something appearing in front of you.  You need longer-loop processes to guide you along a determined route to your destination.  You can use a long-loop process to figure out the best path, then send that path to the car along with a set of conditions that would indicate the path is no longer valid.  That’s setting a preferred state and some rules for selecting alternate states.  You can also send alerts to cars if something is detected (a traffic jam caused by an accident, for example) well ahead, and include a new route.  We have this sort of thing in auto GPSs today; they can receive broadcast traffic alerts.  We need an expanded version in any event-driven system so we can divide the tasks that need local hosting from those that can be more efficiently handled deeper in the cloud.

We also need to be thinking about securing all this.  An as-a-service framework is subject to hacking as much as a sensor would be, though it’s likely easier to secure it.  There is a unique risk with one, though, and that’s the risk of impersonation.  If you have an event-driven system sensitive to external messages, you have a system that can be doctored by spoofing.  Since event processing is about flows, we need to understand how to secure all the flows to prevent impersonation.

As-a-service is critical for future cloud applications, but particularly so for event-driven systems.  By presenting information requirements derived from analytics and not just triggered by a simple event as “services” we can simplify applications and help divide tasks so that we use more expensive edge resources more efficiently.  To get the most from this model, we’ll need to rethink how we network our edge, and how we build applications to use those services optimally.