What Would Edge-Hosting Mean to Infrastructure and Software Design?

If computing in general and carrier cloud in particular is going to become more edge-focused over time, then it’s time to ask just what infrastructure features will be favored by the move.  Even today we see variations in server architecture and the balance of compute and I/O support needed.  It is very likely that there will be even more variations emerging as a host of applications compete to dominate the cloud’s infrastructure needs.  What are the factors, and what will the result be?  I’m going to have to ask you to bear with me, because understanding the very important issues here means going way beyond 140-character Tweets.

It’s always a challenge to try to predict how something that’s not even started will turn out in the long term.  Carrier cloud today is almost pre-infancy; nearly all carrier IT spending is dedicated to traditional OSS/BSS, and what little is really cloud-building or even cloud-ready is so small that it’s not likely representative of broader, later, commitment.  Fortunately, we have some insight we can draw from the IT world, insight that’s particularly relevant given the fact that things like microservices are already a major driver of change in IT, and are of increasing interest in the carrier cloud.  To get to these insights we need to look a bit at the leading edge of cloud software development.

Microservices are in many ways a kind of bridge between traditional componentized applications (including those based on the Service Oriented Architecture of almost two decades ago) and the “bleeding edge” of computing architecture, the functional programming or Lambda function wave.  A Lambda function is a software element that processes an input and produces an output without relying on the storage of internal pieces—it has a single function regardless of the context of its use.  What makes this nice is that because nothing is ever saved inside a Lambda function, you can give a piece of work to any copy of the function and get exactly the same result.  I’m going to talk a lot about the Lambda functions in this blog, so to save typing I’m going to call them “Lambdas” with apologies to the people who use the term (without capitalizing) to mean “wavelength”.

In the broader development context, this kind of behavior is known as “stateless” behavior, because there are no “states” or differences in function outcome depending on the sequence of events or messages being processed.  Stateless behavior is mandatory for Lambdas, and also highly recommended if not mandated for microservices.  Stateless stuff is great because you can replace it, scale it, or use any convenient element of it and there’s no impact, no cross-talk.  They’re bad because many processes aren’t stateless at all—think of taking money out of the bank if you need an easy example.  What you have left depends on what you’ve put in or taken out before.

The reason for this little definitional exercise is that both Amazon and Microsoft have promoted Lambda programming as a pathway to event-driven IT, and the same is being proposed for microservices.  In Amazon’s case, they linked it with distributing functions out of the cloud and into an edge element (Greengrass).  Event-driven can mean a lot of things, but it’s an almost-automatic requirement for what are called “control loop” applications, where something is reported and the report triggers a process to handle it.  IoT is clearly a control-loop application, but there are others even today, which is why Amazon and Microsoft have focused on cloud support for Lambda functions.  You can write a little piece of logic to do something and just fire it off into the network somewhere it can meet the events it supports.  You don’t commit machine image resources or anything else.

If IoT and carrier cloud will focus on being event-driven, it follows they would likely become at least Lambda-like, be based on stateless microservices that are pushed toward the edge to shorten the control loop while traditional transactional processes stay deeper in the compute structure.  Applications, then, could be visualized as a cloud of Lambdas floating around, supporting collectively a smaller number of stateful repository-oriented central applications.  The latter will almost surely look like any large server complex dedicated to online transaction processing (OLTP).  What about the latter?

The Lambda vision is one of functional units that have no specific place to live, remember.  It’s a vision of migration of capabilities to assemble them along the natural path of work, at a place that’s consistent with their mission.  If they’re to be used in event-handling, this process of marshaling Lambdas can’t take too long, which means that you’d probably have a special system that’s analyzing Lambda demand and caching them, almost like video is cached today.  You’d probably not want to send a Lambda somewhere as much as either have it ready or load it quickly from a local resource.  Once it’s where it needs to be, it’s simply used when the appropriate event shows up.

This should make it obvious that running a bunch of Lambdas is different from running applications.  You don’t need a lot of disk I/O for most such missions, unless the storage is for non-volatile referential data rather than a dynamic database.  What you really want is powerful compute capabilities, a lot of RAM capacity to hold functions-in-waiting, and probably flash disk storage so you can quickly insert a function that you need, but hadn’t staged for use.  Network I/O would be very valuable too, because restrictions on network capacity would limit your ability to steer events to a convenient Lambda location.

How Lambda and application hosting balance each other, requirements-wise, depends on how far you are from the edge.  At the very edge, the network is more personalized and so the opportunity to host “general-use Lambdas” is limited.  As you go deeper, the natural convergence of network routes along physical facilities generate places where traffic combines and Lambda missions could reasonably be expected to be shared across multiple users.

This builds a model of “networking” that is very different from what we have now, perhaps more like that of a CDN than like that of the Internet.  We have a request for event-processing, which is an implied request for a Lambda stream.  We wouldn’t direct the request to a fixed point (any more than we direct a video request that way), but would rather assign it to the on-ramp of a pathway along which we had (or could easily have) the right Lambdas assembled.

I noted earlier in this blog that there were similarities between Lambdas and microservices.  My last paragraph shows that there is also at least one difference, at least in popular usage, between Lambdas and microservices.  The general model for microservices is based on extending componentization and facilitating the use of common functions in program design.  A set of services, as independent components, support a set of applications.  Fully exploiting the Lambda concept would mean that there really isn’t a “program” to design at all.  Instead there’s a kind of ongoing formula that’s developed based on the source of an event, its ultimate destination, and perhaps the recent process steps taken by other Lambdas.  This model is the ultimate in event-driven behavior, and thus the ultimate in distributed computing and edge computing.

There’s another difference between microservices and Lambdas, more subtle and perhaps not always accepted by proponents of the technologies.  Both are preferred to be “stateless” as I noted, but in microservices it’s acceptable to use “back-end” state control to remove state/context from the microservices themselves.  With Lambdas, this is deprecated because in theory different copies of the same Lambdas might try to alter state at the same time.  It would be better for “state” or context to be carried as a token along with the request.

We don’t yet really have a framework to describe it, though.  Here’s an event, pushed out by some unspecified endpoint.  In traditional programming, something is looking for it, or it’s being posted somewhere explicitly.  Maybe it’s publish-and-subscribe.  However, in a pure Lambda model, something Out There is pushing Lambdas out along the path of the event.  What path is that?  How does the Something know what Lambdas are needed or where to put them?

If you applied the concepts of state/event programming to Lambda control, you could say that when an event appears it is associated with some number of state/event tables, tables that represent contexts that need to process that event.  The movement of the event through Lambdas could be represented as the changing of states.  Instead of the traditional notion of an event arriving at a process via a state/event table, we have a process arriving at the event for the same reason.  But it’s still necessary to know what process is supposed to arrive.  Does the process now handling an event use “state” information that’s appended to it and identify the next process down the line?  If so, how does the current process know where the next one has been dispatched, and how does the dispatcher know to anticipate the need?  You can see this needs a lot of new thinking.

IoT will really need this kind of edge-focused, Lambda-migrating, thinking.  Even making OSS/BSS “event-driven” could benefit from it.  Right now, as far as I can see, all the good work is being done in abstract with functional programming, or behind the scenes of web-focused, cloud-hosted startups who probably have stimulated both Amazon and Microsoft to offer Lambda capabilities in their clouds.  It will be hard to make IoT the first real use case for this—it’s a very big bite—but maybe that’s what has to happen.

A Slightly Early MWC Retrospective

The iconic MWC conference is now pretty much history.  The big announcements have been made, the attendees have largely exhausted themselves (the exhibitors certainly have!), and it’s time to take stock and decide whether anything important was really said and shown.  In terms of point announcements, it’s rare for something huge to come out at an event like MWC—too much crosstalk.  The buzz of the show is another matter; we can pick out some important points by looking across all the announcements and demonstrations to detect shifts and non-shifts.

The most important thing that I take away from MWC is that there is an enormous gap between 5G expectation and the current state of the technology.  The goal of 5G is service- and infrastructure-shaking, and the reality of 5G at the moment struggles to be a major shift in the RAN.  Part of the reason for this shift is the (usual) slow progress of the specifications, but another part is the fact that standards groups have a habit of grabbing the low apples or focusing on the most visible questions.

5G RAN improvements are important, but operators I talk with have consistently said that their biggest priority was to standardize the metro and access models for wireless and wireline, and to support wireless 5G extensions of fiber networks.  Without these capabilities, many operators said that it would be difficult to justify 5G versus enhanced 4G.  Ironically, the early “5G trials” have all focused on RAN and on modest adjustments to 4G, like supporting 5G frequencies, to “prove out” the technology.  Some operators have been public in their rejection of this approach, but that’s what’s been happening.

One public approach to pre-standard 5G even retains the Evolved Packet Core, which most operators told me was something that they wanted (as a number-one or number-two priority) to eliminate.  Clearly the focus of many 5G proponents is to move the process ahead even if there’s less utility in what’s produced.  That also was a criticism that’s been made in public.

The next point is that we have not yet abandoned our short-sighted and stupid vision of IoT as being all about wireless connections.  There were plenty of examples of this, but two were particularly figured in the overall stream of hype.  The first is a broadening of the notion that this is all about RF, which makes IoT all about connections.  The second is the almost hypnotic attraction to “connected car” as the prototypical IoT application.

I’m almost tired of saying that getting devices connected is the least of our IoT worries, but it is.  The majority of IoT applications will almost certainly use devices that not only aren’t directly on the Internet at all, but don’t even use Internet-related technology for connections.  Home control today relies on technologies that aren’t related to Ethernet, IP, or the Internet.  Only the home controller is an Internet device, and this model of connectivity is likely to dominate for a long time to come.  If we insist that all our sensors and controllers be IP devices that are Internet-connected, we’re building a barrier to adoption that will take unnecessary years to jump.

The connected car is another potential trap.  Most of what a connected car will do is offer WiFi to consumer mobile devices that passengers and drivers (the latter, hopefully, not while moving) are using in the vehicle.  Yes, there are other features, but the value proposition is really more like a moving WiFi hotspot than a real IoT mission.  There’s always pressure to pick something that’s actually happening and then broaden your definition of revolutionary technology to envelope it, justifying your hype.  That’s not helpful when there are real questions and issues that are not addressed by the billboard-technology example, but will have to be addressed for the market to develop.

The first positive point from the show is that both network operators and equipment vendors realize that mobile broadband personalization is the only relevant demand driver.  Wireline broadband for both consumers and businesses is really just a matter of wringing as much as profit as possible out of something that’s already marginal at best.  If there is new revenue to be had for operators, that revenue is going to come from the exploitation of mobile broadband in both enterprises and consumer markets.

There’s a sad side even to this happiness, though.  For all the fact that the explosion of interest in MWC demonstrates the victory of mobile broadband, or that many who exhibit and probably even more who attend MWC are there for things not directly related to cellular networks, we’re still missing a lot of the key points that justify the mobile focus.

A mobile device is a direct extension of the user, a kind of technological third leg or second head.  It brings the knowledge and entertainment base of the Internet and the power of cloud computing right into the hands of everybody.  The best way to look at IT evolution since the ‘50s is that each new wave brought processing closer to people.  Mobile broadband fuses the two.

Also in my view a positive was the talk from FCC Chairman Ajit Pai, where he said what shouldn’t really have surprised anyone—that the FCC planned a “lighter touch” under the new administration.  The FCC had already taken steps that indicated it would retreat from the very activist position taken by the body under the previous Chairman (Wheeler), but Pai voted against the neutrality ruling and his comments at MWC suggest he has specific moves in mind.  Reinforcing the “lighter touch” was the comment (referencing neutrality) that “It has become evident that the FCC made a mistake.  Our new approach injected tremendous uncertainty into the broadband market. And uncertainty is the enemy of growth.”

Net neutrality is important, insofar as it protects OTT competitors from operators cutting favorable deals with their own subsidiaries.  The current rules, though, were not enough to prevent AT&T from offering outside-data-plan video to its TV customers.  On the other hand, the extension of the rules that Wheeler promoted has made the relationship between subsidiaries and ISPs confusing to say the least, and it’s probably limited willingness of operators to pursue initiatives that would have promoted broadband infrastructure investment.

I have to agree with Pai here.  I think that the FCC in the last term overstepped simple neutrality goals and took a stand on the broadband business that favored one party—the OTTs—over the other, to a degree the FCC had never done before.  A dynamic broadband market—the kind that MWC and 5G propose to support—demands a symbiosis and not an artificial financial boundary.  Through almost my whole consulting career I’ve supported the notion of Internet settlement, and I still support it.  I think it’s time to take some careful, guarded, steps toward trying it out.

How Could We Accelerate the Pace of New Edge-Deployed Data Centers?

There should be no question that I am a big fan of edge computing, and so I’m happy that Equinix is pushing storage to the edge (according to one story yesterday) or that Vapor IO supports micro-data-centers at the wireless edge.  I just wish we had more demand focus to explain our interest in the supply.  There are plenty of things happening that might drive a legitimate migration of hosting/storage to the network edge, and I can’t help but feel we’d do a better job with deployment if there was a specific business case behind the talk.

Carrier cloud is the definitive network-distributed IT model, and one of the most significant questions for server vendors who have aspirations there is just how the early carrier cloud data centers will be distributed.  A central model of hosting, even a metro-central one, would tend to delay the deployment of new servers.  An edge-centric hosting model would build a lot of places where servers could be added, and encourage operators to quickly support any missions where edge hosting offered a benefit.  So, in the balance, where are we with this?

Where you host something is a balance between economy of scale, economy of transmission, and propagation issues.  Everyone knows that a pool of servers offers lower unit cost than single servers do, and most also know that the cost per bit of transmission tends to fall to a point, then rise again as you pass the current level of economical fiber transport.  Most everyone knows that traversing a network introduces both latency (delay) and packet loss that grows with the distance involved.  The question is how these things combine to impact specific features or applications.

Latency is an exercise in physics; the speed of light and electrons and the delay introduced by queuing and handling in network devices.  The only way to reduce it is to shorten the path, which means pushing stuff to the edge.  Arguably, the only reason to edge-host something is because of latency (though we’ll explore that point later), and most applications run today aren’t enormously latency-sensitive.  Telemetry and control applications, which involve the handling of an event and sending a response, are often critically sensitive to latency in M2M applications.

That means that IoT is probably the obvious reason to think about edge-hosting something.  The example of self-driving cars is trite here, but effective.  You can imagine what would happen if a vehicle was controlled by something half-a-continent away.  You can easily get a half-second control loop, which would mean almost fifty feet of travel at highway speed.

Responses to human queries, particularly voice-driven personal assistants, are also delay sensitive.  I saw a test run a couple years ago that demonstrated that people got frustrated if their queries were delayed more than about two seconds, resulting in their repeating the question and creating a context disconnect with the system.  Since you have to factor in actual “think time” to a response, a short control loop would be helpful here, but you can accommodate longer delay by having your assistant say “Working on that….”

Content delivery in any form is an example of an application where latency per se isn’t a huge issue, but it raises another important point—resource consumption or “economy of transmission”.  If you were to serve (as a decades-old commercial suggested you could) all the movies ever made from a single point, the problem you’d hit is that multiple views of the same movie would quickly explode demands for capacity.  You also expose the stream to extreme variability in network performance and packet loss, which can destroy QoE.  Caching in content delivery networks is a response to both these factors, and CDNs represent the most common example of “edge hosting” we see today.

Let’s look at the reason we have CDNs to explore the broader question of edge-hosting economies versus more centralized hosting.  Most user video viewing hits a relatively contained universe of titles, for a variety of reasons.  The cost of storing these titles in multiple places close to the network edge, thus reducing network resource consumptions and the risk of performance issues, is minimal.  What makes it so is the fact that so many content views hit a small universe of content.  If we imagine for a moment that every user watched their own unique movie, you’d see that content caching would quickly become unwieldy.  Unique resources, then, are better candidates for “deep hosting” if all other traffic and scale economies are equal.

That brings us to scale.  I’ve mentioned in many blogs that economies of scale don’t follow an exponential or even linear curve, but an Erlang C curve.  That means that when you get to a certain size data center, further efficiency gains from additional servers are minimal.  For an average collection of applications I modeled for a client, you reached 95% optimality at about 800 servers, and there are conditions under which less than half that would achieve 90% efficiency.  That means that supersized cloud data centers aren’t necessary.  Given that, my models have always said that by around 2023, operators would have reached the point where there was little benefit to augmenting centralized data centers and move to edge hosting.  The biggest growth in new data centers occurs in the model between 2030 and 2035, where the number literally doubles.  If I were a vendor, I would want to accelerate that shift to the edge.

Centralization of resources is necessary for unique resources.  Edge hosting is necessary where short control loops are essential to application performance.  If you model the processes, you find that up to about 2020, carrier cloud is driven more by caching and video/ad consideration than anything else, and that tends to encourage a migration of processing toward the edge.  From 2020 to about 2023, enhanced mobile service features begin to introduce more data center applications that are naturally central or metro-scoped, and beyond 2023 you have things like IoT that magnify the need for edge caching again.

Video, then, is seeding the initial data center edge locations for operators.  Metro-focused applications will probably use a mixture of space in these existing edge hosting points and new and more metro-central resources.  The natural explosion in the number of data centers will occur when the newer short-control-loop stuff emerges, perhaps four to five years from now.  It would be hard to advance something like this; the market change is pretty profound.

Presuming this is all true, then current emphasis on caching of data is smart and edge hosting of processing features may be premature.  What could accelerate the need for edge hosting?  This is where something like NFV could be a game-changer, providing a mission for edge-centric hosting before broad-market changes in M2M and IoT emerge and building many more early data centers.  If you were to double the influence of NFV, for example, in the period up to 2020, you would add an additional thousand edge data centers worldwide.

NFV will never drive carrier cloud, but what it could do is to promote edge-placement of many more data centers between now and 2020, shifting the balance of function distribution in the 2020-2023 period toward the edge, simply because the resources are there.  That could accelerate the number of hosting points (and slightly increase the number of servers) through 2025, and it would be a big windfall for vendors.

IT vendors looking at the carrier cloud market should examine the question of how this early NFV success could be motivated by specific benefits, and what specific steps in standardization, operationalization, or whatever might be essential in supporting that motivating.  There are few applications that could add as much to the data center count, realistically, in the next three years.