Is There a Link Between Anthos Federation and Operators’ Service Federation?

I want to take up federation again, extending the federation approach of Google’s Anthos to the service provider space, where federation is surely needed.  I’ve said many times that the cloud initiatives we’re seeing are addressing many of the things that operator standards for advanced services and hosted features should have addressed.  The question now is whether Anthos-like federation is an example of that; whether Anthos could be an indicator that the cloud is going to address one of the critical issues that operator standards groups have failed to cover fully.

My first experience with operator-centric network federation in a formal sense came on the IPsphere project a decade ago.  IPsphere defined a hierarchy of service-as-a-set-of-elements.  Services were essentially a commercial envelope and a unifier of the relationship of the elements within.  Elements could be provided by the service owner, or they could be outsourced to others—presumably operators in IPsphere terms.  Thus, services required federation as a standard feature.

The presumption inherent in IPsphere was that each “Element provider” was an autonomous entity who was unwilling to expose the details of their infrastructure to others.  An Element was a set of properties that could be relied upon, from the perspective of the Service, and a recipe for fulfilling those properties to the Element provider.  The Element ran in the Element provider’s infrastructure and delivered what it promised, exposing status information and interfaces as described.  In virtually all cases, the Element had to be connected in some way with the data plane of the Service owner, and perhaps other Element owners too, to create a cohesive service.  This is the framework I proposed in my ExperiaSphere project.

This structure is not unlike that which Anthos describes.  Each Kubernetes domain in Anthos is autonomous, and there are gateways (implicit or explicit) that provide data plane linkage among the domains.  There’s also a set of properties, in the form of high-level policies, that define how everything is supposed to work at the collective level.  These feed the domain-specific policies, and vice versa, so the general notion of policies rather than scripts links the collective level and the domain levels.

Sometimes you can go astray by staying up in the stratosphere.  The basic model of domains linked with policies and gateways seems appropriate for the kind of federated relationship of autonomous providers that IPsphere envisioned, but while a 50-thousand-foot architecture diagram of the two approaches would look similar, the details are quite different, and those differences are what we need to look at in more detail.

Anthos is about application deployment.  IPsphere, and most network operator federation needs, are service-related.  Anthos lets you connect the pieces of a distributed application, and if we draw the obvious analogy that application components are the same as hosted service features, then Anthos is really about a form of federation that operators (in the IPsphere work) had dismissed.  They didn’t want to expose their infrastructure for others to deploy on, which is what Anthos federation would enable.  They want to have each operator deploy on their own infrastructure, and expose the functional result of that deployment.  Don’t host on me, cooperate with me at the service level.

If we look at this point carefully, we can see that operator-centric federation, or service federation, is actually a layer above Anthos.  You can’t address it with general hosting tools, you have to first assign application/service functional missions to domains, and then advertise those missions on the outside while fulfilling them on the inside.  A VPN could be federated as a service, but if you wanted to do one using Anthos, you’d first have to assign roles in the service to the federated partners, describe those roles in our inside/outside feature/recipe form and then use Anthos to fulfill the recipes by deploying components.

Could you build a service like a VPN that way?  In the present form of Anthos, probably not.  Anthos is about a tenth as fast as native component interconnection, due to the latency that the service mesh and its associated features introduces.  That’s too much of a performance hit to be incorporated into a data-plane network connection service.  You could use Anthos and service meshes for control-plane stuff (within IMS/EPC, for example) but probably not for the data plane.  At least, not until we work through some of the performance issues, which is almost surely going to happen with Anthos, and with service mesh technology in general.

The opposite question is also interesting.  Could Anthos find another layer useful?  An application is like a service.  Could some new application-layer piece be introduced that would be able to decompose functional needs into domain assignments, then use Anthos to deploy the pieces of the application according to that functional assignment map?

Anthos’ main components are Kubernetes and Istio, for container orchestration and service mesh, respectively.  Both are no walk in the park to learn and use.  Could we make them easier to use with a layer above?  Could introducing some service-federation thinking into the cloud benefit the cloud, and so become an outcome we could actually hope for?  I think that’s possible.

Anthos takes a stab, via policy coordination, at managing the collective resource pool, but that’s done at the level of hosting and discovery of microservices, not at the function-to-domain assignment level.  That means that users have to decide where stuff belongs, if that decision is based not on simple metrics like available resources or cost, but on application and user requirements.  Another higher-level dimension that could map an application across multiple hosting domains (multiple public clouds and the data center or data centers) would surely help users.

This might even be the way to get artificial intelligence into the game.  Functional or user-level assignment like this requires balancing a lot of variables, including the performance impact of where you put something, the regulatory implications of having your service pieces in a specific place, and of course cost and performance.  Then there’s the differences in the tools and features available in all the possible domains to consider.  It’s a complex problem of the kind AI would be perfect for.

And (no surprise) there’s more.  If we had this kind of AI-driven higher-level element that mapped function to deployment, why couldn’t it feed the policies of the Anthos layer?  How much of the complexity of container deployment and management could be erased by having something that mapped goals and intent to deployment and connection?  A lot, I suspect.

What this all shows, in my view, is first that what operators think of as “federation” is a higher layer than Kubernetes/Anthos federation, and second that the software for that higher layer isn’t unique to the network operator application of service federation.  The higher layer’s general solution is AI-based coupling of application/user goals and intent to deployment and connection policies.  That means that the cloud community will likely get there as they move to make Kubernetes, Istio, and Anthos more accessible and useful.  So, if operators want input into the only processes likely to actually advance their own needs, they’ll need to hustle.