Federation, Virtual Network Operators, and 5G Slicing, and Their Relationship to SDN/NFV

Every network operator I’ve surveyed has some sort of wholesale/retail relationship with other operators.  Most fit into two categories—a relationship that extends geographic scope or one that incorporates one operator’s service inside another (like backhaul or MVNO).  Given this, it is natural to assume that services built on SDN and/or NFV would have to be covered by these same sorts of deals.  The question is how, and how the need to support these relationships could impact the basic architecture of SDN or NFV.  It’s important because the 5G specifications are going to make “slicing” into a new standard mechanism for virtual networking.

To avoid listing a host of relationships to describe what we’re going to talk about here, I’m going to adopt a term that has been used often (but not exclusively) in the market—federation.  For purposes of this blog, federation is a relationship of “incorporation”, meaning that one operator incorporates services or service elements from another in its own offerings.  We should note, though, that operators are a kind of special case of “administrative domains”, and that federation or sharing-and-incorporation capabilities could also be valuable or essential across business units of the same operator, across different management domains, etc.

We used to have federation all the time, based on intercarrier gateways.  Telcos intercall with others, and early data standards included specific gateway protocols—the venerable and now-hardly-used packet standard X.25 used the X.75 gateway standard for federation.  All of this could be called “service-level” federation, where a common service was concatenated across domains.  Federation today happens at different levels, and creates different issues at each.

There is one common, and giant, issue in federation today, though, and it’s visibility.  If a “service” spans multiple domains, then how does anyone know what the end-to-end state of the service is?  The logical answer is that they have a management console that lets them look, but for that to work all the federated operators have to provide visibility into their infrastructure, to the extent needed to get the state data.  Management visibility is like taxes, if you can invoke it then it can lead to destruction.  No operator wants others to see how their networks work, and it’s worse if you expect to remediate problems because a partner actually exercising management control could lead to destabilizing events.

The presumption we could make to resolve this issue is that service modeling, done right, would create a path to solution.  A service model that’s made up of elements, each being an intent model that asserts a service-level agreement, would let an operator share the model based only on the agreed/exposed SLA.  If we presumed that the “parameters” of that element were exposed at the top and derived compatibly by whatever was inside, then we could say that what you’d see or be able to do with the deployed implementation of any service element would be fixed by the exposure.  In this approach, federation would be the sharing of intent-modeled elements.

This is a big step to solving our problems, but not a complete solution.  Most operators would want to have different visibility for their own network operations and those of a partner.  If I wholesale Service X to you, then your network ops people see the parameters I’ve exposed in the relationship, but I’d like my own to see more.  How would that work?

One possibility is that of a management viewer.  Every intent model, in my thinking, would expose a management port or API, and there’s no reason why it couldn’t expose multiple ones.  So, a given element intent model would have one set of general SLA and parametric variables, but you’d get them through a viewer API, based on your credentials.  Now partners would see only a subset of the full list, and you as the owner of the element could define what was exposed.

Another possibility is an alias element.  We have a “real” service element, which decomposes however it has to into a stream of stuff toward the resources.  We have another element of the same kind, which is a fork upward from this real element.  Internal services compose the real element, but you expose the alias element in federation, and this element contains all the stuff that creates and thus limits the management visibility and span of control.

The issues of visibility can be addressed, but there remain two other federation curve balls to catch.  One is “nesting” and the other is “foundation services”.

Nesting is the creation of a service framework within which other services are built.  A simple example is the provisioning of trunks or virtual wires, to be used by higher-layer Ethernet or IP networks.  You might think this is a non-issue, and in some ways it might be, but the problem that can arise goes back to management and control.  Virtual resources that create an underlayment have to be made visible in the higher layer, but more importantly the higher layer has to be constrained to use those resources.

Suppose we spawn a virtual wire, and we expect that wire to be used exclusively for a given service built at L2 or L3.  The wire is not a general resource, so we can’t add it to a pool of resources.  The implications are that a “layer” or “slice” creates a private resource pool, but for that to be true we either have to run resource allocation processes on that pool that are pool-specific (they don’t know about the rest of the world of resources, nor does the rest of the world see them) or we have to define resource classes and selection policies that guarantee exclusivity.  Since the latter would mix management jurisdictions, the former approach is best, and it’s clearly federation.  We’re going to need something like this for 5G slicing.  The “slice-domain” would define a series of private pools, and each of the “slice-inhabitants” would then be able to run processes to utilize them.

The key point in any layered service is address space management.  Any service that’s being deployed knows its resources because that’s what it’s being deployed on.  However, that simple truth isn’t always explicit; you almost never hear about address spaces in NFV, for example.  Address spaces, in short, are resources as much as wires are.  We have to be explicit in address management at every layer, so that we don’t partition resources in a way that creates collisions if two layers eventually have to harmonize on a common address space, like the Internet.  We can assign RFC 1918 addresses, for example, to subnets with regard for duplication across federated domains because they were designed to be used that way, with NAT applied to linking them to a universal address space like the Internet.  We can’t assign Internet-public addresses that way unless we’re willing to say that our parallel IP layers or domains never connect with each other or with another domain—we’d risk collision in assignment.

The other issue I noted was what I called “foundation services”.  We have tended to think of NFV in particular as being made up of per-customer-and-service-instance hosted virtual functions.  Some functions are unlikely to be economical or even logical in that form.  IMS, meaning cellphone registration and billing management, is probably a service shared across all virtual network operators on a given infrastructure.  As IoT develops, there will be many services designed to provide information about the environment, drawn from sensors or analysis based on sensor data.  To make this information available in multiple address spaces means we’d need a kind of “reverse NAT”, where some outside addresses are gated into a subnet for use by a service instance.  How does that get done?

How do we do any of this in SDN, in NFV?  Good question, but one we can’t really answer confidently today.  As we evolve to real deployments and in particular start dealing (as we must) with federation and slicing, we’re going to have to have the answers.