The Challenge of the Service Data Plane

Given that there’s an essential relationship between features and functions hosted…well…wherever and the connection network, it’s important to talk about the connection relationships involved, and in particular the service composition and federation relationships to the data plane. This is an issue that most cloud and network discussions have dodged, perhaps conveniently, and that has to come to an end.

When we talk about network services, the Internet dominates thinking if not always conversation. A global network with universal addressing and connectivity sure makes it easy to deal with connection features. Even when we’re not depending on universal connectivity, to the point of actively preventing it, we still tend to think of subsets of connectivity rather than truly different connection networks. IP VPNs create what’s effectively a closed subnetwork, but it’s still an IP network and still likely connected to the Internet, so its privacy has to be explicitly protected.

When I was involved in some of the early operator-supported initiatives on federated services, services made up from feature components contributed by multiple providers, the issues with this simplistic model became clear. If you wanted to connect a VPN or a VLAN between operators, in order to create a pan-provider service, you needed to understand just how the interconnection could be done while supporting the SLA promised at the service level, the privacy of the operators involved, and the need to get traffic from one virtual network of any sort, to another.

Going back five decades or so, we find that early “packet network” standards raised and addressed this issue. The X.25 packet switching standard, for example, defined a user/network interface (UNI) and referenced the X.75 network-to-network interface (NNI). IP networking uses BGP to allow a network to advertise gateway points between networks to allow for overall route optimization. The emerging IETF work on IETF network slices presumes that there would be a similar gateway point where one implementation of the specification met and connected with another. The recent IETF work is interesting in that it addresses not only “connection” between networks, but also the exchange of information relating to how to recognize “slice traffic” and what SLA to apply.

Contrasting the early and current work on federation is useful because there are a couple threads that run through them all. One is that federation is based on a contract, an SLA, between the administrations involved. Another is that federation/interconnection takes place at specific points, points that are aware of the contract and can interconnect the traffic as needed, applying the SLA as it does.

The obvious central pillar in all of this, behind both these threads, is the identification of the traffic that a given service interconnect represents. In connection-oriented services like X.25, the identification is specific because it’s made at the time the connection is established. In connectionless services like IP, you have to provide header information that allows the service traffic to be recognized. However, there’s some current and potential future overlap to be considered.

You could signal a service interconnect explicitly, even over an IP network. MPLS LSPs can be set up in a connection-oriented way, and that means that even within IP you could employ connection behavior to signal service interconnections. That’s helpful because explicit service connection setup facilitates the integration of network connections with deployment of service elements. In addition, recognizing an explicit link between a service set and an interconnection means that if adaptive changes to the network occur, you can reestablish the relationship between service and connection even if different NNI points are used. This is a “signaling” model.

You can also do that through “provisioning” versus “signaling”. BGP policies control routes between AS “administrations”, and you can control internal routing via one of the other protocols (IS-IS, OSPF). If you combined route control with the provisioning of a packet filter at the essential gateway points, you could apply service SLAs there and do the proper interconnecting. With SDN, of course, you could simply explicitly route, and the central SDN controller would have to take responsibility for meeting the SLA in the routes it enforces.

The obvious challenge here is that if there are multiple ways of doing something, you have to address the situation where two different mechanisms are in play and have to be united in a single service. We now have an elevated set of NNI functions, which means that gateway nodes would have to be characterized by the details of what they could interconnect. The more different models we accept, the harder it is to manage all the possible interconnections, and the less likely it is that all of them would be supported everywhere.

This leads us to the issue of how service interconnects are specified in the first place. Today, most cloud and container deployments depend on a virtual network of some sort; in Kubernetes you can specify the virtual network to be used. However, Kubernetes really isn’t designed to “set up” connections and routes; in the great majority of cases the assignment of a user or pod to a virtual network creates the connectivity at the low level, and from there you could in theory expose selected low-level elements to a higher-level VPN or two.

It seems to me that something like the IETF network slice concept will eventually demand that we be able to “orchestrate” network connections in parallel with the orchestration of the lifecycle of hosted features, functions, and applications. That would mean either providing the mechanism to control connectivity within orchestration tools like Kubernetes, or providing a higher-layer tool that would take responsibility for orchestrating the service or application, and would then call upon lower-level tools to do deployment and redeployment of hosted elements, and connection of those elements via the network.

Control of connectivity within Kubernetes seems on the surface likely to require a major extension; it might prove a bit easier with DevOps tools like Ansible but even there I think there’d be a lot of work to do. The question is whether there would be any interest in doing it, given that traditional application networking probably doesn’t require much more than that which is already supported. Absent a cloud-centric (or at least container-centric) approach, it seems we’d likely have to build a higher-layer model to unify the awareness of server resources and awareness of network flows.

This may be the biggest challenge in accommodating telecom needs in cloud or virtualization software. Telecom services are connection services at the core, so it’s difficult to leave them out of lifecycle automation and fulfill the telecom mission. That may prove to be on of the most interesting and challenging issues that new initiatives like Nephio will have to address.