Does Virtualization, in or outside 5G, pose a Security/Compliance Risk?

In yesterday’s blog, I looked at the operations considerations that arise when you build a “network” by hosting feature/function instances on infrastructure.  I pointed out that the process creates two explicit layers—“functional” and “infrastructure”—and one implicit one, the layer representing the binding between the two that actually creates the service.  We explored the operations impact of this in yesterday’s blog, so today we’re looking at the security/governance impact.

Some aspects of 5G security are considered within the specifications themselves.  These relate to the “traditional” security issues of protecting the exposed interfaces identified, so they’re necessary but not necessarily sufficient.  The implementation of 5G, just like the implementation of any new network model that hosts features on a pool of resources instead of fixing them within devices, creates a new set of interfaces.  These are neither normally exposed nor considered in the 5G specs, so we have to dig into them by digging into the model of a virtual-feature-hosting infrastructure.

Just as we can visualize two “layers” of our virtual network, we can divide connectivity by those two layers.  Since we’re presuming that the functional layer represents the equivalent of the device network, the connectivity in that layer would be similar to (or identical with) what we’d see in a device network.  The infrastructure layer, on the other hand, represents the connectivity in the pool of resources we can commit to function hosting.

In the functional layer of the network, we have the service domain network, the connectivity that’s explicitly visible to service users.  Think data plane, control plane, and management plane for device networks and you get the idea.  If we have a “functional element” in our functional layer, and that element exposes interfaces that are visible within the service, then those connections are part of the service domain.  Part of this network supports connectivity to the management/lifecycle processes at the service level.

The infrastructure domain network is the network that connects the pool elements and their associated management/lifecycle tools.  Generally speaking, this network should be totally isolated from the service domain network, because the infrastructure layer is shared by all users and services, and so exposing it to an individual user/service would constitute a major security/governance breach.  This is our on-ramp to our discussion today.

The greatest barrier to intrusion in any network is access control.  If you can’t address the network or something on it, then you can’t attack it.  The corollary is that the more attackable things there are on networks, the more points of attack are possible.  The fact that virtual hosting creates an infrastructure layer with its own network means that network creates attack vectors that have to be shortstopped.

The best starting point for security and governance is to assign a private IP address to anything in the infrastructure network.  These addresses (for IPv4, assigned in RFC 1918 and for IPv6, RFC 4193) are not routed on public IP networks like the Internet, so something that has a private IP address has to be gated (NATed) onto a public IP network, meaning the address has to be translated.  That which is not cannot be addressed, which lets us separate the service domain and infrastructure domain networks.

Of course, theory should make almost anything immune to hacking, so we know that theories have a way of not being applied.  In this case, there are two common loopholes.  First, anything that lives on both network domains can shunt things between them.  Second, if something can be planted on a private IP network, it can then relay or introduce traffic there.

The obvious response to the security/governance issues related to virtual functions is meticulous control over the software at all levels.  Of paramount importance is the software that manages the deployment, the “orchestration” tools that might reside in a variety of places.  Anything that has the right to deploy onto infrastructure can deploy malware.

Even virtual functions themselves can be an issue.  At the very least, malicious functions could disrupt a service.  They might also be able to disrupt the host, attacking via holes in container or VM security.  But in some cases, the risk could be higher, because a virtual function could live in two worlds.  If the function represents a piece of data/control-plane functionality, it necessarily lives on the service domain network.  If it has to be addressed/manipulated by the infrastructure lifecycle components, it might also live on (be visible on) the infrastructure domain network.  As such, it could be that badly behaved portal between the two that could admit all manner of bad things.

There are some strategies that could limit this kind of risk.  The obvious one is that any component of lifecycle management must be considered to be highly secure, its source authenticated and its operations validated in a lab before deployment.  Having a guarantor to sue if things go wrong is always a good strategy too.  A less obvious but potentially valuable requirement is that no virtual function/feature element that would likely have to be introduced with less certification should ever be allowed to present an interface directly onto the infrastructure domain network.  If orchestration and lifecycle management needs access to this sort of software, it has to be via an intermediary element, something as simple as an API proxy or gateway or as complex as an abstraction tool designed to harmonize all functions of a given class to a common set of interfaces.

5G complicates the picture because of network slicing, which is a partitioning of an operator’s 5G network into what are supposed to be independent subnetworks, which are then essentially tenants of the main network, in the sense that users of cloud computing are tenants of the cloud provider’s infrastructure.

I think the best approach with 5G is to assume that each network slice will have its own “infrastructure layer” and domain network, separate from the main 5G Core resource network.  If all slices were truly ships in the night, then application of the layer-domain principles above should provide a pathway to securing each slice.  However, it seems certain that some elements of 5G software will have to coordinate across, and thus themselves cross, boundaries.

There is no doubt that virtualization of network features, to create a secure framework, must include practices to standardize and tightly control interactions among the software components.  The “standardize” part is particularly important, because the more latitude is offered in the way that software is introduced into a virtual framework, the more difficult it will be to determine whether the software is behaving properly.  “No boundaries” isn’t a formula for successful child care, and it’s not for successful virtualization either.

Logically speaking, all of the slices in 5G networks should interact with infrastructure via a virtualization/abstraction layer.  Given that, I think we should assume that such an abstraction layer would be advisable in all cases.  I don’t mean just “virtual machines” or “containers” here, but the ecosystem of tools and APIs that represent resources as they’d be expected to be used by consuming processes within a functional layer.  Similarly, each slice should contain a functional layer whose interactions with the 5G core functions (registration for example) are abstracted and presented by an API that is visible to the network operator as well as the slice operator—a NAT gateway or proxy.

It may be that 5G, like so much in our modern world, risks being a victim to a form of virtualization pushed by standards bodies who aren’t able to do much beyond spelling the word.  We should have realized by now that when you promote virtual functionality in networking, you have to make sure you abstract everything that the virtual world touches.  A coupling to “real” stuff is inefficient, and in security and compliance terms it’s also very dangerous.