The whole topic of SDNs has been fuzzy almost from the onset. Hopefully by now you know my own view is that SDN is a transformation of networks whose value is largely created by the transition of IT to a cloud model. In short, SDN is the network side of the cloud revolution.
Because “cloud” means “resource pool” to many, it’s not surprising that there’s been a major SDN focus on the data center virtualization missions that Nicira, recently purchased by VMware, also addresses. But “SDN” isn’t “data center software defined networking”, so we need to think about its application in the wide world. That raises the issues of security and management, issues that aren’t typically on the table for SDN discussions.
Since nobody thinks that software will directly control networks (imagine every software application grabbing an API and setting up connections; it chills the heart of any professional) the presumption is that some software process will act on behalf of applications at large to do that. On the resource side of the SDN picture it’s clear that things like the DevOps stuff I’ve talked about and the Quantum network-as-a-service model of OpenStack can manage the aspects of network definition that relate to application provisioning. The question is who manages the user stuff, and the answer can only be the security system, because for user-to-software connection it’s about rights.
An easy model for an SDN-based security framework is the “branch on-ramp” model. In this model the DevOps processes that provision applications in the cloud also extend a set of application-specific virtual networks (OpenFlow pipes, if you like) to each worker location. The workers, when they sign on to the network, are authenticated by their security system and assigned application rights. This assignment opens forwarding rules that link them with the networks for the applications they’re allowed to use. Non-authenticated workers have no rights, and workers have no access to the application virtual networks their credentials don’t validate them to.
At a deeper level this process might look like worker-class networks meeting application networks. You can envision a series of virtual networks in each branch location, each network representing a class of worker there and linked to the appropriate applications. A worker is credentialed into the appropriate class network and thus receives application access. There are variations possible here, but the basis idea is that security processes control the “provisioning” of the access network in the same way that DevOps processes control the provisioning of the resource network.
The notion of two different virtualization models (branch and data center plus application) meeting through policies is clearly the end-game if you follow this approach. Workplace virtual networks are separated by worker classification and data center by application. The security systems then provide what’s essentially a firewall linkage between the two. If the mobile worker is strongly authenticated, this model would offer a high level of security.
What about management? In many cloud applications, the resource network will be contained within a data center and be subject to direct control. It’s still likely that there will have to be some management link to the SDN processes to set up and restore paths, of course, and this would be a link to what I’ve always called the “Topologizer”, the element of the higher-layer SDN processes that understands the mapping between virtual network services and real networks. That doesn’t answer the question of how things like fault correlation work, though. An network problem in our example earlier would likely be seen by workers as an application access failure, and to fix virtual networks you have to push through the abstraction to the real stuff.
One way to mitigate management issues is to presume that the DevOps processes that provision SDNs also record their assignment of resources so that the dependency of the application virtual network structure on a specific set of devices or services from the network is recognized. This could be used to address failures both as reported from above and from below. In the former case, the user’s application-specific problem is sent to a DevOps-driven task that finds the network dependencies and determines their current state. In the latter case a change in network state is sent to all DevOps-created virtual networks to inform them that anything that depends on the indicated resource is in a fault state.
Another strategy that could be used independently or in concert with the one above is to create “services” from the virtual network layer not above the network but in it, by managing network assets directly (using traditional virtual service models or OpenFlow-based models, or both). If management processes in the network create these services then the virtual networks can be managed by those processes, since the relationship between the virtual and real must be known by the things that create the relationship in the first place.
Yet a third option is to forget traditional network management completely and focus on service management. Assure the outcome, in other words. If something breaks you’d presumably get a direct hardware report to act on for repair, but for analysis of trends and congestion or other “subjective” failure issues you rely on either telemetry at the service level or a report by the user. Then you go to the DevOps notion of fault correlation to determine what you need to do.
I’m not arguing that these approaches are the only way to link security to SDNs or to manage SDNs, but I do think that unless the kind of issues I’ve raised here are addressed, we’ll be under-shooting SDN benefits and risking down-the-line operationalization issues as well. And all of this should demonstrate that the key to SDN isn’t in OpenFlow or the controller, which only implement SDN policies that are created above, but in the stuff that IS above, the stuff that we’re not paying enough attention to at this critical period in SDN evolution.