Policies, Services, Zones, and Black Boxes

I blogged earlier about policy management and its role, but since then I’ve had a number of interesting discussions with operators and users that bring more light to the topic.  Some of the most interesting relate to the relationship between how you define a policy and what the specific utility of policy management would be.  To no one’s surprise I’m sure, there’s more than one perspective out there.  To the surprise of some, many of the open questions on policy management are replicated in the worlds of SDN and NFV, even if you consider the latter two in a no-policy-management implementation context.

According to classic definition, a “policy” is a statement of intent, and so you’d probably be accurate if you thought of policies as statements of goals/objectives rather than of methods.  “I want to drive to LA” might be viewed as a high-level policy, for example, as something that constrains a function we can call “Route”.  It’s not proscriptive on how that goal might be realized.

Shifting gears to networking, we’re recognizing that there are many cases where a collection of technology creates a kind of “zone”, something that offers “service” to outsiders at edge interfaces and imposes some set of cooperative behaviors within to fulfill the needs of its services.  IP networks have zones and gateways, for example.  In SDN in OpenFlow form, or in any distributed-policy model, you could envision this zone process creating something like a classic organization chart with “atomic” zones of actual control at the bottom and coordination and interconnection zones building up toward to the master-level control.

Given that where a service transcends multiple zones there would have to be some organized coordination end to end, it’s certainly convenient to visualize this as a policy management exchange.  In fact, that visualization is also useful for NFV.  You could see my route example as a service created across multiple providers or metro areas, where a high-level process picks the general path among the next-level elements, and so forth down the line.

The concept of my route to LA can be viewed as a policy-driven process, but it can also be viewed as what I’ll call a service-driven process.  If each of the metro areas or whatever’s just below my highest “route” awareness offers a “service” of getting me between edge points, you can stitch my LA path by binding those edge points.  The metro processes are responsible for getting me between edges in some mysterious (to the higher level) way.

Is there a difference between policy-driven and service-driven, then?  Well, this is where I think things can become murky.  IMHO, service-driven models require that each element at every layer in the hierarchy advertise a “service” between endpoints.  The highest layer has a conception “route” and that conception is pushed down by successively decomposing the abstract elements we’re routing through.  Get me out of NJ?  That involves the “NJ” element.  That element might then have “South” or “North” Jersey sub-elements.  The primary constraint, always applied by the higher level to the lower when a service is decomposed, is the notion of endpoint and the notion of service, meaning things like driving speed, road type, etc.

If you look at policy models, they could work in a similar way, but what is missing from most is an explicit notion of “route” or “service” because most policy models are meant to overlay a forwarding-table routing process.  The policies relate most often to handling rules that would make up an SLA.  This makes sense given the definition of “policy” that I cited; it’s not reflective of specifics but rather of goals.  “Traffic of this type should be afforded priority” is a good policy; “traffic of this type should to via this route to this destination” isn’t what most policies are about.  If you applied policies in this sense to my trip to LA, what you get is a rule something like “passage west with a given priority should take this route over that one”.  Policy models are most often applied to systems that have internal routing capability, meaning that the connectivity service is in place.  You don’t need to tell a router to route.  Service-based models establish the service, which is why SDN is more service-based.

If you look at the service-driven and policy-driven approaches, you can see that both have a common element, which is the notion of a domain abstraction as a black box.  We also saw something like this in the old days of SNA and ATM routing, where a domain received an entry on the route stack that described an “abstract service” meaning a destination and service quality.  The domain edge popped that off, calculated its own specific route, and pushed that onto the stack instead.  The higher level in this example, and in our service-driven and policy-driven approaches doesn’t know the details of how to get through the next-lower-level structure and it doesn’t even know that structures below that next level exist.  Why?  Because it won’t scale otherwise.

SDN control is a good example of a need for hierarchy.  You obviously cannot scale a single central SDN controller to handle all of the Internet at a detail level.  You could subdivide the Internet into smaller zones and then collect those control zones into second-layer superzones, and so forth.  If you decided that you could manage say a hundred “route abstractions” per controller (or whatever number you like) you group devices till you get roughly a hundred, then group those centurion zones upward in the same quantity.  In two levels you have a thousand total devices, in three a hundred thousand, in four ten million, and you’re at a billion in five levels.

In NFV you can illustrate another value of hierarchy.  Suppose you have a “service” defined as a collection of four functions.  Each of those functions can be deployed anywhere the service is offered, and the infrastructure would probably not be homogeneous through all those locations.  So imagine instead a hierarchy of functions at the service level, linking down to a hierarchy at the location level.  A service “function” binds to a location-specific implementation of that “function” whose future decomposition depends on the equipment available or the policies in place.  Or both.

Abstraction is another important attribute of zone-based control.  A black box is opaque from the outside, so a zone should not be advertising its internal elements and asking higher-level functions to control them.  If you allow high-level elements to exercise control over what happens many layers down, you have to communicate topology and status from the bottom to the top, for every possible bottom.  Thus, we can say that policy management, SDN control, and NFV object modeling all have to support the notion of a hierarchy of abstract objects that insulate the higher-level structures from the details of the lower layers.  I think this principle is so important that it’s a litmus test for effective implementation of any of those three technologies.

You might wonder how this notion relates to my “structural versus functional” modeling, a prior topic of mine on NFV implementation.  The answer is that it’s orthogonal to that point.  In both structural and functional models I think you need a hierarchy for the same reason that componentization of software doesn’t stop when you create a process object and an I/O object.  Service agility, meaning fast and easy creation of services, depends on being able to assemble opaque objects to do what you want.  You can’t impose a requirement to link low-level details or you’ve created a system integration task for every service you create.  That’s true whether we’re talking about policies, SDN, or NFV.

Policy-based and service-based control zones seem to me to be inevitable, and which of them is best will depend on whether you have intrinsic connectivity support within a zone.  If you do then you need only constrain its use.  If you don’t then you’ll have to explicitly establish routes.  But to me this isn’t the important feature; black boxes are opaque so you don’t know how somebody did their job inside one.  What is important is that you stay with the notion of manageable control hierarchies and insure that your abstraction—your black-box creation—is effective in limiting how much information has to be promulgated to higher levels.  If you don’t do that you build a strategy that won’t scale and it won’t matter what technology you’re using.