The FCC’s Neutrality Order: It Could be Worse

The FCC is converging on a net neutrality order, bringing “clarity” to the broadband regulation process.  In fact, the process has been murky since the Telecom Act, nothing in the current order (as we know of it from the FCC’s released “fact sheet”) un-murks much, and in any event this is something only the courts can hope to make final and clear.  Everyone will hate the new order, of course, and it doesn’t do what it might have done, but it’s not as bad as it might have been.

The high-level position the FCC is taking shouldn’t surprise anyone who knows what the FCC is and what it can and can’t do.  The question that’s been raised time and time again, most recently and decisively by the DC Court of Appeals on hearing the FCC’s previous neutrality order, is jurisdiction.  The position the Court took was that the FCC having previously declared that the Internet was not a common-carrier service, cannot now issue regulations on it that are permitted only for services of that type.  And that is that; the FCC has no jurisdiction under its own rule.  Fortunately for the FCC, it is not bound by its own precedents, so it can reverse a prior ruling in a stroke, which is what Wheeler is going to propose.  Broadband, wireline and wireless, would be reclassified as Title II.

What this does not mean (as TV commercials and some comments have suggested it does) is higher fees and taxes imposed on broadband providers and paid by consumers.  The FCC has said all along that if it made the Title II decision it would then apply Section 706 of the Act to “forbear” from applying many of the Title II rules to broadband.  This would include fees and taxes but also requirements for wholesaling.

The combination of these rules generates as harmless a result as anything is in this political age.  Yes, the FCC can change its mind.  Yes, the courts could rule this is invalid too (I doubt they will).  Doomsayers notwithstanding, the legal foundation of this order is as good as we can get without going back to Congress, a fate I’d be reluctant to wish on the industry.  The biggest real issue neutrality had was jurisdiction of the FCC to treat violations and the order fixes that without generating much unexpected disruption with operators.

So what is the FCC going to do with its authority?  In many cases that’s at least semi-harmless harmless.  It boils down to “no fast lanes” and “equal regulation for all broadband”.  The former bars paid prioritization of traffic.  The latter is the biggest change; the FCC is applying all its rules to wireline and wireless alike.  There will be gnashing of teeth on this, of course, but the truth is that everyone knew we were heading in that direction.  Deeper in the fact sheet the FCC released are some interesting points, though.  These could have significant consequences depending on just what the FCC does with the authority the order would give it.

At the top of my list is the fact that the FCC would hear complaints on ISP interconnect/peering.  This is consistent with the fact that the FCC has jurisdiction over common-carrier peering and tariffs, so at one level it’s not a surprise.  The question is how this new authority would be used, given that we’ve just had a flap between ISPs and content providers like Netflix, resulting in the latter paying a carriage charge to some access providers.

Video traffic is disruptive to network performance, because it demands fairly high capacity and at least stable if not low latency.  It swamps everything else and so if you let it run unfettered through the network it can congest things enough for other services to be impacted.  The FCC’s new order permits throttling traffic for management of stability and to prevent undue impact on other services.  If the FCC were to say that Netflix doesn’t have to pay for video carriage, operators could either suck it up and invest in capacity, further lowering their profit on broadband, or let things congest and try in some way to manage the consequences.

The FCC would, under the new order, have the power to impose settlement—rational settlement—on the Internet.  That could be the biggest advance in the Internet’s history but for one thing, that politically inspired and frankly stupid ban on paid prioritization.  With a combination of settlement and paid QoS, the Internet could become a limitless resource.  With paid prioritization off the table completely, we might see settlement but it won’t do much more than preserve the status quo, under which operators are already seeing critical revenue/price convergence.

I’m not sure whether the details of the order will shed light on this point.  In the past, the FCC has looked askance at provider-pays prioritization, but not at plans where the consumer pays.  The fact sheet doesn’t seem to make any distinction but the order might read differently.  We’ll have to see when it’s released.

The other interesting point in the fact sheet is that the FCC intends to insure that special IP non-Internet services (including VoIP and IPTV) don’t undermine the neutrality rules, presumably by having operators move services into the category to avoid regulation.  This sort of thing, if it went far enough, could create a kind of growing “Undernet” that would absorb applications and services by offering things like paid prioritization.

The devil again will be in the enforcement details.  There’s a fine line between IPTV and streaming services on the Undernet.  The FCC could lean too far toward regulation and make IPTV seem risky, or too far away and encourage bypass of the Internet.  Will the order make its bias clear, or will we have to wait until somebody is ordered to do, or not do, something?

Waiting is the name of the game, regulation-wise, of course.  This order will be appealed for sure.  Some in Congress will propose to preempt it, reverse it, toss it out.  We probably will have years of “uncertainty”, but the good news is that we’ll probably know shortly after the order comes out, whether there is a reasonable risk that any of the reversing/undoing will succeed.

I believe that the order as summarized in the fact sheet is better than we’d likely get if Congress intervened.  The original Republican stance of a very light touch has been amended of late to include support for “no fast lane”, and that creates the classical problem of a part-regulated market.  Light touches aren’t distinctively no touch or a push, and all the current Republican position really seems to do is to give the FCC authority to do its regulating on neutrality without (in fact, barring) Title II regulation applying to ISPs.  I think that the details of that would confound Congress, as it always has with telecom regulation.  The FCC was created to apply expert judgment to the details of regulating a technical industry, and we need to let them do that.

The thing that’s clear is that “no fast lanes” has become a flag-waving slogan for everyone, and fast lanes might have been the best thing ever for the Internet.  No matter what consumers think/want or regulators do, you can’t legislate or regulate an unprofitable operation for very long, and we’ve closed down the clearest avenue to profit for ISPs.  Not only that, we’ve closed down the only avenue that would have made generating bits a better business.  Cisco and others should be wearing black armbands on this one because it decisively shifts the focus of networking out of L2/L3 and either upward into the cloud or downward into optical transport on the cheap.

The administration should have stayed out of this; by making a statement on neutrality the President made it a political issue, and in today’s world that’s the same as saying we’ve foreclosed rational thought.  We can only hope the FCC will enforce its policies with less political bending and weaving than it’s exhibited in setting the policies in the first place.

The Role of “VNFaaS”

The cloud and NFV have a lot in common.  Most NFV is expected to be hosted in the cloud, and many of the elements of NFV seem very “cloud-like”.  These obvious similarities have been explored extensively, so I’m not going to bother with them.  Are there any other cloud/NFV parallels, perhaps some very important one?  Could be.

NFV is all about services, and the cloud about “as-a-service”, but which one?  Cloud computing in IaaS form is hosted virtualization, and so despite the hype it’s hardly revolutionary.  What makes the cloud a revolution in a multi-dimensional way is SaaS, software-as-a-service.  SaaS displaces more costs than IaaS, requires less technical skill on the part of the adopter.  With IaaS alone, it will be hard to get the cloud to 9% of IT spending, while with SaaS and nothing more you could get to 24%.  With “platform services” that create developer frameworks for the cloud that are cloud-specific, you could go a lot higher.

NFV is a form of the cloud.  It’s fair to say that current conceptions of function hosting justified by capex reductions are the NFV equivalent of IaaS, perhaps doomed to the same low level of penetration of provider infrastructure spending.  It’s fair to ask whether there’s any role for SaaS-like behavior in NFV, perhaps Virtual-Network-Function-as-a-Service, or VNFaaS.

In traditional NFV terms we create services by a very IaaS-like process.  Certainly for some services that’s a reasonable approach.  Could we create services by assembling “web services” or SaaS APIs?  If a set of VNFs can be composed, why couldn’t we compose a web service that offered the same functionality?  We have content and web and email servers that support a bunch of independent users, so it’s logical to assume that we could create web services to support multiple VNF-like experiences too.

At the high level, it’s clear that VNFaaS elements would probably have to be multi-tenant, which means that the per-tenant traffic load would have to be limited.  A consumer-level firewall might be enough to tax the concept, so what we’d be talking about is representing services of a more transactional nature, the sort of thing we already deliver through RESTful APIs.  We’d have to be able to separate users through means other than virtualization, of course, but that’s true of web and mail servers today and it’s done successfully.  So we can say that for at least a range of functions, VNFaaS would be practical.

From a service model creation perspective, I’d argue that VNFaaS argues strongly for my often-touted notion of functional orchestration.  A VNFaaS firewall is a “Firewall”, and so is one based on a dedicated VNF or on a real box.  We decompose the functional abstraction differently for each of these implementation choices.  So service modeling requirements for VNFaaS aren’t really new or different; the concept just validates function/structure separation as a requirement (one that sadly isn’t often recognized).

Managing a VNFaaS element would be something like managing any web service, meaning that you’d either have to provide an “out-of-band” management interface that let you ask a system “What’s the status of VNFaaS-Firewall?” or send the web service for the element a management query as a transaction.  This, IMHO, argues in favor of another of my favorite concepts, “derived operations” where management views are synthesized by running a query against a big-data repository where VNFaaS elements and other stuff has their status stored.  That way the fact that a service component had to be managed in what would be in hardware-device terms a peculiar way wouldn’t matter.

What we can say here is that VNFaaS could work technically.  However, could it add value?  Remember, SaaS is a kind of populist concept; the masses rise up and do their own applications defying the tyranny of internal IT.  Don’t tread on me.  It’s hard to see how NFV composition becomes pap for the masses, even if we define “masses” to mean only enterprises with IT staffs.  The fact is that most network services are going to be made up of elements in the data plane, which means that web-service-and-multi-tenant-apps may not be ideal.  There are other applications, though, where the concept of VNFaaS could make sense.

A lot of things in network service are transactional in nature and not continuous flows.  DNS comes to mind.  IMS offers another example of a transactional service set, and it also demonstrates that it’s probably necessary to be able to model VNFaaS elements if only to allow something like IMS/HHS to be represented as an element in other “services”.  You can’t deploy DNSs or IMS every time somebody sets up a service or makes a call.  Content delivery is a mixture of flows and transactions.  And it’s these examples that just might demonstrate where VNFaaS could be heading.

“Services” today are data-path-centric because they’re persistent relationships between IT sites or users.  If we presumed that mobile users gradually moved us from being facilities-and-plans-centric to being context-and-event-centric, we could presume that a “service” would be less about data and more about answers, decisions.  A zillion exchanges make a data path, but one exchange might be a transaction.  That means that as we move toward mobile/behavioral services, contextual services, we may be moving toward VNFaaS, to multi-tenant elements represented by objects but deployed for long-term use.

Mobile services are less provisioned than event-orchestrated.  The focus of services shifts from the service model to a contextual model representing the user.  We coerce services by channeling events based on context, drawing from an inventory of stuff that looks a lot like VNFaaS.  We build “networks” not to support our exchanges but to support this transfer of context and events.

If this is true, and it’s hard for me to see how it couldn’t be, then we’re heading away from fixed data paths and service relationships and toward extemporaneous decision-support services.  That is a lot more dynamic than anything we have now, which would mean that the notion of service agility and the management of agile, dynamic, multi-tenant processes is going to be more important than the management of data paths.  VNFs deployed in single-tenant service relationships have a lot of connections because there are a lot of them.  VNFaaS links, multi-tenant service points, have to talk to other process centers but only edge/agent processes have to talk to humans, and I shrink the number of connections, in a “service” sense, considerably.  The network of the future is more hosting and less bits, not just because bits are less profitable but because we’re after decisions and contextual event exchanges—transactions.

This starts to look more and more like a convergence of “network services” and “cloud services”.  Could it be that VNFaaS and SaaS have a common role to play because NFV and the cloud are converging and making them two sides of the same coin?  I think that’s the really profound truth of our time, NFV-wise.  NFV is an accommodation of cloud computing to two things—flows of information and increasing levels of dynamism.  In our mobile future we may see both services and applications become transactional and dynamic, and we may see “flows” developing out of aggregated relationships among multi-tenant service/application components.  It may be inevitable that whatever NFV does for services, it does for the cloud as well.

Is NFV’s Virtual Network Function Manager the Wrong Approach?

I’ve noted before that the weak link in NFV is operations, or management if you prefer.  A big part of the problem, IMHO, is the need for the ISG to contain its efforts to meet its schedule for completing its Phase One work.  Another issue is the fact that the body didn’t approach NFV from the top down.  Management is a problem because so much of NFV’s near- and long-term value proposition depends on efficient operations.  Service agility means accelerating the service lifecycle—management.  Capex reductions are useful only if you don’t add on so much additional opex due to increased deployment complexity that you swamp the savings.

I’m not the only one who feels there’s a major issue here.  Last spring operators told me that they didn’t have confidence that they could make the business case for NFV and that management was the issue.  Some of their concerns are percolating into visibility in the industry now, and so I think we should do what the ISG didn’t and look at NFV management top-down.

To me, there two simple high-level principles in play.  First, NFV infrastructure must, at the minimum, fit into current network operations and management practices.  Otherwise it will not be possible to replace physical network functions with virtual functions without changing operations, and that will stall early attempts to prove out benefits.  Second, to the extent that NFV is expected to deliver either service agility or operations efficiency benefits, it must provide improved operations practices that deliver sufficient efficiency impact overall.

If we step down from the first of these, we can see that the immediate consequence of harmony with existing practices is management equivalence between VNFs and PNFs.  I think this requirement was accepted by operators and vendors alike, and their response was the notion of the VNF Manager.  If you could collect the management data from the VNFs you could present it to a management system in the same form a PNF would have presented it.  Thus, if you bind a VNFM element into a set of VNFs you can fill my first requirement.

Sadly, that’s not the case.  The problem here is that virtualization itself creates a set of challenges, foremost of which is the fact that a PNF is in a box with local, fixed, hardware assets.  The associated management elements of the PNF know their own hardware state because it’s locally available to be tested.  If we translate that functionality to VNF form, we run the functions in a connected set of virtual machines grabbed ad hoc from a resource pool.  How does the VNF learn what we grabbed, how to interpret the status of stuff like VMs and hypervisors and data path accelerators and oVSs that were never part of the native hardware?  The fact is that the biggest management problem for NFV isn’t how to present VNF status to management systems, it’s how to determine the state of the resources.

The problem with resource management linkage has created a response, of course.  When vendors talk about things like “policy management” for NFV what they are often saying is that their architecture decouples resources from services explicitly.  I won’t worry about how a slew of servers and VMs look to a management system that expects to manage a physical CPE gateway because I’ll manage the resources independent of the service and never report a fault.  Great, but think of what happens when a customer calls to report their service connection is down, and your CSRs say “Gee, on the average we have 99.999% uptime on our virtual devices so you must be mistaken.  Suck it up and send your payment, please!”

There are services like consumer broadband Internet that can be managed like this, because that’s how they’re managed already.  It is not how business services are managed, not how CPE is managed, not how elements of mobile infrastructure are managed.  For them, I contend that the current approach fails to meet the first requirement.

And guess what.  The first requirement only gets you in the game, preventing NFV from being more costly and less agile than what we have now.  We are asking for improved operations efficiency, and that raises two new realities.  First, you can’t make meaningful alterations to opex by diddling with one little piece of a service.  Just like you can’t alter the driving time from NYC to LA by changing one traffic light’s timing.  Second, you can’t make meaningful alterations to even a piece of opex if you don’t do anything different there.  We have decoupled operations and network processes today and if we want service automation we have to make operations event-driven.

Event-driven doesn’t mean that you simply componentize stuff so you can run it when an event occurs.  Event-driven processes need events, but they also need state, context.  A service ordered and not yet fulfilled is in (we could say) the “Unactivated” state.  Activate it and it transitions to “Activating” and then becomes “Ready”.  A fault in the “Activating” process has to be remedied but there’s no customer impact yet, so no operations processes like billing are impacted.  In the “Ready” state the same fault has to do something different—fail over, invoke escalation, offer a billing credit…you get the picture.

What is really needed for NFV is data-modeled operations where you define a service as a set of functional or structural objects, assign each object a set of states and define a set of outside events for each.  You then simply identify the processes that are to be run when you encounter a given event in a given state.  Those processes can be internal to NFV, they can be specialized for the VNF, they can be standard operations processes or management processes.

State/event is the only way that NFV management can work, and it makes zero sense to assume that every VNF vendor would invent their own state/event handling.  It makes no sense that every vendor would invent their own way of accessing the status of underlying resources on which their VNFs were hosted, or that operators would let VNF-specific processes control shared-tenancy elements like servers and switches directly.  We can, with a single management strategy, fulfill both the resource-status-and-service coupling needs of NFV (my first requirement) and the operations efficiency gains (my second).  But we can’t do it the way it’s being looked at today.

This shouldn’t be about whether we have a VNF-specific model of management or a general model.  We need a state-event model of management that lets us introduce both “common” management processes and VNF-specific processes as needed.  Without that it’s going to be darn hard to meet NFV’s operations objectives, gain any service agility, or even sustain capex reductions.  All of NFV hinges on management, and that is the simple truth.  It’s also true that we’re not doing management right in NFV yet, at least not in a standards-defined way, and that poses a big risk for NFV deployment and success.

Policies, Services, Zones, and Black Boxes

I blogged earlier about policy management and its role, but since then I’ve had a number of interesting discussions with operators and users that bring more light to the topic.  Some of the most interesting relate to the relationship between how you define a policy and what the specific utility of policy management would be.  To no one’s surprise I’m sure, there’s more than one perspective out there.  To the surprise of some, many of the open questions on policy management are replicated in the worlds of SDN and NFV, even if you consider the latter two in a no-policy-management implementation context.

According to classic definition, a “policy” is a statement of intent, and so you’d probably be accurate if you thought of policies as statements of goals/objectives rather than of methods.  “I want to drive to LA” might be viewed as a high-level policy, for example, as something that constrains a function we can call “Route”.  It’s not proscriptive on how that goal might be realized.

Shifting gears to networking, we’re recognizing that there are many cases where a collection of technology creates a kind of “zone”, something that offers “service” to outsiders at edge interfaces and imposes some set of cooperative behaviors within to fulfill the needs of its services.  IP networks have zones and gateways, for example.  In SDN in OpenFlow form, or in any distributed-policy model, you could envision this zone process creating something like a classic organization chart with “atomic” zones of actual control at the bottom and coordination and interconnection zones building up toward to the master-level control.

Given that where a service transcends multiple zones there would have to be some organized coordination end to end, it’s certainly convenient to visualize this as a policy management exchange.  In fact, that visualization is also useful for NFV.  You could see my route example as a service created across multiple providers or metro areas, where a high-level process picks the general path among the next-level elements, and so forth down the line.

The concept of my route to LA can be viewed as a policy-driven process, but it can also be viewed as what I’ll call a service-driven process.  If each of the metro areas or whatever’s just below my highest “route” awareness offers a “service” of getting me between edge points, you can stitch my LA path by binding those edge points.  The metro processes are responsible for getting me between edges in some mysterious (to the higher level) way.

Is there a difference between policy-driven and service-driven, then?  Well, this is where I think things can become murky.  IMHO, service-driven models require that each element at every layer in the hierarchy advertise a “service” between endpoints.  The highest layer has a conception “route” and that conception is pushed down by successively decomposing the abstract elements we’re routing through.  Get me out of NJ?  That involves the “NJ” element.  That element might then have “South” or “North” Jersey sub-elements.  The primary constraint, always applied by the higher level to the lower when a service is decomposed, is the notion of endpoint and the notion of service, meaning things like driving speed, road type, etc.

If you look at policy models, they could work in a similar way, but what is missing from most is an explicit notion of “route” or “service” because most policy models are meant to overlay a forwarding-table routing process.  The policies relate most often to handling rules that would make up an SLA.  This makes sense given the definition of “policy” that I cited; it’s not reflective of specifics but rather of goals.  “Traffic of this type should be afforded priority” is a good policy; “traffic of this type should to via this route to this destination” isn’t what most policies are about.  If you applied policies in this sense to my trip to LA, what you get is a rule something like “passage west with a given priority should take this route over that one”.  Policy models are most often applied to systems that have internal routing capability, meaning that the connectivity service is in place.  You don’t need to tell a router to route.  Service-based models establish the service, which is why SDN is more service-based.

If you look at the service-driven and policy-driven approaches, you can see that both have a common element, which is the notion of a domain abstraction as a black box.  We also saw something like this in the old days of SNA and ATM routing, where a domain received an entry on the route stack that described an “abstract service” meaning a destination and service quality.  The domain edge popped that off, calculated its own specific route, and pushed that onto the stack instead.  The higher level in this example, and in our service-driven and policy-driven approaches doesn’t know the details of how to get through the next-lower-level structure and it doesn’t even know that structures below that next level exist.  Why?  Because it won’t scale otherwise.

SDN control is a good example of a need for hierarchy.  You obviously cannot scale a single central SDN controller to handle all of the Internet at a detail level.  You could subdivide the Internet into smaller zones and then collect those control zones into second-layer superzones, and so forth.  If you decided that you could manage say a hundred “route abstractions” per controller (or whatever number you like) you group devices till you get roughly a hundred, then group those centurion zones upward in the same quantity.  In two levels you have a thousand total devices, in three a hundred thousand, in four ten million, and you’re at a billion in five levels.

In NFV you can illustrate another value of hierarchy.  Suppose you have a “service” defined as a collection of four functions.  Each of those functions can be deployed anywhere the service is offered, and the infrastructure would probably not be homogeneous through all those locations.  So imagine instead a hierarchy of functions at the service level, linking down to a hierarchy at the location level.  A service “function” binds to a location-specific implementation of that “function” whose future decomposition depends on the equipment available or the policies in place.  Or both.

Abstraction is another important attribute of zone-based control.  A black box is opaque from the outside, so a zone should not be advertising its internal elements and asking higher-level functions to control them.  If you allow high-level elements to exercise control over what happens many layers down, you have to communicate topology and status from the bottom to the top, for every possible bottom.  Thus, we can say that policy management, SDN control, and NFV object modeling all have to support the notion of a hierarchy of abstract objects that insulate the higher-level structures from the details of the lower layers.  I think this principle is so important that it’s a litmus test for effective implementation of any of those three technologies.

You might wonder how this notion relates to my “structural versus functional” modeling, a prior topic of mine on NFV implementation.  The answer is that it’s orthogonal to that point.  In both structural and functional models I think you need a hierarchy for the same reason that componentization of software doesn’t stop when you create a process object and an I/O object.  Service agility, meaning fast and easy creation of services, depends on being able to assemble opaque objects to do what you want.  You can’t impose a requirement to link low-level details or you’ve created a system integration task for every service you create.  That’s true whether we’re talking about policies, SDN, or NFV.

Policy-based and service-based control zones seem to me to be inevitable, and which of them is best will depend on whether you have intrinsic connectivity support within a zone.  If you do then you need only constrain its use.  If you don’t then you’ll have to explicitly establish routes.  But to me this isn’t the important feature; black boxes are opaque so you don’t know how somebody did their job inside one.  What is important is that you stay with the notion of manageable control hierarchies and insure that your abstraction—your black-box creation—is effective in limiting how much information has to be promulgated to higher levels.  If you don’t do that you build a strategy that won’t scale and it won’t matter what technology you’re using.