In Search of a Paradigm for Virtual Testing and Monitoring

Virtualization changes a lot of stuff in IT and in networking, for two principle reasons.  One is that it breaks the traditional ties between functionality (in the form of software) and resources (both servers and associated connection-network elements).  The other is that it creates resource relationships that don’t map to physical links or paths.  The end result of virtualization is something highly flexible and agile, but also significantly more complicated.

When SDN and NFV came along, one of the things I marveled at was the way that test and monitoring players approached it.  The big question they asked me was “What new protocols are going to be used?”  as if you could understand NFV by intercepting the MANO-to-VIM interface.  The real question was how you could gain some understanding of network behavior when all the network elements and pathways were agile, virtual.

Back in the summer of 2013 when I was Chief Architect for the CloudNFV initiative, I prepared a document on a model for testing/monitoring as a service.  The approach was aimed at leveraging the concept of “derived operations” that was the primary outgrowth of the original ExperiaSphere project and the associated TMF presentations, to provide the answer to the real question, which was “How do you test/monitor a virtual network”.  There was never a partner for that phase and so the document was never released, but I think the basic principles are valid and they serve as a primer in at least one way of approaching the problem.

Like ExperiaSphere, CloudNFV was based on “repository-based management” where all management data was collected in a repository to be delivered through management proxies and queries against that data base, in whatever form was helpful.  A server or switch, for example, would have its MIB polled by an agent who would then store the data (including time-stamp) in the repository.  When somebody wanted to look at switch state, they’d query the repository and get the relevant information.

What makes this “derived” operations was the idea that a service model described a set of objects that represented functionality—atomic like a device or VNFC or collective like a subnetwork.  Each object in the model could describe a set of management variables whose value derived from subordinate object variables using any expression that was useful.  In this way, the critical pieces of a service model—the “nodes”—could be managed as though they were real, which is good because in a virtual world, the abstraction (the service model) is the “realest” thing there is.

The real solution to monitoring virtual networks is to take advantage of this concept.  With derived operations, a “probe” that can report on traffic conditions or other state information is simply a contributor to the repository like anything else that has real status.  You “read” a probe by doing a query.  The trick lies in knowing what probe to read, and I think the solution to that problem exposes some interesting points about NFV management in general.

When an abstract “function” is assigned to a real resource, we call that “deployment” and we call the collective decision set that deploys stuff in NFV “orchestration”.  It follows that orchestration builds resource bindings, and that at the time of deployment we “know” where the abstraction’s resources are—because we just put them there.  The core concept of derived operations is to record the bindings when you create them.  We know, then, that a given object has “real” management relationships with certain resources.

Monitoring is a little different, or it could be.  One approach to monitoring would be to build probes into service descriptions.  If we have places where we can read traffic using RMON or DPI or something, we can exercise those capabilities like they were any other “function”.  A probe can be what (or one of the things that) a service object actually deploys.  A subnet can include a probe, or a tunnel, or a machine image.  Modeled with the service, the probe contributes management data like anything else.  What we’d be doing if we used this model is similar to routing traffic through a conventional probe point.

The thing is, you could do even more.  In a virtual world, why not virtual probes?  We could scatter probes through real infrastructure or designate points where a probe could be loaded.  When somebody wanted to look at traffic, they’d do the virtual equivalent of attaching a data line monitor to a real connection.

To make virtual probes work, we need to understand probe-to-service relationships, because in a virtual world we can’t allow service users to see foundation resources or they see others’ traffic.  So what we’d have to do is to follow the resource bindings to find real probe points we could see, and then use a “probe viewer” that was limited to querying the repository for traffic data associated with the service involved.

One of the things that’s helpful in making this work is the notion of modeling resources in a way similar to that used for modeling services.  An operator’s resource pool is an object that “advertises” bindings to the service objects, each representing some functional element of a service for which it has a recipe for deployment and management.  When a service is created, the service object “asks” for a binding from the resource model, and gets the binding that matches functionality and other policy constraints, like location.  That’s how, in the best of all possible worlds, we can deploy a 20-site VPN with firewall and DHCP support when some sites can use hosted VNF service chains and others have or need real CPE.  The service architect can’t be asked to know that stuff, but the deployment process has to reflect it.  The service/resource model binding is where the physical constraints of infrastructure match the functional constraints of services.

And monitoring, so it happens.  Infrastructure can “advertise” monitoring and even test data injection points, and a service object or monitoring-and-testing-as-a-service could then bind to the correct probe point.  IMHO, this is how you have to make testing and monitoring work in a virtual world.  I think the fact that the vendors aren’t supporting this kind of model is in no small part due to the fact that we’ve not codified “derived operations” and repository-based management data delivery, so the mechanisms (those resource bindings and the management derivation expressions) aren’t available to exploit.

I think that this whole virtual-monitoring and monitoring-as-a-service thing proves an important point, which is that if you start something off with a high-level vision and work down to implementation in a logical way, then everything that has to be done can be done logically.  That’s going to be important to NFV and SDN networks in the future, because network operators and users are not going to forego the tools they depend on today just because they’ve moved to a virtual world.

Fixing the Conflated-and-Find-Out Interpretation of MANO/VIM

I blogged recently about the importance of creating NFV services based on an agile markup-like model rather than based on static explicit data models.  My digging through NFV PoCs and implementations has opened up other issues that can also impact the success of an NFV deployment, and I want to address two of them today.  I’m paring them up because they both relate to the critical Management/Orchestration or MANO element.

The essential concept of NFV is that a “service” somehow described in a data model is converted into a set of cooperating committed resources through the MANO element.  One point I noted in the earlier blog is that if this data model is highly service-specific, then the logic of MANO necessarily has to accommodate all the possible services or those services are ruled out.  That, in turn, would mean that MANO could become enormously complicated and unwieldy.  This is a serious issue but it’s not the only one.

MANO acts through an Infrastructure Manager, which in ETSI is limited to managing Virtual Infrastructure and so is called a VIM.  The VIM represents “resources” and MANO the service models to be created.  If you look at the typical implementations of NFV you find that MANO is expected to drive specific aspects of VNF deployment and parameterization, meaning that MANO uses the VIM almost like OpenStack would use Neutron or Nova.  In fact, I think that this model was explicitly or unconsciously adopted for the relationship, which I think is problematic.

The first problem that’s created by this approach is what I’ll call the conflation problem.  A software architect approaching a service deployment problem would almost certainly divide the problem into two groupings—definition of the “functions” part of virtual functions and descriptions/recipes on how to virtualize them.  The former would view a VNF implementation of “firewall” and a legacy implementation of the same thing as equivalent, not to mention two VNF implementations based on different software.  The latter would realize the function on the available resources.

If you take this approach, then VIMs essentially advertise recipes and constraints on when (and where) they can be used.  MANO has to “bind” a recipe to a function, but once a recipe is identified it’s up to the VIM/chef to cook the dish.

In a conflated model, MANO has to deploy something directly through the VIM, understanding tenant VMs, servers, and parameters.  The obvious effect of this is to make MANO a lot more complicated because it now has to know about the details of infrastructure.  That also means that the service model has to have that level of detail, which as I’ve pointed out in the past means that services could easily become brittle if infrastructure changes underneath.

The second issue that the current MANO/VIM approach creates is the remember-versus-find-out dichotomy.  If MANO has to know about tenant VMs and move a VIM through a deployment process, then (as somebody pointed out in response to my earlier blog on this) MANO has to be stateful.  A service that deploys half a dozen virtual machines and VNFCs has a half-dozen “threads” of activity going at any point in time.  For a VNF that is a combination of VNFCs to be “ready”, each VNFC has to be assigned a VM, loaded, parameterized, and connected.  MANO then becomes a huge state/event application that has to know all about the state progression of everything down below, and has to guide that progression.  And not only that, it has to do that for every service—perhaps many at one time.

Somebody has to know something.  You either have to remember where you are in a complex deployment or constantly ask what state things are in.  Even if you accept that as an option, you’d not know what state you should be in unless you remembered stuff.  Who then does the remembering?  In the original ExperiaSphere project, I demonstrated (to the TMF among others) that you could build a software “factory” for a given service by assembling Java Objects.  Each service built with the factory could be described with a data model based on the service object structure, and any suitable factory could be given a data model for a compatible service at any state of lifecycle progression and it could process events for it.  In other words, a data model could remember everything about a service so that an event or condition in the lifecycle could be handled by any copy of a process.  In this situation, the orchestration isn’t complicated or stateful, the service model that describes it remembers everything needed because it’s all recorded.

There are other issues with the “finding-out” process.  Worldwide, few operators build services without some partner contributions somewhere in the process.  Most services for enterprises span multiple operators, and so one operates as a prime contractor.  With today’s conflated-and-find-out model of MANO/VIM, a considerable amount of information has to be sent from a partner back to the prime contractor, and the prime contractor is actually committing resources (via a VIM) from the partner.  Most operators won’t provide that kind of direct visibility and control even to partners.  If we look at a US service model where a service might include access (now Title II or common-carrier regulated) and information (unregulated), separate subsidiaries at arm’s length have to provide the pieces.  Is a highly centralized and integrated MANO/VIM suitable for that?

I’m also of the view that the conflated-find-out approach to MANO contributes to the management uncertainty.  Any rational service management system has to be based on a state/event process.  If I am in the operating state and I get a report of a failure, I do something to initiate recovery and I enter the “failed” state until I get a report that the failure has been corrected.  In a service with a half-dozen or more interdependent elements, that can best be handled through finite-state machine (state/event) processing.  But however you think you handle it, it should be clear that the process of fixing something and of deploying something are integral, and that MANO and VNFM should not be separated at all.  Both, in fact, should exist as processes that are invoked by a service model as its objects interdependently progress through their lifecycle state/event transitions.

If you’re going to run MANO and VNFM processes based on state/event transitions, then why not integrate external NMS and OSS/BSS processes that way?  We’re wasting enormous numbers of cycles trying to figure out how to integrate operations tasks when if we do MANO/VNFM right the answer falls right out of the basic approach with no additional work or complexity.

Same with horizontal integration across legacy elements.  If a “function” is virtualized to a real device instead of to a VNF, and if we incorporate management processes to map either VNF and host state on one hand and legacy device state on the other to a common set of conditions, then we can integrate management status across any mix of technology, which is pretty important in the evolution of NFV.

If we accept the notion that the ETSI ISG is a functional specification then these issues can be addressed readily by simply adopting a model-based description of management and orchestration.  Another mission for OPNFV, or for vendors who are willing to look beyond the limited scope of PoCs and examine the question of how their model could serve a future with millions of customers and services.

Parcel Delivery Teaches NFV a Lesson

Here’s a riddle for you.  What do Fedex and NFV have in common?  Answer:  Maybe nothing, and that’s a problem.  A review of some NFV trials and implementations, and even some work in the NFV ISG, is demonstrating that we’re not always getting the “agility” we need, and for a single common reason.

I had to ship something yesterday, so I packed it up in a box I had handy and took it to the shipper.  I didn’t have a specialized box for this item.  When I got there, they took a measurement, weighed it, asked me for my insurance needs, and then labeled it and charged me.  Suppose that instead of this, shippers had a specific process with specific boxing and handling for every single type of item you’d ship.  Nobody would be able to afford shipping anything.

How is this related to NFV?  Well, logically speaking what we’d like to have in service creation for NFV is a simple process of providing a couple of parameters that define the service—like weight and measurements on a package—and from those invoke a standard process set.  If you look at how most NFV trials and implementations are defined, though, you have a highly specialized process to drive deployment and management.

Let me give an example.  Suppose we have a business service that’s based in part on some form of virtual CPE for DNS, DHCP, NAT, firewall, VPN, etc.  In some cases we host all the functions in the cloud and in others on premises.  Obviously part of deployment is to launch the necessary feature software as virtual network functions, parameterize them, and then activate the service.  The parameters needed by a given VNF and what’s needed to deploy it will vary depending on the software.  But this can’t be reflected in how the service is created or we’re shipping red hats in red boxes and white hats in white boxes.  Specialization will kill agility and efficiency.

What NFV needs is data-driven processes but also process-independent data.  The parameter string needed to set up VNF “A” doesn’t have to be defined as a set of fields in a data model.  In fact, it shouldn’t be because any software guy knows that if you have a specific data structure for a specific function, the function has to be specialized to the structure.  VNF “A” has to understand its parameters, but the only thing that NFV software has to know to do is to get variables the user can set, and then send everything to the VNF.

The biggest reason why this important point is getting missed is that we are conflating two totally different notions of orchestration into one.  Any respectable process for building services or software works on a functional-black-box level.  If you want “routing” you insert something that provides the properties you’re looking for.  When that insertion has been made at a given point, you then have to instantiate the behavior in some way—by deploying a VNF that does virtual routing or by parameterizing something real that’s already there.  The assembly of functions like routing to make a service is one step, a step that an operator’s service architect would take today but that in the future might be supported even by a customer service portal.  The next step, marshaling the resources to make the function available, is another step and it has to be separated.

In an agile NFV world, we build services like we build web pages.  We have high-level navigation and we have lower-level navigation.  People building sites manipulate generic templates and these templates build pages with specific content as needed.  Just like we don’t have packages and shipping customized for every item, we don’t have a web template for every page, just for every different page.  Functional, in short.  We navigate by shifting among functions, so we are performing what’s fairly “functional orchestration” of the pages.  When we hit a page we have to display it by decoding its instructions.  That’s “structural” orchestration.  The web browser and even the page-building software doesn’t have to know the difference between content pieces, only between different content handling.

I’ve been listening to a lot of discussions on how we’re going to support a given VNF in NFV.  Most often these discussions are including a definition of all of the data elements needed.  Do we think we can go through this for every new feature or service and still be agile and efficient?  What would the Internet be like if every time a news article changed, we had to redefine the data model and change all the browsers in the world to handle the new data elements?

NFV has to start with the idea that you model services and you also model service instantiation and management.  You don’t write a program to do a VPN and change it to add a firewall or NAT.  You author a template to define a VPN, and you combine that with a “Firewall” or “NAT” template to add those features.  For each of these “functional templates” you have a series of “structural” ones that tell you, for a particular point in the network, how that function is to be realized.  NFV doesn’t have to know about the parameters or the specific data elements, only how to process the templates, just like a browser would.  Think of the functional templates as the web page and the structural ones as CSS element definitions.  You need both, but you separate them.

I’d love to be able to offer a list here of the NFV implementations that follow this approach, but I couldn’t create a comprehensive list because this level of detail is simply not offered by most vendors.  But as far as I can determine from talking with operators, most of them are providing only those structural templates.  If we have only structural definitions then we have to retroject what should be “infrastructure details” into services because we have to define a service in part based on where we expect to deploy it.  If we have a legacy CPE element in City A and virtual CPE in City B, we’d have to define two different services and pick one based on the infrastructure of the city we’re deploying in.  Does this sound agile?  Especially considering the fact that if we then deploy VCPE in City A, we now have to change all the service definitions there.

Then there’s management.  How do you manage our “CPE” or “VCPE?”  Do we have to define a different data model for every edge router, for every implementation of a virtual router?  If we change a user from one to the other, do all the management practices change, both for the NOC and for the user?

This is silly, people.  Not only that, it’s unworkable.  We built the web around a markup language.  We have service markup languages now, the Universal Service Definition Language or USDL is an example.  We have template-based approaches to modeling structures and functions, in TOCSA and elsewhere.  We need to have these in NFV too, which means that we have to work somewhere (perhaps in OPNFV) on getting that structure in place, and we have to start demanding vendors explain how their “NFV” actually works.  Otherwise we should assume it doesn’t.

How Buyers See New Network Choices

Networking is changing, in part because of demand-side forces and in part because of technologies.  The question is whether technology changes alone can have an impact, and for that one I went to some buyers to get answers on how they viewed some of the most popular new technology options of our time.  The results are interesting.

One of the most interesting and exciting (at least to some) of the SDN stories is the “white box” concept.  Give an enterprise or a service provider an open-source SDN controller, OpenFlow, and a bunch of “white box” generic/commodity switches and you have the network of the future.  Since “news” means “novelty” rather than “truth” it’s easy to see why this angle would generate a lot of editorial comment.  The questions are first, “Is it true?” and second, “What would it actually mean?”

The white-box craze is underpinned by two precepts.  First, that an open-source controller could create services using white-box switches that would replicate IP or Ethernet services of today.  Second, that those white-box switches would offer sufficiently lower total cost of ownership versus traditional solutions to induce buyers to make the switch.  The concept could deliver, but it’s not a sure thing.

Buyers tell me that in the data center the white-box concept isn’t hard to prove out.  Any of the open-source controllers tried by enterprises and operators were able to deliver Ethernet switching services using white-box foundation switches.  This was true for data centers ranging from several dozen to as many as several thousand servers.

However, buyers were mixed on whether the savings were sufficient.  Operators said that their TCO advantage averaged 18%, which was they said less than needed to make a compelling white-box business case if there was already an installed base of legacy devices.  Most said it was sufficient to justify white-box SDN in new builds.  Enterprises reported TCO benefits that varied widely, from as little as 9% to as much as 31%.  The problem for enterprises was that they had little expectation of new builds and most set a “risk premium” of about 25% on major technology changes.  Thus most enterprises indicated that they couldn’t make the business case in the data center.

Outside the data center it was even more negative.  Only 8% of operators’ projects outside the data center were able to even match the data center’s 18% TCO benefit, and operators expressed concerns that white-box technology was “uproven” (by a 2:1 margin) or offered too low a level of performance (by 60:40) to be useful at all, savings notwithstanding.

Interestingly, virtual switching/routing fares a lot better outside the data center.  Almost 70% of operators thought that virtual switching/routing could if hosted on optimal servers deliver at least 20% TCO benefit relative to legacy devices.  For enterprises the number was just over 75%.  Inside the data center, both operators and enterprises believed vSwitch technology could reduce their need to augment data center switching substantially (offering savings nearly 40% in TCO) but they didn’t see it displacing current switches or eliminating the need for new switches if new servers were added.  The consensus was that vSwitches were good for VMs, not for servers.

Operators believe that agile optics can supplement vSwitch technology and selected white-box deployments to displace as much as 70% of L2/L3 spending by 2025.  This suggests that white-box SDN and virtual switching/routing is best employed to supplement optical advances.  They see white-box data centers emerging more from NFV deployments, interestingly, than they do from directly driven SDN opportunities.  The reason seems to be that they believe NFV will generate a lot of new but smaller data centers where white-box and virtual technology is seen as suitable in performance.

Buyers are in general not particularly enthusiastic about white-box support or vendor credibility.  Three out of four enterprises and almost 90% of operators think their legacy vendors are more trustworthy and offer more credible support.  Virtually 100% of both groups think that they would want “more contractual assurances” from white-box vendors to counter their concerns about reputation and historicity.

What about white-box devices from legacy vendors.  Almost half of both buyer groups think that will “never happen” meaning no chance for at least five years.  Everyone saw legacy vendors entering the white-box space in earnest only when there was no option other than to lose business to others.  Nobody saw them as being leaders, though almost all buyers say that they can get SDN control for legacy devices from their current vendors.

Another option that generates mixed reviews is the overlay SDN model popularized by Nicira (now part of VMware).  While nearly all network operators and two-thirds of enterprises see benefits in overlay-based SDN, they’re struggling to assign an economic value to their sentiment.  The most experienced/sophisticated buyers in both groups (what I call the “literati”) believe that virtual-overlay technology combined with white-box basic physical switching in the LAN and agile optics in the WAN.  They say that the potential benefits are not promoted by vendors, however.

Interestingly, both network operators and enterprises are more hopeful about the Open Compute switching model than about white-box products based on SDN.  Almost 80% of enterprises say they would purchase OCP switches from “any reputable vendor” and almost 70% say they would buy commodity versions of these products.  Operators run slightly lower in both categories.  The difference, say buyers, is that OCP is a “legacy switch in a commodity form” where white-box SDN switches are based on a “new and less proven” technology combination.

What I get from all of this is that buyers need a more holistic statement of a new switch/routing paradigm than they’re getting.  It would seem that a combination of white-box physical switching and overlay SDN might be very attractive, but in the main buyers don’t see that being offered as a combination and they see do-it-yourself integration of two less-than-proven technologies as unattractive.  They’d love to see a major computer vendor (HP or IBM) field that combination; they’re not convinced that network giants will do that, and they’re still a little leery of startups, though less so than they’d been in the past.

The lesson is that there’s no such thing as a “point revolution”.  We have to expect rather significant and widespread change if we’re going to see much change at all, and users need a lot of reassurance about new technologies…and new partners.

A Deep Look at a Disappointing Neutrality Order

The FCC finally released its neutrality order, causing such a run on the website that it crashed the document delivery portion.  Generally, the order is consistent with the preliminary statement on its contents that was released earlier, but now that the full text is available it’s possible to pin down some of the issues I had to hedge on before.

First, the reference.  The official document is FCC 15-24, “REPORT AND ORDER ON REMAND, DECLARATORY RULING, AND ORDER” issues March 12, 2015.  Not surprisingly in our current politicized age, it was passed on a 3:2 partisan vote.  It’s 400 pages long in PDF form, so be prepared for a lot of reading if you intend to browse it fully.

This order was necessitated by the fact that the previous 2010 order was largely set aside by the DC Court of Appeals.  The problem the FCC had stemmed from the Telecom Act of 1996, which never mentioned the Internet at all and was woefully inadequate to serve as guidance in what was the dawn of the broadband era.  I won’t rehash all the past points, but in summary we spend about seven years trying to come up with a compromise reading of the Act that would let broadband investment continue but at the same time provide some regulatory clarity on the Internet itself.  The formula they arrived at was that the Internet was “an information service with a telecommunications component.”  That exempt it from common-carrier regulation, which is defined by Title II of the Communications Act.

When in 2010 the FCC tried to address some of the emerging neutrality issues, they were trapped by their own pronouncement.  If the ISPs were common carriers there was no question the FCC could do what it wanted, but the FCC had said they were not.  The order of 2010 was largely an attempt to salvage jurisdiction from that mess, and it failed—that’s what the Court of Appeals said.  So the fact is that unless you wanted no neutrality order at all, the FCC had no option but Title II regulation.  Fortunately for the FCC, it is not bound legally by its own precedent, which means it can simply change its mind.  It did.

The essence of the 2015 order is simple.  The FCC declares the ISPs to be common carriers with respect to broadband Internet service, making them subject to Title II.  They then exercise the once-famous-now-forgotten provision of the Telecom Act, Section 706, which allows the FCC to “forebear” from applying provisions of the act to assure the availability of Internet services to all.  In this basic sense, the order is following the recipe that the DC Court of Appeals offered in its opinion on the 2010 order, and so this part of the order is fairly bulletproof.

What the FCC proposes to do with the authority it has under Title II is a bit more complicated.  At a high level, the goal of the order is to draw what the FCC calls a “bright line”, a kind of classic line-in-the-sand that would tell everyone where they can’t go.  The basic principles of that bright line are:

  • No blocking of lawful traffic, services, devices, or applications.
  • No throttling of said traffic, except for clear network management purposes.
  • No paid prioritization.

Unlike the order of 2010, the FCC applies these rules to both wireless and wireline.  They exempt services based on IP or otherwise that are separate from the Internet, including VoIP, IPTV, and hosting and business data services.  I interpret the exemptions as including cloud computing services as well.  The key point is that an exempt service is one that does not provide access to the Internet overall, and uses facilities that are separate from those of broadband Internet access.

The last point is important to note.  Broadband Internet access is a Title II service.  The Internet itself is not.  However, the FCC does reserve for itself with this order the right to intervene on interconnect issues, though it declines to do that at present.  The order says that regulators lack the history of dealing with Internet interconnect issues and is not comfortable with prescriptions without further data and experience.  Thus, the order neither affirms nor rules out paid settlement among ISPs of the Netflix-Comcast type.

A point that cuts across all of these other issues is that of transparency.  The FCC wants broadband Internet providers to say what they mean and then do what they say.  My interpretation of this means for example that a mobile provider can’t offer “unlimited” data and then limit it by blocking or throttling or by adding hidden charges based on incremental usage.

To me, the order has one critical impact, perhaps not what the FCC intended.  Operators want to make a favorable return on investment.  If they don’t have a pathway to that through paid prioritization, then it is unlikely that Internet as a service will ever be truly profitable to them.  The best they could hope for would be to earn enough to cover the losses by selling other over-the-top services.  That’s a problem because the OTTs themselves wouldn’t have those losses to cover, and so could likely undercut operators on price.  Thus, the operators may look to “special services” instead, and I think that works against everything the FCC says it wants.

The order gives the distinct impression that the FCC believes the distinguishing point about the Internet is its ubiquity.  A “special service” has the defining criteria of not giving access to all of the Internet.  You can use IP and deliver some specific thing, not Internet access, and call it a special service, immune from regulations.  Certainly the universality of Internet access is a valid criteria, but in an investment sense the fact is that most paying services travel very short distances—less than 40 miles—and involve largely content delivery or (in a growing amount) cloud computing.  Does the order allow operators to separate out the profitable stuff—even encourage them?  Already it’s clear that profitable services are largely special services and the prohibition on paid prioritization guarantees that will be truer in the future.

Video is delivered both on- and off-Internet today, but channelized viewing is a special service.  Most for-fee operator VoIP is also a special service.  Business data services are special services.  Were there paid QoS on the Internet, might there be pressure to move these special services back onto the Internet?  Might the FCC even be able to take the position that they should be combined?  As it is, I see no chance of that happening, and in fact every chance that operators will look to special services, off the Internet, to insure they get reasonable returns.  Home monitoring, cloud computing, IoT, everything we talk about as being a future Internet application could, without paid prioritization, end up off the Internet not on it.

We might, with paid prioritization and a chance for Internet profit, see VC investment in the Internet as a network instead of in other things that increase traffic and complicate the ISP business model.  Certainly we’d give traditional L2/L3 devices a new lease on life.  The order, if it stands, is likely to put an end to those chances and accelerate the evolution toward virtual L2/L3 and minimization of “access” investment.

Will it stand?  The FCC has broad powers on Title II services; do they have the power to say that some commercially viable options cannot be presented, or that operators have to provide services with limits on their features?  I don’t know the answer to that one, but I suspect that there will be pressure now for Congress to step in.  In this day and age that’s a doubtful benefit, but there’s plenty of doubt in what we have now.

The problem here is that we don’t have a real middle ground in play.  Compromise, even when it’s politically possible, is easier to achieve if there is a position between the extremes.  With neutrality we’ve largely killed off moderation, leaving the best position one none of the partisan advocacy groups occupy.  There is then no constituency on which to build a compromise because a middle-ground view simply offends all the players.

“Internet” is an information network on top of a telecommunications service.  We have to treat the latter like all such services, meaning we have to regulate it and apply rules to settlement and interconnect.  We have to include QoS (where have we had a commercial service without SLAs?)  I think that Chairman Wheeler was on the right track with neutrality before the Administration intervened.  Sadly, we can’t take that back and sadly Congressional intervention will only create the other extreme polar view.  Now, I guess, we’ll have to wait until some symptoms develop and rational views can prevail, or so we can hope.

Alcatel-Lucent Offers a Bottom-Up Metro Vision

While vendors are typically pretty coy about public pronouncements on the direction that networking will take, they often telegraph their visions through their product positioning.  I think Alcatel-Lucent did just that with its announcement of its metro-optical extensions to its Photonic Service Switch family.  Touting the new offerings as being aimed at aggregation, cloud, and content missions, Alcatel-Lucent is taking aim at the market area that for many reasons is likely to enjoy the most growth and provide the best opportunities for vendors.  It’s just shooting from cover.

Networking isn’t a homogeneous market.  Infrastructure return varies, obviously, by service type so wireless is generally more profitable than wireline, business more profitable than residential, high-level more profitable than lower-level.  Operators will spend more where profits are greater, so there’s an emphasis on finding ways to exploit higher return potential.  Unfortunately, the universality of IP and the fact that broadband Internet is merging wireline and wireless to a degree work against service-based targeting.  Another dimension of difference would be helpful, and we have it with metro.

I’ve noted in past blogs that even today, about 80% of all profitable traffic for operators travels less than 40 miles, meaning that it stays in the metro area where it originates.  Cloud computing, NFV-based services, and content services will combine to raise that percentage through the next five years.  If NFV achieves optimum deployment, the number of NFV data center interconnects alone would be the largest source of cloud-connect services.  Mobile EPC transformation to an SDN/optical model and the injection of SDN-based facilitation of WiFi offload and integration are another enormous opportunity.

Aside from profit-and-service-driven changes, it’s obvious that networking is gradually shifting focus from L2/L3 down to the optical layer as virtualization both changes how we build high-level connectivity and virtual switch/routers displace traditional hardware.  It’s also obvious that the primary driver of these changes is the need to deliver lower cost bit-pushing services in the face of steadily declining revenue per bit.

Given that one of Alcatel-Lucent’s “Shift” focus points was routing, the company isn’t likely to stand up and tout all of this directly.  Instead of preaching L2/L3 revolution from above, they’re quietly developing more capable optical-layer stuff and applying it where it makes the most sense, which is in the metro area.  The strategy aims to allow operators to invest in the future without linking that investment to displacement of (or reduction in purchasing of) legacy switches and routers.  Unlike Juniper, who tied its own PTX announcement to IP/MPLS, Alcatel-Lucent stays carefully neutral with its approach, which doesn’t commit operators to metro IP devices.

One of the omissions in Alcatel-Lucent’s positioning was, I think, negative for the company overall.  They did not offer specific linkage between their PSS metro family and SDN/NFV, though Alcatel-Lucent has highly credible solutions in both these areas.  Operators don’t want “service layer” activities or applications directly provisioning optical transport, even in the metro, but they do want service/application changes to influence transport configuration.  There is a significant but largely ignored question of how this comes about.  The core of it is the extent to which optical provisioning and management (even based on SDN) are linked to service events (even if SDN controls them).  Do you change transport configuration in response to service orders or in response to traffic when it’s observed, or maybe neither, or both?  Juniper, who has less strategic SDN positioning and no NFV to speak of, goes further in asserting integration.

I’m inclined to attribute the contrast here to my point on IP specificity.  Juniper’s approach is an IP “supercore” and Alcatel-Lucent’s is agile optical metro.  Because of its product portfolio and roots, Juniper seems determined to solve future metro problems in IP device terms, where Alcatel-Lucent I think is trying to prepare for a future where spending on both switches and routers will inevitably decline (without predicting that and scaring all their switch and router customers!).  They can presume “continuity” of policy; transport networks today are traffic engineered largely independent of service networks.  Juniper, by touting service protocols extending down into transport, has to take a different tack.

I’d hope that Alcatel-Lucent takes a position on vertical management integration in metro networks, even if they don’t have to do so right away.  First, I think it would be to their competitive advantage overall.  Every operator understands where networking is heading; vendors can’t hide the truth by not speaking it.  On the other hand, vendors who do speak it have the advantage of gaining credibility.  Alcatel-Lucent’s Nuage is unique in its ability to support what you could call “virtual parallel IP” configurations where application-specific or service-specific cloud networks link with users over the WAN.  They also have a solid NFV approach and decent OSS/BSS integration.  All of this would let them present an elastic approach to vertical integration of networks—one that lets either management conditions (traffic congestion, failures) or service changes (new service orders pending that would demand an adjustment at the optical layer) drive the bus.

With a story like this, Alcatel-Lucent could solve a problem, which is their lack of a significant server or data center switch position.  It’s hard to be convincing as a cloud player if you aren’t a server giant, and the same is true with NFV.  You also face the risk of getting involved in a very expensive and protracted selling cycle to, in the end, see most of the spending go to somebody else.  A cloud is a server set, and so is NFV.  Data center switching is helpful, and I like Alcatel-Lucent’s “Pod” switch approach but it would be far stronger were it bound into an interconnect strategy and a Nuage SDN positioning, not to mention operations/management.  That would build Alcatel-Lucent’s mass in a deal and increase their return on sales effort and their credibility to the buyer.

Most helpful, perhaps, is that strong vertical integration in a metro solution would let Alcatel-Lucent mark some territory at Cisco’s expense.  Cisco isn’t an optical giant, doesn’t like optics taking over any part of IP’s mission, doesn’t like purist OpenFlow SDN, NFV…you get the picture.  By focusing benefits on strategies Cisco is inclined to avoid supporting, Alcatel-Lucent makes it harder for Cisco to engage.  De-positioning the market leader is always a good strategy, and it won’t hurt Alcatel-Lucent against rival Juniper either.

I wonder whether one reason Alcatel-Lucent might not have taken a strong vertical integration slant on their story is their well-known insularity in product groups.  My recommended approach would cut across four different units, which may well approach the cooperation vanishing point even today.  But with a new head of its cloud, SDN, and NFV effort (Bhaskar Gorti), Alcatel-Lucent may be able to bind up what has traditionally been separate for them.  This might be a good time to try it.

More Signposts Along the Path to an IT-Centric Network Future

I always think it’s interesting when multiple news items combine (or conflict) in a way that exposes issues and market conditions.  We have that this week with the Cisco/Microsoft cloud partnership, new-model servers from HP, a management change at Verizon, and Juniper’s router announcements.  All of these create a picture of a seismic shift in networking.

The Cisco/Microsoft partnership is a Nexus 9000/ACI switching system with a Windows Azure Pack (a Microsoft product) to provide a hybrid cloud integration of Microsoft Azure with Windows Server technology in the data center.  The software from Microsoft has been around a while, and I don’t frankly think that there’s any specific need for a Nexus or ACI to create a hybrid cloud since that was the mission of the software from the first.  However, Microsoft has unusual traction in the hybrid space because Azure is a PaaS cloud that offers easy integration with premises Windows Server and middleware tools.  Cisco, I think, wants to take advantage of Microsoft’s hybrid traction and develop its UCS servers as a preferred strategy for hosting the premises part of the hybrid cloud.

This is interesting because it may be the first network-vendor partnership driven by hybrid cloud opportunity.  Cisco is banking on Microsoft to leverage the fact that Azure and Windows Server combine to create a kind of natural hybrid, and that this will in turn drive earlier deployment of Azure hybrids than might be the case with other hybrid cloud models.  That would give Cisco street cred in the hybrid cloud space.  The IT strategy drives the network.

One reason for Cisco’s interest is the HP announcement.  HP has a number of server lines, but Cloudline is an Open Compute compatible architecture that’s designed for high-density cloud deployments, and would also be a darn effective platform for NFV.  HP has a cloud, it has cloud software for private clouds, and a strong server position (number one).  If HP were to leverage all its assets for the cloud, and if it could pull hybrid cloud opportunity through from both the public cloud provider side (through a hybrid-targeted Cloudline positioning) and from the IT side (through its traditional channels) then Cisco might see its growth in UCS sales nipped in the bud.

A Microsoft cloud alliance won’t help Cisco with NFV, though, and that might be its greatest vulnerability to HP competition in particular.  Even before Cloudline, HP had what I think is the best of the major-vendor NFV approaches.  Add in hyperscale data centers and you could get even more, and my model still says that NFV will generate more data centers in the next decade than any other application, perhaps sell more servers.  I’d be watching to see if Cisco does something on the NFV side now, to cover that major hole.

NFV’s importance is, I think, illustrated by the Verizon management change.  CTO Melone is retiring, and the office of the CTO will then fall under Verizon’s CIO.  Think about that!  It used to be that the CTO, Operations, and CMO were the big names.  The only people who called on the CIO were OSS/BSS vendors.  Now, I think, Verizon is signaling a power shift.  CIOs are the only telco players who know software and servers, and software and servers are the only things that matter for the future.

Globally, CIOs have been getting more involved with NFV, but now I think it’s fair to say they may be moving toward the driver’s seat.  That’s a dynamic that will require some thinking, because of the point I just made on what CIOs have historically been involved with.  OSS/BSS vendors have more engagement with CIOs and OSS/BSS issues have taken a back seat from the very first meetings of the ETSI ISG.  Might this shift impact the vendor engagement?  It won’t hurt HP because they have a strong operations story, and obviously Ericsson and Alcatel-Lucent do as well, but Cisco will have to do a lot more if operations is given a major role.  Of course, everyone will have to address OSS/BSS integration more effectively than they have if the guy who buys the OSS/BSS systems is leading the NFV charge.

Speaking of network vendors, we have Juniper.  Juniper has no servers, and they don’t have a strong software or operations position either.  They can’t be leaders in NFV because they don’t have the central MANO/VNFM components.  I think they represent what might be the last bastion of pure networking.  Cisco, Alcatel-Lucent, Ericsson, Huawei all need more bucks and more opportunity growth than switching and routing can hope to provide.  All of them, as contenders for leader status in network equipment, will have to expand their TAM.  Juniper is likely hoping that with the rush to servers and software, there will be opportunity remaining in the network layers.

Will there be?  Truth be told it won’t matter for Juniper because there are no options left.  They can’t be broader players now; time has run out.  The union of IP and optics, at least part of the focus of their announcements, is inevitable and it will inevitably cap the growth of IP and Ethernet alike, working with virtual routing and switching driven by SDN and NFV at the technical level and by operators’ relentless pressure to reduce capex and opex.  It’s hard to see how a switch/router company only recently converted to the value of agile optics can win against players like Alcatel-Lucent or Ciena or Infinera or Adva, all of whom have arguably better SDN and NFV stories.

There are other data points to support my thesis that we’re moving toward the “server/software” age of networking.  Ciena already announced an NFV strategy and so now has Adva.  Alcatel-Lucent’s CEO said that once they’re done “shifting” they will likely focus more on services.  Logical, given that professional services are almost inevitably more important as the rather vast issues if the cloud and SDN and NFV start driving the bus.  Few vendors will field comprehensive solutions and operators want those.  They’ll accept consortium insurance where specific vendor solutions just aren’t available from enough players to give the operators a comfortable competitive choice.

All of these points demonstrate the angst facing network vendors, but adding to that is the fact that Huawei is running away with the market, racking up 20% growth when almost all the competition is losing year-over-year.  It’s Huawei that in my view renders the pure networking position untenable for competitors; everyone else will lose on price and network equipment differentiation is now almost impossible.  For five years now, vendors have played Huawei’s game, focusing their attention on reducing costs when the price leader in the market is sharpening their blade.  It may be too late to change that attitude, though Cisco at least is certainly trying.

We have a true revolution here.  It’s not the platitudes we read about, it’s the relentless march of commoditization driven by that compression of revenue/cost curves.  It’s the shift in approach to hosted software with greater agility, from monolithic specialized network hardware.   We are moving to an IT-driven future for networking and there is no going back now.

What OPNFV Needs to Address

OPNFV, the Linux Foundation open-source project for NFV, is getting ready to release its first preliminary code.  Everyone, including me, is rooting for a success in creating first a reference implementation of NFV that’s functionally complete and second an open-source framework for that implementation.  My concern is as it’s always been; do we know that “reference implementation” is “functionally complete?”  I’d like to offer some comments to OPNFV and its members on what is needed, which I’ll call “principles”.

First and foremost, I think operators globally would agree that NFV should embrace any and all hosting resources.  We are advancing server technology with low-power and edge-deployed technology, and we’re optimizing virtualization with things like containers.  It’s less desirable to standardize on a platform than to standardize a way to accommodate whatever platforms operators find helpful.  The key to achieving this is what I’ll call an open infrastructure management model, and it has four requirements:

  1. The implementation must support multiple VIMs, with the VIM to be specified by the model used to drive management and orchestration (MANO). All VIMs must accept a common “input” from MANO to describe what is to be deployed so that all VIMs are portable across all implementations of MANO.
  2. Resources to be used for hosting/connecting VNFs as part of NFV Infrastructure (NFVI) must be represented by a VIM that supports the common input structure described above.
  3. If it is determined that some hosting options may not be available for all NFVI choices, then the VIM must be capable of returning a response to MANO indicating that a deployment request cannot be executed because one or more features are unsupported by the specified VIM.
  4. Operators will need NFV to control legacy connection resources either within the NFVI data centers or in the non-NFV portion of a service. This means that there should be “network managers” (to use a term that’s been suggested in the ETSI ISG) that should look in most ways as VIMs look but support connection requests rather than both connection and hosting.  I suggest that given the similarity, the concept of an Infrastructure Manager with two (current) subclasses—VIM and Network Manager—is appropriate

We should be viewing these IMs as “black boxes”, with defined interfaces but flexibility in assigning the processes internally.  What goes on inside an IM is, I think, opaque to the process of standardization.  Yes, we need one or more reference implementations, but if the goal of an IM is to represent an arbitrary set of controllable resources, we have to give the IM latitude in how it achieves that goal.

With these requirements at the bottom layer, we can now move upward to management and orchestration.  Here, I believe that it’s important to recognize that the ISG’s work has defined MANO and VNFM separately, but if you read through the material it’s fairly clear that the two are features of a common implementation.  At one time (and even now, for some members) the ETSI ISG used the term “package” to describe a deployed unit.  A package might be a complete service or service element or just a convenient piece of one.  For my second principle, I think that OPNFV has to recognize that MANO and VNFM operate collectively on packages, and that the definition of a package must provide not only how the package is deployed on resources but how the management connections are made.  I also think that “packages” are fractal, meaning that you can create a package of packages or a package of VNFs/VNFCs.

The question at the MANO/VNFM level is how the package is modeled.  It seems to be possible, based on my experience at least, to model any package by defining a connection model and then identifying nodes that are so connected.  We have LINEs, LANs, and TREEs as generally accepted connection models.  A package might then be two NODEs and a LINE, or some number of NODES on a LAN or in a TREE.  With the right modeling approach, though, we could define another model, “CHAIN” that would be a linear list of nodes.  Thus, a connection model could represent any generally useful relationship between nodes.

There are a lot of ways to define this kind of model.  Some vendors already use XML, which I’ve also used on a project.  Others prefer TOSCA or YANG.  I think it would be helpful to have a dialog to determine whether operators think that their services would be defined hierarchically as a package tree using a single standard modeling semantic, or whether they’d be happy to use anything that works.  I suspect that the answer might lie in whether operators thought service definitions could/should be shared among operators.

If a standard model approach is suitable, then I think that models could be the input to IMs.  If it’s desired to support multiple model options, then IMs will need some standard API to receive parameters from MANO.  Otherwise IMs would not be portable across MANO implementations.

Going back to the VNFM issue, I believe in the concept I’ve called “derived operations” where each package defines its own MIB and the relationship between that MIB and subordinate package or resource MIBs.  I still think this is the way to go because it moves management derivation into the model rather than requiring “manager” elements.  I’m willing to be shown that other ways will work, but my third principle is that OPNFV has to define and provide a reference implementation for a rational management vision.

A related point is lifecycle management, a responsibility of the VNFM in the ETSI spec.  There is simply no way to get a complicated system to progress in an orderly way from service order to operations without recognizing operating states by package and events to signal changes.  Principle number four is that OPNFV has to provide a means of describing lifecycle processes in terms of package state/event progressions.

The final principle is simple.  Operators build services today in a variety of ways—they may start with opportunities and dive downward to realization, or they may build upward from capabilities to services.  The “service lifecycle” to an operator is more than the deployment-to-operations progression, it’s the concept-to-realization progression.  OPNFV has to demonstrate that any service lifecycle process from conception to delivery can be supported by their reference implementation.  That means we have to define not only the baseline model for example, but also the tools that will be used to build and validate the models.

I think that all of this is possible, and some at least seems to be consistent with the direction that the ETSI ISG is taking for their second-phase activity.  I also think that all of this, in some form at least, is necessary.  A reference implementation that actually does what’s needed is enormously useful, but one that fails to support the goals of NFV will be far worse than no implementation at all.  It could freeze progress for so long that operators are forced to look elsewhere for solutions to their revenue/cost per bit problems.  We may not have the time for that.

Do We Watch Watch or Look Through Glass?

Apple announced the details on its Apple Watch, which some will call a revolution and others (including me) will yawn at.  It’s the first truly new product from Apple in about five years, the latest darling of the wearable technology niche, loved by Apple fans for sure.  The question is whether it will really amount to anything other than a status symbol.  It’s a valid question because Apple Watch isn’t the ideal wearable no matter what your Apple fan status might be.

For many, the form factor alone is going to be hard to accept.  There are two sizes, 38mm and 42mm, which equate to roughly an inch and a half or an inch and two-thirds.  A good-sized chronograph roughly equals the former, but the square face looks bigger.  It’s certainly something to be noticed, which to Apple fans may be a good thing.  Conspicuousness probably won’t sell a lot of watches, though.  There has to be utility, and that may be harder to come by, because obviously for many tasks a watch face presents a pretty minimalist GUI. Yes, you could wave a watch at a pay terminal and buy something (if Apple Pay gets cleaned up).  Yes, you could read the time and perhaps some SMS or an email notice.  The thing is, you can do that with your phone.  Some will pay a minimum of three hundred fifty bucks to wave a watch instead of a phone, but I don’t think that will start a revolution.

All wearable technology is essentially an extension of mobile broadband.  While it might work standalone, it’s really designed to work with a mobile phone (probably) or tablet (possibly), which means you have to value it based on what it can “input” into the mobile/behavioral ecosystem or what it can output from it.  The Watch can be a tickler to tell you to get your phone out, and it can let you do some basic things without taking out your phone.

Probably the most interesting application for Apple Watch is biometric monitoring, which could be used to track fitness goals and monitor yourself during exercise.  Even here, though, it’s possible to do that stuff in other ways.  Judging from what I see on the street, there aren’t too many people in gyms or exercising anyway.  I took a two-and-a-half mile walk yesterday and didn’t see anyone who wasn’t in a car.  More intense health care apps are for the future only, and then only if issues with FDA approval can be dodged.

Why then is Apple doing this?  It’s most likely a matter of niche marketing.  Apple fans value social interaction, cool-ness, leading-edge stuff.  You can sell them stuff that most of the population won’t buy.  There’s nothing wrong with that approach, except perhaps that you believe some of the heady numbers.  Five hundred million units by 2020 is one estimate I saw, and a Street guy who’s a bull on Apple thinks a couple billion in that timeframe is reasonable.  Milking your base with add-on products is Marketing 101, but you have to be wary of expectations.  Total smartwatch sales so far have been less than two million units, and many of the things you can do with Apple Watch can be done with earlier releases from other vendors, at lower cost.

Competition there is.  Questions on the value proposition are there too.  But Apple’s big risk isn’t competitors in the smartwatch space, or even lack of interest in smartwatches.  It’s the possibility that Google might still do something with Glass.

The most powerful of all our senses is sight, and in my view that means that the logical way to supplement any mobile device is through augmented-reality glasses.  Yes, I know that’s what Google Glass was/is supposed to be and yes, I know that the story is that Glass is gas, so to speak.  The truth, I think, is that Glass got away from Google and they simply were not ready to capitalize on what happened when it came out.  That doesn’t mean it’s not the best approach.

Google says that Glass is only exiting one phase and preparing for a new one.  A number of news stories have made that same point in more detail, claiming that the current Glass strategy was only a proof of concept, a kind of field trial.  It’s hard not to see this as an opportunity for Google, though.  What better way to kick sand in Apple’s face than to launch a really useful wearable, and I think even the hardest-core Apple aficionado would agree that the king of the wearable concept is still augmented reality glasses.

But will Google really push Glass?  Google has a habit of trying to launch a revolution on the cheap, hoping partners will do the heavy lifting and take the risks.  Remember Google Wave?  It was another skunk-works project that had great potential, but Google never really invested in it.  Some now believe that Android may belong in that same category, a concept that Google should have productized and pushed rather than simply launched and blew kisses at.  It’s hard to see how Glass could succeed without support from Android devices, so would that mean Google would have to get more serious about Android?

The Google MVNO rumor might be an interesting data point.  If Google wanted to stay with its advertising roots, though, it’s hard to see how a Glass/Android pairing could promote advertising effectively if the associated Android phone wasn’t on an MVNO service that could then be partially ad-sponsored.   The thing is, before the story took hold, Google seemed to be trying more to control expectations than to promote the concept.  That doesn’t sound like it’s prepping a new Glass.

They should, because augmented reality could be great for ads.  You can picture a Glass wearer walking down the street and viewing the ads of stores along the route.  It’s hard to get that same effect by looking at a watch.  Selling ad space on a billboard an inch and a half square is definitely an uphill battle.  Augmented reality is also great for travelers, for gamers, for a host of large markets that cut across current vendor/technology boundaries.  It seems to me those would be better places to go.

The potentially positive thing with Apple Watch is that it could extend the concept of personalization and it could help further integrate mobile broadband into day-to-day behavior.  The biggest impact of mobile broadband on traffic is video viewing, but the biggest impact overall is on behavior.  We are remaking ourselves through the agency of our gadgets, and Watch might boost that.  Here, Apple is surely hoping that developers will innovate around the basics to develop something useful.

Useful but not compelling.  The thing I’m not clear on is why Apple would do something like that, which I think would magnify the value of the cloud by creating in-cloud agents, when they seem unwilling to take a lead in the cloud itself.  Without Apple leadership, the cloud is unlikely to be a place where developers elect to go, and as long as Apple stays in the background, cloud-wise, they are at risk to being preempted by Google, Glass or no.  In the end, broadband devices are appliances to help us live, and the watch can be only a subordinate appliance.  The master intelligence has to be in the cloud, and if they want Watch to succeed, so does Apple.

Will TV Viewing Habits Change Metro Architecture?

According to a couple recent surveys, TV viewing is dropping in the 18-34 year old age group.  Some are already predicting that this will mean the end of broadcast TV, cable, and pretty much the media World as We Know It.  Certainly there are major changes coming, but the future is more complicated than the “New overtakes the Old” model.  It’s really dependent on what we could call lifestyle phases, and of course it’s really complicated.  To make things worse, video could impact metro infrastructure planning as much as NFV could, and it’s also perhaps the service most at risk to being itself impacted by regulatory policy.  It’s another of those industry complications, perhaps one of the most important.

Let’s start with video and viewing changes, particularly mobile broadband.  “Independence” is what most young people crave.  They start to grow up, become more socially aware, link with peer groups that eventually influence them more than their parents do.  When a parent says “Let’s watch TV” to their kids, the kids hear “Stay where I can watch you!”  That’s not an attractive option, and so they avoid TV because they’re avoiding supervision.  This was true fifty years ago and it’s still true.

Kids roaming the streets or hanging out in Starbucks don’t have a TV there to watch, and mobile broadband and even tablets and WiFi have given them an alternative entertainment model, which is streaming video.  So perhaps ten years ago, we started to see youth viewing behavior shift because technology opened a new viewing option that fit their supervision-avoidance goal.

Few people will watch a full hour TV show much less a movie on a mobile device.  The mobile experience has to fit into the life of people moving, so shorter clips like music videos or YouTube’s proverbial stupid pet tricks caught on.  When things like Facebook and Twitter came along, they reinforced the peer-group community sense, and they also provided a way of sharing viewing experiences through a link.

Given all this, it’s hardly surprising that youth has embraced streaming.  So what changes that?  The same thing that changes “youth”, which is “aging”.  Lifestyles march on with time.  The teen goes to school, gets a job and a place to live, enters a partner relationship, and perhaps has kids of his/her own.

Fast forward ten years.  Same “kid” now doesn’t have to leave “home” to avoid supervision, but they still hang out with friends and they still remember their streaming habits.  Stupid pet tricks seem a bit more stupid, and a lot of social-media chatter can interfere with keying down after a hard day at the office.  Sitting and “watching TV” seems more appealing.  My own research says that there’s a jump in TV viewing that aligns with independent living.

Another jump happens two or three years later when the “kid” enters a stable partner relationship.  Now that partner makes up a bigger part of life, the home is a better place to spend time together, and financial responsibilities are rising and creating more work and more keying down.  There’s another jump in TV viewing associated with this step.

And even more if you add children to the mix.  Kids don’t start being “independent” for the first twelve years or so on the average.  While they are at home, the partner “kids” now have to entertain them, to build a set of shared experiences that we would call “family life”.  Their TV viewing soars at this point, and while we don’t have full data on how mobile-video-exposed kids behave as senior citizens yet, it appears that it may stay high for the remainder of their lives.

These lifecycle changes drive viewing changes, and this is why Neilson and others say that TV viewing overall is increasing even as it’s declining as a percentage of viewing by people between 18 and 34.  If you add to this mix the fact that in any stage of life you can find yourself sitting in a waiting room or on a plane and be bored to death (and who shows in-flight movies anymore?), you see that mobile viewing of video is here to stay…sort of.

The big problem that TV faces now isn’t “streaming” per se, it’s “on-demand” in its broadest sense—time-shifted viewing.  Across all age groups we’re seeing people get more and more of their “TV” in non-broadcast form.  Competition among the networks encourages them to pile into key slots with alternative shows while other slots are occupied by the TV equivalent of stupid pet tricks.  There are too many commercials and reruns.  Finally, we’re seeing streaming to TV become mainstream, which means that even stay-at-homes can stream video instead of watching “what’s on”.

I’ve been trying to model this whole media/video mess with uncertain results, largely because there are a huge number of variables.  Obviously network television creates most of the original content, so were we to dispense with it we’d have to fund content development some other way.  Obviously cable networks could dispense with “cable” and go directly to customers online, and more importantly directly to their TV.  The key for them would be monetizing this shift, and we’re only now getting some data from “on-demand” cable programming regarding advertising potential for that type of delivery.  I’m told that revenue realization from streaming or on-demand content per hundred views is less than a third of channelized real-time viewing.

I think all of this will get resolved, and be resolved in favor of streaming/on-demand in the long run.  It’s the nature of the current financial markets to value only the current quarter, which means that media companies will self-destruct the future to make a buck in the present.  My model suggests that about 14% of current video can sustain itself in scheduled-viewing broadcast form, but that ignores the really big question—delivery.

If I’m right that only 14% of video can sustain broadcast delivery then it would be crazy for the cable companies to allocate the capacity for all the stuff we have now, a view that most of the cable planners hold privately already.  However, the traffic implications of streaming delivery and the impact on content delivery networks and metro architecture would be profound.

My model suggests that you end up with what I’ll call simultaneity classes.  At the top of the heap are original content productions that are released on a schedule whether they’re time-shifted in viewing or not and that command a considerable audience.  This includes the 14% that could sustain broadcast delivery and just a bit more—say 18% of content.  These would likely be cached in edge locations because a lot of people would want them.  There’s another roughly 30% that would likely be metro-cached in any significant population center, which leaves about 52% that are more sparsely viewed and would probably be handled as content from Amazon or Netflix is handled today.

The top 14% of content would likely account for about two-thirds of views, and the next 30% for 24% of views, leaving 10% for all the rest.  Thus it would be this first category of viewing, widely seen by lots of people, that would have the biggest impact on network design.  Obviously all of these categories would require streaming or “personalized delivery”, which means that the total traffic volume to be handled could be significant even if everyone were watching substantially the same shows.

“Could” may well be the important qualifier here.  In theory you could multicast video over IP, and while that wouldn’t support traditional on-demand programming there’s no reason it couldn’t be used with prime-time material that’s going to be released at a particular time/date.  I suspect that as on-demand consumption increases, in fact, there will be more attention paid to classifying material according to whether it’s going to be multicast or not.  The most popular material might well be multicast at its release and perhaps even at a couple of additional times, just to control traffic loads.

The impact of on-demand on networking would focus on the serving/central office for wireline service, and on locations where you’d likely find SGWs today for mobile services (clusters of cells).  The goal of operators will be to push caches forward to these locations to avoid having to carry multiple copies of the same videos (time-shifted) to users over a lot of metro infrastructure.  So the on-demand trend will tend to encourage forward caching, which in turn would likely encourage at least mini-data-center deployments in larger numbers.

What makes this a bit harder to predict is the neutrality momentum.  The more “neutral” the Internet is, the less operators can hope to earn from investing in it.  It seems likely that the new order (announced but not yet released) will retain previous exemptions for “interior” elements like CDNs.  That would pose some interesting challenges because current streaming giants like Amazon and Netflix don’t forward-cache in most networks.  Do operators let them use forward caches, charge for the use of them, or what?

There’s even a broader question, which is whether operators take a path like that of AT&T (and in a sense Verizon) and deploy an IP-non-Internet video model.  For the FCC to say that AT&T had to share U-verse would be a major blow to customers and shareholders, but if they don’t say that then they are essentially sanctioning the bypassing of the Internet for content in some form.  The only question would be whether bypassing would be permitted for more than just content.

On-demand video is yet another trend acting to reshape networking, particularly in the metro sense.  Its complicated relationship with neutrality regulations mean it’s hard to predict what would happen even if consumer video trends themselves were predictable.  Depending on how video shakes out, how NFV shakes out, and how cloud computing develops, we could see major changes in metro spending, which means major spending changes overall.  If video joins forces with NFV and the cloud, then changes could come very quickly indeed.