What Does Domain 2.0 Have to Do to Succeed?

I’ve looked at the vendor side of network transformation in a couple of recent blogs, focusing on interpreting Wall Street views of how the seller side of our industry will fare.  It may be fitting that my blog for the last day of 2013 focuses on what’s arguably the clearest buyer-side statement that something needs to change for 2014 and beyond.  AT&T’s Domain 2.0 model is a bold attempt to gather information (it’s an RFI) about the next generation of network equipment and how well it will fit into a lower-capex model.  Bold, yes.  Achievable, optimal?  We’ll see, but remember this is just my analysis!

Insiders tell me that Domain 2.0 is aimed at creating a much more agile model of infrastructure (yes, we’ve heard that term before) as well as a model that can contain both opex and capex.  While some Street research (the MKM report cited in Light Reading for example) focuses on the capex impact of the AT&T initiative for the obvious reason that’s what moves network equipment stocks, the story is much broader.  My inside person says it’s about the cloud as a platform for services, next-gen operations practices to not only stabilize but drive down opex even as service complexity rises, and optimum use of cloud resources to host network features, including support for both SDN and NFV.  You can see from this list of goals that AT&T is looking way beyond white-box switches.

Another thing that insiders say is that Domain 2.0 recognizes that where regulatory issues aren’t driving a different model, the smart approach is to spend more proportionally on bandwidth and less proportionally on bandwidth optimization.  That’s why Ciena makes out better and Cisco likely makes out worse.  Networks are built increasingly from edge devices with optical onramp capability, coupled with agile optics.

Where SDN comes into the mix in the WAN is in providing a mechanism for creating and sustaining this model, which is sort of what some people mean when they say “flattening the network”.  It’s not as much about eliminating OSI layers as it is about eliminating physical devices so that the total complexity of the network is reduced.  According to these sources, Cisco isn’t the only router vendor to be at risk—everyone who’s not a pure-play agile-optics vendor might have to look often over their shoulder.

Data center networking is also on the agenda, mostly because the new cloud-and-NFV model demands a lot of network agility in the data center.  There will be, obviously, a major increase in the amount of v-switching consumed, but it’s not yet clear whether this is all incremental to the current data center switching infrastructure, a result of increased virtualization (which uses vSwitch technology, obviously).  However, my sources say that they are very interested in low-cost data center switching models based on SDN.

It seems likely to me that a combination of an SDN-metro strategy based on the optics-plus-light-edge model and an SDN data center strategy would be self-reinforcing.  Absent one or the other of these and it’s harder to see how a complete SDN transition could occur.  To me, that means that it will be hard for a smaller vendor with a limited portfolio could get both right.  Could a white-box player?  My sources in AT&T say that they’d love white boxes from giants like IBM or HP or Intel or Dell, but they’re skeptical about whether smaller players would be credible in as critical a mission.  They are even more skeptical about whether smaller players might be able to field credible SDN software.  A giant IT player is the best answer, so they say.

The role of NFV here is harder to define.  If you presume “cloud” is a goal and “SDN” is a goal, then you either have to make NFV a fusion of these things to gather enough critical executive attention, or you have to say that NFV is really going to be about something totally different from the cloud/SDN combination.  It’s not clear to me where AT&T sits on this topic, but it’s possible that they see NFV as the path toward gaining that next-gen operations model we talked about.

NFV does define a Management and Orchestration (MANO) function.  It’s tempting to say that this activity could become the framework of our new-age operations vision.  The challenge here is that next-gen operations is not the ETSI NFV ISG’s mandate.  It is possible that working through a strategy to operationalize virtual-function-based services could create a framework with broader capabilities, but it would require a significant shift in ISG policy.  The ISG, to insure it gets its own work done, has been reluctant to step outside virtual functions into the broader area, and next-gen operations demands a complete edge-to-edge model, not just a model of virtual functions.

Might our model come from the TMF?  Support for that view inside AT&T is divided at best, which mirrors the views we’ve gotten from Tier Ones globally in our surveys.  The problem here, I think, is less that the TMF doesn’t have an approach (I happen to think that GB942 is as close to the Rosetta Stone of next-gen management as you’ll find anywhere) as that TMF material doesn’t explain their approach particularly well.  TMF material seems aimed more at the old-line TMF types, and the front lines of the NGN push inside AT&T or elsewhere lacks representation from this group for obvious reasons.  NGN isn’t about tradition.

The future of network operations could be derived from NFV activities, or from TMF’s, or from something that embodies both, or neither.  Here again, it would be my expectation that advances in operations practices would have to come out of integration activity associated with lab trials and proof-of-concept (TMF Catalyst) testing.  As a fallen software guy, I believe you can develop software only from software architectures, and standards and specs from either the NFV ISG or the TMF aren’t architectures.  I also think this has to be viewed as a top-down problem; all virtualization including “management virtualization” has to start with an abstract model (of a service, end-to-end, in this case) and move downward to how that model is linked with the real world to drive real operations work.  The biggest advance we could see in next-gen networking for 2014 would come if Domain 2.0 identifies such a model.

I wish you all a happy and prosperous New Year, I’m grateful for your support in reading my blog, and I hope to continue to earn your interest in 2014.


A Financial-Conference View of SDN and NFV

I’ve blogged in the past on how the Street views various companies or even the networking industry, and today there’s been an interesting report by equity research firm Cowan and Company about how SDN and NFV will impact the market.  This is the outcome of the company’s Trend-Spotting Conference and it includes some industry panel commentary and also audience surveys.  As usual, my goal here is to review their views against what I’ve gotten in my surveys of enterprises and operators.

The top-line comment is that our revolutionary technology changes are all going to have an adverse impact on the industry, with attendees saying it will commoditize networking within five years (by an 80% margin).  The vision is that white-box devices will suck the heart out of the networking market, and here I think the view of the audience is short-sighted.  Yes, we are going to see commoditization but no, SDN and NFV or even Huawei aren’t the real driver.

Any time you have a market without reasonable feature differentiation you get price differentiation and commoditization, and that is clearly where both networking and IT are already.  Buyers of all sorts tell me that what they want is a low price from a recognized player, period.  But despite this, buyers also say that they have to be able to “trust” their supplier, to “integrate” their current purchases into extant networks and operations practices, and to “protect their investment” as further changes develop.  When you add in these factors to see who’s exercising strategic influence, it’s not the newcomers who have white-box products or revolutionary software layers, it’s the incumbents.  So my conclusion here is that five years is too fast to expect the kind of radical result people were talking about with Cowan’s conference.

The second major point from the conference is that this is all about agility and not about cost savings, which on the face of it shows that simple commoditization is too simple.  When you dig through the comments in the report you get the picture of a market that’s coping with changes in information technology, changes that necessarily change the mission of networking.  They see that change as focusing on the data center, where they want networks and virtualization to work together in a nice way.

I think, based on my surveys, that the “agility” angle is a common evolution of early simplistic benefit models for SDN and NFV alike.  People start by thinking they’re saving money on equipment, realize the savings won’t likely be enough, and then begin to think about other possible benefits.  But whatever the motivation, if it’s true (as I believe it is) that neither SDN nor NFV will pay off enough in equipment savings, then we have to start looking for how the additional savings will be achieved.

The report says that there are two camps with respect to impact on existing vendors; one says that the paradigm shift in networking could help Cisco and similar firms and the other that it would hurt.  Obviously, everything has this kind of polar outcome choice, so that’s not much insight, but I think the reason for the two camps is the agility angle.  If there are real benefits that can be achieved through SDN and NFV deployment, then those new benefits could justify new spending, not reduce existing spending.  So you can say that we have a group who believe that vendors will figure out how to drive new benefits, and a group who thinks the vendors will fall into commoditization.  That’s likely true.

In my view, the “new benefits” group will indeed win out, but the real question is “when?” and not “if.”  Going back to the five-year timeline, the industry has to either develop something new and useful to drive up spending or face continued budget-based pressure to cut it.  No matter how those cuts are achieved, they’ll be tough to add back later on once they happen.  So I do think that there’s a five-year timeline to worry about—an even shorter one, in fact.  If we can’t identify how networking generates new benefits in the next three years, then those benefits won’t get us back to a golden age.  They’ll simply let us hang on to today’s relatively stagnant market picture.

When you look at the details of the Cowen report you see things like “rapid provisioning” as key goals, things that make the network responsive.  To me, that makes it clear that the revolution we’re looking for has little or nothing to do with traditional SDN concepts like “centralization” or even with NFV concepts like hosting efficiencies of COTS.  What it really is about is operationalization.  We have networked for ages based on averages and now we want to have more specific application/service control over things, without incurring higher costs because we’re touching things at a more detailed level.  No more “Shakespearian diagnosis” of network problems; “Something is rotten in Denmark.”  We need to know what’s rotten and where in Denmark it’s happening.

The issue here, of course, is that the classical models of SDN and NFV have nothing specific to do with operationalization.  You can argue strongly that neither SDN nor NFV brings with it any changes in management model (you can even argue they don’t even bring a management model).  The interesting thing is that when I’ve gotten into detailed discussions with operators (about NFV in particular) what I hear from them is that same agility point.  That means that the operators do really see a need for a higher-level benefit set, and they’re hoping to achieve it.

Hope springs eternal, but results are harder to come by.  The reason is simple; nobody really understands this stuff very well.  Among buyers, SDN literacy is satisfactory only for Tier One and Two operators and NFV literacy is unsatisfactory everywhere.  In the Cowen audience polls NFV wasn’t really even on the radar, and yet if you are going to transform network agility it’s hard to see how you do that without hosted processes.

Vendors are, of course, the real problem.  My straw poll of vendors suggests that their literacy on SDN and NFV aren’t much better than buyers’ literacy.  Why?  Because they’re really not trying to sell it.  Startup vendors want to create a “Chicken-Little-Says-the-Sky-is-Falling” story in the press so they can be flipped.  Major vendors want to create a “Chicken-Little-is-Right-but-It’s-Still-in-the-Future” mindset to keep current paradigms in control to drive near-term revenue.  The Cowen panel said that the SDN world was getting “cloudy” in the sense of becoming obscure.  Do we really think that happens for any reason other than that people want it to?  That’s why we won’t see revolutionary change in five years, and that may be why networking will in fact lose its mojo, just as IT is now losing it.  But there’s still time, folks.  Just face reality.

Hockey Stick or Hockey Puck?

We hear all the time about how new technologies like the cloud, SDN, and NFV are poised for “hockey-stick” growth.  Most of the time that’s not true, of course.  In an industry with three to eight-year capital cycles it’s pretty darn hard for something to achieve really explosive growth because of the dampening impact of transitions.  But there are other factors too, and as we’re coming to a new year, this is a good time to look at just what does inhibit technology growth.  What is it that’s making our hockey stick look more like a hockey puck; flat on cold ice?

Most everyone knows that companies look at return on investment in making project decisions, but in fact ROI isn’t always the measure.  Tech spending tends to divide into a “budget” and “project” category, with the former being the money allocated to sustain current capabilities and the latter to expanding tech into new areas.  Most CIOs tell me that budget spending is typically aimed at getting a little more for a little less rather than on achieving a specific ROI target, largely because companies don’t require formal justification to keep doing what they’ve done all along.  Budget spending this tends to focus on cost reduction, and project spending is generally accretive to IT spending overall because it buys new stuff justified by new benefits.

The challenge we have with things like the cloud, SDN, and NFV is that they are all cost-side technologies in how they’re presented.  It’s true that you can justify a “project” aimed at cutting costs through technology substitution, but buyers of all types have been telling me in surveys for decades that this kind of project is a politically difficult sell unless the cost savings are truly extravagant.  The problem is that “status quo for a little less cost” doesn’t overcome the risk that the new technology or the transition to it will prove really disruptive.  Our waves of IT growth in the past have come from projects that created new benefits.  I’ve blogged about this before so I won’t go into more detail here.  Just remember that cost-side projects, if anything, create downward trends in spending overall.

Another factor that inhibits the growth of new technology is buyer literacy.  You don’t have to be an automotive engineer to buy a car, but being able to drive is an asset.  The point is that a buyer has to be able to exercise a technical choice so as to reap the expected benefit, and that takes some understanding of what they’re doing.  We’ve measured buyer literacy for decades, and found that it generally takes literacy rates of about 30% in order to create a “natural market” where buyers’ are able to drive things without a lot of pushing and help from others.  Right now, buyer literacy in the cloud has exceeded that threshold, and operator literacy for SDN also exceeds it.  None of the other “hockey-stick” market areas have the required level of buyer literacy.

With SDN, buyers’ problems relate primarily to the identification of costs and benefits.  Even enterprise technologists have a hard time drawing a picture of a complete SDN network.  A bit less than half could accurately describe SDN application in a data center, but of this group the majority admit that the benefits there aren’t sufficient to drive deployment.  When you get down to the details, you find that the users simply don’t understand what SDN will do differently.  One user in our fall survey said “I have a guy who comes in and tells me that SDN will save on switching and routing, but then he tells me that we’d use the same switches and routers we already have, just run OpenFlow for them.”  You can see the problem here, and isn’t that a fairly accurate picture of the net of SDN sales pitches today?

With NFV the situation is similar, but among operators there’s a lot more proactivity associated with finding the right answers.  Operators started off with a widely accepted benefit paradigm (capex reduction) and a clear path to it (replace middlebox products with COTS-hosted apps).  They then found out that 1) there weren’t that many middleboxes to replace, 2) they could achieve similar savings by pushing vendors on price for legacy gear, 3) that they couldn’t be sure what the operations cost of an NFV deployment would be.  Now they’re shifting their benefit focus from capex to opex to service velocity to service creation and OTT competition.  The problem is that they don’t have a clear idea of how these evolving benefits will be achieved.  NFV by itself targets only hostable functions and few operators believe that these would make up more than 20% of their total infrastructure.  How do you achieve massive improvements in operations or services with that small a bite?

The vendors themselves present an inhibiting impact on adoption of the new technologies.  Companies today rely more than ever on their suppliers for professional services or technology education.    If the new technologies are designed to reduce costs (and remember that’s the classic mission for the cloud, SDN, and NFV) then why would a vendor push customers into them?  Yes, you could argue that a startup would be able to step in and present the new options, but 1) the VCs are all funding new social-network stupid-pet-tricks stuff and won’t be bothered with real tech investment and 2) the buyers will say “I can’t bet my network on some company with fifty million in valuation; I need a public company worth billions.”  So we now have to look for a large public-company startup (that’s why Dell could be such an interesting player—a company who’s gone private like that is exactly what buyers would like and it can take a long-term view most public companies can’t).

The point here is that we’re on track to achieve significant cloud growth, not the pathetic dabbling we have now, in about 2016, with SDN and NFV trailing that.  All these dates could be accelerated by optimal market activity, but it’s going to be up to vendors to initiate that.  What we need to watch in 2014 isn’t who has the best technology, but who has the best buyer education process.  The current “take-root-and-become-a-tree” mindset of vendors will favor the bold; they’ll be the only ones moving at all.

Do We Need a “Management-NGN” Model?

It’s pretty clear to me from comments I’ve gotten in my blog that there are a lot of questions out there on the topic of management or operations.  When we talk about things like SDN and NFV we can talk about management practices in a general way, but we can really only talk “operations” if we can integrate management practices with business practices.  So today let’s take a look at the topic and see if we can sort out a framework for evaluating the needs of new network technologies.

You can group services into two broad classes—assured and best-efforts.  Assured services are those that guarantee some level of service quality and availability and best-efforts are services that…well…just expect the operator to do their best to get something from point “A” to point “B”.

Best-efforts may not have a hard standard of quality or availability, but it’s not “no-effort”.  An ISP who had significant numbers of delivery failures could never survive in a competitive market, so independent of the service classes above there’s still a general need to manage network behavior so as to get most stuff to its destination.

Here we can also say there are two broad approaches, provisioned QoS and engineered QoS.  The former means that a service is assigned resources according to the service guarantees made, and sustaining those guarantees means validating whether the committed resources are meeting them.  The latter means that we play a statistical game.  We engineer the infrastructure to a given probability of packet loss or mean delay or failure rate/duration, and we understand that most of the time we’ll be OK.  Sometimes we won’t.  In all cases, though, what we’ve done is to calculate the service quality based on resource conditions.  We assure resources against the parameters that we define as “good-enough-effort” for a given service, not the services themselves.

Where we have provisioned QoS, we assure services specifically and we do resource substitution or adaptation based on service events.  Where we have engineered QoS, we build a network to operate within statistical boundaries, and we respond to conditions with what?  Generally, with policy management.  Policies play two broad roles—they insure that the resources are operating within their designed limits through traffic admission controls, and they respond to internal conditions by applying predefined broad remedies.

So can we apply this to SDN and NFV?  Let’s see.

In the case of SDN, the immediate problem we have is in the definition of SDN.  Does SDN mean “software defined” in the sense that network behavior is centrally software-managed without local device adaptation (the original notion) or do we mean it’s “software controlled” meaning that software can control network behavior more precisely?

If SDN is the former, then we have a bicameral model of service behavior.  We have real devices (whether they’re hosted/virtual or physical) that have to be running and working in order for traffic to be passed from place to place, and we have central route/forwarding control that has to be right or packets fly off to the wrong places or into the bit bucket.  The advent of central forwarding control means that we know the rules in one place, the routes in one place, but it doesn’t mean we’re less dependent than before on retrieving and delivering status information.  In fact, one of the biggest issues in centralized SDN is how you secure management connectivity at all times without adaptive behavior to provide default paths.  Without management connectivity you don’t know what’s going on, and can’t do anything about it.

In the software-controlled model, we presumably still have adaptive behavior and thus have default routing.  Arguably this model means, in management terms, that we are providing more service-centricity while sustaining essentially an adaptive, policy-managed resource set.  It’s my view that any software-controlled SDN models are really models that will ultimately rely on better (more granular in terms of services and service grades) policy control.  This model relies more on predictive analytics for one simple reason; if you can’t figure out exactly how a given service will impact resources and thus other services, you can’t reliably provide for software control at the service and application level (which is the only meaningful place to provide it).  So we do traffic engineering on service/application flows based on understanding network conditions and how they’ll change with the introduction of the new service/flow.

In the central-control model, we can still use predictive analytics but we also have to provide the baseline assembly of route status and traffic patterns that come from our two previously mentioned sources.  However, we’ll use this information not to manipulate routes or queues but rather in creating/connecting paths for specific services or flows.  We may also use policy management, but more in the form of automated service responses to resource events.  There are plenty of early SDN examples of predefining failure modes to create fast transitions from a normal state to a failure-response state when something goes wrong.

I think it’s clear from SDN that we do have a role for analytics and also a role for policy management in the models.  How about NFV?

NFV replaces boxes with software/hosting pairs, much as centralized SDN would replace switching/routing with OpenFlow switches and central software.  You would have to manage NFV elements just as much (though not necessarily in the same way) as we’d managed real devices in the same mission before NFV came along.  We can’t manage the resources on which we host virtual functions and assume that the software is perfect, any more than we could presume in SDN that just because all our OpenFlow switches were running fine, we had coherent forwarding rules to deliver traffic or operating software centrally or in the devices.  If the services are provisioned per-user, though, then service management in a virtual world doesn’t necessarily change much because a service event can trigger a resource-assignment-remediation response.

But NFV adds a problem dimension in that a virtual device may have properties that real ones didn’t.  Scalability and repositioning are examples of this.  So I have an additional dimension of things that might get done, and there’s arguably a policy-management and even analytics link to doing them.  Scale-in/out and redeployment on command are management functions that can be automated responses to events.  Is that policy management?

To me, this is the core of the whole story of SDN/NFV management.  We are really talking about service automation here, even today before we have either SDN or NFV much in the picture.  Today, most service automation is either applied at the OSS/BSS provisioning level or exercised through policy management (admission control, grade-of-service assignment).  In the mobile world we have those tools.  The place where we might be going wrong is in assuming that the mobile world gives us the only example of how we implement policy management.  I think the best thing to do here is to say that we’re heading for a future of service automation and that this future automation strategy will draw on traffic data, past and present, and device status (also past and present) in some measure to make machine decisions about how to handle conditions in real time.  If we apply service automation to the same kinds of services we provide today’s PCC-based policy management or path computation functions to, we could well have the same implementation.  But all future NFV and SDN management missions don’t map to this basic model, and that’s why we have to try to expand our conception of management when we expand into SDN and NFV.

A Realist’s Guide to NFV

Generally, the realities of online journalism will tend to exaggerate everything.  A development, to be press-worthy, has to be either the single-handed savior of Western Culture, or the last bastion of International Communism.  NFV has gone through this gamut, and it seems like a lot of the recent stories are skeptical if not negative.  My number one prediction for 2014 is that the hype on NFV will explode, and even more reality will be swept under the rug of generating buzz and URL clicks.  So let’s look at where we really are.

First, the ETSI ISG’s work won’t complete till January 2015 so we’re a year from being done.  However, the framework of the architecture is in place at this point and so we know what NFV will look like as an implementation.  Vendors (and even operators) who really don’t want to do NFV may use the incomplete ETSI ISG process as an excuse, but it’s lame.  Those who are taking itty bitty NFV steps are doing that because they want to.

Second, the ISG’s work, in a formal specification sense, will increasingly be linked to forma proof-of-concept work that aligns implementation reality with specifications.  I’ve said many times that a five-hundred-page spec won’t write code and can’t be deployed on anything.  At some point NFV has to become software—in fact two levels of it (I’ll get to that next).  I can’t overstate how critical it is for implementations of NFV to be played with now, so there’s time to adapt the specifications to reflect optimum implementation practices.  I said that before this process ever got started, back in October 2012 and it’s just as true today.  We need prototypes.

Third, there are really two parallel NFV issue sets.  “Network Functions Virtualization” equals “Network Functions” plus “Functions Virtualization”.  Network functions are the things that will be run, and functions virtualization is the process of deploying and managing them.  Both are software, and we need to understand that the requirements for the two are interdependent.

For example, there are literally tens of thousands of open-source packages out there that might be harnessed as network functions, but they were not written to be deployed through NFV.  If we can adapt how they work now to NFV principles we have a rich inventory of stuff to start NFV off right.  If we can’t, then not only do we have to write virtual functions, we have to agree on the specialized interfaces and programming practices we decide are necessary.

Fourth, NFV isn’t enough.  When all this started, operators were thinking that they could deploy custom appliances as virtual functions and save opex based on the difference in cost between a proprietary appliance and a hosted function.  Well, look at home gateways as an example.  You can pick up one for forty bucks in Office Depot.  Mine has worked for at least five years, so the amortized capital cost (less cost of money and accounting mumbojumbo) is eight dollars.  Do you think an operator could deploy and host something like that for seventy cents a month?

Operators I’ve talked with and surveyed now agree that the value of NFV will come from its ability to reduce opex and to accelerate service velocity.  But NFV in the original strict host-appliance-logic-in-VMs form doesn’t address enough of the total service infrastructure to have massive impacts on either one.  What’s important is that NFV illustrates the need for virtualization of all the service lifecycle processes in order to be cost-efficient, and the way that’s done could revolutionize service management overall, even extending beyond “functions virtualization” into legacy infrastructure.

The next point, related, is that NFV is about managing virtualization in all dimensions.  Virtual functions, hosted in the cloud, and morphed into virtual service experiences is a lot of virtualization to manage.  Several years ago, the TMF actually framed the technical dimensions of the solution with their integration model (GB942).  The core insight there was that a complete abstract model of the service (the “contract”) has to represent the component and resource relationships in deployment, operations, and management.  Virtualization is a combination of creating valuable abstractions that are generalized to be flexible and agile, and then instantiating them optimally when they’re needed.  That doesn’t mean that NFV is about specific modeling or graphing, it’s about creating a useful abstraction in abstract.  How we then describe it in machine terms is an implementation finesse.

My next point is that NFV is an ecosystem.  You hear every day about this company or that who has “demonstrated NFV” and 99% or more of this is exaggeration.  You don’t demonstrate a race car by showing somebody a nut and bolt.  Every component of NFV already exists somewhere in some form, so trotting one out doesn’t make you an NFV player.  The ISG has published end-to-end specifications in early form, and those identify the functional elements of “real” NFV.  That which can address all these elements is “NFV” and that which cannot is simply a collection of some of the stuff that is included in “real” NFV.  We don’t need some of it, we need all of it.

The final point is openness.  The original target of NFV was replacement of purpose-built, proprietary, boxes by open virtual functions.  Operators don’t want to go from an age where vendors held them up for high-margin middle-boxes to an age where the same vendors hold them up for high-margin virtual functions.  What makes something “open” is the ability to freely substitute components at key places without having to re-integrate or re-program everything.  The ISG can do a lot to assure openness within its scope, but if most of the value of NFV comes from operations efficiencies and service agility that spreads out to cover all of the service management process, then all of that has to be open for openness to matter.

NFV is making good progress.  Yes, there are critical issues to be faced but they’re not the ones being talked about.  Most critical is the proof-of-concept stuff; we will never get this to happen without running prototype implementations and that is the absolute truth.  So if you hear an NFV story from a vendor, ask if they’re supporting a PoC.  If you hear speculation about implementation issues or the value of NFV, ask if the person doing the speculating is working on a prototype to develop their points.  And pay no attention to people who aren’t really involved in this, because they don’t know what’s real.  That’s my advice for what’s going to be a hype-ridden 2014.

In 2014, it’s “Battleground Metro”

This is always the time of year when you’re supposed to look into the crystal ball and talk about the near-term future.  OK, one of the areas we need to be watching in 2014 is metro.  I’ve noted in prior blogs that metro-area networks are critical because the majority of profitable traffic both originates and terminates in the same metro area, and that percentage is on the rise rather than declining.  I also want to point out that if you believe in “carrier cloud” you have to believe in a radical increase in metro volumes, and that NFV is the crowning factor in the “metro-ization” of infrastructure planning.  To understand where networking is going in 2014, where metro is going, we need to look at these factors.

Most of the valuable content we see today comes not “over the Internet” in a topological sense but rather from a content delivery network cache point that’s actually inside the metro network of the viewer.  This is because the high-value content is highly reused, which means it makes sense to keep a copy in all the metro caches where it can be delivered quickly and with a lower drain in network resources.  That’s not going to change, and as Amazon Prime or Roku or Google or Apple TV take hold, we’re going to see even more local video delivery.  Video is the primary source of traffic growth.  That means that the majority of the capacity growth we should expect in the coming years will have to be growth in metro capacity—user to cache.

Then we have the cloud.  Some, perhaps, might propose that cloud-created services could originate in some giant data center in the center of the country or (as Amazon’s model defines) regional centers.  That’s true in the early low-penetration days of the cloud, but if the cloud is successful on a large scale then there’s no further economy of scale in centralizing the large data center demand.  You regionalize, then metro-ize it, for the same reason you cache video.  It makes no sense to haul traffic a thousand miles to get to a data center when you have the cloud opportunity in any respectable metro to justify a local data center.

And more than one, almost surely, in any populated metro.  The Erlang efficiency of a data center plateaus as it grows, to the point where you may as well subdivide it further to improve resiliency and also to improve transport utilization.  Again, I point out that we routinely “forward cache” popular content so there are multiple copies in any given metro area, at different points.  Thus, as cloud grows we would expect to see a growing number of data centers in each metro area, linked by their own “cloud intranet”.

Where NFV comes in is twofold.  First, NFV postulates the substitution of hosted virtual functions for physical middle-box appliances.  Some studies suggest that there are as many middle-boxes as access devices; my own model says that about 28% of network devices fit into the middle-box category, but that these represent 35% of network capex.  This would clearly increase the data center demand in the metro, and since middle-box services tend to be best offered near the point of user attachment (you want a firewall close to the user, for example) it makes sense to assume that NFV would encourage distribution of hosting closer to the edge.  This is why operators have told me they would be installing servers “everywhere we have real estate”.

Second, and in my view most importantly, NFV is the architecture to create OTT-like new services for operators.  It makes no sense for operators to adopt the OTT model of basic web-hosting for their services; they can differentiate themselves on integration of services and on quality of experience if they can automate their own operations practices to deliver this stuff at a good price.  And remember, even without an FCC-hinted reform of neutrality rules, the operator cloud would be immune from neutrality rules.

What, you say?  Yes, even in the original neutrality order, things like cloud and content services are exempt from neutrality within their infrastructure boundaries.  You have to deliver content using neutral principles, but inside the CDN you can prioritize or do whatever you like.  That means that as we push “the cloud” close to the edge for cloud service and NFV reasons, we create a model of service where neutrality is an inch deep.  You have to obey it from the user’s home/business to the edge of the cloud, but no further.  You can move all kinds of stuff around inside the cloud, any way you like, as non-neutral as pleases your business model.

That’s the key metro point, I think.  Even if operators can’t immediately benefit from metro-izing their planning for profit reasons, they are experts at planning for regulatory optimality.  Build a metro cloud, dear carrier, and you have nothing further to worry about with neutrality.  And how many OTTs will build metro clouds?  The ultimate differentiator is within their grasp, and they’re driven to it by reduced costs (optimal use of data centers and transport), improved revenues (new services build on cloud and NFV), and regulations (you don’t share with anyone and you can charge for QoS inside the cloud).

This is the risk that infrastructure evolution poses for vendors.  It’s not SDN sucking the adaptive heart out of Ethernet or IP, but metro sucking all the dollars into a part of the network where “transport” means nothing more than connecting data centers with superfat (likely fiber) pipes.  If all the traffic is staying so close you can climb a tall tree and see its destination, how much core routing do you need?  In fact, it becomes easy for SDN principles to take hold in a metro-dominated world because the fact is that you have a very simple connection mission to manage—everyone goes to the nearest cloud edge and is subducted into the magma of NFV.

This isn’t going to happen in 2014, obviously.  It’s going to start in 2014, though, and we’ll probably see who gets it and who doesn’t by the end of next year.  That will determine who we’ll see around, alive, and prospering beyond 2015.  Vendors, standards bodies, network operators, everybody.  The metro could be the real revolution, and revolutions are…well…Revolutionary.

Wandl and Cariden: Is There a Real Value?

There have been some LinkedIn discussions regarding my blog on Juniper’s Wandl buy, and some comments have opened the broader question of whether policy management and path computation might be a factor in the deal, even to the point where they might be incorporated into Juniper’s Contrail SDN controller as a result of the buy.  So the question is whether 1) there’s a link between traffic analytics and path, policy, or SDN control, 2) whether this might be behind the Juniper buy, and 3) if so, is it good enough reason.  To get to these answers we’ll have to excurse a bit into Cisco’s Cariden deal as well.

At a very high level, there are two views of how networking will evolve.  In one view, current adaptive network protocols and devices are enhanced to make networks more responsive to business and application needs.  Think of it as “adapting with boundaries”.  In the other, we step away from the notion of adaptive network behavior in favor of more explicit and centralized control.  You might have expected that we would argue this was the “legacy versus SDN” choice, but both sides call their approach “SDN” so we’re kind of deprived of that easy out.

There are also two views of what drives “evolution” of networks.  In one view, it’s driven by traffic.  The goal of networking, under this view, is to handle traffic efficiently so as to reduce the overall cost of both network equipment and network operations.  In another view, evolution is driven by services, meaning differentiable things that can be sold.  This particular dipole could be called the “Internet versus managed services” or “dumb versus smart”, but everyone salutes the Internet and nobody wants to be dumb so that characterization isn’t helpful either.

We can also say that there are two visions of where we care about any of the other visions.  One says that the “core” is the heart of the network and that big iron and bit-pushing are what it all comes down to.  The other says that the great majority of profit today, and credible future profit, would involve services with traffic topologies that span a single metro.  They say that revenue per bit in the core is the worst in the industry, so why invest there.  Their opponents say that’s why you have to invest there.  Users don’t care because they don’t understand how we build networks anyway.

What everyone sort of agrees on is that you have to optimize something, somewhere and that’s likely to demand that you understand what’s going on.  So we can presume that there would be a kind of marriage of data-gathering and analytics that would present a vision of operations, likely now and in the past, and would be able to support decisions of some sort based on that vision.  We could apply that knowledge in a number of ways, and in fact we could use it to support any of the poles of any of the visions we’ve cited here.

Traffic engineering in the core of an IP network is usually accomplished using MPLS, through a process of path computation.  That means that you can use traffic knowledge to drive PCE processes that would optimize handling in an IP core, and Wandl or Cariden can certainly be used that way.  You could also use the traffic knowledge to drive a wider scope of policy management processes in the network, even perhaps out to the edge, to increase the span of control.  So Wandl could be used for that too.

In a centralized SDN model where we replace traditional adaptive IP with something else, the presumption is that the central control process is what manages paths, so it’s that process that needs the knowledge of traffic.  Since Wandl (and of course Cariden and other Wandl-competitors) can provide traffic knowledge, it could be used to couple that knowledge to an SDN controller.  In fact, if we presumed that SDN devices, lacking adaptive behavior, might also lack some of the traffic knowledge that’s necessary to do adaptive routing, you could argue that SDN might need traffic knowledge more than legacy networks would.

What all of this adds up to is that you can in fact say that traffic analytics are useful whether you’re thinking SDN or legacy, core or metro.  So we answer our first question.

Could Juniper have thought this was a good reason to buy Wandl?  Here, of course, we’re getting into something very subjective and somewhat unknowable, which is how any vendor thinks about something.  Certainly it might be useful to have multi-vendor traffic/network analytics given that virtually every carrier network is multi-vendor.  And $60 million isn’t a lot of money these days; Apple bought social-media analytics company Topsy for five times that much.

My problem is that this is the kind of “so cheap I can’t refuse it” justification, which isn’t enough.  The best strategy for Juniper would seem to be to provide a data-gathering interface to any analytics platform out there rather than to seize on a single one and hope it’s adopted universally by operators.  They could do that with Wandl, with whom they already have a partnership.  So just “getting” the data doesn’t seem to be enough of a reason to buy them.

We could say that Juniper bought Wandl because Cisco bought Cariden, but while that’s possible in one sense it’s illogical in another.  Cisco needs to minimize the SDN revolution at all costs because they’re the networking establishment.  We’re in a market that thinks that analytics, even if you don’t do anything different with the results, is a revolution.  So add analytics to POIPN (Plain Old IP Neworking) and you have SDN?  As I’ve said before, you can affect software control without centralization, OpenFlow, or any of that stuff.  That’s been Cisco’s angle.

The problem is that it can’t be Juniper’s.  You can’t become a come-from-behind winner in traditional networking by aping every move of the market leader, unless you want to be a price leader.  Guess who’s the price leader?  Huawei.  Juniper needs to take SDN further, take networking further, if it wants to gain market share on Cisco.

We can’t say whether Juniper believes that Wandl will suddenly make it a market leader, but if they do they’re delusional.  More is needed, and so we’ve answered our last questions.  But that obviously raises another, which is what “more” might be.

To me, both SDN and NFV signal that operators believe that there’s a fundamental problem with the notion of uncontrollable, adaptive, networking.  That problem isn’t just “utilization” either because fiber improvements are making it possible to lower the cost of transport bit-pushing.  The problem is that this model of networking is associated with commodity connectivity and that’s never going to be profitable enough for deregulated operators to survive on.  I’ve heard all the presentations about how the price/cost line on bit-pushing is about to cross.  They’re right.  You can slow the process a little by fiddling with utilization or fiddling with capital costs, but in the long run even opex enhancements aren’t enough.  You have to sell more, which means you have to create networks that are subservient to services.  That starts at the top, where the services are, which is why the cloud and NFV are important.

I think Cisco knows this, and they’re just waiting for the right moment to jump off the old bits and onto the new.  Cariden is useful to them while they’re in titular denial.  The question is whether Juniper sees the future too, and Wandl doesn’t answer that question.

Reading Oracle’s Quarter

Oracle’s numbers came out, and they were good after a run of misses.  One of the obvious questions raised by Oracle’s success is whether it’s Oracle’s success or a sign of secular recovery in software or in tech.  Let’s look at some of the signals to find out.

This tends to be Oracle’s best quarter; they did well at this point last year as well.  Our surveys and model say that companies are tending to push tech spending to the second half to counter economic uncertainties, and also that they want to get any upgrades done before the holiday season.  Software sales were up 5%, new software licenses up 1%, and the company reported very strong growth in their cloud portfolio and in “engineered systems”, the appliance space.  Hardware, meaning servers, continued to be a disappointment.

One thing I think comes out of the Oracle call loud and clear is that systems/servers are a very tough business to be in.  If you think about the notion of network functions virtualization, which is to shift features from network appliances to commercial-off-the-shelf servers (COTS), you realize that the notion of COTS is what’s fundamental here.  COTS could as easily stand for “Commodity off-the-shelf”.  Software, particularly open-source Linux, has tended to commoditize server platforms because it limits differentiation in everything but price.  This puts companies like HP and IBM, who have large hardware exposure, in a tight position.  They need to have strong software stories, but the promotion of software could arguably be driving a nail in the hardware coffin.

One solution to this, the one Oracle is promoting, is the “engineered system” or appliance.  By sticking software and hardware into a single package you create something that’s feature-differentiable.  Clearly this is working for Oracle at this point, and I think that’s largely because the transition to virtualization at the hardware level (including the cloud) demands thinking of database as a service, which promotes the notion of an appliance as the source.  We’re only at the beginning of the cloud revolution (we’ve addressed less than 10% of the near-term market potential) so Oracle can ride this horse for some time yet.

In the longer run, though, you have to look at networking and NFV/SDN for inspiration.  If operators are trying to convert network appliances into combinations of software and COTS, then database or transaction appliances are just circling the drain a bit higher in the bowl.  Probably by 2018 it will be difficult to promote “cloud appliances” and that’s largely because network operators are creating the framework to couple software features to hardware in a more automated, elastic, and manageable way with NFV.  It’s not going to happen quickly (certainly not in the next year, probably not the next two) but it will happen.  When it does, it will gradually percolate out of operator infrastructure and into the cloud and the enterprise.

A second point from the Oracle call is the supremacy of “intrinsic margins”.  The market that you need to own is the one with the highest profits.  In the cloud, that’s the top of the service food chain—SaaS.  Oracle has always liked the SaaS space because SaaS displaces the most cost and because Oracle thought (correctly, I think) that they could gain market share on competitors like SAP who were reluctant to go all-in on a hosted-software model.  If you can offer your software as a service, buyers don’t care about the underlying platform and hardware at all, which means that you can use your own stuff there to augment your profits and that you tend to commoditize PaaS and IaaS offerings of others.

Oracle’s weakness in the cloud, potentially speaking, is that the real future of the cloud isn’t even SaaS as much as it is cloud-specific applications based on platform services.  Amazon has the right idea in cloud evolution from the IaaS model.  You augment basic cloud with all kinds of neat web-service additions that create in effect a cloud-resident virtual OS.  People write to that to get the special cloud features you’ve included, and the next thing you know we don’t write apps for hardware any more—but for the cloud.  If Amazon continues this push and Oracle doesn’t counter it, they risk having software developers rush to Amazon and create SaaS offerings there that transcend anything that Oracle or anyone else could create using traditional IT elements.

Oracle has a combined point of weakness in the network.  It’s a weakness in appliances because NFV is an anti-appliance move and Oracle’s not particularly an NFV player despite the hype that Tekelec and Acme make it one.  Having virtual functions is having software components; what makes NFV is the orchestration and management.  Networking is also a weakness because arch-rival Cisco is a network giant and a competitor to Oracle in the cloud, and both IBM and HP have either their own network products or OEM products there.  The cloud is as much networking as it is IT, so if Oracle wants a secure bastion there they have to create a position in the space, preferably one that denigrates competitive incumbencies.  Both SDN and NFV would do that, and so Oracle needs to do a lot in those spaces.

Insiders at Oracle tell me the company really seems clueless to exploit their Acme and Tekelec acquisitions.  Some say that they need to focus them on cloud UC.  Some say they need to get a broader NFV and SDN story, but that latter group (likely the ones who have the right answer) really has no easy strategic path forward because nobody in Oracle seems to be thinking much about either topic beyond slideware.

So here’s my net.  Oracle can’t be said to be signaling a general shift in IT.  We need more and broader data points and the others in the space did not provide them.  Oracle can be said to be signaling that a differentiable, margin-generating, story is better than one that presents commodity hardware and hopes for the best.  You can augment your sales effort (as Oracle has done) if you have something to say besides “we’re cheaper”.  But Oracle is still captive to broad commoditization trends, and it has to address them now while its margins will still support the investment.

Will FCC Experiment Delays Hurt Networking?

AT&T has joined Verizon in spinning off some rural phone lines/customers that didn’t present a profit/revenue profile that matched their future requirements.  This trend, and others, is what has motivated the FCC to take up the issue of “transition”, and the outcome of the FCC’s review could have major impacts.  I noted the FCC’s intentions in a prior blog, and some of those intentions have now been made into concrete steps.  We just don’t know where those steps might be taking us.

The FCC had established a Technology Transitions Task Force to evaluate the impact of technology changes, obviously, but the fact is that it’s the business changes that demand attention.  Over the roughly twenty years since the Telecom Act of 1996 created our current framework of regulations, we’ve seen the price per bit of capacity drop by 50% per year.  That has made low-bandwidth services like voice something that could be offered over the Internet for literally nothing, and you can’t sell stuff in competition with things that are free.

The immediate result of the declining capacity cost has been to shift expensive calls off the bill, either by shifting them online or to unlimited-call plans.  Both these have capped and eventually pushed down ARPU on voice services.  At the same time the cost of providing baseline voice services has grown because of higher labor costs and because rural geographies are increasingly seeing growth of off-grid vacation homes and other low-density development that’s inherently more expensive to serve with wireline.

The TTTF report, issued December 12th, recommended experiments on the way that consumers could be served by advanced technology options.  What this is really about is answering a single question—what can we do to transition TDM voice to something less expensive without creating any consumer risks along the way.  The option that operators would like is to transition it to mobile service because that could eliminate the whole issue of wireline voice, and move users to something that’s (at this point at least) still profitable.  Another option is to decide how you could provide VoIP to users who don’t have Internet access today because they don’t want it or can’t afford it.

Experiments will provide some comfort in saying that wireline voice and wireline voice carrier-of-last-resort obligations can be eliminated, and the fact is that they have to be.  We made a decision in 1996 to deregulate the industry, to eliminate the regulatory-monopoly model that was (some said) limiting innovation but (some said) was also protecting the investment in infrastructure.  Now we have a big chunk of telco budgets spent providing a service that almost everyone thinks is a dinosaur—POTS.

The larger question that the FCC has to answer is relating to the rest of the transitional issues, and I don’t mean the simple wireline/wireless or copper/fiber issues.  Those aren’t issues at all, they’re simply market shifts that are really out of policy-makers’ control.  The real issues relate to whether you can have an industry that’s half-regulated.  The Telecom Act didn’t repeal the Communications Act of 1934, it simply amended it.  There are still universal service obligations, E911 obligations, wholesale obligations, and other silly holdovers.  Most recently we initiated net neutrality regulations that go beyond guarantees of access to guarantee neutral network behavior under all kinds of conditions—guarantees that have kept QoS and settlement out of the Internet and limited the incentive to invest.  In the US and in Europe, operators are looking for other ways to make profits, and if they can’t do that in their home countries and core industries, they step outside one or both.  In Europe, many operators have their premier service experiments running in some other country, even continent.

But I think that the real issue for the FCC isn’t whether we can regulate an industry we deregulated at least in a titular sense.  We can’t and keep it healthy.  We need TTTFs not because technology changes are unusual and need special actions to deal with, but because we have created a rigid framework for an industry in one area and demanded it be flexible in another.  We have to regulate or deregulate.  I think that many in the FCC know this, and know that the experiments that are going to be run in 2014 are really aimed at defining the boundaries of “minimum regulation”, the level needed to protect consumers while letting the technology of the future drive the network of the future.

What is sad is that it’s taken us this long.  When we look back on FCC policies and FCC chairmen, I don’t think that the last Chairman we had will get high marks.  Genachowski was an activist at a time when networking needed a minimalist, and worse his activism was biased strongly against those who build the networks, not just build things that generate traffic like the OTTs do.  We need both business types to protect consumers, and we should never have taken regulatory positions that ended up favoring one type—particularly when without infrastructure there’s no services of any sort.

My fear is that this whole “experiment” thing will be another delay.  We already see the signs, in moves by players like AT&T and Verizon to spin off lines that aren’t going to be profitable, that the industry is developing serious structural problems.  Vendors like Cisco say that “macro” conditions are hurting their sales, but one of those conditions is that operators aren’t able to earn the same return on investment as OTTs do, and yet both operators and OTTs sell stock in the same markets and have the same shareholder concerns about growth.  We could have changed this industry three or four years ago by simply doing the right thing—either creating a true regulated monopoly again and regulating the outcomes we wanted, or letting the market do what comes naturally.  Markets divided against themselves can’t stand either, and we could fritter away the health of the industry taking baby steps now.

Does Juniper Have a Magic Wandl?

Juniper announced it was buying its long-time network monitoring and planning partner, Wandl.  Obviously this could indicate a strategic shift on Juniper’s part, particularly when you marry it with the departure of the “Microsoft camp” (CEO Kevin Johnson and software head Bob Muglia).  It could also mean nothing much, and it’s too soon to call this one definitively.

Wandl is a company that provides planning support for network operations by analyzing traffic data and network conditions.  Most people would put this in the “analytics” category rather than the “operations” category because the products don’t actively control network behavior in real time.  Operators in our survey put the company most often in the category of a capacity or network planning tool, which is an element of OSS/BSS but not the key element (service creation, order management, and service management in real time are the keys).

That doesn’t mean that Wandl couldn’t become part of a next-gen OSS (NGOSS) strategy, and this is where the Juniper decision to buy them gets interesting.  On the surface, Wandl as a partner contributes no less to Juniper’s portfolio than Wandl as a subsidiary element of Juniper.  There’s no reason why its sudden acquisition would generate a change in strategy, but it could be that the acquisition signals a change, or a competitive countermove.

If Juniper, hypothetically, were going to launch a major NGOSS-linked initiative that would make Wandl the centerpiece of a highly critical and profitable strategy, Juniper would likely immediately have two thoughts.  Thought One:  We are about to make these Wandl guys rich with the sweat off our backs marketing this NGOSS stuff, and why not buy them and keep all the money for ourselves?  Thought Two:  We are about to make these Wandl guys the center of a major Juniper strategy and Cisco just said they’re going to do software M&A!

All of this presupposes that Juniper is about to launch an aggressive NGOSS-linked strategy, and I don’t think that’s very likely.  OSS/BSS is an enormous undertaking, something that only a few giant players have gained any traction in marketing (Ericsson, Amdocs….)  Juniper has never had a horse in that race and has just been blowing kisses at the TMF whose stuff is the standards foundation for the space.  If  Juniper were going to do something with OSS/BSS on a large scale, I’d have thought they’d be advertising that in dialogs with operators, engagement in the standards processes, and so forth.  I don’t see any of that.  I also think that such a move would be nearly impossible to bring to fruition in the near term (fast enough to help their numbers in 2014) and it would be a giant project for a company that never got its software or even network management story straight in the past.

The other possibility is that Juniper has a reliance on Wandl for some of its capacity and network planning applications, and simply fears that another network vendor will grab them up.  Wandl is involved with equipment/networks from all the router/switch vendors and so it could very well be seen as an asset in a multi-vendor world.  But the problem with this story is that if Cisco was really seeing Wandl as critical they could certainly have outbid Juniper.

So where does this leave us?  One possibility is that Wandl is linked to some Juniper plan in the NFV arena.  NFV and SDN could (separately or together) create a kind of network/operations shim layer between classical OSS/BSS and the evolving network infrastructure.  For example, if you used this layer to make the whole network look like a single agile virtual device, you could express that device simply to current management systems and take up all of the special tasks associated with the next-gen network inside your virtual device—including management/operations (CloudNFV supports but doesn’t demand this model, for example).  This is kind of where I think Juniper must be headed with the Wandl deal, if there’s any strategic justification for it at all (a point I’ll return to in a minute).

It would be much easier for a network player like Juniper (or its rival Cisco, or even Alcatel-Lucent) to create an NGN shim layer than to become a true OSS/BSS player.  It’s not an impossible stretch to make a true NMS that’s SDN and virtualization-aware into such a layer, and it’s hard to imagine how you can stay in networking without a management model that can do something with SDN and function virtualization.  Otherwise you’re relying on a competitor to take the higher-level management role.

This would be a pretty interesting development if it were really behind the Wandl deal, but it’s hard to see where the genesis of this would have come from inside Juniper.  They’ve never had a strong operations story, their software story never gelled under Johnson/Muglia, and those two (particularly the latter) have made a rather quick departure.  Who takes over?  Logically it would be the Mad Scientist who was the founding father of this supposed Juniper shim-layer story, or somebody brought in.  Will some executive from Wandl be that somebody?  Their key people’s names aren’t even listed on their website nor are they provided by the major sources of executive and board information (the company is privately held).  I don’t know enough about who’s there to know who might be a candidate for a bigger role.  David Wang is the CEO, though, and if he suddenly takes a bigger role we’d know this was the angle for the deal.

The other possibility is that this is a small move.  Juniper, like other companies, sometimes buys partners just because they see it a sound financial step (the earnings are good) or because there’s a small risk that a competitor would make that small-step decision (Huawei is also a Wandl partner, for example).  Juniper has not made many (if any) successful strategic acquisitions, so it would be a stretch for them to make one now.

But if there was ever a time for it, this is the time.  The new CEO starts at Juniper in just a couple of weeks, and he’s going to have to hit the ground running.