Can a Pact Save the OSS/BSS from NFV Oblivion?

Ericsson, Huawei, and NSN have entered into a somewhat-historic agreement to break down interoperability barriers for OSS/BSS systems.  OSSii is essentially an agreement to share interface specifications not to cooperate in broader development, or to set new standards for OSS/BSS interoperability (still largely under the TM Forum).  The promise is to facilitate rollouts of new services by making it easier to transform OSS processes.

OK, call me suspicious but I’m not much of a believer in corporate sweetness and light, so I think we need to be looking at just why this sort of thing might be happening, and I don’t think you need to look very far.  Think SDN and NFV and the cloud.

OSS/BSS systems evolved from what could be called “craft-support systems”, meaning they grew up in a day when a lot of provisioning was running wires and connecting them to racks of gear.  As we moved toward using converged networks driven by management interfaces, we moved away from this model and started to see some tension arising between a “provisioned” view of the world (where virtual humans ran virtual cables between virtual connections but kept the processes of service creation largely intact—if virtual) and a policy-managed view in which you set goals for infrastructure overall and forget about provisioning services.  Over time, recognizing that services are what get sold, we had to fight to establish a service context for infrastructure that really had to be less service-aware in order to be profitable.

The challenge now is that we are clearly heading toward an age where “service” means something delivered through the network as much or more than something produced by it.  Cloud-hosting of stuff, including consumer monitoring, content delivery, and computing services for business, creates a whole new picture of deployment and application lifecycle management, a picture that is very software-centric.  Most of you know that the cloud has deployment interfaces for basic tasks and DevOps tools for complicated ones.  These interfaces and software-based ALM processes actually bypass a lot of what the TMF and the OSS/BSS community has worked to establish.  There might have been hope among some optimists/purists in the space that somehow provisioning the network of the future would still be OSS/BSS-driven.  Not happening.  The TMF is way too slow, the market way too fast, and the tools of the cloud way too far ahead.

NFV is probably the big factor in this, because if we’re going to host service features in the cloud (as we are, even if NFV as a body isn’t universally prepared to make that leap of faith) we’re building the whole of networking as a cloud application.  OSS/BSS will likely play a role in this, but just what that role might be is not clear at this point.  Focusing as they are on virtualizing network functions, the NFV activity isn’t looking at the source of orders or the conduit for operations practices.  The kids in the TMF will have to learn to play on their own, and so will the vendors.

Ah, the vendors.  I didn’t forget about them.  What’s happening, I think, is that the players who make a lot of money on OSS/BSS in product or integration terms are now looking at the high costs and high inertia of the processes and saying “This isn’t going to be acceptable to operators”.  They believe that if they stay the course and make every OSS/BSS project into something that’s more complicated than writing a computer operating system from scratch, the cloud and NFV revolution will pass them by, and NFV principles applied to network operations will drive even network path provisioning along SDN lines, leaving nothing for OSS/BSS to do and no money available to buy it or integrate it.  So you conspire (OK, you join hands in the spirit of public good) and try to constrain one of the costly and high-inertia issues—the OSS/BSS interface integration tasks.

The question, of course, is whether this is going to work—for the first big three or for anyone else who joins later on or even for the TMF.  The answer probably lies in how NFV and SDN work.  The TMF is a glacier in the tortoise versus hare race, and that’s not a problem unique to the TMF.  Standards processes have always been glacial, and they’ve been getting worse over time as vendors try to manipulate them and as old-line standards people fight to control the things they’ve always controlled.  NFV and SDN are what most would call “standards processes” even though the former says it will only identify standards and the latter is so far so narrow in its focus as to be market-irrelevant.  You don’t win a race, even with a glacier, by adopting the same practices that slowed your opponent down.

I am of the view that there is a high-level harmony between NFV, SDN, and the TMF models.  I believe that if that harmony were to be accepted by the TMF guys it would likely be accepted by the SDN and NFV people too, and we’d have a realistic framework for OSS/BSS integration with the cloud, with SDN, and with NFV.  But I have my doubts on whether TMF will in fact accept the optimum approach, at least not accept it fast enough to influence deployment.  By the end of this year there will be clarity on a complete NFV deployment model, spawned by people who take NFV goals and apply them along cloud/SDN lines rather than along TMF/OSS/BSS lines.  If that happens, then only the happy accident of convergence on an early implementation by both NFV and TMF will save the role of the OSS/BSS in the network of the future.

Might the FCC Spur SDN/NFV?

Anyone who’s read my blogs knows I’m no fan of current FCC Chairman Genachowski.  I’ve always believed his role in the VC-and-startup world has made him too much a fan of Internet novelty and not enough a fan of reality.  In particular, I’ve been critical of the neutrality position he’s championed, which includes the preservation of bill-and-keep and a de facto prohibition against content-provider-pays models.  Now he’s going, and guess who’s taking over?  Many say a representative of the other side, the network operators.  If that’s true, it may actually help new trends like SDN and NFV.

The newly nominated FCC Chairman is Tom Wheeler, who has formerly headed both the NCTA and CTIA, is not seen as an unbridled fan of the venture world.  His nomination has stirred concern among the Internet-as-usual advocates, and I personally hope that he sponsors a revisiting of the whole settlement-and-payment process.  Right now, IMHO, investment in the Internet as a network is hampered by the low return on investment, and that’s largely because the bill-and-keep model focuses on getting users a given ISP (so they can bill and keep) rather than on building connectivity among ISPs and in particular encouraging QoS peering.  We’re not going to have Internet QoS without making ISPs settle among themselves; otherwise only the guy who owns the customer gets any reward for providing it.

We may see an opportunity for the FCC to rethink things coming just as Wheeler takes the helm.  ESPN reportedly wants to do a deal with AT&T, Verizon, or both to subsidize mobile access to ESPN content, so it won’t count toward the users’ data caps.  Such a deal would be just what Genachowski doesn’t want (which is likely why it won’t be requested as long as he’s in the game), but Wheeler could use this to set boundaries on the settlement issue.

The FCC is a Federal commission, and so under US law it’s kind of a court of fact.  Unlike courts, though, the FCC is not bound by its own precedent, which means a newly constituted FCC could just change its mind.  That’s happened from time to time in the past, but usually when a change of political power brings about the shift in the FCC.  This time we’re just moving from one Democratic appointee to another.  Still, the difference in Wheeler’s background is hard to overlook; it could be a signal that the Administration recognizes its policies have worked against the consumer, and against the Internet as well.

It’s fair to ask what the consequences of the move might be.  The immediate effect would be to promote mobile video from the large players, providing more revenue to mobile operators.  An additional revenue stream in the metro would almost certainly result in major remaking of CDNs.  Operators aren’t going to build capacity where they can shortstop transport.  They’ll push caching further forward, and the immediate impact of that will be to increase the demand for a more agile mobile delivery architecture, something to replace or augment the Evolved Packet Core.  EPC is going to be pressured even without a reversal of content-provider-pays policies; small cell and WiFi cells both complicate the simple picture of mobility that the 3GPP codified years ago.  Mobility, IMS, and EPC are big operator targets for SDN and NFV.

A broader change in policy is possible too, as I said.  If the FCC went all-in on settlement, it would likely generate QoS peering among ISPs, and that would likely accelerate the transition from provisioned VPNs to Internet VPNs for long-haul branch connectivity in particular.  It would accelerate the use of cloud computing because cloud providers could pay for QoS or even pay for network services under cloud contracts.  In short, it could very well push a sane model of venture investment rather than pushing continued attempts to re-invent social networking.  To me, the sooner the better.  We’ve had a venture market funding riotous consumption without any regard for production, and it’s time we changed the rules here.  I don’t think anything will happen immediately, but as I’ve said the FCC is not bound by its own precedent, and they could signal a shift with the ESPN deal (if it’s introduced) that would shake the industry—maybe even enough to shake it into normalcy.


Putting the Software into Networking

It’s “recap Friday” again, and there are a lot of little items that (as usual) add up to some significant industry trends.  Facing important truths about something you depend on is never easy, and it looks like the industry as a whole and some companies in particular are now going to have to do that.

Networking equipment numbers have generally been soft in this earnings season (Cisco doesn’t report till next week), and I think this frames the central issue of the industry—you can’t expect buyers to spend more to generate less ROI.  Vendors have historically believed that connectivity and traffic-handling were divine mandates that enterprises and operators simply had to meet at any cost.  It was up to the buyer to make the business case.  Well, we can see where that’s been getting us.

Not surprisingly, the need to improve ROI has driven both enterprises and network operators to look at cost management as a partial solution.  This has driven some vendors to take a completely cost-side view of the future of the industry—Juniper is probably the best example with their TCO stories.  The problem with that is that the least costly thing to do is to not buy anything.  I’m looking at an old D-Link hub I bought a decade ago for probably twenty bucks, and it’s still churning out bits.  Are we saying that consumer network equipment is more reliable, has a longer useful life, than carrier gear?

Or, like most vendors, you can add some enhanced features to your products and tout them as a justification for churning the network.  This idea also fails the school of history.  Novell, once the darling of PC networking, dumped because they didn’t recognize that at some point buyers didn’t need more bells and whistles added to file and printer sharing.  Same here in networking.

The vendors who need to be the most concerned here?  Likely the giant equipment vendors who are tied to a portfolio of low-margin equipment kept afloat by the single bright light of mobile services.  4G LTE competition among operators has given some vendors a lease on life, but we just had what I think was the first truly insightful IMS open-source announcement from Metaswitch.  How long will it take before you can buy IMS/EPC components in open-source form and then (via the operators’ NFV initiative) deploy them on commodity servers?

And why are commodity servers important?  Because the quarterly numbers are demonstrating that proprietary servers are just not selling.  We are seeing in servers, as in networking, the fact that one box is pretty much like another—they’re samples on the paint card of life.  Thus, price competition is all there is there.  IBM, Oracle, HP…everyone is seeing power go up and price per unit of power fall.

Why this is happening goes back to that ROI picture.  Software is the value of technology, not the boxes.  Software has low inertia, can be replicated endlessly at low incremental cost, and is the component that matches technology products to human behavior.  Hardware runs it, connects it, but doesn’t make it valuable—it’s the other way around.  The server, and the network, is a software platform.  That’s what SDN and NFV are really about; bringing both servers and networks into a functional union to take the next step in supporting what will have to be a software-driven revolution.

The problem is that this sort of revolution is hard for people.  I happen to have been a software architect in my life prior to consulting, and so I’m comfortable with software.  I’ve talked just this year with hundreds of good technologists on the network side who have no intuitive understanding whatsoever of software.  I’ve talked to software executives who have absolutely no understanding of networks.  We have market forces driving us to a unity of concept with precious few people to do the unifying.  That’s true with vendors, with network operators, with standards-writers.

When Cisco announces its numbers and makes its comments on the quarter next week, what I’m going to be looking for is an indication that they have a plan for the software side of the networking revolution.  They have taken steps like their ONE stuff, but that’s not all that different from the API position Alcatel-Lucent took, and may be un-taking as we speak.  What is necessary isn’t to expose network assets, it’s to leverage them and to use software to build a bridge to utility.  We want more network spending?  Then create more network value.  It’s not Cisco’s “Internet of things” or Juniper’s TCO.  The former says that the solution to voice revenue was more mouths, not what those mouths can say that’s useful.  The latter says that the solution is to acknowledge that there will never be a new network application, so we’ll sit on the shrinking island until global warming drowns us.  So watch Cisco this quarter and through this summer.  They have the best assets for the new fusion of networking and IT, and if they can’t make a go of it, things are going to go hard for the industry in 2014.

Is Virtualization Reality Even More Elusive than Virtual Reality?

Software, in defining networks, shouldn’t be expected to cull through a lot of details on network operation.  But yes it should.  SDN will be the agent of making applications a part of the network itself.  No, that’s NFV’s role, or maybe it’s nobody’s role because it’s not even a goal.  If you listen to the views of the future of networking, you get…views in a plural sense.  We have no consensus on a topic that’s critical to the way we build applications, networks, data centers, maybe even client devices.  We have no real idea what the demands of the network of the future might be, or what applications will demand of it.

I’ve come to believe that the issues we face in networking are really emerging out of the collision of two notions; elastic relationships between application components and their resources,  and the notion that functionality once resident in special-purpose devices could be exported to run on servers somewhere.  Either of these things change the rules; both change everything.

When we build networks from discrete devices, we connect them through well-known interfaces, much the way a kid builds castles with interlocking blocks.  Each box represents a unit of functionality and cooperation is induced across those standard connections.  Give the same kid a big heap of cement and you’re not likely to get as structured a castle.  The degrees of freedom created by the new tools overwhelm the builder’s natural abilities to organize them in a useful way.

When we pull functionality out of devices, we don’t eliminate the need to organize that functionality into cooperating systems we’d call a “service”.  In fact, make that harder.  First, there was a lot of intra-device flow that was invisible inside the package and now is not only visible but has to be connected somehow.  Worse, these connections are different from those well-known interfaces because they represent functional exchanges that have no accepted standards to support them.  And they should never be exposed to the user at all, lest the user build a bad image of a cow instead of a good castle by sticking the stuff together in the wrong way.  Virtualizing network functions requires that we organize and orchestrate functionality at a deeper level than ever, and still operationalize the result at least as efficiently as we’ve always done with devices.  Any operator or enterprise network planner knows that operationalization complexity and cost tends to rise based on the square of the number of components, and with virtualizing functions we explode the number of components.

And just when you’ve reconciled yourself to the fact that this may well suck, you face the second issue, which is that the cloud notion says that these components are not only divorced from those nice self-organizing devices, they’re scattered about in the cloud in a totally unpredictable way.  I might have a gadget with three functional components, and that creates three network applications to run.  Where?  Anywhere.  So now these things have to find one another at run time, and they have to be sited with some notion of optimizing traffic flow and availability when we deploy them.  We have to recognize a problem with one of these things as being a problem with the collective functionality we unloaded from the device, even though the three components are totally separated and may not even know about one another in any true sense.  Just because I pump out data downstream doesn’t mean I know if anyone is home, functionally, out there.

Over the years, as network equipment evolved from being service-specific to being “converged”, we developed a set of practices, protocols, and tools to deal with the new multiplicity of mission.  We began to gradually view service management as something more complicated than aggregated network or device management.  We began to recognize that providing somebody a management view into a virtual network was different than such a view into a real one.  We’re now confronting a similar problem but at a much larger scale, and with a timeline to address it that’s compressed by market interest, market hype, and vendor competition.

That’s the bad news.  The good news is that I believe that the pieces of the holistic vision of cloud, SDN, and NFV that we need to have today are all available out there.  We don’t have a pile of disorderly network cement, but rather a pile of Legos mixed with Lincoln Logs, Tinkertoy parts, and more.  We can build what we need if we organize what we have, and that seems to be the problem of the day.  The first step to solving it is to start at the top to define what applications need from networks, to frame the overall goals of the services themselves.  We’ve kind of done that with the cloud, which may be responsible for why it’s advanced fairly fast.  We’ve not done it with SDN (OpenFlow is hardly the top of the food chain) and we’re not doing it with NFV, which focuses on decomposing devices and not composing services.  We’re groping the classical elephant here, and while we may find trees and snakes and boulders in our exploration we’d better not try to pick fruits or milk something for venom or quarry some stones for our garden.  Holistic visions matter, sometimes.  This is one of them.

Taking the On-Ramp to the Virtual Age of Networking

Light Reading made an interesting point yesterday in commenting about this year’s Interop show.  I’ve been to Interop in the past, and it’s always been the bastion of Big Network Iron.  Now it may be about to show its softer side, if you believe the advance comments on keynotes and vendor announcements.  Trade shows drive hype more than they drive the industry, but a virtual change in Interop could presage something big if it really develops.  But could it be that virtual players will threaten real ones?  That’s not so sure.

As I noted in a blog I did on Metaswitch’s virtual IMS (an interview with LR on that development was also on their site), virtualization as a mechanism for radically reducing the unit cost of functionality is still a hope more than a predictable outcome.  There are many examples of purpose-built gear for residential and light business use where hardware prices of fifty bucks can be sustained.  It’s hard to imagine hosted functionality of the same level being priced much less, and it’s completely unknown whether such a framework can be operationalized.  But while we can’t expect that just transplanting network functions into the IT world will save a zillion bucks, we can presume it would do some interesting things, and these might have the effect of empowering some players and disemboweling others.

The functions that would be easiest to switch to hosted form would be the higher-layer functions involving things like security, VPNs, or application performance management.  Many vendors, including Cisco and Juniper, hoped to pull these functions into smarter edge devices to increase their value and differentiability.  If they’re hosted on servers instead, the result isn’t a reduction in the number of edge routers sold, but a reduction in the price per router.  We might see a “unit of edge functionality” that would have cost ten grand in the vendors’ smart-edge model costing nine in the new model, but two of that nine might then go to hosting and software, which would reduce vendor sales by 30% unless they got the hosting/software deal.

More complex higher-layer functionality like content delivery (CDN) and the components of mobile infrastructure (IMS, EPC) could create even more impact.  Both CDN and IMS/EPC are sold in package form today, and there are open-source and cloud-based commercial versions of all of the stuff already.  A major thrust toward open-source IMS, driven in my view more by the potential efficiencies of a properly designed/componentized software framework than by the fact that the software is “free”, would undermine Alcatel-Lucent, Ericsson, and NSN who rely a lot on IMS/EPC supremacy to pull through their overall mobile/metro solutions.

In the data center, I don’t think that SDN or virtualization is as much a threat as an opportunity.  No matter how many vSwitches you deploy, you don’t switch squat unless you have something real underneath that can carry traffic.  From the first, Nicira’s papers on the topic always made it clear that you might actually need to oversupply your data center network with bandwidth since virtual overlay traffic can’t be managed on a per-switch, per-trunk basis for optimized flows.  But what you can do in the data center is create a whole new model of how applications are deployed, a model that would transform our notion of security, application performance management, load balancing, and more stuff.  Each of these transformations is an opportunity for a far-seeing vendor, legacy or emerging.  If these transformations could be pulled out of the data center and extended to the branch, they would transform both carrier Ethernet networks and enterprise networks.  HP and Dell would be potentially big winners, or losers, in this changing of the guard.

Then there’s metro.  Metro will be the beneficiary of more incremental dollars than anywhere else, and more dollars means more changes could be made to create a transformation or revolution.  I pointed out before that we could add tens of thousands of new data centers to host network features.  These are green fields, open to any new architecture, and that’s more opportunity than all the enterprises on the planet will generate.  Metro is where all this virtualization will come home, literally, because whatever we know about services, one thing that’s clear is that you have to couple features to them at the customer edge where it’s easy—not in the core where it’s hard.

The key point in all of this, though, is the “R” in the “ROI”.  As I said, simple cost management, even improvements in operations costs, are not going to revolutionize the network because they have too slow an uptake in a modernization cycle.  You need some prodigious benefits, which means significant revenue, to create enough return to justify a major transformation.  IaaS isn’t going to cut it.  SDN isn’t going to cut it.  This is an applications-and-services revolution that is funded because we figured out how to transform the services of the service providers and the productivity of the enterprises.  So it’s not really Interop we need to be looking at to read networking’s tea leaves, it’s some yet-to-be-launched show that talks about what Apple and Google might talk about, but in carrier and enterprise terms.  Services, in the end, are what we perceive to be in our hand, and what we can do with those things.

A Tale of APIs, Executives, Teeth, and Bridges

Today we have the network sequitur and the network non sequitur.  In the former category I place the Alcatel-Lucent comments on the demise (or surrender) of their API efforts, and in the latter I place the Cisco guy leaving for Big Switch.  And amid all of this, I see tooth fairies and trolls.

Some people believe in SDN.  Some believe in NFV.  Some are holding out for the tooth fairy or trolls or ghosts.  For those with one of the first two belief sets, or others relating to improved network functionality, it’s hard to escape the fundamental truth that if software is going to control something, it’s going to do the controlling through APIs.  Further, it’s inescapable that if you start with those high-level APIs that would allow software to influence the network’s behavior, and you build downward from them, you’ll eventually create a cohesive set of control elements that translate what software wants to what the network does.  Top-down design, in short.

Look at Alcatel-Lucent’s notion of APIs and you run into the concept of “exposure”, and IMHO where Alcatel-Lucent went wrong is at that very early and critical point.  Exposure is a bottom-up concept, something that makes the current mechanisms for control accessible rather than defining the goals.  If you progress from what you have toward something you don’t quite grasp, you likely go wrong along the way.  You have to get from NYC to LA by making a series of turns, but if you don’t have the context of the destination in mind all along, you’ll make the wrong ones and end up in the wrong place, as Alcatel-Lucent did.

Perhaps, as Alcatel-Lucent believes, they aren’t needed where they focused their initiatives, but that means the focus was wrong and not the need.  What Alcatel-Lucent needs to do right now is to take up the NFV process as their baseline for progress.  NFV is thinking a lot about how to deploy and manage virtual functions but so far relatively little about where these things come from.  Hint:  The tooth fairy and trolls will not figure prominently in providing them!  Building services from virtual components starts with building services, and building them in a way that allows software processes to drive everything along the path to delivery.  Otherwise the operations costs for virtual functions will explode as the number of choices and elements explodes.  Any good software guy would look at the process and whip out a blank sheet on which to start drawing an object model.  Where’s that happening?  Nowhere I see, so Alcatel-Lucent can be on the ground floor, still.

On the Cisco-Big-Switch switcheroo on an executive, we’re trying to make news out of what is far more likely to be a simple career decision.  If you’re an SDN guy, why not join an SDN startup and collect a zillion dollars on the flip, or at least try to?  You can always go back if it doesn’t work out, or go to a competitor.  An old friend told me that in the late 40s or 50s, networking executives/managers get to the point where they have to roll the startup dice or forget that game forever.  But since having somebody leave establishment Cisco for revolutionary Big Switch looks like the revolution is working, it’s a great tale.  So was the tooth fairy and trolls, but that doesn’t make them real.  Stick a tooth under your pillow or look under the next bridge you see, and you’re unlikely to become a believer in fairy tales.

There’s a connection here besides the negative one.  The cloud is creating a dynamic application and resource model that demands a dynamic connection model, a different model from the static site networking or experience networking that businesses or consumers (respectively) now expect.  SDN can’t make a go of itself by providing a mechanism to do something that software can’t get its head around.  We have to start by looking at how that dynamic connection model is instantiated on dynamic infrastructure to support something stable and manageable and cost-effective enough to be commercially viable.  Big Switch isn’t doing that, not for the cloud as a whole.  If they don’t, or if somebody doesn’t gratuitously do it for them, then Cisco is going to win and Big Switch is going to lose.

Same with NFV.  We can have a great architecture to deploy virtual functions but we have to connect them into cooperating systems and manage their behavior, and most of all we have to do this while addressing the agile applications that absolutely have to come along and become the revenue kicker for telcos that makes all this work worthwhile.  And doing that means getting out that sheet of paper and drawing that object model, not drawing trolls and tooth fairies.

Juniper’s Contrail Story: Left on Base?

Juniper has released an SDN Controller based on its Contrail acquisition, and the early state of the material makes it difficult to judge just how much of an advance the JunosV Contrail product is, for the industry or for Juniper.  I want to stress that I was not briefed on the announcement and so have had no opportunity to get any more detail than was presented in the press release, blog, and a white paper.  The latter was a re-release of an earlier one, so it didn’t contribute much to my understanding.  If we did baseball here, Juniper left a guy on base at the end of the first inning.

Those who read my blogs know that my biggest gripe with SDN is a lack of an architected higher layer to receive network telemetry and service requests and synthesize routes based on a combination of service needs and topology.  My second-biggest gripe is a data-center-only focus, something that doesn’t extend SDN to the branch.  Behind the first gripe is my conviction that you need to have a tie between SDN and the IP world that goes beyond providing a default gateway, and behind the second is my conviction that the IP world has to include all of the enterprise network or the strategy loses enterprise relevance.

I can’t say that Juniper hasn’t addressed these points, but I can’t say that they have.  There is nothing in the material that’s explicit in either area, and a search of their website didn’t provide anything more than the press release.  Juniper does include something I think is important for carriers and the cloud, and even for NFV—federation across provider or public/private boundaries for NaaS.

The best way to approach enterprise clouds is to consider them a federation, because nearly all enterprises will adopt a public cloud service and also retain data center IT in some form.  If we presumed that an enterprise was a private cloud user, the hybridization of the data center with public cloud providers would be almost a given in enterprise cloud adoption.  For cloud providers, the need to support enterprises with global geographies and different software and platform needs would seem to dictate a federation strategy—either among providers or across them based on an enterprise federation vision.  Juniper promises a standards-based approach to federation.

Cloud federation at the enterprise level, meaning adopted by the enterprise and extended to public providers without specific cooperation on their part, would be a matter of providing something like OpenStack APIs (Quantum, Nova, etc.) across multiple management interfaces, and the ability to recognize jurisdictional boundaries in Quantum to know which interface to grab for a given resource.  Juniper does mention OpenStack in their material, so it’s entirely possible that this is what they have in mind.

At the provider level, it’s hard to say exactly what federation would entail because it would depend on the nature of the cloud and network service being offered by the various providers.  There are three general cloud service models for IT (IaaS, PaaS, and SaaS) and a Quantum-based evolution of NaaS models as well.  In theory, you could federate all of these, and I think that would be a good idea and a strong industry position for Juniper to take.

Facilitating network federation is probably not much of an issue; physical interconnect would be sufficient.  The question is what virtual network structures were used to underpin application services.  Most of the prevailing cloud architectures use a virtual network overlay product set (OVS and related tunneling strategies) to create flexible and segmented application-specific VPNs.  To extend these across a provider boundary could be done in a variety of ways, including creating a boundary device that could link the VPNs or providing something to harmonize Quantum administration of virtual networks across providers (as I noted above).  Other formal approaches to exchanging route information would also be possible if we went beyond the virtual level to actual OpenFlow SDNs.  I think that some mechanism for SDN-to-SDN route exchange would be smart, and again something Juniper might well do—and do well.

I just don’t know if they did it.  There was nothing on how federation was done or the boundaries of the capabilities.  Juniper isn’t alone in saying little in their SDN announcements, of course.  Beyond avowing support for SDN, we don’t really know what Juniper’s competitors have done.  The whole topic is so under-articulated that I expect our spring survey will show that buyers can’t even align their goals with current products.  We have a fairly good idea of how SDN and OpenFlow can support data center segmentation and multi-tenancy for cloud providers, but we know little beyond that.  We have less on NFV, but here it’s because the work of the body hasn’t identified a full set of relevant standards.  Juniper has only one mention of NFV on their website according to our search, and it’s not related to their current Contrail announcement, but they have made NFV presentations in the past.

I think federation could be a good hook for Juniper and SDN, but to make it one they have to embrace an NFV story to cover the buyer-side issues and they have to outline just what federation approach they’re taking in order to validate the utility of their federation claim.  It may be these things will come along in a later announcement; they’re not there now.

An API-Side View of Networking Revolutions

If you look at the developments in the SDN, NFV, and cloud areas, you find that there’s a lot of discussion about “services” and “APIs”.  One of the things I realized when I was reviewing some material from Intel/Aepona and Cisco was that there’s a bit of a disconnect in terms of how “services” and “APIs” are positioned, both with respect to each other and with respect to the trio of drivers we read about—SDN, NFV, and the cloud.

The term “service” is fairly well understood at the commercial level; a network service is something we buy from a service provider and it connects us to stuff.  At the technical level, a service is a cooperative relationship among devices linked by transmission facilities, for the purpose of delivering traffic.  One thing you can see straightaway (as my British friends would say) is that services can be hierarchical, meaning that they can be made up of component elements.  These components are commercially “wholesale” elements, and technically there’s no clearly accepted name for them.  We’ll call them “components”.  Things like IMS or EPC or a CDN are components.

An API is an interface through which a software application gains access to a “service”, which in most cases is really one of our “components”.  Like services, APIs are a bit hierarchical; there are “high-level” APIs that essentially mimic what we can do with appliances (phones, for example) to request service, and lower-level ones that move toward the mission of organizing network elements in that cooperative behavior I mentioned.  High-level APIs are simple to do and don’t pose any more risk than the use of phones or other user devices would pose.  Low-level APIs could in theory put the network into a bad state, steal resources from others, and create security issues.

Cloud networking means the creation of a resource network to hold application resources and the connection of that network to the user community.  In OpenStack, that’s the Quantum interface.  Using Quantum I can spin up a virtual network and then (using Nova) add VMs to it.  Quantum lets me define a means for connecting this structure to the user—a default gateway for example.  So you could assume that everyone who’s talking about virtual networks and virtual network services would be talking Quantum, right?  If only that were true!

Let’s look at SDN from Quantum’s perspective.  If I want to build a virtual network, I need to specify the connection points and service levels.  I can generate a BuildSDN(endPointList, QoS) command and have my Quantum interface build the result, right?  Well, maybe.  The first problem is that there are a number of connection topologies—LINEs between endpoints, a LAN on which all the endpoints are connected, or a TREE for multicast distribution.  Quantum most often assumes a virtual LAN subnet, but there are use cases in both SDN and NFV (“service chaining”) that imply the network connection is a set of paths or lines. I can fix this by adding a parameter to my BuildSDN, the connectionTopology.   The second problem is that OpenFlow doesn’t know anything about connection topologies even if I specify one, it only knows individual forwarding table entries.  Something has to organize the SDN request into a set of forwarding table changes so the Controller can send them to the devices.  If you look at the SDN stories of most vendors, you find it’s silent about how that happens.  So we have a wonderful BuildSDN API and nothing to connect it to.

NFV faces similar challenges.  The body wants to identify current standards for various roles, but API standards are rare.  Most of the network standards we have are really interfaces between devices and not APIs that link software components.  How does an interface translate to an API?  Do we have to connect two virtual functions with a communications path just because the devices they came out of were connected that way?  Suppose they were running on the same platform?  And what kind of “API” would we like to see something like IMS or EPC expose?  How would those APIs relate to SDN APIs if the activity we were supporting could have elements of both?

The logical way to plan for the future of network services is to consider what’s driving the changes.  That’s the cloud.  Quantum is the interface that defines “network-as-a-service” to the cloud, and so it’s the gold standard of high-level deployment APIs.  We need to be looking at SDN and NFV through Quantum-colored glasses to establish how their own APIs will be derived from the highest-level Quantum models.

But that’s only part of the story.  What does an SDN service look like?  Is it just an Ethernet or IP service pushed through SDN devices, or is it a service that has properties that can be generated in virtualized, centralized, infrastructure and couldn’t have been offered before?  If we take the former view, we cripple the SDN benefit case.  Are all NFV’s virtual functions simply models of existing devices, connected in the same way and offering the same features?  If so, we’re crippling the benefits of NFV.  Deploying stuff is essential for consuming it, but deployment of a flexible framework lets us compose cooperative relationships among functional elements in many different ways, some of which might even result in a whole new way of presenting services to end users.  Can we say that any of our current activities are thinking about that?

Cisco is publishing a rich set of intermediate APIs under its ONE banner.  Intel, with the acquisition of Aepona, is entering the world of intermediate-level APIs as a means of exposing provider assets.  We need a roadmap now, from both vendors.  We need to understand how these new and wonderful APIs fit in the context of Quantum, and through that context into both SDN and NFV.  Absent that map, we have no way to navigate the value proposition of APIs…or of SDN, NFV, or cloud networking.

Revolution or Pillow Fight?

This has been a pretty active week in terms of happenings of real relevance to the future of networking.  We’ve also had some background stuff going on, things that don’t rise to the level of being part of the revolution for various reasons.  Taken as a whole, they may be a signpost into how the revolution is proceeding at the tactical level, though.

HP, Brocade, and Arista all announced high-capacity data-center switches that supported OpenFlow and all of them made considerable hay on their SDN credentials with the products.  It’s clear that SDN compatibility has street creds with the media, but less clear just how much buyers pay attention to it.  The actual number of data centers that need this sort of support is limited but the players hope “the cloud” will fix that.

It’s not that I’m against an evolution to data center SDN, but I’m in favor of our actually having a value proposition to drive it.  New technology walks a fine line between creating a new paradigm and offering bathroom reading, and what puts it decisively in one camp or the other is the benefit case that can be harnessed to justify deployment.  Yes, cloud data centers will need massive scale.  The overwhelming majority of users do not believe they’re building one as of last fall (we’ll be looking at their spring views in July).  So the moral is that all this good SDN switching stuff has to drive not only SDN in the data center, but also cloud data center deployment, to get onto the value map.  Which means SDN players should be a lot more cloud-literate than they are.

Alcatel-Lucent posted its quarterly results, which were certainly disappointing to them but not completely surprising to many Street analysts.  Like Juniper’s results, Alcatel-Lucent’s had elements of good and bad, but the problem was that they didn’t prove a lot of progress toward a turnaround in sales (off about 22% sequentially).  Investors are getting leery about companies who can sustain profits by cutting costs; all you need is a ruler to extend the lines and you cross the zero axis in a couple of years.  Negative costs might be a concept of much greater value to the industry than SDN.  Sharpen your pencils, MBAs!  You have a future in technology after all.

The thing is, Alcatel-Lucent has what I believe to be the best SDN story of the lot.  It’s not as much a matter of technology as a matter of scope; they have a vision of SDN that truly goes end-to-end in the network.  If you focus on the cloud data center, you miss the cloud itself, the users who have to be the benefit drivers.  Alcatel-Lucent has captured the broadest SDN footprint, and their only problem is (as I’ve said) a substandard job of positioning what they’ve done.

I just had a conversation with a network operator on the Alcatel-Lucent SDN story, and they didn’t know it was end to end.  That, my friends, is a serious problem, and if Alcatel-Lucent wants to turn itself around it has to learn to be an effective seller and singer of technology anthems, and not just somebody who pushes geeks into a room until something new emerges, like fusion inside a star.  Some of the financial pubs are calling Alcatel-Lucent dead, and while I don’t think it’s true now or even necessarily so in the long pull, I do think that poor positioning will be fatal if it’s not corrected.  The one thing you cannot do in this industry and survive is fail to exploit your own value propositions.

HP, according to the media coverage of its switch launch, did the “fabric of the cloud”.  Wrong.  They did the fabric of the data center, and they hope that somehow the cloud is going to drive more data centers into a total capacity where fabric is needed.  This, from a company who ought to be the poster-child of cloud value propositions.  Look at all of the SDN and NFV stories that we’re hearing out there and you find the same sad kind of reductio ad absurdum; I want to focus on my own contribution and needs, not those of the buyers.  Thus, I will hope that they figure out their value proposition on their own, ‘cause I darn sure can’t figure it out!

In SDN, in NFV, and even in the cloud, we have a very clear set of value propositions.  The problem is that nobody wants to take the time and trouble to tell buyers about them.  Are we, as an industry, so specialized in our work that nobody sees the big picture anymore?  If that’s true, then there’s a problem that’s bigger than Alcatel-Lucent or HP, because we need the broadest possible benefit case to justify a revolutionary change, which is what SDN and NFV are supposed to bring us.  When the benefits shrink, the revolution becomes two kids in a pillow fight.  It’s time to step back from the minutiae and start from the top, where we should have started all along.

A Cloud IMS Solution

Virtualizing network functionality is the aim of a surprisingly large number of initiatives these days, for a surprisingly large number of reasons.  In NFV, operators focused on cost savings versus custom devices in their seminal paper on the initiative, but in SDN the focus is on improving network operations and stability.  Other operators have been looking to define a framework for hosting service features that would be both agile and operationalizable.

One of the most interesting topics in the virtualize-my-functions game is IMS.  Mobile services monetization, in the advanced form I’ve called “mobile/behavioral symbiosis” has been the second-most-important goal on the operators’ list (after content) but it’s also the area where they’ve made the least progress.  Some, at least, say the reason is that IMS is a key element in mobile services.

In theory, IMS is mostly about voice services, which many would say are strategically unimportant.  The problem is that it’s hard to offer wireless services that don’t include voice, and intercalling with the public voice network.  It’s also true that as we evolve to 4G, things like mobility management (part of the Evolved Packet Core or EPC) are more often linked to the deployment of IMS than run on their own.  I don’t even know of any plans to run EPC without the rest of IMS.  Finally, if we presume that we might eventually get wireless services from WiFi roaming and have some service continuity among WiFi hotspots or roam-over to cellular services when you’re not in one, we’ll likely need something like IMS.

The problem is that IMS isn’t known for its agility.  Words like “ossified” or “glacial” are more likely to be used to describe IMS, in fact.  People have been talking about changing that, and today I think there are a half-dozen initiatives to make IMS itself cloud-compatible.  Most are players who had IMS elements supported on fixed platforms or appliances and are now moving them to the cloud.  One vendor who’s taken a different approach is Metaswitch, who’s promoting a cloud IMS that was “cloud” from the first, and it’s an interesting study in some of the issues in cloud-based virtualization of network features.  Their cloud IMS is called “Clearwater”.

Metaswitch starts where most people will have to start, which is the structural framework of IMS defined by 3GPP.  You need to support the standard interfaces at the usual spots or you can’t work with the rest of your own infrastructure, much less with partners.  You also need to rethink the structure of the software itself, because cloud-based virtual components (as I noted in a previous blog) can’t be made reliable by conventional means.  So Metaswitch talks (if you let them) about N + M redundancy and scalable distributable state and automatic scale-out and RESTful interfaces.  Underneath all this terminology is that point about the fundamental difference of virtual, cloud-hosted, functionality.  You can’t take a non-cloud system and cloud host it and get the same thing you started with in reliability, availability, and manageability.  Skype and Google Voice didn’t implement a virtual bunch of Class 4s and Class 5s.  You have to transport functionality and not functions.

While it supports the externalized IMS interfaces, its intercomponent communications is web-like, and it manages state in a web-friendly way.  That means that if something breaks or is replicated, the system can manage a load-balancing process or fail-over without losing calls or data.  The web manages this by “client-maintains-state”, which is what RESTful interfaces expect.  You can make that stable by providing state storage and recovery via a database, something I did on an open-source project out of TMF years ago.  This is why they can claim a very high call rate and a very low cost-per-user-per-year; you can use generic resources and you can grow capacity through multiplicity using web-proven techniques.  It’s telco through web-colored glasses.

Does Metaswitch answer all the questions about cloud IMS?  I don’t think even they would say that.  The fact is that we have, as an industry, absolutely no experience in the effective deployment and operationalization of cloud-hosted network services.  Metaswitch tells me that they’re involved in a number of tests that will help determine just what needs to be done and what the best way to do it might be.  Some of this activity may contribute further understanding to initiatives like SDN and NFV, because the central-control notion of IMS makes it easier to adapt IMS to an SDN framework and cloud-anything is an NFV target.

Because Clearwater is planned for release early this month under the GPL, there are some delicate questions regarding how you could deploy it in situations where commercial interfaces or components might also be involved.  My personal concern here is that the value of cloud IMS might be diminished if it can’t be tightly coupled with elements that are already deployed outside GPL, or that are developed with more permissive licenses like Apache.  I’d recommend that the Metaswitch people look into this and make some decisions; perhaps GPL isn’t the best way to do this.

The most important aspect of IMS evolution is an area where this licensing may hit home.  While Clearwater includes native WebRTC support, integrating IMS into a web world is likely to mean writing services that are web-coupled and also IMS-coupled.  Even implementations of SDN and NFV might run afoul of licensing issues in creating composite functionality, and if the goal of bringing IMS into the cloud is in part driven by a goal of bringing it into the web era, the licensing could be a big issue.

This is a good thing, though, not only because we need to have virtual functions virtualized to deploy in the cloud in the network of the future, but because we need a model on how to do it.  IMS is one of the most complex functional assemblies of networking that we’re likely to encounter in the near term, so if we can figure out how to deploy IMS in the cloud, we can deploy pretty much anything there.  Which is good, because many operators plan to deploy everything there.