A Deeper Look at IBM, Pre-Earnings

As we advance toward IBM’s earnings call on January 21st, and with additional news now out that IBM is again trying to sell off its x86 server business, it’s clear that we’re heading for one of those watershed moments.  I’ve been associated with IBM products in some way for almost fifty years now, and watched the company move with seemingly little effort through major product transitions.  The one that’s coming now may be the most “major” of all.  For the first time in my memory there is a serious question of whether IBM can make it through without serious impact.

There are a lot of silly and short-sighted views of what’s wrong with IBM, most of which focus on the fact that the company still depends on its mainframe and UNIX revenues.  It’s true that depending on “legacy” anything is a risk, but network equipment vendors have all ridden their current revenue horses for decades too.  You don’t throw out product families because you’ve had them around for a while, you wait until it’s clear that you can’t evolve them.  Yes, IBM is losing revenue on these legacy boxes, but the x86 stuff is doing worse, which is why IBM is interested in dumping those families.

That’s the question that now faces IBM.  Companies have enormous investment in their current IBM systems, but if there is a change of paradigm on the horizon then there’s obviously a good chance that those current products won’t be suitable in the new age.  Some customers will evolve out of the lines, and if UNIX and mainframes are old-hat to the new buyers, there will be nothing to replace them.  So the question is, is there a paradigm change on the horizon?

Maybe, I think.  The Street thinks that the cloud is going to hurt IBM, presumably because it’s going to commoditize hardware.  Of course, what you find in “the cloud” is the x86 stuff that IBM wants to get rid of, so wouldn’t the cloud be more of an impact on whoever buys IBM’s COTS business?  No, it’s not that simple.  The cloud’s impact on IBM arises because it might impact how applications are written, it might be that paradigm shift.  If it is, though, it could help IBM as easily as hurt them.

If we follow the “hurt” track first, this is how it goes.  The cloud presents a new virtual-OS model based on what I’ve been calling “platform services” and targets it at worker productivity (what I’ve called “point-of-activity empowerment”).  These shifts induce enterprises to spend on new software, and that software is based on commodity technology not the high-margin mainframes and UNIX stuff.  As a result, IBM bleeds off customers in these two hardware areas, and can’t hope to replace them with new customers.  They contract sharply.

The critical presumptions in this scenario are the paradigm shift and the notion that the new paradigm is supported immediately on COTS platforms where IBM can’t be competitive by its own admission.  I think the paradigm shift is real but not imminent, and that means that IBM could address the second point, the commoditization of the new cloud platform.

There is an incredible amount of new work needed to support platform-services-based point-of-activity empowerment.  IBM is probably one of the few companies on the planet that could actually afford to drive something like that.  Suppose that IBM built up its new cloud vision (they’re spending $1.2 billion on expansion, remember?) around platform services and mobile broadband empowerment.  Suppose they supported these concepts first and strongest on their higher-margin platforms.  Might that not drive enough growth there for the revenue and profit trends for these systems to turn around?  And think about this a moment.  The premise behind “platform services” is that high-value cloud-friendly application elements are presented as web services to any OS and middleware combination.  It’s PaaS without the fixed software base.

IBM could win with this, and here’s the scenario.  They field their PSPOAE (Platform Services Point of Activity Empowerment) offering based on their cloud and their high-end enterprise system software.  There’s a significant business benefit to be had from the POAE part, and that gain drives new investment (R begats I).  It also makes IBM a powerhouse in the cloud, and even lets IBM build software margins on top of x86.  IBM turns around by 2016 and becomes a powerhouse again.

The thing that’s wrong with this picture isn’t technical, it’s promotional.  IBM had the strategic moxie to bring this about for all of the time I’ve surveyed enterprises except for the last three years.  From that point, they’ve lost strategic influence because they’ve lost their marketing touch, becoming too much of a sales engine.  That’s the real reason why they can’t make money on x86.  Low-end platforms are not sold, they are marketed.  So IBM would have to turn around its positioning/marketing almost instantly, certainly in 2014, to promote something as massive as PSPOAE.  The “old” IBM could have, would have, done that, but the old IBM would never have let their brand slide.

Waiting in the wings for IBM to fail is…who?  HP or Dell, who have made money on COTS and thus don’t have the risk of obsolete platforms?  Cisco, who wants to be the next IBM and so surely would try to benefit from a decline in the “current IBM?”  No, I think not.  It’s Amazon.

Amazon now holds the IT cards, if they choose to play them.  They are already the premier provider of platform services.  They have brand credibility, they have a tolerance for low margins, they have shareholders who are prepared to take a risk for a big payoff.  If Amazon were to frame a point-of-activity-empowerment model within AWS, they’d end any hope of IBM coming through this latest transition as the premier IT company.  They’d also end any hope Cisco has for taking over from IBM, and they’d consign Dell and HP to providers of commodity iron.  Is this the kind of risk Amazon would take?  Can IBM, or Cisco or Dell or HP be bold enough to counter it?  That’s what we should be watching for as 2014 unfolds.

Tale of Three News Items

We never get much tech news on a Friday, but we do sometimes get business news about tech and today’s no exception.  We have Intel’s earnings report, IBM’s new cloud investment, and also a comment by Juniper’s new CEO on the company’s directions.  Let’s take a look at each and see what (if anything) we can learn.

Intel pretty much hit forecasts, its slight undershoot in profit coming from a larger reserve against taxes.  It seems pretty clear from their numbers that the PC business is more complicated than doomsayers expected.  In fact, there was significant strength at the high end of the PC chip line (i5 and i7) and desktop-configuration sales were up over 10%.  So what’s happening?

The answer is that we’ve had three factors influencing the PC market all along.  The one that gets the most ink is the tablet-and-smartphone overhang, and it was supposed to kill the PC.  Clearly reports of the death were, as they say, greatly exaggerated.  The second factor was the economic collapse of 2008, and we’re now in a cautious but real recovery, particularly in the US and Europe.  The final factor is the issue of new benefits to drive new empowerment—productivity-driven spending gains have been negligible since 2002.  That one is still real.

Where we are with PCs versus tablets or phones (in my view) is that consumers did indeed reduce their dependence on PCs when smartphones and tablets came along, and those who saw personal technology as an on-ramp to social networking found that the new stuff was better than the PC.  That remains true today, but remember that the impact of tablets and smartphones was essentially to reduce PC TAM.  Once you pull out the social-driven buyers, the remaining PC market isn’t much impacted by tablets or smartphones.  Thus, we have already seen most of the overhang effect from that.  Analysts who projected the overhang curve with a ruler or French curve overstated the length of the effect, which is probably the big reason why they were surprised by Intel’s numbers.

On the second point, businesses who have been holding back on refreshing aging “traditional” workstations and (to a lesser degree) laptops have now opened up as the economy recovers.  Those buyers are the drivers of the desktop and high-end chip sales gains Intel reported.  It’s this sort of refresh that drives top-end PC chip sales; consumers tend to buy models based on lower-performance chips for cost reasons.  This factor and the overshoot in estimating tablet overhang effect explains everything about Intel’s quarter in my view.

The future is another matter, and that’s where our third market factor comes in.  We see from the numbers that there are really two markets for PCs, the largely business-based buyers who demand power and accept a price has to be paid, and the consumers who demand price and don’t compromise on it for features.  It doesn’t take a rocket scientist to realize that the latter market is going to commoditize and that Intel’s brand will be less meaningful there over time.  Chromebooks and tablets with keyboards will erode the bottom tier of the consumer PC market over time, but it’s really the commoditization that’s the story.  If Intel wants more sales and profit they have to sustain the business PC user.

That may be a challenge because the real opportunity for driving new benefits to businesses is in point-of-activity empowerment based on mobile broadband.  Unless Intel can work with PC makers to figure out how to couple personal computers and not tablets to this mission, it will drive the business investment in personal technology toward mobile devices, and cut the growth of high-end PCs.  That’s what will hurt the PC business, and Intel.

IBM’s decision to spend another $1.2 billion on the cloud is heralded by some as an indication that Big Blue sees the need to turn around.  They may very well see that (it would be hard to miss, after all) but the problem is that it’s not spending money but making strategic investment that matters.

IBM’s current cloud strategy (SoftLayer) is a kind of left-handed PaaS approach, something that pushes IBM middleware through the cloud.  From what enterprises have told me in surveys, the approach has some utility for failover and cloudbursting missions for IBM-based applications, but it’s not breaking new ground in terms of supporting cloud-specific applications.  That’s a problem.

The only reason to take something that’s running today in your data center and stick it into the cloud is that the cloud is cheaper.  Once you push past the limited opportunity presented by hybrid elasticity of resources, the only thing that drives further cloud usage is lower costs.  If you’re targeting your own IT customer base (as IBM is) for your cloud opportunity, you’re winning less than you’re losing or the buyer is making a price-based move to increase costs not decrease them.  Need I say how unlikely that is?

IBM needs to spend a rough billion on the expanded “web-service-middleware” approach that Amazon is following and that I’ve called platform services.  Oracle’s deal with Verizon shows that most of the players in the industry now realize that the best way to promote the cloud is to add cloud-enabled services to IaaS by providing virtual interfaces (web services).  Let the buyer write in whatever they like, but incentivize them to use platform services by making those services really useful in building cloud-specific apps.  IBM, of all people, should be able to do this.

Juniper’s partner conference was new CEO Shaygan Kheradpir’s opportunity to speak to the market, particularly given the activist-hedge-fund attack on Juniper’s strategy.  The result was mixed, in my view.  On the hopeful side, he says he wants to make Juniper “dramatically better”, and that he wants Juniper not to get bogged down selling hardware and technology and focus instead on crafting solutions buyers value.  He talks about “High IQ” networks, though, and that’s too much like a tag line to replace older tag lines with a similar meaning.  He talks about networks that can “never go down”, which suggests that network intelligence has to focus on sustaining connectivity and availability, not building toward those “solutions” buyers value.  A reliable network is a solution for a carrier, perhaps, but it’s simply an application requirement for an enterprise.  The guy who controls the application value proposition, then, still controls the buyer and that’s not Juniper today.

In this “Elliott aftermath” period, Juniper is not signaling decisively on whether it will boost share price by “artificial financial means” as I’d describe stock buybacks, dividends, and cost cuts, or whether it will actually open new and better opportunities for itself.  What is in Juniper’s DNA, if you go back to deep ancestry, is innovative designs and technology.  What is in it recently is hype, myopia, and taking the easy way out.  Which DNA will Juniper draw on?  I don’t know yet.  I don’t know if Shaygan Kheradpir does either.

Enterprise Networking and its Consumer-Like Future

There seem to be a lot of forces battering the enterprise networking space this year, and we’re only halfway through January.  What makes things difficult is that many of these forces are kind of under the radar, at least in terms of the extent to which their impacts are obvious.  They’re also embryonic; there are still questions about just how far and how fast they’ll push us.  But push they will, so I want to look at some of the macro trends they’re creating and where we might end up by the end of 2014.

One obvious thing is that we’re continuing a long-standing trend away from “networking-as-we-knew-it”, which was the connection of sites using static resources and company-owned devices.  What we have today is the notion of service connectivity not site connectivity.  Service connectivity means that we project higher-level services through delivery mechanisms that are loosely equivalent to private networks, but only loosely.  Arguably, for the enterprise, the service connectivity trend is the most important trend of all because it transforms what people spend money on—the most basic driver of technology policy.

Mobility, both in terms of road-warrior in-the-hotel stuff and in the more modern sense of mobile broadband, is the biggest new driver of service connectivity.  Workers all sitting at their desks or standing at terminals creates a wonderful orderly picture of technology, but it’s obsolete.  There are places where this happy situation still prevails—retail banking for example—but even in these places we’re seeing practices steal importance from the model.  How many do online banking instead?  We are clearly heading for a time when “going to the bank” is anachronistic like “going to the store” is already becoming anachronistic.

The cloud also boosts the notion of service connectivity by converting business applications into a network-delivered utility.  We are “Internet-izing” our view of how workers relate to company information and information processing, and this view encourages us to farm out or distribute resources.  Combine it with mobility and you create an enterprise where people are wherever they happen to be, supported by whatever they happen to need, based on technology located wherever it’s convenient.  Build a private network to support that?  Never happen.

But what does service connectivity look like?  The answer goes back to the Internet.  Most people see the Internet not as a network at all but as a collection of resources.  If they have any network-centric vision, it’s that the Internet is their access connection, and that is exactly what’s happening with the enterprise.  In days of old, you bought connection services that tended to map 1:1 with access connections.  That’s not going to work now, not only because “services” now don’t have any specific connectivity map but also because it’s contrary to the emerging business model of the operator.

How many times have we read that network operators want more “agile” services?  We want, they say, to cut provisioning time from weeks to seconds.  Great, and I understand why that is.  If technology is to be optimally useful it has to be available when it’s needed.  However, if we stay with the presumed 1:1 service/access mapping, then we have physical provisioning to do every time we want to deliver something new.  That’s not going to happen in seconds no matter how powerful we make service automation.  So we have to shift policies.  If you have consumer broadband over fiber or CATV, you likely can go online and in a very short period upspeed your service.  Why?  Because the physical media doesn’t have to change for the new speed, and because the access rate is determined by an agile edge interface that can be dialed up simply.

We don’t see this a lot today but we will see it more often in the future, largely because of regulatory changes.  Regulators like the FCC are actively planning to support the migration away from TDM voice.  If we don’t have TDM voice, we won’t likely have much TDM at all, and the loss of TDM is the removal of the greatest barrier to access consolidation at the business site level.  If we move everything to a packet trunk using whatever technology looks good, we can then dial the access bandwidth up as needed (even make it variable over the span of days, to accommodate business factors) and deliver a bunch of services in parallel over the same pipe.  The enterprise, not just branch offices but even headquarters, is moving to having a service access connection that’s not linked to any specific connectivity at all—just to services.

This has potentially profound implications for business services, particularly things like carrier Ethernet.  We can surely use an Ethernet pipe for our universal business access conduit to services, but first we don’t have to and second a lot of the features of Ethernet won’t matter.  What does end-to-end management mean when you’re using Ethernet only across the last mile?  Nothing, obviously.  In fact, management of “networking” is meaningless in this kind of world because you’re really consuming services that might include connection-based delivery but probably include more higher-level stuff—stuff that’s really what you want in the first place.  Manage what you care about; if the “service” of connectivity breaks it’s broken because it’s not connecting you to your experiences.  That’s the operator’s problem.  So traditional  enterprise network management becomes less meaningful.

Which drives managed services.  If enterprises have fewer network devices of their own to manage, there’s less incentive to manage anything at all.  If you add to that the fact that the “services” workers are consuming are composed of a bunch of interfaces that are fulfilled across a mixture of company facilities and cloud facilities, you realize that service continuity is beyond the ken of the enterprise anyway.  Why not just let the operator manage the little bit of networking that’s stuck on your premises?

You can see where we’re headed here.  The whole networking world is changing.  The change won’t be complete at the end of this year, but I think the change will be visible by then, impacting both the enterprise buyer and their vendors.

And in the “More News” Category…

We have some additional news on a couple of topics I’ve been blogging on, one of which could have a significant impact on the state of the networking market and the other which could be a signpost of progress in network technology evolution.  A Court of Appeals has overturned the FCC’s neutrality order, and Juniper is under activist-investor pressure that will almost certainly impact how it trades simple financial measures against major strategic initiatives.

In the neutrality debate, what we got was what many (myself included) had speculated would happen.  The FCC has considerable authority in regulating common-carrier providers, but much more limited authority in regulating the behavior of “information services”.  The DC Court of Appeals problem with the neutrality order was less that the FCC had no authority to regulate than it was that the FCC had elected to classify ISPs and the Internet as information services, then attempted to apply common-carrier regulatory power.

The whole net neutrality thing is a discussion that has, en masse, long since left the path of reason to become almost-political in the trading of dire predictions and diatribes.  It’s my view that there is virtually no chance that the ISPs would interfere with “lawful content” for reasons of market and media pressure.  That doesn’t mean that porn or spam couldn’t be stopped, but Congress has the right to do whatever they like in those areas whatever the FCC does or thinks, because the FCC operates under the rules Congress sets, period.  What’s really open to question here, the only thing that likely is, is the controversial who-pays-for-premium-handling thing.

Neutrality rules say that the consumer can pay for premium handling but not the provider, meaning that Netflix (for example) couldn’t pay to have video expedited.  This rule was stupid from the first for a number of reasons.  First, it’s saying that traffic policy depends on which side you’re looking from in the flow.  Second, every content provider uses CDN for delivery of its most useful elements, which means that big companies already have an “advantage” because they can afford to pay for better QoE.  What we did with this rule was to codify investment in something other than transport, which is bad public policy.  A network moves traffic, and if you make it harder to do that or make traffic-moving less valuable, you hurt the business of networking.

The question is “What now?”  The new FCC Chairman (Wheeler), who has no ties to the VC community that was favored by the old neutrality position (the prior Chairman, Genachowski, had such ties) had already suggested he didn’t see a problem with supplier-pays.  The two Republican commissioners didn’t like the neutrality order, so if Wheeler doesn’t push to appeal the ruling, the order will die.  Even if the FCC does appeal, I’ve never believed the order would stand legal review and I think the Appeals ruling will be upheld.  So we may be done with the notion of consumer-pays in favor of anybody-pays.

This might lead to something truly revolutionary and truly good, which is settlement on the Internet.  The bill-and-keep model has distorted the business of networking by making it virtually impossible to create end-to-end services that offer anything other than best-efforts connections.  Without the neutrality order, ISPs could shift to a model that’s already in use in some places, which is a paid-peering framework where traffic volumes and QoS commitments determine peering cost.  It would likely create a flow of revenue from content providers (mostly video, where QoS matters the most) to ISPs, and it could help to raise revenue per bit and encourage infrastructure investment.  However…I don’t think that something like this will help enough at this point to reverse the trends away from “the network” and into “the service” meaning the computer-and-software structure that generates experiences.  We could have, with enlightened policy, transformed the Internet five or ten years ago.  We’ve already codified the other OTT-centric model with long-term investment, and so we’re never going to have what we might have had.

That’s a fair opening for the Juniper situation.  Network equipment vendors today face (like it or not) profound business changes driven by the declining revenue per bit of their customers in the carrier space.  Commoditization pressure and competition from Huawei have combined to drive down revenue and profit, and it’s not going to be halted by the overturning of the neutrality order.  Slowed, perhaps, but not halted.  So every network vendor faces two choices—be commoditized or find other market niches.  Juniper has always had the most innovative technology at the network level, IMHO, so they were the obvious candidate to do something revolutionary there.  But when they announced a new CEO last year, the Street immediately clamored for a financial-driven policy that would raise share price by cutting costs and reducing M&A.  I said then that if Juniper took that route it would produce happy hedge funds for a year or so, and then fall into what could be an irreversible decline.

The activist hedge fund intervention that happened this week seems to me the final straw here.  Juniper is being pressured to do what the Street wants, which at one level is just what they should be doing—a public company is responsible to its shareholders.  However, there was a better way, and perhaps still is.

Elliott, the activist hedge fund, says that Juniper should have been more competitive with a player like Alcatel-Lucent in optics.  Crap.  Optical margins are low, and even if Juniper wanted to get into that space now the margins would be truly in the toilet by the time they established themselves.  Alcatel-Lucent’s “Shift” strategy is arguably getting that company into the spaces that Elliott says Juniper should be getting out of.  Juniper should have gotten into optics more a decade ago, when it made sense.  Which means what it should be doing now is what’s going to be the obvious choice in 2024.

Which is the service layer.  The debate over whether Juniper should have jumped on OpenDaylight or bought Contrail is moot in two ways—it’s already been resolved by the fact they did by Contrail, and it’s the wrong debate to begin with.  The controller doesn’t matter, it’s just a higher level of plumbing.  What matters in the service layer is the creation of services, the architecture that couples features to transport/connection.  There was plenty of time to grab supremacy in this space, there were specific opportunities Juniper had to embrace the service layer and build network value for its customers and profits for itself.  I watched them turn their back on at least one such opportunity.  Even now, it might not be too late, but the Elliott move will make it very hard for the new CEO to balance between a totally different near-term strategy that would require a truly insightful SDN/NFV overlay on either Contrail or OpenDaylight and a hungry activist investor who wants all the focus to be on cutting staff, cutting M&A, paying a dividend, buying back stock, and contracting into the networking equivalent of a white dwarf.  By the end of Q2, Juniper either does the right thing or there will be no right thing left to do.

Looking for Optimization in All the Wrong Places

We’re all familiar with the notion of “layered networks” but one of the things that’s sometimes overlooked when considering these age-old concepts is that multiple layers often begat multiple connection topologies, service policies, and so forth.  In today’s world we’re not thinking of layers as much in terms of protocol layers as in terms of service model layers, but the issue of layer-specific topologies and policies is still as much a bone of contention as ever.  And it may be complicated by the fact that we’re likely optimizing the wrong thing these days.

Virtualization follows the concept of abstraction and instantiation.  When you apply it to networks, you start by visualizing a given service as being fulfilled by a black box that connects inputs and outputs.  That box is given a mission by the layer above it, and it fulfills that mission in many cases with its own virtualization process.  Inside a black box is another black box; it’s like the bear going over the mountain.

The issue of nesting black boxes or supporting hierarchies of service models comes up today mostly in two places.  One is at the operations level, where the “services” of the network are often abstracted to simplify how much work has to be done integrating OSS/BSS processes with individual devices.  The other is in SDN services, where intermediary layers insulate generalized control logic from specific topologies or even implementations.

Multi-layer abstraction is a useful tool.  If you look at the operations example, you can see that if an OSS/BSS has to understand how to route packets via OpenFlow, you’re exploding the complexity of the high-level operations software to the point where it may be difficult to make it work, and you run the risk of having what might be called “service-level” decisions create structures at the connection level that don’t even make sense.  You also create a situation where a change in service topology that happens to involve using a different part of the network could end up changing a bunch of high-level operations tasks because that part of the network used a different implementation of SDN or a controller with different northbound APIs.

OpenStack’s network API, Neutron, is a good example of multi-layer abstraction in action.  You supply Neutron with connection models—you say, for example, “create a subnet”—and Neutron does that without bothering you with the details.  Underneath Neutron is often an OpenFlow controller that may have its own abstractions to help it translate a request into specific paths.  And above Neutron is the whole cloud thing, with its top-level picture of a cloud service as being a set of hosts that become visible on an IP network like the Internet.

Of course, everything that’s logical and easy isn’t always good, and that’s the case with this nested-abstraction box-in-a-box thing.  The problem you have in nesting abstractions is that it becomes more complicated to apply policies that guide the total instantiation when you’re hiding details from one layer to another.  Let’s look at an example.  You have a cloud with three data centers and three independent network vendors represented, so you have three network enclaves, each with a data center in it.  You want to host a multi-component application in the cloud, and a single high-level model abstraction might be our IP subnet.  However, the “best” data center and server might depend on characteristics of the individual network enclaves and how they happen to be interconnected, loaded, etc.  If your high-level process says “subnet” to the next layer, which then divides the request among the resource pools there, how do you know that you picked the best option?  The problem is that the networks know their own internal optimality but not that of the hosting layers they connect, and not of each other.

It would be possible for a higher layer to obtain a complete map of the potential resource commitments of the layer below it, and to aggregate a single picture of the resource pool to allow for centralized and optimized decisions.  If you really had centralized SDN control over an entire network domain end-to-end that is an automatic attribute, but realistically SDN is going to be a bunch of interconnected domains just like IP is, and in the real world we’re going to have server resources and network resources that have to be considered/optimized in parallel.  Is that even possible?

Not completely, and so what we’re really trying to figure out in this onrushing cloud, SDN, and NFV era is how much it matters.  The benefit of optimization depends on the cost of inefficiency, and that depends on how different the resource cost of various network paths or hosting points might be.  If there are five data centers with good capacity in a metro area, for example, and if we assume that network bandwidth within that metro area is fairly evenly distributed, you could probably presume that the best place to host something is the center closest to the point of connection; it diverts traffic less and produces lower latency and risk of disruption.

But how much work is being done trying to get the “best” answer when the difference between it and every other answer is way inside the realm of statistical significance?  How much complexity might we generate in a network, in a cloud, by trying to gild the lily in terms of optimization—how much opex could we build up to overwhelm our marginal resource cost savings?  Operators and enterprises alike are increasingly oversupplying capacity where unit cost is low (inside a data center, where fiber trunks are available, etc.) to reduce operations costs.  Given that, is it sensible to try to get the “best” resource assignment?  I asked, rhetorically, several years ago what the lowest-cost route would be in a network with zero unit bandwidth cost.  The answer is “There isn’t any” because all routes would cost the same—nothing.  We’re not there yet, but we have to start thinking in terms of how we deal with opex-optimized networks and not resource-optimized networks.

Might This Be the Start of Something Big for SDN?

If you review all of the industry news of the last week looking for “secret trends”, it’s hard to avoid the conclusion that a lot of the longer-term trends that everyone has ignored for almost a decade are now coming home to roost.  As a result, we may in fact be in for some changes that could actually end up creating a viable SDN story—somewhere.

The Street consensus on the industry for 2014 isn’t exactly rosy.  Generally, they believe that the only areas of network spending that will be strong are wireless RAN and the data center.  The first of these traditionally benefits only a few vendors and the latter is historically the lowest-margin piece of all of networking, and also an area where SDN competition could be stronger and come on faster.

There could be worse news down the line, because both of these areas are susceptible to further commoditization from a price leader like Huawei.  We see Huawei’s influence in wireless exploding (everywhere but the US), and that means that even a build-out of wireless will likely not generate a lot of extra profit for any of the major players.  On the data center side it’s a little more complicated but the result is the same.  Granted, Huawei is not a player in the enterprise space within the US today, but this is where something like SDN could come along.  SDN could encourage somebody to become a white-box data center player.

Somebody like IBM, for example.  IBM exited the network business long ago, selling its assets to Cisco in fact.  Then over time they got back in through OEM deals.  Today, it’s pretty clear that IBM has to do something more strategic and ecosystemic to rebuild its image.  IBM has lost a lot of strategic influence over the last decade, and that’s especially bad given that this was the decade when buyers were faced with massive changes and challenges and wanted help from trusted partners more than ever.  The Street thinks that IBM’s conservative strategy for building up earnings and sustaining its share price isn’t going to work (in fact, isn’t working even now) and that they’ll have to start getting more into something like “the cloud”.

IaaS is a waste of time, though; the margins there are already thin and will only get worse, which means that IBM would have to do two things.  First, they’d have to offer a kind of “cloud-in-a-box” strategy that would necessarily get them more involved in data center networking, and second they’d have to do some M&A to build up platform services  that would add software features to IaaS.  I think that the logical way to approach both is to consider a radical commitment to Open Daylight.

Nobody makes money selling open source, so there has to be ecosystemic value in promoting it.  So far, IBM’s commitment to Open Daylight has been confined to stamping the logo here and there and perhaps funding some developers.  IBM needs to productize it because Open Daylight could offer IBM two critical assets; a data center story that hurts its competitors and helps IBM, and a platform services story.

If IBM promoted a strong central control story based on Open Daylight and targeting the data center, they would have the option of then actually generating white-box products there, OEMing them from smaller players, or simply promoting a white-box data center ecosystem built around the IBM/Open Daylight brand.  Any of these options would seriously impact the revenues and profits of rivals like Cisco and Juniper, and since IBM isn’t directly in the networking business they’d not impact IBM in any negative way.  On the positive side, it would give IBM a strong SDN-based product position.

Which would help the platform-services point.  IBM has a ton of software and software skills, and it would be very easy for IBM to build up an Amazon-like web-services expansion to the basic IaaS cloud based on those assets.  That would not only give IBM a specific and differentiated cloud tool offering, it would be the basis for a public cloud service offering that could sustain better margins.

All of this might be a factor behind a story that Network World ran last week regarding Juniper.  Juniper has its own OpenFlow controller in Contrail (which has both a proprietary and open-source version) and recently there have been a number of high-profile defections from Juniper’s Open Daylight team.  Could it be that Juniper wants IBM to push Contrail instead of Open Daylight (no chance of that, in my view), or could it be that suddenly Open Daylight people have improved opportunities elsewhere because something big is about to happen in Open Daylight that Contrail/Juniper isn’t going to match?

I know that there’s a lot of cynicism about Open Daylight given that IBM and Cisco are primary backers, but it’s my view that the project is doing more useful stuff in the SDN space than anything else, including the ONF.  Specifications are simply not going to move the ball fast enough in today’s world—not for SDN, not for NFV, not for the cloud.  You need implementation and Open Daylight is creating a community that is building stuff north of OpenFlow where everything valuable absolutely has to live, or it’s not going to live at all.

But Open Daylight poses a problem too, because these ecosystems tend to have support to the extent that they’re not perceived as being the purview of major vendors like IBM.  If IBM makes a grab to brand Open Daylight they’d certainly start losing support of others, or they might see Cisco fighting with them for control.  So that raises yet another possibility, which is that Juniper might see this as an opportunity for itself.  If Open Daylight is pulled in a dozen directions and fractured by internal discord, might Contrail take hold?

No, not directly.  This isn’t about OpenFlow at all, it’s about whatever is above it.  To make any SDN strategy a success you need a specific notion of higher-layer services and how you’ll build them.  If IBM can offer that they can win an Open Daylight branding war with Cisco or anyone else.  If Juniper can provide that critical notion, they could make Contrail a contender.  Right now, neither of the two companies is on a winning streak in positioning terms.  Neither is the other Open Daylight giant, Cisco.  Who will come out of the funk first?  That’s the question.

Can We Modernize OSS/BSS?

There have been a number of articles recently about the evolution of OSS/BSS, and certainly there’s pressure to drive evolution, based on a number of outside forces.  Caroline Chappell, Heavy Reading analyst, has been a particularly good source of insightful questions and comments on the topic, and I think it’s time to go a bit into the transformation of OSS/BSS.  Which, obviously, has to start with where it is now.

It’s convenient to look at operations systems through the lens of the Telemanagement Forum standards.  The TMF stuff can be divided into data models and processes, and most people would recognize the first as being “the SID” and the second as being “eTOM”.  While these two things are defined independently, there’s a presumption in how they’ve been described and implemented that is (I think) critical to the issue of OSS/BSS evolution.  In most cases, eTOM describes a linear workflow, implemented in most cases through SOA and workflow engines like enterprise service bus (ESB).  There is a presumptive process flow, in short, and the flow needs data that is drawn from the SID.  This follows the structure of most software today; process model described one way, linked in a general way to a supporting data model independently (but hopefully symbiotically) described.

There are several reasons for this “workflow” model of operations.  One is that OSS/BSS evolved as a means of coordinating manual and automated processes in synchrony, so you had to reflect a specific sequence of tasks.  Another is that as packet networks evolved, operations systems tended to cede what I’ll call “functional management” to lower-layer EMS/NMS activities.  That meant that the higher-level operations processes were more about ordering and billing and less about fulfillment.

Now, there’s pressure for that to change from a number of sources.  Start with the transitioning from “provisioned” services to self-service, which presumes that the user is able to almost dial up a service feature as needed.  Another is the desire for a higher level of service automation to reduce opex, and a third is a need to improve composability and agility of services through the use of software features, which changes the nature of what we mean by “creating” a service to something more like the cloud DevOps process.  SDN and NFV are both drivers of change because they change the nature of how services are built, and in general the dynamism introduced to both resources and services by the notion of “virtualization” puts a lot of stress on a system designed to support humans connecting boxes, which was of course the root of OSS/BSS.  These are combining to push OSS/BSS toward what is often called (and was called, in a Light Reading piece) a more “event-driven” model.

OK, how then do we get there?  We have to start somewhere, and a good place is whether there’s been any useful standards work.  A point I’ve made here before, and that has generated some questions from my readers, is that TMF GB942 offers a model for next-gen service orchestration.  Some have been skeptical that the TMF could have done anything like this—it’s not a body widely known for its forward-looking insights.  Some wonder how GB942 could have anticipated current market needs when it’s probably five years old.  Most just wonder how it could work, so let me take a stab at applying GB942 principles to OSS/BSS modernization as it’s being driven today.

When GB942 came along, the authors had something radically different in mind.  They proposed to integrate data, process, and service events in a single structure.  Events would be handed off to processes through the intermediation of the service contract, a data model.  If we were to put this into modern terms, we’d say that GB942 proposed to augment basic contract data in a parametric sense with service metadata that described the policies for event handling, the links between services and resources, etc.  While I certainly take pride of authorship for the framework of CloudNFV, I’ve always said that it was inspired by GB942 (which it is).

Down under the covers, the problem that GB942 is trying to solve is the problem of context or state.  A flow of processes means work is moved sequentially from one to another—like a chain of in- and out-boxes.  When you shift to event processing you have to be able to establish context before you can process an event.  That’s what GB942 is about—you insert a data/metadata model to describe process relationships and the data part of the model carries your parameters and state/contxt information.

This isn’t all that hard because protocol jandlers do it every day.  All good protocol handlers are built on what’s called “state-event” logic.  Visualizing this in service terms, a service starts off in, let’s say, the “Orderable” state.  When a “ServiceOrder” event arrives in this state, you want to run a management process which we could call “OrderTheService”.  When this has been done, the service might enter the “ActivationPending” state if it needed to become live at a future time or the “Active” state if it’s immediately available.  If we got a “TerminateService” in the “Active” or “ActivationPending” state we’d kill the order and dismantle any service-to-resource commitments, but if we got it in the “Orderable” state it would be an error.  You get the picture, I’m sure.  The set of actions that should be taken (include “PostError”) for any combination of state/event can be expressed in a table, and that table can be included in service metadata.

You can hardly have customer service reps (or worse yet, customers) writing state/event tables, so the metadata-model approach demands that someone knowledgeable puts one together.  Most services today are built up from component elements, and each such element would have its own little micro-definition.  An architect would create this, and would also assemble the micro-services into full retail offerings.  At every phase of this process, that architect could define a state/event table that relates how a service evolves from that ready-to-order state to having been sold and perhaps eventually cancelled.  That definition, created for each service template, would carry over into the service contract and drive the lifecycle processes for as long as the services lived.

Connecting this all to the real world is the last complication.  Obviously there has to be an event interface (in and out), and obviously there has to be a way of recording resource commitments to the service so that events can be linked through service descriptions to the right resources.  (In the CloudNFV activity these linkages are created by the Management Visualizer and the Service Model Handler, respectively; whatever name you give them the functionality is needed.)  You also need to have a pretty flexible and agile data/process coupling system (in CloudNFV this is provided by EnterpriseWeb) or there’s a lot of custom development to worry about, not to mention performance and scalability/availability.

In theory, you can make any componentized operations software system into something that’s event-driven by proceeding along these lines.  All you have to do is to break the workflow presumptions and insert the metadata model of the state/event handling and related components.  I’m not saying this is a walk in the park, but it takes relatively little time if you have the right tools and apply some resources.  So the good news is that if you follow GB942 principles with the enhancements I’ve described here, you can modernize operations processes without tossing out everything you’ve done and starting over; you’re just orchestrating the processes based on events and not flowing work sequentially along presumptive (provisioning) lines.  We’re not hearing much about this right now, but I think that will change in 2014.  The calls for operations change are becoming shouts, and eventually somebody will wake up to the opportunities, which are immense.

Clouds, Brown Paper Bags, and Networking’s Future

I’ve blogged a lot about declining revenue per bit and commoditization of hardware in networking.  The same thing is happening in IT, driven by the same mass-market and consumerization forces.  You can’t sell a lot of something that’s expensive; Ford made the automobile real by making it cheap.  So arguably networking and IT are getting real now, and that means that perhaps we have to look at the impact.  As I’ve noted before, you can see some signs of the future of networking in current IT trends.  Look further and you see clouds and brown paper bags.

In the old days, Microsoft would not have made hardware, and Intel would not have made systems or devices.  It’s pretty clear that’s changing and the reason is that as prices fall, profits pegged to a percentage of price will obviously fall too.  Declining margins will nearly always have the effect of forcing intermediaries out of the food chain.  You want to keep all the money.

In networking, this is going to put pressure on distribution channels over time.  We can already see some of that in the fact that the network vendors are working with larger and larger partners on the average, moving away from the VARs and the populist networking in favor of presenting products directly to mass-market retail.  A small business can get everything they need in networking from Staples or Office Depot, and in my survey of mid-sized businesses this fall, they reported that their off-the-shelf investment in networking had almost doubled versus 2012.

The other impact of commoditization is that it quickly runs out of its own value proposition.  If gear is really expensive then getting it for ten percent less really means something.  As prices fall, the discounts are less significant in dollar terms, but more important the component of cost that network equipment represents is less important overall.  Five years ago, businesses said that capital cost of equipment was 62% of their total cost of ownership, and today it’s 48%.

From the perspective of network services, commoditization is hurting operators more every day and increasing their disintermediation problem.  If the industry’s services are driven by people riding on a stream of bits that’s getting cheaper (in marginal cost terms) every year, then those people are gaining additional benefit while the operators carrying those bits are losing.  This is the basic force behind operators’ need for “transformation”.  First and foremost, they need to be able to get into the “service” business again.  That’s important because their network vendors are doubly impacted—they’re increasingly expected to help their customers, and they are also increasingly at risk because their own core business of bit-pushing is less the strategic focus.

Translation to services isn’t as easy as it sounds.  If I’m an operator and I’m selling stuff over my network instead of from my network, am I not killing my ability to differentiate what I spend most of my capex doing—which is pushing bits and making connections?  Look at what’s happening in the enterprise.  Ten years ago, the CIO was likely to report directly to the CEO, while today they’re more likely to report to the CFO.  Twenty years ago, the head of enterprise networking had influence nearly equal to that of the head of IT, and today both these positions say that networking has about a fifth the influence of the IT counterpart.  Think of the political impact of this shift, and how it might impact service providers—people who are essentially all networking types.

That poses, to me, what should be the critical question about the evolution of networking, which is whether we can envision new services from rather than just on the network of the future.  It should be obvious that if anyone is going to empower the network as a source of goodness and not just a  brown paper bag to carry it in, that someone will have to be a network equipment vendor.  IT or software giants have everything to gain by encouraging commoditization of networking—it’s more money on the table for them.

Are those vendors empowering the network?  At some levels you can argue that they are.  Cisco’s application-centricity model would at least expose network features to exploitation by applications, but that’s not enough because it gets us back to being about carriage and not about performing some useful task.  What is needed for networks to be valuable is for more of what we think of as the cloud to be subsumed into the network.  Right now, the cloud to most people is just VM hosting—IaaS.  If the cloud is adding platform services (to use my term) to build a virtual OS on which future applications can be built, the mechanism for that addition could just as easily be something like NFV as something like OpenStack.  Or it could if network people pushed as hard as cloud people, meaning IT people, are pushing.

It’s important for our industry to think about this now, because (returning to commoditization) the rewards of success are greatest when the benefits created from success are the greatest.  Revolutions may be hard to sell, but they bring about big changes and create big winners (and of course losers).  If we wait for the right moment to do something in networking when things like OpenStack are advancing functionality in the network space (Neutron) two to four times per year, they’ll move the ball so far that all networking will be able to do is dodge it.

Commoditization at the bottom creates revolutionary potential in the middle, in the adjacent zone.  I think that the evolution of the cloud is taking it “downward” toward the network and the evolution of the network must necessarily take it upward to respond to its own commoditization pressures.  That boundary zone is where the two collide, and in that zone we will likely find the opportunities and conflicts and players and risk-takers who will shape where networking, the cloud, and IT all go as their own safe core markets commoditize.  That battle cannot be avoided; it’s driven by economic forces.  It can be won, though, and networking needs to win it to avoid becoming a brown paper bag.

Two Good Tech Stories Gone Bad

There are a couple of recent news items that involve a big player, and while the focus of the news is different I think there is a common theme to be found.  One is a little impromptu talk reported on Beet.tv from Cisco’s John Chambers at CES, and the other is Oracle’s acquisition of SDN vendor Corente.  Both mix a healthy dose of interest and hype, so we need to look at them a bit to extract the pearls of wisdom—and there are some.

I have to confess that I’m finding the progress of Cisco’s descriptions of the Internet’s evolution entertaining.  We started with the “Internet” then the “Internet of Things”, now the “Internet of Everything.”  Next, perhaps, is the “Internet of Stephen Hawking Multidimensional Space-Time.”  Underneath the hyperbole, though, Chambers make a valid point in his little talk.  We are coming to a stage where what’s important on the Internet isn’t “information” but the fusion of information into contextual relevance.  Mobile devices have turned people from being Internet researchers into being Internet-driven almost completely.

The problem I have in calling this a useful insight is that I don’t think Chambers then makes use of it.  He jumps into NDS and GUIs, skipping over the fact that what is really needed for context-fusing is a combination of a rich vision of what “context” means and a means of filtering information flows through that vision.  It’s a made-for-the-cloud problem, and Cisco purports to be driving to cloud leadership, so it would be nice to have it lead here.

Cisco could easily build a story of context-driven information and event filtering; they have nearly all the platform tools they’d need, the underlying servers and network elements, and the right combination of enterprise and operator customers to push solutions to.  The kind of network that Chambers is indirectly describing is one where “the cloud” pushes all the way to the edge, and where users and apps grab onto a context agent in the cloud to shop, or eat, or talk, or think, (or whatever) through.  In this model, you actually consume a lot of capacity because you want to get all of the information and all the context resources connected to these context agents through low-latency pipes so the user gets what they want quickly.  By the way, these pipes, being inside the cloud, are immune from neutrality rules even under the current regulations.

There’s really no good reason not to push this vision, either.  It’s not going to overhang current Cisco products, it’s not going to demand new channels or new buyer relationships.  It supports the operator goals of monetization better than Cisco has supported them to date, and it could be applied to enhance worker productivity and thus drive up the benefit case that funds network investment on the business side.  We seem to have an example of knee-jerk evangelism here; Cisco is so used to spinning a yarn then putting a box in front of the listener that they can’t put a concept there instead—even a good one.

Oracle’s acquisition of Corente has its own elements of theater.  The description of Corente as an SDN offering is a stretch, IMHO.  It’s a tunnel-overlay system created by linking edge devices with hosting platforms, for the purposes of delivering an application set.  It’s sort-of-cloud, sort-of-NFV, sort-of-SDN, but it’s not a massive piece of any of the three.  It could in fact be a useful tool in cloud hybridization, as any overlay virtual network could be that runs end to end.  It’s not clear to me how well it integrates with carrier infrastructure, how it could influence network behavior or take advantage of specific network features.

There is sense to this; an application overlay or virtual network that manages connectivity separate from transport is a notion I’ve always liked, and in fact I’ve said that the two-layer model of SDN (connectivity and transport) is the right approach.  It’s logical in my view to think of the future network as an underlying transport process that serves a series of service- and application-specific connection networks.  Connectivity and connection rights must be managed by the applications or at the application level, but you don’t want applications messing with transport policies or behaviors.  Thus, Oracle has a good point—if you could convince them to make it rather than saying it’s “software-defined WAN virtualization”.  That seems to me little more than taking tunnels and wrapping them in the Holy Mantle of SDN.

And that’s what it appears to be.  At the technical level you have to offer some way to connect my two layers of SDN so you’re not riding best-efforts forever, and at the high positioning level you have to make it clear that’s what you’re doing.  Corente didn’t do that on their site and Oracle didn’t do it in the announcement.  Other vendors like Alcatel-Lucent have articulated end-to-end visions that tie down into infrastructure, which is a much better approach (particularly for network operators who need to make their networks valuable).

This is one of those times when you could speculate on the reasons why Cisco or Oracle would do something smart but do it stupidly.  One, they did an opportunistic thing for opportunistic reasons and by sheer chance it happened it had relevance to real problems.  Since they didn’t care about those problems, they didn’t catch the relevance.  Two, they did a smart thing for smart reasons and just don’t understand how to articulate anything that requires more than third-grade-level writing and thinking.  Three, we have a media/market process incapable of digesting anything other than “Dick and Jane do telecom”.  Probably all of the above, which is why I’m not going to spend much time trying to pick from that list of cynical possibilities.

Ultimately, the smarts will out.  Again, it’s not constructive to predict whether startups will jump in (somehow making the VCs believe this is all part of social networking and then working under the table, perhaps) or that some major vendor will hire a CEO who actually does understand market requirements and has the confidence and organizational drive of a Marine Gunnery Sergeant.  The question is how much vendors lose in the time it takes for this to happen.

Why Test-Data-as-a-Service is Important to NFV

Yesterday, the CloudNFV project (of which I am Chief Architect) announced a new Integration Partner, Shenick Network Systems.  Shenick is a premier provider of IP test and measurement capabilities, and I’m blogging about this not because of my connection with CloudNFV but because the announcement illustrates some important points about NFV and next-gen networking in general.

No matter how successful NFV is eventually, it will never completely displace either dedicated network hardware or static hosted information/processing resources.  In the near term, we’ll certainly have a long period of adaptation in which NFV will gradually penetrate those areas where it is suitable, and in that period we’ll have to live and interwork with legacy components.  Further, NFV is still a network technology and it will still have to accommodate testing/measurement/monitoring functions that are used today for network validation and diagnostics.  Thus, it’s important for an NFV model to accommodate both.

One way to do that is by making testing and measurement an actual element of a service.  In most cases, test data injection and measurement will take place at points specified by an operations specialist and under conditions where that specialist has determined there’s a need and an acceptable level of risk to network operations overall.  So at the service level, we can say that a service model should be able to define a point of testing/monitoring as an interface, and connect a testing, measurement, or monitoring function to that interface as needed.

The question is how that function is itself represented.  I blogged yesterday about the value of platform services in the cloud, services that were presented through a web-service interface and could be accessed by an application.  It makes sense to assume that if there are a number of points in a network where testing/monitoring/measurement facilities exist, we should be able to link them to an interface as a platform service.  This interface could then be “orchestrated” to connect with the correct point of testing defined in the service model, as needed.

Of course, there’s another possibility, which is that there is no static point where the testing and measurement is available.  Shenick TeraVM is a hostable software testing and measurement tool, and while you can host it in specific places on bare metal or VMs, you can also cloud-host it.  That means it would be nice if in addition to supporting a static location for testing and measurement to be linked with service test points, you could spawn a dynamic copy of TeraVM and run it somewhere proximate to the point where you’re connecting into the service under test.

What Shenick is bringing to CloudNFV (and to NFV overall) is the ability to do both these things, to support testing and measurement as a static set of platform service points and also as a virtual function that can be composed into a service at specific points and activated on demand.  The initial application is the static model (because CloudNFV runs today in Dell’s lab, so dynamism in terms of location isn’t that relevant) but Shenick is committed to evolving a dynamic support model based on VNFs.

We need a way to connect testing, measurement, and monitoring into services because operations personnel rely on them today.  What’s interesting about this Shenick approach is that it is also a kind of platform-services poster-child for NFV.  There are plenty of other things that are deployed in a static way but linked dynamically to services.  Take IMS, for example.  You don’t build an IMS instance for every cellular user or phone call, after all.  Same with CDN services.  But if we have to build a service, in a user sense, that references something that’s already there and presented as a web-service interface or some other interface, we then have to be able to model that when we build services.  That’s true whether we’re talking about CloudNFV, NFV a la ETSI ISG, or even traditional networking.  Absent modeling we can’t have effective service automation.

In NFV, a virtual network function is a deployable unit of functionality.  If that function represents something like a firewall that is also (and was traditionally) a discrete device, then it follows that the way a service model defines the VNF would be similar to the way it might define the physical device.  Why not make the two interchangeable, then?  Why not say that a “service model component” defines a unit of functionality and not necessarily just one that is a machine image deployed on a virtual machine?  We could then model both legacy versions of a network function and the corresponding VNFs in the same way.  But we could also use that same mechanism to model the “platform services” like testing and measurement that Shenick provides, which of course is what they’re doing.  Testing and measurement is a function, and whether it’s static or hosted in the cloud, it’s the same function.  Yes, we may have to do something different to get it running depending on whether it’s pre-deployed or instantiated on demand, but that’s a difference that can be accommodated in parameters, not one requiring a whole new model.

I think the Shenick lesson here is important for both these reasons.  We need to expect NFV to support not only building services but testing them as well, which means that testing and measurement have to be included in service composition and implemented either through static references or via VNFs.  We also need to broaden our perception of service modeling for NFV somehow, to embrace the mixture of things that will absolutely have to be mixed in any realistic vision of a network service.

Both SDN and NFV present a challenge that I think the Shenick announcement brings to light.  Network services require the coordination of a bunch of elements and processes.  Changing one of them, with something like SDN or NFV, certainly requires the appropriate bodies focus on what they’re changing, but we can’t lose sight of the ecosystem as we try to address the organisms.