June 2013 – Page 2 – Welcome to CIMI Corporation's Public Blog

Subtle SDN/NFV Data Points

We’re seeing more evidence of major changes in the networking industry, this time from the vendor side of the picture. Obviously one impact of a sudden shift in network operator business models would be a collateral shift in spending that would impact vendors depending on their product portfolios. I think some of those impacts can be seen.

NSN, who’s been a kind of on-again-off-again stepchild of Nokia and Siemens (and is now “on”) is rumored to be up for consideration for a complete Siemens takeover. NSN has also articulated “pillars” on which it will frame its future business model. Sadly most of these are rather vapid (increase capacity, improve QoS) and those that have potential (cloud-enable operators) are vague.

The challenge for NSN is that their recent decision to focus on mobile may be putting them in the position of having to small shoes for their feet. To be sure, mobile is a big deal for operators because margins are higher there, but ARPU for mobile users is already flattened and in some cases heading to declines. WiFi offload is actually revaluing metro, and so is CDN. That means that a pure-RAN-and-IMS push may not be enough skin in the game. And IMS, remember, is perhaps the number one target of operators for NFV. You can’t have an open-source radio, but you could take everything else in IMS, including the evolved packet core (EPC) and turn it into cheap hosted functions. That would have a decidedly chilling impact on three of the big mobile suppliers (Alcatel-Lucent, Ericsson, and NSN) but most of all on NSN simply because they don’t have anything else to play with.

The SDN and NFV future also present risks to the vendors, obviously, though they’re much harder to assess. Operators have expressed the conviction that future networks will be rich in both, but the fact is that operators really don’t have any solid strategies for making that happen. In our spring survey results, now largely in for the network operators, we found that the indicated that they did not believe that they would have made “significant strides” in deployment of either SDN (in the network) or NFV by the end of 2014.

Alcatel-Lucent may be the centerpoint of the whole evolution-of-the-network thing. They have a broad network asset base, a strong incumbency, good professional services, and probably more assets in the SDN space than competitors. They’re working hard to build a cloud position, which could give them an SDN position. Their stock has been rising on the expectations of the Street that they’re less disordered than some of their competition. Philosophically their biggest rival is Cisco, who has sacrificed broad carrier-product engagement for value-driven carrier-cloud IT engagement. The question is whether Cisco can drive a cloud strategy for operators in an industry that is still unable to articulate one as a consensus framework or standard. Many say that Cisco is one reason that’s happened; they’ve been dragging their feet on issues like SDN and NFV. If that’s the case, then the Cisco/Alcatel-Lucent dynamic is the focus of a major Cisco gamble. Our models say that carrier spending on network infrastructure will climb nicely through 2015 and then tip downward, never again to rise out to the 2018 limit of our prediction. If that’s true, Alcatel-Lucent could benefit enormously in the near term and if Cisco can’t drive an IT shift to offset their lack of many of the basic elements of infrastructure (RAN comes to mind) then they risk watching a major rival reinvent itself with the cash earned from this spending surge.

Another vendor is betting on change, Cyan. The company has launched what’s essentially an SDN-coop with a bunch of other largely smaller players. The titular focus of all of this is an SDN-metro deployment that would combine optical and electrical and link better to the cloud and to central network operations-based apps. I like the idea a lot, but I’m not convinced that there’s enough substance here. The trouble with SDN these days is that all you need to do is spell out the acronym and the media thinks you have a story. No details on how things actually work are required. The press likely wouldn’t carry them anyway, but I’ve noticed that they don’t even go on the website. If you have an ambition to SDN-ize metro you have to open the kimono to the degree needed to explain exactly how you expect to do it, which I can’t get from what Cyan has said.

We see a lot of SDN and NFV coming down the line based on briefings we’ve received and what we’ve gotten from operators who have also been briefed. Most of it seems to be relying more on professional services and customization than on standard products and open architectures. Since only major vendors can offer the professional-services path, it may be that this proves the major drivers of our big network revolutions are the least revolutionary of the players. That may compromise the goals of both technologies. Operators have fallen into this trap of trusting innovation to those with the most to lose by innovating. I hope they don’t do it again.

Two Good-News Items

There’s been some potential progress on a couple of fronts in the cloud, SDN, and NFV space (a space I’m arguing will converge to become the framework of a “supercloud”). One is the introduction of Red Hat’s OpenShift PaaS framework as a commercial offering, and the other a proposal to converge two different OpenFlow controller frameworks into one that would provide for a “service abstraction layer”.

Since I think the cloud is the driver of pretty much everything else, we’ll start there, and with its most popular service. IaaS is a bare-bones model of the cloud, one that provides little more than VM hosting and thus really doesn’t change the economics of IT all that much. I’ve been arguing that cloud services have to evolve to a higher level, either to accommodate developers who want to write cloud-specific apps or to broaden the segment of current costs that the cloud would displace. In either case, we need something that’s more PaaS-like. Developing cloud-specific apps means developing to a cloud-aware virtual OS, which would look like PaaS. Displacing more cost means moving up the ladder of infrastructure from bare (if virtual) metal to a platform that at least envelopes some operations tools.

Red Hat’s OpenShift is the latter, a platform offering that’s designed to address what’s already the major cloud application—web-related application processing. You can develop for OpenShift in all the common web languages, including Java, and the development and application lifecycle processes are supported throughout, right to the point of running. OpenShift currently runs as a platform overlay on EC2, which puts IaaS services in what I think is their proper perspective, the bottom of the food chain.

OpenShift isn’t the end of the PaaS evolution, IMHO. I think Red Hat will be a player in deploying further PaaS tools to create a more operationalizable cloud that’s aimed at the web-related application set. The question is whether they’ll take any steps to advance to the other PaaS mission, the creation of a virtual cloud OS. In some ways, Java and perhaps some of the other popular web-linked languages are suitable candidates for distributed package development. What’s needed is a definition of platform services which, added to either IaaS or a development-enhanced PaaS, would create a virtual cloud OS. We actually have a lot of that today, particularly if you add in the OSGi stuff.

OSGi stands for Open Services Gateway initiative, and it’s an activity that implements what many would call a Service Abstraction Layer (SAL), which is the other piece of my potential good news. SAL is an architecture that offers a Java-and-web-friendly way of creating and using Java components in a highly dynamic way. It’s been around for perhaps a decade, and with concepts like SDN and NFV it could come into vogue big time. In the OpenDaylight context, the concept seems to have this original meeting and another besides; a notion that the SAL would provide a common semantic for talking to control elements or drivers (“plugins”) that would then let these relate to the rest of OpenDaylight in a consistent way. You could argue that the models in Quantum are an abstraction, for example.

One of the potential advantages of SAL in OpenFlow Controller missions is creating an organized way to modularize and enhance the controller by providing an easy exposure of low-level features and the binding of those features upward to applications. Without some common set of abstractions, it’s doubtful whether a controller could use all of the possible southbound protocols in the same way, and variability in how they had to be used would then percolate upward and make it harder to build applications without making them specific to the control mechanism at the bottom. Thus, in theory, this SAL thing could bring some structure to how the famous northbound APIs would work, and it could also allow for the easy support of multiple lower-level control protocols rather than just OpenFlow.

The proposal is to make OpenDaylight a three-layer controller architecture with the low layer being the “drivers” for the control protocols, the middle layer being “network services” (kind of the platform services of SDN) and the higher layer being the applications. SAL would fit between drivers and network services and would create a dual OpenFlow/OVSDB interface. Presumably it would be easy to add other control frameworks/languages. This sort of structure seems to play on the benefits of the Cisco-contributed code, which (obviously) can be enhanced to support the ONEpk API.

The biggest impact of this change, if it happens, would be to make OpenDaylight more of an SDN controller and less of an OpenFlow controller. It doesn’t destroy OpenFlow but it would make it easier to develop control alternatives that might make more sense. Recall that OpenFlow was really designed to be an easy way of communicating ASIC-level forwarding rules to hardware, something that is probably not really necessary as SDN evolves. Why? Because likely there would be control logic in the devices that could perform a compilation of forwarding rules from any reasonably expressed structure. That would make it easier to apply SDN principles to protocols/layers that were opaque from a normal packet-header perspective. Like optics.

All this makes one wonder just what Cisco has in mind here with OpenDaylight. As I commented earlier, the controller code seems quite good, and these new concepts seem to advance the controller toward the point where it might actually provide the basic level of utility needed to drive an OpenFlow “purist” network model. But I want to point out that routing and switching are still “applications” to the controller, and most of what we take for granted in networking is thus still up in the Frozen Northbound…APIs, that is!

Apple Swings for the Stars and Misses the Cloud

When your stock dips at the point in your big annual event where it’s clear that you have nothing else to say, you’re not in a happy place. That’s Apple’s problem for today. There’s no question that the Apple aficionados liked the chances to iOS and OSX, and perhaps liked the new notion of an Apple radio too. Stars will surely be promoting the new look of iOS on TV shows. The investors weren’t quite so happy. I wasn’t either because Apple again turned its back on the cloud.

Yes, I heard the iWork/iCloud thing. It was Google Drive or SkyDrive a year or so too late. Apple is still sending a very clear signal that its future may not be a cool device every six months, but rather software designed to buff up their current developer posture. It’s not the cloud, and that’s a problem for Apple under the best of conditions. We’re far from those conditions now.

In case you’ve missed my view on this issue, let me state it again. In a mobile revolution there is absolutely no place for a strategy that isn’t grounded in cloud computing. Appliances, to be relevant in terms of moving the ball, are going to have to get smaller and more specialized. Small, specialized stuff can’t differentiate itself based on a GUI or internal application support. Think of how differentiated an iWatch could be in either area. The inevitable trend in appliances is the trend of visual agent, which is even less than the role of functional agent that some cloud process might play. A gadget with simple capabilities can do profound things if you link it to the power of the cloud. Absent that linkage it’s just a toy, and so Apple’s reluctance to be a real cloud player makes it into a toy player instead, which won’t sustain its reputation or stock price over time.

Mobile users value context, the ability of the tech gadget they have to provide stuff that’s relevant to what they’re actually doing at the moment. Contextual value is critical because mobile devices are integrated into our routine behavior, literally as part of our lives. We use then not to research or to plan, but to decide, to do. There’s an immediacy and a demand for immediate relevance that exists for mobile, and thus for all personal gadgets—because those gadgets are either mobile with us or they’re irrelevant. Apple of all companies, the company of the cool and empowered, should have seen this all along. They don’t seem to see it now.

The big beneficiary of Apple’s mistake with the cloud is likely to be not Google but Microsoft. RIM’s disintegration put its slot of the smartphone space up for grabs. One could argue that Blackberry enthusiasts could have defected to Apple or Google at any time, though arguably Apple’s lack of cloud insight would have made at least that migration path harder. In any event, those that stayed are now far game to Microsoft. Microsoft has better corporate street creds than Apple or Google at this point, and if Apple refuses to exercise its cloud options optimally then Microsoft may have an unfettered run at number three in smartphones and tablets.

The “Why?” of this is probably pretty easy to see. Apple is the company who has resisted any kind of resident sub-platform (even Flash) that would let developers port applications between iPhones and competitive devices. Apple is the company who wants only in-iTunes and not in-app sales. They’ve fought to control all the financial tendrils of their commercial organism, so it’s not surprising that they’d resist the notion of pulling functionality out of devices and cloud-hosting it. Such functionality would be anything but limited to Apple platforms, and so it would weaken Apple’s desire to brand its ecosystem and not share it.

The cloud, then, is about sharing. Web services are open to all, and so they inherently diminish the power of the device, the user platform. For somebody like Microsoft whose platforms are at high risk in the current era, rolling the dice on the cloud makes a lot of sense. Apple, king of the device hill, saw it differently. So will other vendors, in both the IT and the network space, and like Apple they put themselves at risk.

IT, when combined with a more agile and dynamic vision of application componentization and deployment, could enrich companies by enhancing worker productivity. Microsoft and all the other IT giants probably fear that a radical cloud vision could unseat them, which it could. HP, Oracle, SAP, and other software/hardware giants are desperately trying to control the revolution. Somebody could embrace it. Apple could have articulated such a vision.

We are in the most innovative industry of all time, an industry that gave meaning to the notion of “Internet years” as a measure of the pace of progress. It’s time to recapture that. Defense is what killed the incumbents of the past, not innovation.

Servers, Clouds, NFVs, and Apples

The notion of hosting centralized network functionality appeals to even enterprises, and operators positively salivate over it. There is a potential issue, though, and that’s the performance of the servers that do the hosting. Servers weren’t really designed for high-speed data switching, and when you add hypervisors and virtualization to the mix you get something that’s potentially a deal-breaker. Fortunately there are answers available, and we’re starting to seem more about them as function virtualization gains traction.

In any activity where a server hosts a virtual switch/router the network I/O performance of the device will be critical. Where a server hosts multiple VMs that include data-plane processing (firewalls, etc.) the same issue arises, perhaps even more so. If a given server hosts a virtual switch (OVS for example, or one of the other open or commercial equivalents) to support adjacent VMs hosting virtual functions, the whole mixture can be compromised. So you have to make the handling of at least data-plane packets a heck of a lot better than the standard hardware/software mechanisms would tend to do.

The first player in this space that we became aware of was 6WIND, who has a network software stack for Linux that’s designed to harness the full capabilities of any hardware-level acceleration (like Intel’s DPDK) and to improve performance even when there’s no special hardware available. The software divides into three segments, one of which is fast-path processing of the data plane. If you dedicate enough cores to the process, 6WINDGate (as the software is known) can drive the adapters to their throughput limit.

6WIND’s website is quite good, with links to explain how the software manages high-performance network connections in SDN and NFV, and they were the first we saw assert specific support for NFV. They’re also planning to do a seminar in conjunction with open-source giant Red Hat at noon, Eastern Time, on June 25^th. You can get details and register HERE. Red Hat is an important supporter of OpenStack, the cloud framework I like best for NFV, and their big Red Hat Summit is this week so I expect they’ll have some interesting announcements. Because of their strength in Linux, credibility with operators, and experience in OpenStack, it’s pretty obvious that Red Hat is going to be a player in NFV.

And that could be important for a bunch of reasons, not the least one exposed by a JP Morgan report today. The IT forecast cuts the expected IT spending growth by half, and cites particular weakness in the hardware space. A big part of the hardware problem is PCs, but servers and storage are “sluggish” according to the report and the reason cited is the long-expected plateau in virtualization. I’ve noted for almost two years now that applications designed to be run in a multi-tasking environment don’t need to be virtualized to use resources efficiently. Obviously many such applications already exist (most mission-critical apps are designed for multi-tasking) and obviously software types would design applications for multi-tasking use where possible. Anyway, virtualization is running out of gas in the enterprise, which makes it dependent on the cloud.

Even the cloud isn’t boosting spending though, and the likely reason is that public cloud justification is a lot harder than people think and private cloud justification is harder still. That’s what make NFV important. Our models have consistently suggested that a given metro area could be the largest incremental source of new data centers globally, and the largest consumer of servers globally. Which is why it’s important to understand what servers need to host virtual functions.

The cloud figures in another way this morning—in Apple’s WWDC. Setting the stage a bit for the conference is Google’s purchase of Waze, a mapping and GPS navigation player. Some think that Google is as interested (or more) in keeping Waze out of other hands as doing anything themselves, but the point is that Waze points out the value of “social GPS”, and the fact that the smartphone and tablet market is changing to a software and cloud market. This is the transition Apple has dreaded the most and has supported the least, and even at this week’s WWDC the Apple fans are looking for the next cool device rather than for a conspicuous and determined move to the cloud.

If we see the cloud as IT economy of scale, we see nothing. The cloud is the network-IT architecture of the future, the framework for applications and services that have to draw on the breadth of the Internet and its connected information to provide us with navigation, social connections, buying decisions, and life support in general. It’s not only the “next big thing”, it’s likely the next-only-thing. You get it, you support it, or you are putting yourself on a path to marginalization.

I know 6WIND and Red Hat get it. I’m pretty sure Google does as well, but I’m not at all sure what either Apple or rival Microsoft have taken their training wheels off on the road to the cloud. Apple still, in its heart of hearts, wants to be a purveyor of cool devices. I bet more Apple engineers work on iWatches than on iClouds. Microsoft wants Windows to come back as the platform of the future, and more people are probably working on Metro than on Azure. This isn’t going to lead to a happy outcome for any of those old giants, so look for a change in direction—this week, from Apple—or look for trouble ahead.

The Clay Feet of the Virtual Revolution

UBS just published a brief on their SDN conference, where a number of vendors made presentations on the state of SDN, its issues and benefits, their own strategies, etc. If you read through the material you get an enormous range of visions (not surprisingly) of what SDN actually is, a consistent view that somehow SDN is right around the corner (not surprising either), and a very clear and virtually universal point. That point is that SDN and operations are highly interdependent.

The truth is that operationalization (the term I use to describe the engineering of a technology to allow it to be deployed and managed in the most cost-effective way) is the baseline problem for pretty much everything that’s important today—cloud, SDN, and NFV. The operational challenge of the future is created by the fact that cooperative systems’ complexity generally rises at the rate of the number of relationships, which often approaches the square of the number of elements. If you take a device and break it into a couple virtual functions and a small dumbed-down data-plane element, you end up with something that could be cheaper in capital costs but will be considerably more expensive to operate, unless you do something proactive to make things easier.

Enterprise and network operator management processes have evolved over time as networks have moved from being run through largely static provisioning to being run by managing (or at least bounding) adaptive behavior. You could visualize this as a shift from managing services to managing class of service. Arguably the state-of-the-management-art is to have the network generate a reasonable but not infinite service set, assign application/user relationships to one of these services when they’re created, and manage the infrastructure to meet the grade-of-service stipulations for each of the services in your set. Get the boat to the other side of the river and you get the passengers there too; you don’t have to route them independently.

While class-of-service management makes a lot of sense, it’s still had its challenges and it’s facing new ones. First, there is a sense that new technologies like SDN will achieve their benefits more through personalization, meaning that the ability of the network to adapt to user-level service visions is a fundamental value proposition. Second, there is the question of how services work when they’re made up not of physical boxes in static locations, but of virtual elements that are in theory put anywhere that’s convenient, and even broken up and replicated for performance or availability reasons.

I’ve gone through these issues with both operators and enterprises in surveys, and with the former in face-to-face discussions. It’s my view that neither operators nor the enterprises who really understand the challenges (what I call the “literati”) buy into the notion of personalizing network services. The SDN or NFV of the future isn’t going to turn perhaps four classes of service into perhaps four thousand, or even four hundred. Not only do buyers think this won’t scale, they think that the difference among the grades becomes inconsequential very quickly as you multiply the options. So that leaves the issue of dealing with virtual elements.

What seems inevitable in virtual-element networks, including elastic applications of cloud computing and NFV as well as SDN, is the notion of derived operations. You start derived operations with two principles and you cement it into reality with the third.

The first of the two starting principles is that infrastructure will always have to be policy-managed and isolated from connection dynamism. There is a very good reason to use OpenFlow to control overlay virtual SDN connectivity—virtual routers and tunnels—because you can shape the connection network that builds the customer-facing notion of application network services easily and flexibly that way. There’s no value to trying to extend application control directly to lower layers. You may use OpenFlow down there, but you use it in support of traffic policies and not to couple applications to lower OSI layers directly. Too many disasters happen, operationally speaking, if you do that. We need class-of-service management even when it’s class-of-virtual-service.

The second of the two starting points is that a completely elastic and virtual-element-compatible management model is too complicated for operations personnel to absorb. We could give a NOC a screen that showed the topology of a virtual network like one built with NFV, or a view of their virtual hosts for multi-component applications. What would they do with it? In any large-scale operation, the result would be like watching ants crawl around the mouth of an anthill. It would be disorder to the point where absorbing the current state, much less visualizing correct versus incorrect operations would be meaningless. Add in the notion that class-of-service management and automatic fault remediation would be fixing things or changing things as they arise, and you have visual chaos.

The unifying principle? It’s the notion of derived operations. We have to view complex virtual-element systems as things that we build from convenient subsystems. Infrastructure might be a single subsystem or a collection of them, divided by resource type, geography, etc. Within each of the subsystems we have a set of autonomous processes that cooperate to make sure the subsystem fits its role by supporting its mission. People can absorb the subsystem detail because it’s more contained in both scope and technology.

There’s a second dimension to this, and that’s the idea that these subsystems can be created not only by subsetting real technology based on real factors like geography, but also by grouping the stuff in any meaningful way. The best example of this is service management. A service is simply a cooperative behavior induced by a management process. Whether services represent real or virtual resources, the truth is that managing them by managing the resources that are cooperating hasn’t worked well from the first. Box X might or might not be involved in a given service, so whether it’s working or not doesn’t matter. Boxes Y and Z may be involved, along with the path that connects them, but looking at the state of any of the boxes or paths won’t by itself communicate whether the service is working properly. You need to synthesize management views, to create at the simplest level the notion of a service black box that presents a management interface as though it were a “god box” that did everything the service included. The state of that god box is the derived state of what’s inside it. You orchestrate management state just the way you provision services, and that’s true both for services using real devices and services made up of virtual elements. The latter is just a bit more complicated, more dynamic, but it still depends on orchestrated management.

So the lesson of the SDN conference? We still don’t understand the nature of software-defined or virtual networks at the most important level of all—the management level. Till we do, we’re going to be groping for business cases and service profits with less success than we should be, and often with no success at all.

Big Switch and Open Daylight: Perfect Apart

Big Switch has announced its defection from Open Daylight, and the move has been greeted with all the usual “I-told-you-so’s”. According to the majority who’ve commented so far, this is because Cisco contributed its open-source controller as a deliberate move to crush the business model of Big Switch. Well, I don’t know about that.

First, anyone who thought (or still thinks) that they can be successful based solely on an OpenFlow Controller is delusional. There’s a ton of stuff out there that’s open-source and free already (Big Switch has a free version). The problem with OpenFlow is and has always been those pesky northbound APIs, the stuff that links OpenFlow with all the network and service intelligence that nobody wanted to bother sticking in…including Big Switch. They needed, from the first, to focus on what they’re focusing on now—northbound applications. They didn’t, and if they thought that joining Open Daylight to contribute their free controller so people would be up-sold to the commercial version instead, they were unrealistic.

But this doesn’t mean that Cisco isn’t playing its own opportunistic game. Did somebody out there actually think that Open Daylight was a happy band of market/tech socialists contributing all of their abilities to match all the market’s needs? Everyone in the process is out for something, which has been true of every one of these groups in the past and will be true of all the future ones too.

In any case, Cisco isn’t gunning for Big Switch (why bother?) they’re gunning for VMware. Everyone in the vendor space knows darn straight well that virtualization, the cloud, and SDN are going to combine to fuel NFV and a displacement of network devices by hosted software. VMware is in a darn good position to be a major player in that space, though I think their smarts with NFV and even the broader missions of SDN have yet to be demonstrated. In the enterprise, where Cisco can’t afford to take chances, VMware hurts and an industry group like Open Daylight erodes its advantage by standardizing and open-sourcing some of the key components.

And it’s not junky code. I took a look at the Open Daylight controller code, with the help of a friend who’s one of the hundred-odd engineers working on it. It’s nice, modular, and not particularly Cisco-centric. Yes, you can add a ONEpk plugin pretty easily but that would be true of any plugin. Why gripe about modularity and expandability?

I still have major reservations about Open Daylight, and to be sure Cisco’s willingness to back it creates one of them. But as I said before about the venture, whatever else it is, it’s an SDN activity that’s actually developing an implementation. It could just do something useful, even northbound, at a time when standards activities are contemplating the whichness of why, their navels, or both.

Meanwhile, I had a chance to get a look at Brocade’s SDN strategy, and I was surprised by how good it was and how well they articulated it at a technical level. Their Vyatta buy is paying off by providing them a flexible virtual router that’s just been picked by Rackspace for virtual hosting support, and they’re also represented in my survey of strategic vendors for network operators for the first time since the survey started. No, they’re not threatening any of the leaders, but they will surely make the playoffs even if they don’t gain further as the responses come back.

The problem Brocade has is the same that many have, but with a complication. Market positioning and website articulation are awful, press attention is non-existent, and they don’t kiss analyst rings so they don’t get any love there either. And since Brocade is very much a channel-sales company, outreach is the only way to get a strategy change going. If they can fix that problem they have a chance of riding SDN and NFV to some decent carrier attention and especially to seize some control of the nacent enterprise SDN opportunity.

With Brocade as with so many SDN advocates, though, it’s going to be NFV that differentiates them or tosses their last chance in the trash. Brocade hasn’t articulated its NFV position but they are engaged in the activity and they’ve been quietly doing stuff that has gotten them carrier attention; it’s NFV in fact that carriers cite as their primary reason for being more interested in Brocade as a strategic partner.

SDN is going to be a big part of NFV implementation; in fact it’s likely that NFV will drive SDN consumption more than multi-tenancy in the cloud would do. Every NFV service will likely have to keep its virtual functions partitioned, and if the service is instantiated per user as business services likely would be, then there will be many partitions. Furthermore, interior traffic in virtual services of any sort are IMHO exempt from neutrality rules, which would allow operators even in the US to prioritize their intra-NFV traffic. Net-net, this could be a very significant time for SDN, and NFV, and even Open Daylight. And, yes, for Brocade.

IBM Starts the Engine of the New Cloud

My biggest gripe about cloud coverage in the media is that we’re almost never talking about what’s really going on, but rather simply pushing the most newsworthy claim. One result of this is that when something happens in the cloud space, even something that’s inherently newsworthy, we have no context of reality into which it can be placed. So, we miss the important details. So it is with IBM’s acquisition of SoftLayer.

Let’s get something straight from the first. IaaS is not a particularly good business. Why do you suppose that a mass-market online retail firm (Amazon) is the IaaS leader? Because they can tolerate low margins given that their core business already commits to them. IBM isn’t exactly a low-margin firm, and so it would be crazy for them to spend billions to get into a business that would draw down their gross margins by being successful.

So why, then, would IBM do the deal? The answer is what’s important about the deal in the first place. What IBM is telling us is that software and hardware and all of IT is becoming the cloud. The cloud is the architecture for which businesses will write or buy future applications. The cloud is the platform for which operators will develop or purchase elements of service functionality. The cloud is what will do everything that our basic mobile devices don’t do, but that we still want done. If iPhones and iPads are our portal to information, it’s the cloud that is that information.

IBM, as an IT company, is confronting this long-term truth via a short-term requirement from its buyers—“cloudbursting”. Businesses have long told me that elastic use of public capacity to offload extra work or back up data centers in the event of a failure is the most credible application for cloud computing, and the only one that has any significant financial strength behind it. So what they’re saying is 1) every profitable business cloud is a hybrid cloud, 2) every piece of software deployed henceforth will have to be cloudbursting-ready, and 3) if you want to sell anyone IT from now on, you’d better want to sell them hybrid-cloud IT.

SoftLayer is the public cloud part of IBM’s hybrid cloud plans. SoftLayer gives them geographic scope and economy of scale in offering the hybrid services that IBM software must embrace or IBM won’t be able to sell it any more. SoftLayer is also the IaaS underlayment to what IBM will eventually offer, what every cloud provider will eventually offer, which is platform-as-a-service.

Anyone who has ever written large-scale software systems knows that you can’t just take a piece of software, make a second copy in the cloud, and expect the two to share the workload or provide mutual backup. Everyone who’s written systems designed for elastic expansion and fail-over knows that there are system services needed to make these processes work, and that you can’t build an operationally scalable system by letting every app developer invent their own mechanisms. So if you hybridize a cloud, you first create a common platform of system services that apps can draw on, and then you extend it over both the public cloud and the data center. That, IMHO, is what IBM now intends to do with SoftLayer.

This isn’t going to be an easy thing for IBM to do; they are essentially inventing an architecture on which future cloud-specific apps will be hosted. So the stakes are clearly very high, and that begs the question of how a trend that’s this important to IBM is likely to impact the other players in the space.

To start with, I think it’s clear that everyone who wants to sell cloud services to business will have to start thinking about having a cloud ecosystem that ranges from private software to public services. HP and Oracle already have this, and likely the risk of these two companies’ getting ahead of IBM was a major driver in IBM’s decision. Amazon doesn’t really have private software and Cisco, for example, doesn’t have public cloud. There will be pressure on both these players to do something astounding, and do that quickly. In other words, think M&A.

The second thing I think is clear is that when you start to think about failover and cloudbursting and hybridization, and you build a “platform” of services to handle that, it’s likely that your platform will extend to handle other areas of cloud hybridization that aren’t directly linked to the cloudburst/fail-over process. Database support is a good example. Network support is another good example. Deployment and management (“operationalization”) are likely the best example. In short, we’re going to invent pretty much that “cloud operating system” that I’ve been saying all along would be needed to get the most from the cloud.

From a networking perspective, the question is how this is going to happen. The boundary between networking and IT is soft here; I estimate that almost half the total functionality required for this cloud-of-the-future could be put into either camp. If networking were to grab the lion’s share of this up-for-grabs functionality, then networking and network vendors would have a very big stake in the cloud of the future and would likely gain market share and strength overall. If networking hangs shyly back (or, let’s face it, hangs stupidly back) then the boundary between IT and networking will be pushed further down, leaving more functionality for guys like IBM and less for guys like Alcatel-Lucent or Cisco or whoever. Which, of course, is something else driving IBM’s moves.

It seems to me that our old friend NFV is square in the sights of this change. The fact is that every single thing that NFV requires to deploy service elements on hosted platforms is the same as what’s required to deploy application elements in a hybrid cloud. So NFV could, if it plays its cards right, define a big chunk of that cloud-of-the-future while all the IT guys are still getting their heads around the process. Even IBM may have less a plan than a goal here, and thus there may be some time for networking to get ahead. Since I’m a networking guy (even though I’m a software architect by background), I’d sure like to see that.

Microsoft’s Path to Reorg? Redefine “SaaS”

I’ve blogged for a couple days now on the evolution of the service providers into software-driven services versus bit-driven services. News is now floating about that Microsoft is going to transform itself from a software company into a devices and services company. So you may wonder how these two things go together, and what it might mean for Microsoft and the market overall.

IMHO, the real question is likely less whether Microsoft is reorganizing somewhat as rumored than what “services and devices” might mean. Obviously Microsoft is going to chase the tablet and phone markets but they’re already doing that. Gaming is an area where they have some persistent success so that’s clearly going to stay as a high priority. Do they think they’ll have their own glasses and watches, or maybe belt-buckles or rings? The problem with the “device” part of the speculative reorg is that it would seem to put Microsoft in direct competition with Apple for cool new stuff at the very moment when Apple may be running out of that very thing. So I think that we have to presume that what Microsoft may have in mind is first the services part and second device-symbiotic services, a partnership between devices and services.

The cloud as a platform has been distorted by early market developments. We have successfully penetrated a heck of a lot less than 1% of the total cloud opportunity space and yet some are already saying that Amazon has a lock on things. No it doesn’t. The cloud isn’t IaaS, it’s SaaS where the first “S” stands for both “Software” and “Services”. What the cloud is building toward is being the experience hosting point, and the network is evolving to be the experience delivery technology. The “experiences” here can be content, social, or behavioral stuff aimed at consumers or it can be productivity-enhancing stuff aimed at the business market. So this vision of cloud services would in fact fit the Microsoft rumors nicely. This new SaaS space, unlike IaaS, needs platform software to deploy things efficiently, operationalize them at little incremental cost, and launch new experiences when a market whim shows even a vague sign of becoming exploitable. That’s true whether the service provider is a telco trying to transform its business model or a software giant trying to do the same.

I’m describing a PaaS platform on which Microsoft would try to create a new version of its old Wintel developer partnership, and of course Microsoft already has Azure which is PaaS so you might think that’s a starting point. It might be, but Azure wasn’t designed to do what Amazon now does with EC2 applications, and that’s to provide direct support for a mobile device experience from the cloud. Microsoft would need to make Azure into something more a platform for the experience dimension of services versus just the business software dimension. So Microsoft, to remake itself, will have to remake its cloud. In doing that, it will have to do a whole lot of optimizing around the consumer space, where applications are highly transient in their execution, and where the device is most likely to launch a cloud process and wait for the result rather than gather all the info and figure things out for itself.

This concept of agency, I think, is the key to Microsoft’s reorg success. Apple, bastion of device coolness, isn’t likely to go to a model of device that’s a camel’s nose under the cloud tent. Neither is Google, who needs enough profit in the device space to keep its current partners loyal to Android. But will Microsoft actually try to promote a handset that can be less than current Apple or Android models? They wouldn’t really have to, they’d only have to accept that there would likely be a drive to simplify the handset if the agent concept were promoted by a big player. Microsoft would then have to make the agent device more capable in other missions, perhaps offline activities, to justify a higher value device.

Some operators are already very interested in this sort of thing, with Mozilla’s Firefox OS being an example. Telecom Italia and others have looked at this featurephone-plus model of service delivery already, and most operators would like to see handsets get cheaper, which is why the availability of an agent-model device would likely put more commoditization pressure on handsets. But clearly operators would be willing to continue to deploy smarter and more expensive devices if that’s what consumers wanted. So Microsoft, in pushing an agent-model service relationship with devices, would have to make consumers value add-on features or see its own handset market commoditize—including Windows Phone, which is only now gaining ground.

Windows Phone and Windows RT are Microsoft’s biggest assets and liabilities at the same time. To make its reorg work, it will need to drive Phone and RT to the service-agent model, which puts them (and the rest of the mobile device market) at risk. But so what? Microsoft isn’t the leader in tablets or phones so the other guy has more to lose anyway. If Microsoft were to be bold here they could mark Apple/Google territory and open relatively little risk to themselves by comparison.

Lessons from the CTR

We’ve all no doubt heard (read) the rumors of Cisco’s next core router, the “CTR”, and read that it’s designed to be able to support the “lean core” model. What we haven’t heard is that such a mission is essentially a validation of my long-standing point that network connectivity—bit-pushing—isn’t profitable enough anymore, and that radical changes are going to happen to reverse the slipping ROI on network devices.

Somebody did a university study a decade ago that pointed out that the revenue per bit was very different at different places in the network. Out in the services layer, at the customer point of connection, some services were generating thousands or tens of thousands of times the revenue that the same bits generated inside the Internet core. What this proves is that in terms of willingness to pay, context of bits (the services) is the critical factor. Where context is necessarily lost (in the core, every bit is pretty much like any other), then value is lost.

For a couple of decades, operators have recognized that the traditional model of the network wasn’t going to offer them reasonable ROI on their deeper assets, and they looked at things like optical meshing, agile optics, and flattening of OSI layers to address the problem that equipment costs were high in the very place where bit-value was low. If you look at Cisco’s CTR, and at other newer “core” products from competitors, you can see that these products are more and more designed to be lean handlers of routes or even electro-optical transition devices. Where bits are the least valuable, you need lean handling. Conversely, where lean handling is acknowledged as a need, bit value is acknowledged as a problem.

This point about variable value per bit also shows why new services are so important. When an operator provides a bit-transport service to the user, the value of that bit is still hundreds of times less (by that study) than the bit would be valued if you stuck it inside a service like mobile voice or SMS or even TV viewing. The experience that users pay for is what commands the margin, and if operators offer only bits then they invite others (the OTTs) to create the experiences. That’s what’s been happening up to now, and what the operators want to stop.

If we start with the retail services, the experiences, we can see how something like SDN can be highly valuable in creating an agile, optimized, operationalized, framework for creating and delivering the good stuff. I can write a stunning SDN justification by starting with OpenStack Quantum and working my way down to the infrastructure. Because Quantum is linked to the deployment of application components (SaaS, which is profitable) or service virtual components (NFV, which operators expect to be at least less costly than custom hardware), I can validate a business benefit to drive my investment by following Quantum’s trail downward.

If I don’t do that, then I have to presume that SDN simply changes retail transport/connection service in some way. What way? Do we expect that SDN would create some new service interface and protocol? I don’t think anyone is proposing that. Do we expect that SDN will make the experience of routing or switching different—and by “experience” I mean what’s seen looking into the UNI from the user side? Don’t we have to deliver packets to where they’re addressed. Sure we could in theory have user’s capacity expand and contract dynamically with load, or over time. So you’re saying we can’t do that now with current protocols? I respectfully disagree. Furthermore, if the operator has to size the network for the peak capacity people could use or risk not being able to deliver on what’s needed, how does this elasticity change their cost picture? If it doesn’t, how do they price it more favorably than just giving the user the maximum capacity?

We can also look at the question of whether NFV might be shooting at the wrong duck by targeting cost reduction rather than new services. The problem that network operators have with “new services” is that even if they’re a lot more profitable than the old connection/transport services, the old services are still there and dragging down the bottom line. If an operator offered a content service that earned 40% gross margins and was losing 10% on the basic network transport service associated with the delivery of that content, they’d still be earning 30%, which is better than losing 10%. However, their OTT competitors with the same 40% gross margin and no network losses to cover would be earning the whole forty. Or, more likely, they’d be pricing their content service lower and undercutting the operator. The network has to be profitable, you can’t carry its losses to higher layers to pay off or you can’t compete with others who don’t have the losses in the first place. So NFV and other initiatives like lean-core or agile optical networking have to happen no matter what happens at the service layer.

Do you want to know what the biggest danger is? Us. We don’t want to believe that the world of unlimited usage and expanding capacity and exploding service sets, all delivered at steadily lower costs, is going to end. Because we don’t want to believe it, we listen to people who say that the future won’t include the loss of all the goodies. We form an inertial barrier far greater than the cost inertia of long-lived telco devices, the politics of huge telco bureaucracies, or the intransigence of old-line standards types. Our industry, our lives, are the sum of the decisions we make, and we need to make some good ones now. Facing reality doesn’t assure that, but it’s at least a good start.