Why We’re Not in the Age of Network Software Yet

Gartner released some IT data it will be discussing further in a financial analyst conference next week.  The information suggests that IT spending is growing more slowly than previously estimated, and it probably won’t surprise you to learn that software growth is well ahead of hardware growth.  The company says that “Price pressure based on increased competition, lack of product differentiation and the increased availability of viable alternative solutions has had a dampening effect on the short term IT spending outlook.”  They expect “normal” spending to resume in 2015-2018 (which I don’t).

I think that networking can learn something from IT here.  I now use a laptop computer that probably cost less than $500 as a “desktop” system.  It has a quad-core processor and plenty of storage and memory and it actually outperforms my old desktop system.  I’m sure the manufacturer is happy I made the substitution, but the fact is that part of my motive was that I was getting more for less.  That “for less” part is the point about price competition and lack of differentiation Gartner cited.  If you can’t find something else that’s quantifiably better to you, you make a price-based choice.  Bits, being hard to differentiate, have commoditized.

We hear a lot about networks shifting to software for differentiation and commoditization avoidance, but we’re missing the point here.  “Software” per se is not the issue.  If you consume a software-produced bit versus a hardware-produced bit, it’s still a bit.  Software is more valuable not because what you do with it is better, but because you can do more differentiating things more easily.  My PC is a brick, whether the old or new one, without software to run on it, and it’s the software that creates the value proposition.  The software isn’t the value proposition, it’s just a vehicle.

In network terms, that means that SDN and NFV are not game-changers in and of themselves.  How much service innovation are we hearing about for either?  I don’t mean “service innovation” as a path toward cheaper bits or lower operations costs, I mean innovation of the kind that enterprise software can offer.

To create differentiating features, we need three things.  First, we need something that’s valuable to the buyer that can be targeted.  Utility is the first test of differentiability.  Second, we need to be able to target that something quickly enough to satisfy buyer needs while the need still exists.  That’s agility.  Finally, we need to be able to do the Golden Something at a price the user will pay and with a level of profit that justifies our making the attempt.  That’s efficiency.

Why has network software not met these tests?  It clearly has not, because we’re not seeing some explosion in “network software”.  I think the answer is simple; we don’t have “network software” today.  We have software used in the network as part of hardware platforms to push bits.  We should think of network software as serviceware, as the stuff that you write services in.  I’m not talking about a new language, beyond Java or Visual C++ or something, but rather a new model.  We have to be able to address utility, agility, and efficiency and that means coming up with some way of “writing services” that does the job.  Sure that will involve “software” in the traditional sense, just as a computer chip involves addition or data movement, but it has to go beyond the basics or it’s a commodity right out of the box.

To my way of thinking the biggest advance we’ve made in the network services space, from the perspective of platform innovation, is the notion of modeling.  We are seeing glimmers of a future where you build a service by dragging and dropping elements at the GUI level.  Yes, it’s software that will create the GUI and other software that will translate the model you create into a functioning cooperative system of resources, but the architecture here is new; the modeling “language” we use becomes the serviceware of the future.

You’re probably thinking at this point about just what modeling language that might be.  Even among the “literati” of network operators we have relatively little understanding of service modeling languages.  Two-thirds of operators think of modeling of services as being modeling of the networks, and obviously that’s not the case.  The network isn’t going to create the serviceware of the future because services are on the network not in it.  We have to look to the cloud, and there we have an array of possibilities.

There’s a bunch of stuff out there, including OpenStack itself, that focuses on modeling applications.  Everyone who has used OpenStack knows that Neutron includes models, but when you look at these models they’re still models of the networks on which applications live, not of the applications themselves.  We do have some emerging stuff that shows some signs of relevance, like the OASIS TOSCA (Topology and Orchestration Specification for Cloud Applications”) model, but all of them have one fatal flaw, the flaw of NFV and SDN, and even the cloud.  The flaw of focus.  All our model revolutions are talking about modeling service infrastructure for deployment and management, and not modeling service logic.  We’re using advanced tools to make the stuff we do more efficient, but not making it better.

I worked on the TMF’s SDF activity years ago, and one of the issues that came up very quickly was the difference between the service logic plane and the service management plane.  Making something we assemble with a model useful, agile, and efficient mandates both logic and management and we’re still stuck on the management-model side.  What do assembled components of a service do?  How do we create useful functions by combining functional components?  I can provision five components into a glorious ecosystem with NFV and connect them using SDN, but they move the ball on differentiability at the service level by what they do and not how I assembled them.

Bodies like OASIS, the NFV ISG, the ONF, and all of the other groups already running or starting up in the network service layer space should look at the logic/management point carefully, because if they don’t they’re not doing network software at all, they’re doing softened hardware.

An Open Letter on Open NFV and SDN

We continue to have a lot of discussion around the use of open-source technology by network operators in NFV and SDN applications.  Obviously, the majority of the SDN spectrum is already covered by open source (OpenDaylight for example) and there’s no shortage of open source in NFV either, with OpenStack, Linux, and so forth.  Why is there a need for a formal group, under the Linux Foundation, then?  If there is a need, is the process likely to meet it?

If you looked at either SDN or NFV holistically you’d find that there are three basic pieces to each.  There’s underlying infrastructure (in SDN, white-box switches and in NFV server pools), there’s applications that control behavior (VNFs in NFV and those “northbound apps” in SDN) and there’s some central binding/control logic—orchestration in NFV and the Controller in SDN.  We have some open-source contribution to all of these areas already.

The problem, as operators report it to me, is that their business case for SDN or NFV demands a fairly comprehensive implementation of a whole service and operations ecosystem that can accommodate these new technology options.  You hear all the time about how we have to integrate SDN with network management or NFV with OSS/BSS, but it’s more complicated than that.  The management tools and practices of a pre-virtual world don’t map optimally to the new cloud-SDN-NFV tools.  That lack of optimality means that you can’t make as good a business case, which means less gets deployed.

In my early spring survey of operators, the consensus of C-level executives was that their trials for SDN and NFV were proving technology choices but not proving the business case.  That means that all this work is at risk for being rendered useless, because nobody cares if something works technically if it can’t create enough value to deploy.  I think that one big driver to the open-source movement is the frustration of operators with the fact that they don’t have a clear set of SDN or NFV ecosystem options in place, that they can’t do a useful trial for lack of functional scope.

For the last decade, operator have faced business transformation pressures, and they clearly face them even more starkly these days.  Normally they’d have expected a competitive equipment market to come to their rescue, producing new gear to take them in new directions.  Since 1999, though, the vendor community has been induced by Wall Street to focus on selling the next quarter’s quota, and has simply not been interested in rolling the dice on some transformed future.  So, say the operators, we have nothing from them that builds the ecosystem up to the top, where the benefits collect.  Vendors all hope somebody else will do it—and operators think that if vendors won’t, then open-source is likely the only choice.  They can’t band together in any other forum without generating anti-trust complaints.

OK, so we do have a problem.  The question then is whether the project under the Linux Foundation can solve it.  Some of that is beyond our ken right now because we don’t know what they’re going to do, but there are some disquieting signs in my view.

Sign number one is that the activities we already have in open-source networking, and the proposal for the NFV one, are based on what I’ll call the “ONF model”, meaning that they’re a group with “membership levels”, the top levels of which will require a significant annual investment and the bottom levels of which may not even admit smaller tech companies who can’t pay ten to fifty grand per year to play.  So who funds these “Platinum” levels?  Vendors, big ones, the same ones who haven’t been delivering commercial solutions to the transformation dilemma their customers face.  Why do it now, then?  Wouldn’t these people just hunker down as they’ve always done to protect their current revenue model?

Sign number two is that the early target areas for open-source NFV are in the areas of NFVI, the infrastructure.  Did we learn nothing from SDN?  What good is a protocol to drive switch behavior in the absence of any way to model target behaviors and present them to users?  We’ve proved we can build IP network areas and Ethernet data centers using SDN, but we’ve had those things long before SDN came along, so the benefit is hardly revolutionary.

A realistic target for NFV open-source has to start with a set of goals, and here are my candidates:

  • First, you require that the architecture admit all current open-source network functions running on a cloud-capable platform, without changes to the functions.  That lets open-source into the VNF opportunity space.
  • Second, you require that there be a boundary between infrastructure and services that is abstracted so that it can support both legacy technology (devices or collective services presented through a provisioning interface like that of an NMS).
  • Third, you focus on finding open-source tools that can be enhanced to get you where you need to be, not on developing something from scratch.  We’ve spent nearly two years now on NFV standards.  Do the operators want to spend two more on open-source NFV development?

But we’ve still not targeted the problem, the benefits.  So here’s a stark truth for you.  Every single document I’ve seen that presents a credible survey of operator interest ranks service agility as the top goal for both SDN and NFV, and ranks operations efficiency as either goal two or three (capital savings is ranked higher in some).  What generates agile services?  It’s not just agile components, it’s an agile component assembly process.  That means an orchestration or service-building model that can quickly define complex services, services that initially and maybe forever will be a mixture of legacy stuff and new SDN/NFV stuff.  Where is that structure?  Everyone who talks “orchestration” and “management” is taking all of the new virtual things and sticking them into the old operations model, a model that everyone agrees is too monolithic even now.

We’re supposed to be talking about these issues next week.  So my challenge to the operators, to the Linux Foundation, is show that something can be done to drive NFV and SDN deployment, not just prove that deployment is possible.  There are a lot of things that can be done that won’t be, because nobody can make the case.  Prove that SDN and NFV don’t have to be among that group of losers.

Why Not Try SETTING Internet Policy Explicitly?

Public policy is often set in the courts, and we have both a current and an impending example of court action on our networking industry.  As is always the case, the issues may be decided in law but not necessarily in the eyes of those with interest in the matters.

The Supreme Court dealt what most observers think will be a death blow to Aereo, the company who used little subscriber-specific antennas to receive TV broadcasts for re-streaming over the Internet.  The Justices ruled the company’s model to be a violation of copyright, which I’ve said from the first was my view as well.  Had the decision gone the other way, I think over-the-air TV and even content production would have been hurt.  There might have been another jump in Internet video traffic and further risk to Internet performance.

Aereo demonstrates that putting something on the Internet doesn’t make it free, or legal.  It’s just one piece of a long-standing battle between the forces of rampant consumerism (“I want what I want and I want to name the price, which will likely be zero”) and what global regulatory authorities have long called the “health of the industry”.  It won’t be the last battle either.

There’s a healthy dose of self-centeredness in a business model like this; you have to believe many in Aereo knew in their hearts that this wasn’t likely to work, and the argument that it should comes down to saying that if you are clever in how you infringe on copyright you can beat the rap.  What’s at least a related question is whether it’s reasonable to build a business on delivery of a service over a medium that has cost but not price—the unlimited-usage Internet model.  The Supreme Court isn’t ruling on that now, but the FCC and the DC Court of Appeals have been dancing on the issue—net neutrality—and eventually it’s likely to end up in the Supreme Court too.

Where things stand now is that the Courts have told the FCC that most of the stuff they wrote into the original neutrality order would be fine if the FCC hadn’t previously said that ISPs weren’t common carriers.  The regulatory authority is there, but only for that category of provider.  That leaves the FCC with the choice of either declaring the ISPs to be common carriers and subjecting them to all the telcom regulations, or abandoning some of the neutrality policies.

The FCC’s straw position here is that there’s nothing wrong with “fast lanes” or with ISPs charging content providers for carriage.  That position isn’t popular in many venues, but the fact is that the FCC can’t do anything else unless either it regulates ISPs as common carriers or Congress amends the Telecom Act.  There’s fear that ISPs would start charging every content provider for carriage, and the content providers would then pass it along in their prices.  There’s fear that the inability of startups to pay for carriage would strangle innovation.  There’s fear that without some check, ISPs will simply abandon unlimited usage, a trend that was emerging just a year or so ago.

The fundamental problem here is that we’re talking about the Internet, which is not only a best-efforts service for which no specific standard of service quality exists, we’re talking about a network where the QoE depends on a bunch of things that the customer doesn’t even know about.  How do we deal with this variability in setting policy?

Let me cite a personal example here.  I changed computers recently, and with the new system I’d noticed I was having some annoying delays in loading pages and picking up emails under some conditions.  Since the old system was still hooked up, I could compare the two and see that the old one did better, which is hardly what you’d expect when you get a new system.  So I looked at the Internet setup on the two systems, and what I found was that on the old one I’d overridden my ISP’s (Verizon FiOS) DNS with Google’s and on the new I’d forgotten to make that change.  My performance shot up as soon as I switched to Google DNS.  I’d had problems with Verizon’s DNS before, and that’s why I’d gone to Google in the first place.  So your ISP can regulate your traffic flow, in theory, simply by delaying your DNS decodes.  I’m not saying Verizon did that, only that my experience proves that it would be possible.  Can you imagine writing an FCC order that would deal with that sort of thing and getting it past the Supreme Court?

The business relationships involved in Internet service are also often opaque.  Most Internet users don’t realize that content providers pay for CDN services so their stuff is cached closer to the user and can be delivered with a better QoE.  In most cases it would probably be up to the CDN provider to pay access ISPs for carriage, which some already do.  So here’s the question; if it’s OK for content providers to pay for CDN services without violating neutrality, why is it not OK for them to pay ISPs?  Should all ISPs do their own CDN services and refuse to support the big public CDNs, or should the ISPs charge the big CDN guys?  In the end, if we want good video delivery we have to be willing to pay somebody to get it as long as there’s capital investment needed to insure it.  Should we let content players pay ISPs for carriage and forget CDNs completely, cutting out the middleman?  You can see the point; it’s not easy here either.

There is always a risk in trying to set policies to guide an industry in any direction but where natural market forces will take it.  Obviously there is a risk that too little neutrality, too much copyright protection, would stifle innovation.  There’s also a risk that too much neutrality and too little copyright protection would erode investment by limiting the chance of profitable returns.  I happen to agree with the Aereo decision and with the current (apparent) turn in neutrality policy, but I do have concerns with how we’re going about it.  Rather than have Internet issues turn on legal interpretations, we need to have legislation that balances the factors effectively and explicitly.  Given the Washington dysfunction, though, that may be too much to hope for.

Forget IoT; Think “Cloud of Things”

The “Internet of Things” is one of those concepts that starts with at least a grain of truth and gets enveloped in the inevitable wave of hype.  We seem, as an industry, to be incapable of addressing anything that’s not characterized by a hockey stick growth estimate and an ever-expanding-and-less-precise definition.  The problem is that you can have something that’s terribly overblown but still important, and the fanfare forces you to either accept a borderless problem set or ignore the whole thing (well, “things” in this case).  The fact is that IoT trends could be significant, just not everything they’re assumed to be.

We have to start any IoT discussion by defining some realistic boundaries.  We have millions of “networked devices” today that provide sensor information in things like home control, and that number is growing at a predictable rate.  However, virtually none of these devices is on the Internet and there’s no good reason to put them there.  Short-range home wiring and RF technologies like X.10, ZigBee, and Insteon all provide homeowners with a way to trigger alarms when somebody opens a door or window, when a freezer gets too warm, or when water accumulates somewhere.  We should start by defining IoT as applications where something is actually Internet-hosted.

We do have examples of IoT in home control.  Most of the residential monitoring products have offered the option to call out to a monitoring center, and in the Internet age we’ve added the option to text or email, and also the option to review and control the systems from a PC, phone, or tablet.  This is the space Apple has been looking at.  What’s on the Internet isn’t the sensors but the control system, so you go from millions of devices down to perhaps a tenth of that.

There are also some sensor applications that do require, or benefit from, more conventional Internet-like addressing, though.  Where short-range RF won’t work (because ranges aren’t short) people have looked at cellular technology.  Some automotive and process control applications do involve putting something on the Internet, though it’s easy to skid from the reality of smarter vehicles to the exaggeration that every vehicle will be online or will interact with online traffic sensors in a couple years.

In general, control networks should not be “on the Internet” in a direct sense; you should have specific measures in place to make sure they aren’t addressable except by the controllers that work with the data.  It’s these controllers, as I’ve said, that are going to be expanding.  It’s significant but it’s not a network of a bezillion new things.  Also in general, traffic between the controllers and higher-level applications or users won’t be a big thing, certainly not enough to generate any major blip in an Internet traffic pattern set by video use.  I had a chance to look at the industrial control traffic of a big plant, and what was generated in the way of “Internet traffic” even if we define VPNs as the Internet was, over the period of a month, less than that generated by one YouTube viewer in a day.

So where is the substance here, if anywhere?  The big thing is that we do have a number of credible applications of IoT that could generate network and IT changes.  Most of them involve the management of things that move, from trains and trucks and ships to cars and even bikes.  Most, as far as my surveys of enterprises can validate, are really applications of what many feel to be a boring RFID technique.  The nice thing about RFID is that you can have thousands of tags that yield information when pinged by a much smaller number of sensors, and so the cost per object is much lower.

RFID today tends to be a short-range specialized technology, like being able to ping a box as it passes on a conveyor or is delivered to a home, or even “taking a ticket” on a transportation system by pinging it.  Still, you can see that if you tagged something and could read its location when it passed a sensor, there’s an opportunity in what could be called a “sensor service”, where somebody pays to have sensors in key places and sells or licenses the data to companies who couldn’t afford to cover the same geography with dedicated sensors.

Traffic from an RFID sensor network could be greater than that of a standard control network, but still a pimple on the video growth curve.  Cisco won’t sell more routers to carry IoT traffic (but they get more media attention, which is likely the reason).  But RFID sensor networks show us the real issues with IoT.

Issue one is security and privacy.  Sensors that are widely distributed and collectively analyzed can always track something that’s tagged, and that means that packages, devices, clothing, and other stuff that you carry (even stuff like a handout that you’re given and absent-mindedly stick into a pocket/bag) could be used to track you.  How would they know it was “you”?  By correlating the track of a tag back to a point where that track intersects a transaction or activity that establishes identity.  Buy something and walk out of the store and not only might Big Brother be watching you, you might be carrying him along.

Issue number two is access.  Suppose we had a million RFID sensors out there, spewing information every time someone passed.  We could in fact generate billions of data points per minute.  Do we believe that people, even authorized people, are looking at this data in real time?  The complexity of the event processing would be daunting.  The best way to think about these true IoT-like applications is that they are feeding a big-data repository and a set of analytic processes.  Not only does that tame the challenges of delivering sensor data to potentially thousands of users in real time, it can be used to reduce security and privacy concerns.  If track data is a day old, it’s still useful for legitimate profiling of movements and (where it’s legal) even people, but it doesn’t present the same level of personal risk as it would if the data could track someone to where they are now.

The cloud is likely the thing that will make IoT real or unreal.  A cloud-based process to collect data in convenient places like Hadoop cluster sides and provide for queries on that data subject to policy constraints would be easier to regulate and could easily be offered as a SaaS service.  So when you hear about the “Internet of Things”, think of the “Cloud of Things” instead.

Security, Compliance, and Reality: Different in a Virtual World

There’s been a lot of news recently about “security” or “governance” in the cloud, SDN, or NFV.  It’s certainly fair to ask how new technologies are going to support long-standing requirements in these areas, but I wonder whether we’re not imposing not only old practices but old rules on a set of very new technologies, all based on the principle of “virtualization” that breaks so many boundaries.

In a virtual world we have services or capabilities represented by abstractions (“virtual machine”) that can be instantiated on real resources on demand, by assigning from a pool of such resources.  Breaking the barrier of fixed resources can solve the problem of under-utilization and higher costs.  We’ve done this with services for ages—“virtual circuits” or “virtual private networks” appear to be circuits or private networks but are created on a pool of shared facilities.

When we think about security or governance in our new technologies we should start right there with virtualization.  It demands a pattern, an abstraction.  It demands a pool of suitable resources, and a mechanism for resource commitment and ongoing management.  Every one of these things are an element of virtualization, and every one is a risk.

To start with, who did these abstractions?  They’re recipes in a sense, so we’d want them to be from a trusted cook.  Same with ingredients, tableware, etc.  To mix metaphors, think of a virtual environment as a bank vault; you’ve got to be sure what’s getting in.

Authenticity is the first requirement to virtualization security, and that means that everything in a virtual world has to be on-boarded explicitly, and only by trusted personnel through trusted processes.  The logical assumption here is that when something is on-boarded, it’s also credentialed so that as it’s presented for use it’s collaterally recertified.  We think of on-boarding of applications or virtual functions or even devices as being a process of wrapping it in the proper technical package, but it’s got to be more than that.

Credentialing something isn’t necessarily easy; if you take a device like a router or a software load, you can’t just slap a badge on it.  In most cases, the thing you’re certifying is the functional aspect of the device or software, which means that you have to develop a hash key that can tell you whether the thing you’re adding is the version it represents itself to be and hasn’t been changed or diddled elsewhere.

The second issue we face is related to the processes that manage virtual environments.  Not only do you have to credential the resources, the recipes, and the components, you have to credential the software elements that drive the virtual management and deployment.  And you have to certify the things you’re running as “payload” like cloud components or VNFs.  If you have a flawed process it can circumvent authentication of the elements of your infrastructure.

A corollary to this is that you have to be very wary of application components or virtual functions having direct control over resources or the ability to load other components directly.  You can check the credentials of something if it’s introduced through your deployment processes or spun up by a management system, but how do you insure that every application that loads another copy of itself or of a neighbor component will properly check?

One of the things this raises is the need to protect what I’ll call interior networks, the network connections among the applications’ components or services’ features.  If I put the addresses of all the elements of a virtual application or service into the user’s address space, why can’t they address it and perhaps through that compromise not only their own application but perhaps even the infrastructure?  We don’t hear much about the idea that a multi-component application or a set of virtual network functions should be isolated completely from the service network except at the points where these components/functions are actually connected to users.  We should.

Virtual environments also require tighter control over resources that exchange topology and reachability data.  All too many network devices trust adjacent connections, but we all know that false route advertising is a problem for the Internet.  We have to accept that in virtual structures, we can’t trust anyone without verification or the whole trust and certification process will crumble around us.

You’ll note that I’ve not mentioned things like secure links, etc.  Encryption and security in the traditional sense of protecting payloads from interception, or firewalls to limit what addresses can access something, are all fine but they’re also things we understand at the high level.  What we need to be doing is focusing not on how to make today’s end-to-end security measures work in an SDN or NFV or cloud world, but on why the differences in those worlds might make them NOT work.  For example, can you trash central SDN control by presenting a lot of packets for which there are no forwarding rules?  It could create a controller denial of service attack.  Yet how do you adapt traditional security measures to prevent this, particularly if you expect that packets will normally be presented “outside the rules” and require central handling authorization?

Even basic things like SLAs and MIBs, which can be part of a compliance program because they define application QoE and therefore worker productivity and response, can be difficult in a virtual world.  A real firewall has explicit performance and availability numbers associated with it, and repair or replacement can help define an explicit MTTR.  If we make that firewall virtual, then we don’t even know whether the components of it at a given moment in time are the same as the moment before.  In fact, we might have three or four firewall components behind a load-balancer.  Calculating the MTBF and MTTR in these situations is non-trivial, and knowing when something has changed may be outside the range of expected management responses.  Does your firewall tell you when it reroutes flows among its logical elements?  How then do you know when a virtual one does?

We’re exploring our revolutionary strategies in a very limited way right now, and that’s letting a lot of users and providers dodge general issues that will inevitably arise if we broaden our use of this new stuff.  If we don’t want to stall everything short of the goal, then we have to look a bit closer at how we’re getting to that goal.

Can Oracle Turn Micros into a Cloud Play?

We hear this morning that Oracle is going to buy Micros Systems, a primary provider of software to retail/hospitality users.  What should we think?  Well, everyone knows the “Law of Large Numbers”, and most know that the term is applied to so many things it’s fair to say there’s a “large number” of the laws.  I want to contribute a couple new ones in analyzing this deal.

First is the Law of Mass-Market Numbers, which says that you can create a mass market only from “average” people by definition.  Our thoughts about the cloud violate common sense because they focus on IaaS, a cloud service option that’s unlikely to be consumed by the “average” buyer, meaning the SMB.  The fact is that the mass market for cloud services has to be one for SaaS because that’s something that can be consumed without a strong technical support foundation.

Oracle’s cloud strategy may have evolved through canny planning or been forced on them by tardiness, but whichever was the case they are likely on the right track.  What you need to be a cloud success, most of all, is something you can sell as SaaS, and Micros brings that to the table.  You have a large prospect base, widely distributed, and it’s a market that can be weaned into the cloud easily from the premises framework that now dominates.

Retail/hospitality isn’t the only example of this model, either.  There are dozens of verticals that have a homogeneous core-system requirements set of their own, verticals that could be credibly addressed by having a package that gets decent market share as a sale-of-software provider.  Oracle could jump from Micros into other spaces, and all the easier if everyone keeps their head in the IaaS gutter instead of in the real clouds.

Of course, there’s still the question of how you create an effective SaaS strategy.  If you look around in the market today, you can see that we have three different “roles” emerging regarding the SaaS cloud, and Oracle and every other player will ultimately have to bet on one.

The first role is that of the software-overlay supplier.  Micros/Oracle clearly fits in this role; they can offer the application that when cloud-hosted becomes SaaS.  By taking this position in the cloud, Oracle could promote SaaS to cloud operators and at the same time promote cloud migration to end users via IaaS or PaaS.  If you have a really strong software position you have a hope of obtaining decent margins, but if you don’t offer the underlying cloud infrastructure yourself, you’re forced to rely on and share revenue with those who are prepared to do that.  Some of those “others” might field ineffective infrastructure that could taint the SaaS application’s reputation.

The second possible role is that of the full-spectrum SaaS provider, the company who builds out a cloud and populates it with their own applications to create SaaS.  This role has the obvious advantage of capturing all the cloud revenue and eliminating dependence on others in creating the retail offering or sustaining it at a credible level of QoE.  It has the obvious disadvantage of introducing a very low-margin component, the IaaS foundation.

This is where our second “law of large numbers” comes in, the “Law of Respectable Infrastructure Scale”.  While it’s not true that doubling the size of your resource pool doubles your economy of scale (it’s an Erlang Cumulative Density Function so it converges on a plateau level of efficiency) it is true that you need to have enough capacity to absorb customers, support cloudbursting, provide for geographic diversity, and increase server utilization as close to 100% as you can attain.  With this second SaaS model, you’re going to likely have a very significant positioning investment (what carriers call “first cost”) just to get your service coverage what you need it to be.  That delays the realization of real profits from the model.  Even when you reach full efficiency, you may well have found that you’ve made two-thirds of your total capital investment at something that’s going to yield less ROI than your target level, which means you’re dragging down your overall ROI.

But the third option is even worse; you can become a hardware cloud provider, someone who offers basic IaaS or some form of augmented PaaS to independent software vendors, who they share revenue with you in some way when their software is converted to SaaS.  Here you have the same ROI issues that the second model presented without any good way of sweetening the pot with add-on retail software.  You end up having to support the business goals of the SaaS software vendor of the first model but at a lower ROI than they have!

And yet it gets worse, because of the third “large-numbers” law:  “The operational cost of a system of components is proportional to the number of components (N) raised to a power greater than 1.”  That means that opex of large clouds designed to support large numbers of dynamic application components and users will increase faster than linear.  My model says that cloud opex will become a larger factor in cloud cost than component capital costs well before maximum efficiency levels are reached.  Building a truly effective cloud is more a management problem than a matter of assembling a resource pool.  And building a multi-layer cloud profit model with hardware, OS/middleware, and application software players will be even more complicated to operationalize.

I think there’s an important lesson here.  We are going to have to do a lot more work in operationalizing cloud infrastructure and services.  We are going to need a whole new spectrum of automated tools, tools that do more than just expose management interfaces but rather provide automated responses.  It’s facile to say that we have to move to “policy-based” operations, but at the very least we have to move to operations that’s driven more by policy than what we have today.  So, for Oracle, the key to their Micros success may be what they’re willing to do to automate the cloud they hope to deploy it in.

Oracle’s Cloudy Cloud Position

Oracle reported their quarterly numbers, and they’re always a good player to analyze to understand market trends and issues.  Oracle is like IBM in that both companies are respected incumbents with a very strong sales-versus-marketing bias in their way of doing business.  They’re unlike IBM in that they have a much stronger software bend and arguably are more focused on the cloud revenue model (despite the comments years ago by Larry Ellison).  This IBM-likeness opens some other avenues for comment, as it happens.

The quarterly numbers yelled “transition” and not “success”.  License revenue missed expectations but cloud revenue growth was impressive.  Hardware sales were up for the second quarter in a row, but that space still doesn’t command respect on Wall Street.  Oracle shows that even in the software age, a software company isn’t an automatic win.

The thing that stands out in Oracle’s earnings call is their focus on the cloud.  It’s clear that Oracle believes that its SaaS and emerging PaaS businesses are transformational to their own revenue model for the obvious reason that it substitutes spread-out services revenue for up-front license revenue.  We all knew that one cloud benefit on the user side was the service-versus-capital/license shift so it’s not surprising that vendors see the other side of it.  What Oracle shows is that it creates the potential for short-term pain for longer-term gain, and the Street isn’t into that kind of tradeoff.  You can see that by the fact that Oracle is off nearly 7% pre-market today.

The cloud is also a strong driver for the transformation of Oracle’s hardware business to their “engineered systems” or appliance model.  There is a general industry trend toward packaged functionality because buyers don’t have the staff to perform complex integration tasks and because the cloud encourages function-as-a-service deployment, which appliances are truly good at providing.  In fact, I think that Oracle has a good thing going for it in “FaaS” as a bridge between traditional platform as a service and what I’ve been calling “platform services”, the web services that augment applications by exposing valuable features as URLs.

Oracle’s PaaS seems to be coalescing into a platform services collection; only their Java Cloud Service could be considered PaaS in the strict old sense of Microsoft’s Azure.  Since I think that platform services are the best approach to the cloud (it’s too late for old-line OS-and-middleware PaaS), I think that this positions Oracle better than anyone other than Amazon in the cloud services space, and Amazon isn’t making private cloud software available, which Oracle is.  Oracle may have waited a long time for their cloud model, but it seems to be a very good one, but it’s not perfect and that’s where the risks (and similarity to IBM) come in.

My modeling has long shown that the largest potential source of new data centers in this decade is the carrier cloud, which would make service providers the most important vertical.  Oracle’s website treats the service provider as a kind of “horizontal vertical” under “Communications” so this most important industry of all has to dig through a couple of layers to find specific product detail aimed at its needs.  Worse yet, the detail Oracle provides for operators focuses on fairly pedestrian (though recognized) issues like customer experience management and billing.

We are embarking on a revolution in operations created by virtualization and the cloud—yes, that very cloud Oracle is making into a focus.  Taken from Oracle’s perspective, the carrier cloud should be their hottest target because it’s middleware-centric (Oracle is a big middleware player), it favors large credible vendors (which Oracle is), and it favors companies who can invest some R&D with delayed gratification (Oracle’s cloud drive commits them to this anyway).  Oracle needs to be taking the lead in carrier cloud, which is mostly about what I’ve called “universal MANO”, the management and orchestration of hosted components of services or applications.

Oracle has great credentials in VoIP evolution, but that’s not something operators want as much as something that’s being forced on them.  Many tell me there are serious questions in their minds on whether explicit voice services are even a good idea; letting Skype win is credible to them if voice ROI will be very low.  They have great signaling credentials, but other than IMS it’s hard to identify evolutionary service trends that demand conventional signaling (Diameter) at all.  Why not develop some credentials in MANO?  Their top NFV offering isn’t MANO, it’s session control and VoIP, which arguably are only hostable functions and not NFV at all, absent a focused MANO companion element.  On their call, Oracle never mentioned networking, much less SDN or NFV.

What Oracle has said in formal presentations about NFV seems to focus its aspirations on making NFV into a new agile IMS platform.  NFV, which talks about agile and manageable function hosting, is most valuable when it’s applied to dynamic service elements.  IMS is a long-term multi-tenant application that arguably doesn’t even need the cloud much less NFV.  Further, IMS is something that RAN providers from the network equipment side have an advantage in; they’ve got to be in the deal because you need radios to make mobile services work.  It’s not clear you need all of IMS, and Metaswitch’s Project Clearwater offers all that you’d likely need other than registration.  It’s open source, so it’s a tough competitor.

Where does IBM come in?  I’m getting there.  If Oracle wants its PaaS to be optimal it has to recognize that platform services tend to be market-target-specific, and that those for the carrier cloud space (the biggest opportunity) demand MANO, SDN management, and other things that are at least being stimulated out of NFV.  It also has to position its assets while there’s still time.  Carriers have a long sales cycle but a good concept can get entrenched there and make a boatload of cash for those who provide it.  The MANO opportunity is addressable now and highly differentiable relative to competitors.  To make it Oracle’s own, though, Oracle has to focus its marketing collateral and website on that space.  Obviously, it also has to cover the space adequately.

IBM is also the most significant possible entrant into the NFV/MANO space.  They have all the tools needed, and their model for orchestration appears to be very similar to the one I believe is the optimum one (in fact it seems rooted in the same standards my ExperiaSphere initiative is based on, at least in modeling services).  IBM Research, the source of most of IBM’s insight in the space, is now a contributor to an ETSI PoC.  Suppose IBM gets this right, very right?  That would make future earnings calls for Oracle a lot more problematic.

Wind River’s Ecosystemic Solution to NFV and Orchestration

I blogged just yesterday about the possibility that the competitive dynamic in the orchestration space would be changed by Cisco’s Tail-f deal.  Since then we’ve had another announcement in the space, this one from Wind River.  The company’s NFV approach has a lot of good about it, a singular issue I’d like to see them deal with, and a high probability of further defining the NFV and orchestration space.

I’ve always liked Wind River, partly because they are an Intel company and it seems like the combination of open-source software and chips is a match made in heaven.  I also like their Carrier Grade Communications Server concept (so much that it’s a part of my ExperiaSphere model, as the recommended platform for hosting).  They’ve been active in the NFV space from the first, so they’re not just dabbling in the space to ride the hype wave.  Now, they’re addressing the critical area “above” the NFV Infrastructure platform, which is good.

The strategy Wind River has selected is one of an ecosystem of partners, which they call their Titanium Cloud ecosystem.  The program, whose announced initial partners are Brocade, GENBAND, Metaswitch Networks, Nakina Systems, and Overture Networks, is focused on building upward from CGCS to address the needs of the cloud, SDN, and NFV more fully.  Two of these companies were members of the CloudNFV project (Metaswitch and Overture) and they represent both the VNF side of NFV (Metaswitch) and the orchestration and infrastructure side (Overture).

I’m not a big fan of ecosystem approaches generally, as readers likely know.  What makes this one different is that you could build NFV completely from the ecosystem and even include the mobile/IMS stuff that’s the most popular example of an early NFV opportunity.  You can’t say that something is fluff when it’s functionally complete, and from that perspective Wind River’s Titanium could jump out into a space with few current inhabitants—the “NFV product” space.  The operative word here is “could”, and that’s the thing I’d like to see Wind River address more directly.

The most significant question I have about Titanium is “Who sells it?”  It appears that each of the vendors involved could sell their own stuff and that would likely include Wind River’s CGCS, but all of these vendors have a different slant on NFV and not only that provide different pieces of the high-level puzzle.  Any vendor in Titanium who wanted to sell a complete ecosystemic solution would have to integrate it.  None would have the credibility of Intel behind them.

A second question is the details of that assembly and integration.  I built a project from multiple vendors to implement NFV so I know what’s involved.  You have to promote some vision, some overall architecture, that helps get all the pieces assembled in an orderly way, or what you have is just a marketing convocation, a cheering section.

Integrating the pieces from Titanium into an offering wouldn’t be difficult, I think, but it wouldn’t be trivial either.  Some of the partners are competitors at least in part, and as more partners are added it’s going to be harder to insure that the sum of the parts adds up to anything useful.  It will also raise the question of who stands behind the summation.  Operators, to be confident about something as revolutionary as SDN, NFV, or the cloud, have to be confident that they have a trusted partner and a clear architecture.  Wind River will have to address that to actually realize the functional completeness they “could” provide.

That could be complicated, because Intel may be riding a large number of horses in the NFV and orchestration race.  Tail-f has been an Intel Network Builder partner; they’re still on the site.  So is an Intel company building an ecosystem that would be competitive with a Network Builder?  Does Intel hope that Cisco will adopt Wind River CGCS, and does that hope extend to either Cisco becoming a part of Titanium or Titanium being used as part of Cisco’s strategy for MANO and/or NFV?  I don’t think Tail-f is a complete solution but you could put one together from Titanium partners, as I’ve said, and that would give Cisco a better story.

Since the Titanium players could (that word again!) field what could be the most complete strategy for NFV and MANO commercially available, they certainly put pressure on Alcatel-Lucent, Dell, HP, and Red Hat.  Alcatel-Lucent could face a formidable challenge to the early credibility lead CloudBand has secured among operators.  HP could have someone with essentially comparable functionality to OpenNFV but who can actually deliver it all.  Dell and Red Hat now see competitors with a far better story than their own.  Unless they want to abandon a carrier vertical that could be the largest source of new data centers in the remainder of this decade, they’ll have to step up.

One player who might actually be happy with this is Ericsson.  If you’re a company who wants to make money on professional services, then having a credible ecosystem rather than a credible product leading the charge in NFV and MANO is a good thing because of that nagging question of who integrates and stands behind the ecosystem.  Ericsson may have an answer for that—they do.

Thus, Titanium may elevate the question of professional services for integration versus single-source as much or more as it elevates the NFV and orchestration dialog.  Do operators want best-of-breed components and an integrator, or a single source?  In my surveys they’re on the fence on that point; about 40% say one, 40% the other, and 20% say they can’t pick at this point.  Operator attitude might not matter much if nobody offers a single-source solution.  Getting that single-source solution will be complicated by the fact that the cloud, SDN, and NFV are all evolving in their own separate spaces (even though they rely on common notions like MANO).  If somebody can make an ecosystemic approach real, they could drive things forward in all three of our revolutions, to both the market’s benefit and their own.

How Cisco’s Tail-f Deal Helps…Helps its Competition

Cisco isn’t the largest networking company, but they’re likely the most famous.  They’re associated with the IP revolution, the Internet, and the way that a little company can suddenly become a giant and make a lot of people rich along the way.  “Be the next Cisco” entered the lexicon of startup success, in fact.  So when they make an acquisition that’s reportedly in the new hot space of “orchestration” people take notice.

Cisco certainly needs to have an orchestration strategy.  I’ve blogged before that Cisco’s success likely depends in large part on riding the cloud effectively to IT success for UCS.  They can do that without orchestration only if someone else provides it, and that’s because without MANO you can’t improve service agility and operations efficiency in NFV applications or effectively deploy highly agile and componentized applications in cloud computing.  Of course, if somebody else is providing MANO then 1) Cisco has to put its aspirations on hold until it arrives because cloud/NFV spending won’t peak without it, or 2) Cisco has to bet that capex control alone will promote the cloud.  It can’t wait for the former and the latter would put it into a commodity market.

The problem is that Tail-f, who has from the first been promoting Yang and Netconf, isn’t IMHO much related to orchestration at all.  Yang and Netconf are parameterization/control languages that first model a network and then exchange parameter data between a management system and a device.  They’ve promoted it for SDN, for NFV, for the cloud, and Cisco has bought into it largely I think because Cisco has always tended to do M&A based on sales recommendations and not on strategic value.

Networks are made up of abstract functions that are realized by committing sets of cooperating resources.  More modern concepts of cloud orchestration (TOSCA for example) reflect this model and can be applied to both cloud-hosted functional elements and devices.  The principle here is to develop a hierarchy that will envelop current management and provisioning tools, and emerging stuff like SDN’s “northbound APIs” in a common abstraction set so that you can ask for a VPN without caring how it’s set up.

In three successive projects and two formal standards activities, I’ve worked on abstraction-based encapsulation of current and emerging software and hardware, and I think the results have proved out in every case.  ExperiaSphere will prove shortly that you can use the correct standards and open source tools to orchestrate at the functional level and accommodate both legacy devices and emerging technologies.  CloudNFV, with similar functional capabilities but without the open-source requirement, just won a TMF award for innovation, and innovation is what we need to face the future, not a new way of reorganizing stuff we can already do.

That Tail-f has been involved in a lot of trials is a testament to their engagement with telcos and the fact that the telcos are wrong-headed in the main on the topics of SDN and NFV.  You don’t think about software abstractions if you’re a switch/router guy, and so you’re easy prey for a story that converging on a single control mechanism at the bottom will somehow make SDN or NFV happen faster.  Any software guy will tell you that nothing good comes from bottom-up design.  It’s possible that Cisco is seeing Tail-f as a way of harmonizing other equipment into Cisco’s own evolution-not-revolution SDN and NFV approach.

If you want the services of the future SDN or NFV world to somehow be delivered by the same switches and routers as always, then it’s possible that having some mechanism to coerce the behavior you need from the other vendors you’d likely find in a buyer network would be valuable.  If Cisco is really as committed to policy control of networks, then the networks would be divided into little administrative zones subordinate to a policy controller that would then induce the necessary behavior from the devices below.  That could be useful to Cisco, but it’s not orchestration, and even policy hierarchies are hierarchies, and Cisco is introducing features to a market, validating them by Cisco’s demonstrated interest, and then not nailing them firmly down as issues Cisco can own.

That would be a major risk for Cisco, because their move here is almost certain to create a lot of competitive counterpunching, and in the media where having a good story is more important than some lab trial wins.  We’ve seen competitive strategies in orchestration that are going about this the right way, based on a software-centric hierarchical model.  Any of these could be used to generate some real market interest, and thus create a threat to Cisco.  Of particular concern is HP, who has on paper a super strategy but still has to fill it out.  Might they now, given Cisco’s move, take a bit of time and money and get their whole offering together?  If so they could present a pretty exciting alternative to what’s been totally pedestrian Tail-f positioning and totally self-serving Cisco MANO stories.

HP is also dangerous at the technical level, because as an IT company they don’t particularly want to change the world down at the network device level.  At least, not any faster than buyers can see the value to doing so.  Every salesperson knows that you don’t propose a project that costs your buyer a ton of money when you’re not even going to get that money.  Every IT company who looks at SDN or NFV will naturally want to add servers and software to current network configurations as an overlay, to organize and optimize and agile-ize what’s already there.  The benefits then flow from the very first.  They don’t want to diddle with parameter flows in the basement, which Cisco now seems to have committed to doing.

Tom’s First Rule of Positioning:  Don’t validate a market with a step that doesn’t seize ownership of that market.  You’re pointing everyone in the right direction and then heading out down a different path.  Cisco may have started the orchestration land rush with a move that ties them to a tree, to watch as others race off to the finish line.

What Amazon’s History Shows Us About their Future

The future of networking is being defined literally as we speak, but not by the companies who’d like to define it, or even the ones we’d think would be doing the defining.  We’re seeing a passing of innovation from the traditional network players to the OTT players, and from traditional network vendors to a collection of software types, some of which work for OTTs and others of which work for nobody anyone has ever heard of.

One company that everyone has heard of and that’s increasingly driving the revolution is Amazon.  In the news recently with its Prime music service, Amazon has been a kind of universal spoiler, a company who has managed to step on a lot of other people’s dreams.  Now, in the edge of their possible entry into the smartphone space, we have to ask “What is it that makes them so powerful?”  Are their people more insightful, smarter? What could it be?  I think Amazon has a lot of smart, insightful people but so do others.  Their secret lies in positioning and simple business metrics.

In a race to the bottom, a world of commoditization, the guy with the lowest internal rate of return (IRR) will always win.  Generally speaking a company can invest profitably above its IRR, so a low IRR means that projects that would be ROI poison for others could be a profit haven for you.  Public utilities or former ones, like the network operators, have historically low IRRs and so they fit this happy model.  Amazon, as a mass-market retailer, also has a low IRR.  Cisco, Apple, Google…all the competitors that seem to have at least as much glamour as Amazon or more…have higher IRRs.  They fight Amazon with one financial arm tied behind their back.

But if common carriers also have low IRRs, then what distinguishes them from Amazon?  The answer is that Amazon has a tech culture bred on efficiency while the operators have one bred on guaranteed rate of return.  Amazon learned early on that to be profitable they had to wring every last penny of margin from something, and so they’ve looked at technology not as some sort of divine mandate but as a simple tool.  You adopt what’s useful, in the way that’s the most useful.

But that’s not all.  Amazon learned the simplest less of all with its Prime service, the lesson of ARPU.  You’ll succeed even if you can’t grow your customer base as long as you can increase the average revenue per user.  Amazon Prime is like an annuity.  If Amazon can make Prime a durable value they get a slow but steady revenue growth from what’s now clear will be regular price increases for Prime.  That’s in addition to their retail margins.  People are paying Amazon to be customers.  I don’t think an ad-sponsored mobile service would be a logical move for them and I’m not convinced that even ad-sponsored hardware would be.  Their edge is that people pay, and where the money is, there is the future.

There’s been criticism of Prime Music as there has been of Prime Video.  OK, I agree that if you’re on the leading edge of music or video you’re not going to sate yourself by consuming Prime stuff (as old-timer I’m not looking for the latest so I don’t see the problem myself but I understand those who do).  That’s not the point.  Make Prime valuable enough with incremental new stuff and you can get users to tolerate price creep.  Which is ARPU creep for Amazon.  Where in networking, besides them, do you find a company confidently predicting rising ARPU?

But there’s more.  In an industry where everyone seems to be joining or creating standards groups, Amazon remains aloof.  Their implied message; screw standards.  All they do is encourage commoditization, empower those that are weaker.  If you have the market power to go your own way, you do that.  The Amazon cloud strategy is a good example.  They’ll accept standards that help them and ignore the rest.  This, while their service provider customers are spinning their wheels on consensus-building.  Innovation trumps standardization any time in the real world.

In the cloud, arguably the biggest opportunity of the current age, Amazon is balancing forces others don’t even seem to see.  They have to know, better than anyone, that pure IaaS is an economy of scale race to zero margins.  Many companies would, as Microsoft has done, note that higher cloud service models are better for the user and the provider alike.  But Amazon is incumbent in IaaS.  Their solution was to create something above IaaS but not PaaS or SaaS.  “Platform services” is what I’ve called it, but Amazon hasn’t even bothered to name it because they know they can demonstrate value and so don’t need market hype.  Platform services are extensions to IaaS that draw on the Internet web service model; they create little islands of utility that any application can access, and so they’re agnostic on the xaaS debate.  But they add value, people will pay for them, and they extend Amazon’s IaaS incumbency into the future.  Salesforce and Microsoft may hold higher ground, but it’s in the middle of an open field of fire.  Where do they go without coming down and facing the bullets?

Amazon will likely field a handset and become an MVNO.  I think they’re even plotting their Internet of Things strategy.  Why?  Because they can.  They have brand, they can make investments with low rates of return, they can build and leverage a growing customer base.  They won’t do drone deliveries, but they got a lot of press with the idea and that was likely their goal.  Rumors get you PR, but solid business offerings get you money.  Ad sponsorship is something even Google and Facebook will have to augment for growth.  Amazon is where both these giants would like to be.

Why is Amazon winning?  Because they’re a better technology company?  No, because they’re not a technology company at all.  They are a mass-market retailer in an industry that is commoditizing.  What better thing could you be?  Apple wants cool people; they’ll never be the mass market.  Google wants ad payment; that’s a less-than-zero-sum game.  Amazon just wants everyone’s money and that’s the best thing of all to want, because it’s something you can strategize to get.