Can We Modernize OSS/BSS?

There have been a number of articles recently about the evolution of OSS/BSS, and certainly there’s pressure to drive evolution, based on a number of outside forces.  Caroline Chappell, Heavy Reading analyst, has been a particularly good source of insightful questions and comments on the topic, and I think it’s time to go a bit into the transformation of OSS/BSS.  Which, obviously, has to start with where it is now.

It’s convenient to look at operations systems through the lens of the Telemanagement Forum standards.  The TMF stuff can be divided into data models and processes, and most people would recognize the first as being “the SID” and the second as being “eTOM”.  While these two things are defined independently, there’s a presumption in how they’ve been described and implemented that is (I think) critical to the issue of OSS/BSS evolution.  In most cases, eTOM describes a linear workflow, implemented in most cases through SOA and workflow engines like enterprise service bus (ESB).  There is a presumptive process flow, in short, and the flow needs data that is drawn from the SID.  This follows the structure of most software today; process model described one way, linked in a general way to a supporting data model independently (but hopefully symbiotically) described.

There are several reasons for this “workflow” model of operations.  One is that OSS/BSS evolved as a means of coordinating manual and automated processes in synchrony, so you had to reflect a specific sequence of tasks.  Another is that as packet networks evolved, operations systems tended to cede what I’ll call “functional management” to lower-layer EMS/NMS activities.  That meant that the higher-level operations processes were more about ordering and billing and less about fulfillment.

Now, there’s pressure for that to change from a number of sources.  Start with the transitioning from “provisioned” services to self-service, which presumes that the user is able to almost dial up a service feature as needed.  Another is the desire for a higher level of service automation to reduce opex, and a third is a need to improve composability and agility of services through the use of software features, which changes the nature of what we mean by “creating” a service to something more like the cloud DevOps process.  SDN and NFV are both drivers of change because they change the nature of how services are built, and in general the dynamism introduced to both resources and services by the notion of “virtualization” puts a lot of stress on a system designed to support humans connecting boxes, which was of course the root of OSS/BSS.  These are combining to push OSS/BSS toward what is often called (and was called, in a Light Reading piece) a more “event-driven” model.

OK, how then do we get there?  We have to start somewhere, and a good place is whether there’s been any useful standards work.  A point I’ve made here before, and that has generated some questions from my readers, is that TMF GB942 offers a model for next-gen service orchestration.  Some have been skeptical that the TMF could have done anything like this—it’s not a body widely known for its forward-looking insights.  Some wonder how GB942 could have anticipated current market needs when it’s probably five years old.  Most just wonder how it could work, so let me take a stab at applying GB942 principles to OSS/BSS modernization as it’s being driven today.

When GB942 came along, the authors had something radically different in mind.  They proposed to integrate data, process, and service events in a single structure.  Events would be handed off to processes through the intermediation of the service contract, a data model.  If we were to put this into modern terms, we’d say that GB942 proposed to augment basic contract data in a parametric sense with service metadata that described the policies for event handling, the links between services and resources, etc.  While I certainly take pride of authorship for the framework of CloudNFV, I’ve always said that it was inspired by GB942 (which it is).

Down under the covers, the problem that GB942 is trying to solve is the problem of context or state.  A flow of processes means work is moved sequentially from one to another—like a chain of in- and out-boxes.  When you shift to event processing you have to be able to establish context before you can process an event.  That’s what GB942 is about—you insert a data/metadata model to describe process relationships and the data part of the model carries your parameters and state/contxt information.

This isn’t all that hard because protocol jandlers do it every day.  All good protocol handlers are built on what’s called “state-event” logic.  Visualizing this in service terms, a service starts off in, let’s say, the “Orderable” state.  When a “ServiceOrder” event arrives in this state, you want to run a management process which we could call “OrderTheService”.  When this has been done, the service might enter the “ActivationPending” state if it needed to become live at a future time or the “Active” state if it’s immediately available.  If we got a “TerminateService” in the “Active” or “ActivationPending” state we’d kill the order and dismantle any service-to-resource commitments, but if we got it in the “Orderable” state it would be an error.  You get the picture, I’m sure.  The set of actions that should be taken (include “PostError”) for any combination of state/event can be expressed in a table, and that table can be included in service metadata.

You can hardly have customer service reps (or worse yet, customers) writing state/event tables, so the metadata-model approach demands that someone knowledgeable puts one together.  Most services today are built up from component elements, and each such element would have its own little micro-definition.  An architect would create this, and would also assemble the micro-services into full retail offerings.  At every phase of this process, that architect could define a state/event table that relates how a service evolves from that ready-to-order state to having been sold and perhaps eventually cancelled.  That definition, created for each service template, would carry over into the service contract and drive the lifecycle processes for as long as the services lived.

Connecting this all to the real world is the last complication.  Obviously there has to be an event interface (in and out), and obviously there has to be a way of recording resource commitments to the service so that events can be linked through service descriptions to the right resources.  (In the CloudNFV activity these linkages are created by the Management Visualizer and the Service Model Handler, respectively; whatever name you give them the functionality is needed.)  You also need to have a pretty flexible and agile data/process coupling system (in CloudNFV this is provided by EnterpriseWeb) or there’s a lot of custom development to worry about, not to mention performance and scalability/availability.

In theory, you can make any componentized operations software system into something that’s event-driven by proceeding along these lines.  All you have to do is to break the workflow presumptions and insert the metadata model of the state/event handling and related components.  I’m not saying this is a walk in the park, but it takes relatively little time if you have the right tools and apply some resources.  So the good news is that if you follow GB942 principles with the enhancements I’ve described here, you can modernize operations processes without tossing out everything you’ve done and starting over; you’re just orchestrating the processes based on events and not flowing work sequentially along presumptive (provisioning) lines.  We’re not hearing much about this right now, but I think that will change in 2014.  The calls for operations change are becoming shouts, and eventually somebody will wake up to the opportunities, which are immense.

Clouds, Brown Paper Bags, and Networking’s Future

I’ve blogged a lot about declining revenue per bit and commoditization of hardware in networking.  The same thing is happening in IT, driven by the same mass-market and consumerization forces.  You can’t sell a lot of something that’s expensive; Ford made the automobile real by making it cheap.  So arguably networking and IT are getting real now, and that means that perhaps we have to look at the impact.  As I’ve noted before, you can see some signs of the future of networking in current IT trends.  Look further and you see clouds and brown paper bags.

In the old days, Microsoft would not have made hardware, and Intel would not have made systems or devices.  It’s pretty clear that’s changing and the reason is that as prices fall, profits pegged to a percentage of price will obviously fall too.  Declining margins will nearly always have the effect of forcing intermediaries out of the food chain.  You want to keep all the money.

In networking, this is going to put pressure on distribution channels over time.  We can already see some of that in the fact that the network vendors are working with larger and larger partners on the average, moving away from the VARs and the populist networking in favor of presenting products directly to mass-market retail.  A small business can get everything they need in networking from Staples or Office Depot, and in my survey of mid-sized businesses this fall, they reported that their off-the-shelf investment in networking had almost doubled versus 2012.

The other impact of commoditization is that it quickly runs out of its own value proposition.  If gear is really expensive then getting it for ten percent less really means something.  As prices fall, the discounts are less significant in dollar terms, but more important the component of cost that network equipment represents is less important overall.  Five years ago, businesses said that capital cost of equipment was 62% of their total cost of ownership, and today it’s 48%.

From the perspective of network services, commoditization is hurting operators more every day and increasing their disintermediation problem.  If the industry’s services are driven by people riding on a stream of bits that’s getting cheaper (in marginal cost terms) every year, then those people are gaining additional benefit while the operators carrying those bits are losing.  This is the basic force behind operators’ need for “transformation”.  First and foremost, they need to be able to get into the “service” business again.  That’s important because their network vendors are doubly impacted—they’re increasingly expected to help their customers, and they are also increasingly at risk because their own core business of bit-pushing is less the strategic focus.

Translation to services isn’t as easy as it sounds.  If I’m an operator and I’m selling stuff over my network instead of from my network, am I not killing my ability to differentiate what I spend most of my capex doing—which is pushing bits and making connections?  Look at what’s happening in the enterprise.  Ten years ago, the CIO was likely to report directly to the CEO, while today they’re more likely to report to the CFO.  Twenty years ago, the head of enterprise networking had influence nearly equal to that of the head of IT, and today both these positions say that networking has about a fifth the influence of the IT counterpart.  Think of the political impact of this shift, and how it might impact service providers—people who are essentially all networking types.

That poses, to me, what should be the critical question about the evolution of networking, which is whether we can envision new services from rather than just on the network of the future.  It should be obvious that if anyone is going to empower the network as a source of goodness and not just a  brown paper bag to carry it in, that someone will have to be a network equipment vendor.  IT or software giants have everything to gain by encouraging commoditization of networking—it’s more money on the table for them.

Are those vendors empowering the network?  At some levels you can argue that they are.  Cisco’s application-centricity model would at least expose network features to exploitation by applications, but that’s not enough because it gets us back to being about carriage and not about performing some useful task.  What is needed for networks to be valuable is for more of what we think of as the cloud to be subsumed into the network.  Right now, the cloud to most people is just VM hosting—IaaS.  If the cloud is adding platform services (to use my term) to build a virtual OS on which future applications can be built, the mechanism for that addition could just as easily be something like NFV as something like OpenStack.  Or it could if network people pushed as hard as cloud people, meaning IT people, are pushing.

It’s important for our industry to think about this now, because (returning to commoditization) the rewards of success are greatest when the benefits created from success are the greatest.  Revolutions may be hard to sell, but they bring about big changes and create big winners (and of course losers).  If we wait for the right moment to do something in networking when things like OpenStack are advancing functionality in the network space (Neutron) two to four times per year, they’ll move the ball so far that all networking will be able to do is dodge it.

Commoditization at the bottom creates revolutionary potential in the middle, in the adjacent zone.  I think that the evolution of the cloud is taking it “downward” toward the network and the evolution of the network must necessarily take it upward to respond to its own commoditization pressures.  That boundary zone is where the two collide, and in that zone we will likely find the opportunities and conflicts and players and risk-takers who will shape where networking, the cloud, and IT all go as their own safe core markets commoditize.  That battle cannot be avoided; it’s driven by economic forces.  It can be won, though, and networking needs to win it to avoid becoming a brown paper bag.

Two Good Tech Stories Gone Bad

There are a couple of recent news items that involve a big player, and while the focus of the news is different I think there is a common theme to be found.  One is a little impromptu talk reported on Beet.tv from Cisco’s John Chambers at CES, and the other is Oracle’s acquisition of SDN vendor Corente.  Both mix a healthy dose of interest and hype, so we need to look at them a bit to extract the pearls of wisdom—and there are some.

I have to confess that I’m finding the progress of Cisco’s descriptions of the Internet’s evolution entertaining.  We started with the “Internet” then the “Internet of Things”, now the “Internet of Everything.”  Next, perhaps, is the “Internet of Stephen Hawking Multidimensional Space-Time.”  Underneath the hyperbole, though, Chambers make a valid point in his little talk.  We are coming to a stage where what’s important on the Internet isn’t “information” but the fusion of information into contextual relevance.  Mobile devices have turned people from being Internet researchers into being Internet-driven almost completely.

The problem I have in calling this a useful insight is that I don’t think Chambers then makes use of it.  He jumps into NDS and GUIs, skipping over the fact that what is really needed for context-fusing is a combination of a rich vision of what “context” means and a means of filtering information flows through that vision.  It’s a made-for-the-cloud problem, and Cisco purports to be driving to cloud leadership, so it would be nice to have it lead here.

Cisco could easily build a story of context-driven information and event filtering; they have nearly all the platform tools they’d need, the underlying servers and network elements, and the right combination of enterprise and operator customers to push solutions to.  The kind of network that Chambers is indirectly describing is one where “the cloud” pushes all the way to the edge, and where users and apps grab onto a context agent in the cloud to shop, or eat, or talk, or think, (or whatever) through.  In this model, you actually consume a lot of capacity because you want to get all of the information and all the context resources connected to these context agents through low-latency pipes so the user gets what they want quickly.  By the way, these pipes, being inside the cloud, are immune from neutrality rules even under the current regulations.

There’s really no good reason not to push this vision, either.  It’s not going to overhang current Cisco products, it’s not going to demand new channels or new buyer relationships.  It supports the operator goals of monetization better than Cisco has supported them to date, and it could be applied to enhance worker productivity and thus drive up the benefit case that funds network investment on the business side.  We seem to have an example of knee-jerk evangelism here; Cisco is so used to spinning a yarn then putting a box in front of the listener that they can’t put a concept there instead—even a good one.

Oracle’s acquisition of Corente has its own elements of theater.  The description of Corente as an SDN offering is a stretch, IMHO.  It’s a tunnel-overlay system created by linking edge devices with hosting platforms, for the purposes of delivering an application set.  It’s sort-of-cloud, sort-of-NFV, sort-of-SDN, but it’s not a massive piece of any of the three.  It could in fact be a useful tool in cloud hybridization, as any overlay virtual network could be that runs end to end.  It’s not clear to me how well it integrates with carrier infrastructure, how it could influence network behavior or take advantage of specific network features.

There is sense to this; an application overlay or virtual network that manages connectivity separate from transport is a notion I’ve always liked, and in fact I’ve said that the two-layer model of SDN (connectivity and transport) is the right approach.  It’s logical in my view to think of the future network as an underlying transport process that serves a series of service- and application-specific connection networks.  Connectivity and connection rights must be managed by the applications or at the application level, but you don’t want applications messing with transport policies or behaviors.  Thus, Oracle has a good point—if you could convince them to make it rather than saying it’s “software-defined WAN virtualization”.  That seems to me little more than taking tunnels and wrapping them in the Holy Mantle of SDN.

And that’s what it appears to be.  At the technical level you have to offer some way to connect my two layers of SDN so you’re not riding best-efforts forever, and at the high positioning level you have to make it clear that’s what you’re doing.  Corente didn’t do that on their site and Oracle didn’t do it in the announcement.  Other vendors like Alcatel-Lucent have articulated end-to-end visions that tie down into infrastructure, which is a much better approach (particularly for network operators who need to make their networks valuable).

This is one of those times when you could speculate on the reasons why Cisco or Oracle would do something smart but do it stupidly.  One, they did an opportunistic thing for opportunistic reasons and by sheer chance it happened it had relevance to real problems.  Since they didn’t care about those problems, they didn’t catch the relevance.  Two, they did a smart thing for smart reasons and just don’t understand how to articulate anything that requires more than third-grade-level writing and thinking.  Three, we have a media/market process incapable of digesting anything other than “Dick and Jane do telecom”.  Probably all of the above, which is why I’m not going to spend much time trying to pick from that list of cynical possibilities.

Ultimately, the smarts will out.  Again, it’s not constructive to predict whether startups will jump in (somehow making the VCs believe this is all part of social networking and then working under the table, perhaps) or that some major vendor will hire a CEO who actually does understand market requirements and has the confidence and organizational drive of a Marine Gunnery Sergeant.  The question is how much vendors lose in the time it takes for this to happen.

Why Test-Data-as-a-Service is Important to NFV

Yesterday, the CloudNFV project (of which I am Chief Architect) announced a new Integration Partner, Shenick Network Systems.  Shenick is a premier provider of IP test and measurement capabilities, and I’m blogging about this not because of my connection with CloudNFV but because the announcement illustrates some important points about NFV and next-gen networking in general.

No matter how successful NFV is eventually, it will never completely displace either dedicated network hardware or static hosted information/processing resources.  In the near term, we’ll certainly have a long period of adaptation in which NFV will gradually penetrate those areas where it is suitable, and in that period we’ll have to live and interwork with legacy components.  Further, NFV is still a network technology and it will still have to accommodate testing/measurement/monitoring functions that are used today for network validation and diagnostics.  Thus, it’s important for an NFV model to accommodate both.

One way to do that is by making testing and measurement an actual element of a service.  In most cases, test data injection and measurement will take place at points specified by an operations specialist and under conditions where that specialist has determined there’s a need and an acceptable level of risk to network operations overall.  So at the service level, we can say that a service model should be able to define a point of testing/monitoring as an interface, and connect a testing, measurement, or monitoring function to that interface as needed.

The question is how that function is itself represented.  I blogged yesterday about the value of platform services in the cloud, services that were presented through a web-service interface and could be accessed by an application.  It makes sense to assume that if there are a number of points in a network where testing/monitoring/measurement facilities exist, we should be able to link them to an interface as a platform service.  This interface could then be “orchestrated” to connect with the correct point of testing defined in the service model, as needed.

Of course, there’s another possibility, which is that there is no static point where the testing and measurement is available.  Shenick TeraVM is a hostable software testing and measurement tool, and while you can host it in specific places on bare metal or VMs, you can also cloud-host it.  That means it would be nice if in addition to supporting a static location for testing and measurement to be linked with service test points, you could spawn a dynamic copy of TeraVM and run it somewhere proximate to the point where you’re connecting into the service under test.

What Shenick is bringing to CloudNFV (and to NFV overall) is the ability to do both these things, to support testing and measurement as a static set of platform service points and also as a virtual function that can be composed into a service at specific points and activated on demand.  The initial application is the static model (because CloudNFV runs today in Dell’s lab, so dynamism in terms of location isn’t that relevant) but Shenick is committed to evolving a dynamic support model based on VNFs.

We need a way to connect testing, measurement, and monitoring into services because operations personnel rely on them today.  What’s interesting about this Shenick approach is that it is also a kind of platform-services poster-child for NFV.  There are plenty of other things that are deployed in a static way but linked dynamically to services.  Take IMS, for example.  You don’t build an IMS instance for every cellular user or phone call, after all.  Same with CDN services.  But if we have to build a service, in a user sense, that references something that’s already there and presented as a web-service interface or some other interface, we then have to be able to model that when we build services.  That’s true whether we’re talking about CloudNFV, NFV a la ETSI ISG, or even traditional networking.  Absent modeling we can’t have effective service automation.

In NFV, a virtual network function is a deployable unit of functionality.  If that function represents something like a firewall that is also (and was traditionally) a discrete device, then it follows that the way a service model defines the VNF would be similar to the way it might define the physical device.  Why not make the two interchangeable, then?  Why not say that a “service model component” defines a unit of functionality and not necessarily just one that is a machine image deployed on a virtual machine?  We could then model both legacy versions of a network function and the corresponding VNFs in the same way.  But we could also use that same mechanism to model the “platform services” like testing and measurement that Shenick provides, which of course is what they’re doing.  Testing and measurement is a function, and whether it’s static or hosted in the cloud, it’s the same function.  Yes, we may have to do something different to get it running depending on whether it’s pre-deployed or instantiated on demand, but that’s a difference that can be accommodated in parameters, not one requiring a whole new model.

I think the Shenick lesson here is important for both these reasons.  We need to expect NFV to support not only building services but testing them as well, which means that testing and measurement have to be included in service composition and implemented either through static references or via VNFs.  We also need to broaden our perception of service modeling for NFV somehow, to embrace the mixture of things that will absolutely have to be mixed in any realistic vision of a network service.

Both SDN and NFV present a challenge that I think the Shenick announcement brings to light.  Network services require the coordination of a bunch of elements and processes.  Changing one of them, with something like SDN or NFV, certainly requires the appropriate bodies focus on what they’re changing, but we can’t lose sight of the ecosystem as we try to address the organisms.

The Cloud’s Great Unifier: Platform Services

I’ve blogged through all of 2013 that of our three “revolutions” of cloud, NFV, and SDN, it’s the cloud that’s the senior partner.  The transformational aspects of SDN and NFV depend to a very large degree on our harnessing cloud capabilities to host network features.  We’re also hearing that Amazon is on a tear, with Credit Suisse estimating their cloud revenues at $3.5 billion and their growth rate at 60% or more.  From this, you might think that our cloud senior partner is going to be a runaway success and drag the other stuff along.  Well, maybe.

The challenge for the cloud today is simple; myopia.  If you believe that IaaS by itself, the hosting of applications that could have been run on dedicated servers or virtualized ones, is a revolution, then you’re likely someone who’s avoiding real stimulation.  There is simply not enough value in IaaS-cloud versus virtualization versus just shared hosting like we do every day for web pages or email.  So the cloud has to be more to be a revolution, and it has to do more than IaaS to support SDN or NFV.  What?

Amazon has been quietly answering that question with what I’ve called platform services.  A platform service is a web service presented through a network interface via an API, and designed to effectively extend the set of APIs that operating systems and middleware normally provide.  It differs from platform-as-a-service in that a cloud architecture based on platform services says in effect “I don’t care what you do to satisfy traditional operating system or local-middleware APIs; make your own choice and build a machine image.  But in your application, do cloud-specific things using platform services that are cloud-optimized.”

Amazon has been adding platform services to EC2, ranging from caching and HTML acceleration to flow optimization and desktop-as-a-service.  All of this stuff lets you build “cloud applications” that would have been impractical/impossible without the cloud, which is IMHO the only way you can generate a lot of true cloud migration.  As large as Credit Suisse’s estimate for Amazon’s cloud revenue stream is, it represents only about a quarter’s worth of server revenue for HP or IBM.  We’re going to have to do a lot better than that to create a revolution, and that’s exactly what platform services could do.

Could already be doing, in fact.  Some of my sources tell me that almost 60% of Amazon’s cloud revenue comes from applications that have a significant platform services component, which means that it may depend on a more-than-IaaS model of cloud computing.  Furthermore, those sources tell me that at least three-quarters of Amazon’s cloud revenue growth can be attributed to these platform-service-consuming applications.

For all of the importance of platform services, we don’t really know too much about them except by example.  Obviously you don’t want to make every operating system or middleware service into a platform service because it wouldn’t add any value.  Disk operations is an example of something that could be extended as a platform service, and many cloud providers offer distributed data operations.  Caching is necessarily a platform services because you have to distribute caches to make them useful.  Amazon’s flow-related stuff (AppStream an Kinesis) are clearly platform services because they draw on specific cloud elasticity features to work.  But there are many other kinds of platform services that could be defined as well.

One example is in the area of information services.  An application doesn’t know much about a user other than what the user is doing with that application, but network applications increasingly depend on understanding the context of the user at the time.  Location-based services could be considered platform services, both because they rely on knowledge of where a user is at the moment that could come from a variety of sources.  Other services related to the user’s current context/presence could also be viewed as platform service.  Is the user on a call, is the user stationary or moving, or in a restaurant or theater?  You can see that this information could be valuable to applications that wanted to apply policies based on what their user was up to, and we’re not going to write all of that stuff into every application individually.

The information services linkage demonstrates that platform services can extend to the API-based services already offered by many social networking, search, and network operator  firms today.  That’s not a bad thing; most revolutions build on the past rather than demanding that everything change radically.  What we’re doing with the cloud is making it possible to define, or compose, applications by assembling information and IT resources from a variety of sources, in an application component set that runs where and when it’s convenient—and runs how it’s most profitable overall.

If you look at SDN and NFV in this light, you can see how the cloud can be a supporting revolution to both.  It’s far easier to understand how an application could define network services if we assume a set of platform services that offer the control of the network, apply policies based on what the user will pay for, how the network’s stability and security can be protected, and what mix of resources will earn the best return with suitable performance and stability.  It’s easier to visualize what a practical NFV service would look like if we assumed that the virtual functions might themselves define platform services and that static features defined as platform services might get composed into a retail offering alongside virtual functions.

A final interesting point about the platform services side of the cloud revolution is that it’s really the unifier of “the Internet” and “the cloud” because as information services offered as platform services evolve, the notion of a web application and a cloud application converge.  At the API level, they’re increasingly the same; the difference between them would likely be primarily whether the hosting of the components had to be very dynamic because of shifts in usage or geography, or could be static as they would be today with a dedicated server or VM in some ISP or OTT data center.

So when you think of the cloud in 2014, don’t get caught in the IaaS trap.  That’s so 2013.

Traveling the Road Less Taken

I’ve been on a couple trips recently that featured, at some point or the other, the classical multi-armed signpost telling me how far I was from New York, Beijing, Rio, or whatever.  And, while there’s never been a road junction at one of these signs that could take me to all the destinations, you can surely get there from here.  That the converse is true isn’t necessarily obvious, but you can get to any given place from a lot of starting points.  That’s one reason our current network-evolutionary trends are sometimes confusing.  What we need to do is travel the road less taken, I think.

What operators want in networking is what every business wants, higher revenues and profits.  The challenges they’ve faced in that regard is that communications services aren’t infinitely expanding or consumable.  People have historically spend a fairly constant piece of their disposable income on communications, and businesses have done likewise.  That means that average revenue per user (ARPU) tends to flatten over time, making the industry dependent on subscriber growth for revenue gains.

On the profit side, the challenge we’ve faced for literally decades is that service prices have been in decline but costs have not declined along the same curve.  Stu Elby and other operator executives have been showing the charts that illustrate the future cost/price collision, the point where infrastructure isn’t profitable.

So what operators want is to find other stuff to sell, and other ways of producing that stuff that offers lower cost points.  Simple, right?

SDN and NFV are technology-shift solutions that are arguably aimed primarily at the cost side but that could also offer new service benefits.  There are some indications that you can raise network utilization with SDN or improve the economics of high-touch features with NFV, but the biggest benefit of both would be to shed proprietary fetters and empower commodity devices.  This notion got, and still gets, a lot of media attention even though operators themselves are saying that they could likely get the same cost reductions by “beating up Huawei on price”.

Another strategy that has a lot of merit from the operator perspective is a shift away from multi-layered hierarchical networks to something very simple.  An extreme example is an optical network with a ring of electrical on-ramps around the edge.  You take advantage of the declining price of an optical bit to create a network that doesn’t “aggregate traffic” to save money because aggregation likely costs more in device and operations costs than it saves.  This is of particular interest to operators in the metro networks, and I’ve been blogging all this year that metro is where the profitable traffic is (and will be) found.  This approach is sometimes called “flattening” the network because it tends to eliminate some OSI layers, but the real goal is simplification, not flattening layers.  Less to manage means less management cost.

It’s this point that creates the traffic jam on our road to a profitable destination.  SDN and NFV both have a tendency to increase the number of elements in a network.  Pre- and post-SDN, do you have fewer boxes or just different ones?  The latter, I think.  In the case of NFV, do I end up taking a fifty-dollar box and converting it into a couple of hosted functions that require deployment of cloud stacks, hypervisors, servers, data-path acceleration, vSwitches, and so forth?  If so, how much is it going to cost to operationalize all of this new stuff?

What SDN and NFV should have taught us is that we’re at a fork in our converging set of roads—a place where we have to decide whether we’re going to remake operationalization while we remake networking, or try to do network management and service management largely as we’ve done in the past.  The path of change is often called simply service automation, but that covers a lot of diverse ground.  How do you automate a service?  You make the service lifecycle processes event-driven and capable of acting based on complex policies rather than based on human judgment.  You reduce or eliminate autonomous behaviors that are non-deterministic, like the way router networks adapt to failures.  And you do this not only for the new SDN-ish or NFV-ish elements of networks but for the legacy stuff.  That’s not going away completely, no matter what some may think, and particularly not in the near term.

But that’s only one of my forks; what’s the other?  It’s aggressive device commoditization.  You can’t build unprofitable networks, so you’ll skin the needed margins from something.  If it’s not opex it’s capex.  You start by favoring price-leader vendors (does Huawei come to mind here?) and you then drive down cost further by offloading features and further dumbing down the devices (SDN and NFV).  This path flattens metro networks, emphasizes capacity generation over capacity management, and generally picks the pockets of the network vendors.

But this is the path vendors are taking.  I’ve been involved in five projects aimed at service automation in the last eight years, and none of them have received any helpful vendor support.  Vendors have blown kisses at operationalization by using acronyms like “TCO” that cover vague assertions that their gear is somewhat lower-cost to operate.  The problem is that “operations’ means everything from creating a service from piece-parts in a lab, through rolling it out and selling it, to managing its availability and performance during its billable life.  The network vendors have declared most of this out of bounds for them, so they’ve not fixed operationalization and given operators no choice but to seek cheaper devices as their path to their profit goals.  The OSS/BSS vendors have declared all the low-level network stuff out-of-scope too, and nothing is bringing the two groups together in a useful way.  Why, otherwise, would it take two to eight weeks to sell a VPN service?

We have the solution to network profit and vendor profit staring us in the face, and it’s the same solution for both.  We need to rethink the notion of “services” and the notion of “infrastructure” to match a cloud model.  That has to include on the creation of an agile platform for hosting features and a cost-effective platform for pushing bits around, but it has to focus on an architecture that can optimize everything in the virtual-resource and virtual-service worlds at the same time for diverse sets of missions and trade-offs.  Who makes this?  Nobody, today.  Somebody, soon, I think, and that’s the company I’m going to watch.

Some Big Network Vendors and What They Need in 2014

 

 

Well, it’s 2014 and so it’s a good time to match up the looking-ahead and looking-back parts of analyzing our industry.  One good way to do that is to look at how the network vendors’ stocks have fared in the last year.  If you go to any of the financial sites and run a chart (as I did), you’ll find that the major players divide themselves into three groups based on how much their stocks have risen since last New Year’s Day.

The “bottom” group, Cisco, Ericsson, and Juniper, are up less than the proverbial “Q’s”, the NASDAQ 100 index.  The “middle” group is one company, Ciena, who beat the NASDAQ but didn’t exactly explode, and the two inhabitants of the “exploding” category are Alcatel-Lucent, whose stock was up by over 200% versus last New Year’s Day, and Nokia who was up over 100%.  The topic for my blog this first working day of 2014 is what these companies need to do in the coming year, to achieve success or secure it as appropriate.

The companies in our “Exploding” group are interesting because both Alcatel-Lucent and Nokia have lower price/sales ratios, meaning that their stock price is high relative to their revenues.  You can argue that this means the Street is expecting some radical change in operating practices.  In my view, one of the companies is too broad to produce that change easily, and the other too narrow.

Alcatel-Lucent has a lot of strong areas, and a lot of weak ones.  Net-net, it has so many areas that it’s almost impossible for it to be successful somewhere without the same trend driving that success creating a weakness in other areas.  The company needs to do two things; build an ecosystem and not a bunch of products, and cut what doesn’t fit.  There’s too much cost in premier assets like Bell Labs for Alcatel-Lucent to easily shrink its portfolio, so it needs to be able to create a broader value.  Given it’s a product-silo player, that’s not going to be easy.  CloudBand, Alcatel-Lucent’s cloud flagship, seems to be morphing into something that could lead the cloud, SDN, and NFV initiatives within the company, but for that to happen the product guys have to let a broader story get told.

Nokia has the opposite problem; its focus on wireless and 4G in particular makes it a one-trick pony that is simply too vulnerable to competition.  Wireless is currently the best of bad spaces; we know wireline margins are declining but wireless isn’t great, it’s just not as bad.  It’s obvious that operators on a cost-cutting mission are going to hurt higher-margin vendors in every space they target, and Huawei is an enormously effective cost-based competitor in wireless.  For Nokia, it’s got to be the service layer at this point.

In the middle group we find Ciena, another player with low price/sales ratio, in this case likely because they have very low operating margins.  Optical-layer stuff tends to generate low margins, but right now the industry is working on a way of building networks that spends more on optical transport and less on aggregation at higher layers.  That would benefit Ciena, but they need to be on top of that trend and right now they’re only riding a bit of the wave.

In the bottom group, we have Cisco, Ericsson, and Juniper, and in my view this group is driven by two different trends just as the top group was.  Cisco and Juniper are in a space that’s been performing well historically and is now facing what’s likely to be continued downward movement of margins and gross opportunity—switching/routing.  Ericsson is attempting a transition to professional services as a driver, banking IMHO on the long-term commoditization of networking to create a need for such services as price pressure squeezes out any hope of “included” support.

Cisco and Juniper both have high price/sales, and Juniper also has the highest current operating margins and P/E multiple of the major network vendors.  The Street thinks they can pull something out, but the question is whether they can be “affirmative” and do something that’s actually revenue-and-profit-accretive, or whether they’ll have to cut costs or shed business units to make shareholders happy.

Cisco needs to make the cloud a success, plain and simple.  It’s the only networking giant that has the credentials as a true cloud player, and so it can gain considerable differentiation from making what’s unique about Cisco into a unique asset.  To do that, it has to become not an IT player but a software player.  The cloud is about software, not hardware.  Cisco has an OpenStack distro that’s practically unknown; it needs to be building its own cloud identity by building on that asset.  It can build networking into OpenStack much more effectively than other players, more effectively than the current Neutron effort.  Open Daylight might be a good way to do that, but it’s hardly differentiated.  Cisco needs to think about what it could do that would be Cisco’s alone, and the answer is northbound applications.  SDN and the cloud link there, and that’s where Cisco needs to be.

Juniper’s big problem is expectation.  If you valued Juniper at Cisco’s multiples the company stock would be in the toilet, but unless you can generate some credible reason to believe it has better potential three years down the road than Cisco, its current multiple is hard to justify.  It will never sell enough switches and routers to make up that multiple difference, especially when its operating margins are high and so it’s vulnerable to commoditization.  It tried to be a software company under the prior regime, and it didn’t work.  What’s left?

The only possible answer for Juniper is network functions virtualization.  Kevin Johnson, the former CEO, was a lover of TCO stories.  There is a major need for operational efficiency in the network today, something that the operators are already seeing, and something causing them to shift their NFV focus away from capex alone.  Juniper can morph a TCO story into a full-bore operationalization story through NFV, and also use that as an on-ramp to a more effective SDN position and even a service-layer story.  But it has to get moving because every other group in our three-zone market could also view NFV as a way of moving forward.

NFV builds ecosystems like Alcatel-Lucent needs.  NFV can create a true service-layer architecture, as Nokia must.  It can define the cloud for Cisco, frame professional services and integration opportunity for Ericsson, and help Juniper get some meat on its TCO bones.  But NFV to do any of these things will have to be more than most proponents see it today, more than any vendor is positioning it to be.  So while it’s possible for all of our vendors to get to the promised land in their own way, it’s also possible that how they position their NFV strategies will be the “tell” that shows whether they’ll end 2014 up…or down.

What Does Domain 2.0 Have to Do to Succeed?

I’ve looked at the vendor side of network transformation in a couple of recent blogs, focusing on interpreting Wall Street views of how the seller side of our industry will fare.  It may be fitting that my blog for the last day of 2013 focuses on what’s arguably the clearest buyer-side statement that something needs to change for 2014 and beyond.  AT&T’s Domain 2.0 model is a bold attempt to gather information (it’s an RFI) about the next generation of network equipment and how well it will fit into a lower-capex model.  Bold, yes.  Achievable, optimal?  We’ll see, but remember this is just my analysis!

Insiders tell me that Domain 2.0 is aimed at creating a much more agile model of infrastructure (yes, we’ve heard that term before) as well as a model that can contain both opex and capex.  While some Street research (the MKM report cited in Light Reading for example) focuses on the capex impact of the AT&T initiative for the obvious reason that’s what moves network equipment stocks, the story is much broader.  My inside person says it’s about the cloud as a platform for services, next-gen operations practices to not only stabilize but drive down opex even as service complexity rises, and optimum use of cloud resources to host network features, including support for both SDN and NFV.  You can see from this list of goals that AT&T is looking way beyond white-box switches.

Another thing that insiders say is that Domain 2.0 recognizes that where regulatory issues aren’t driving a different model, the smart approach is to spend more proportionally on bandwidth and less proportionally on bandwidth optimization.  That’s why Ciena makes out better and Cisco likely makes out worse.  Networks are built increasingly from edge devices with optical onramp capability, coupled with agile optics.

Where SDN comes into the mix in the WAN is in providing a mechanism for creating and sustaining this model, which is sort of what some people mean when they say “flattening the network”.  It’s not as much about eliminating OSI layers as it is about eliminating physical devices so that the total complexity of the network is reduced.  According to these sources, Cisco isn’t the only router vendor to be at risk—everyone who’s not a pure-play agile-optics vendor might have to look often over their shoulder.

Data center networking is also on the agenda, mostly because the new cloud-and-NFV model demands a lot of network agility in the data center.  There will be, obviously, a major increase in the amount of v-switching consumed, but it’s not yet clear whether this is all incremental to the current data center switching infrastructure, a result of increased virtualization (which uses vSwitch technology, obviously).  However, my sources say that they are very interested in low-cost data center switching models based on SDN.

It seems likely to me that a combination of an SDN-metro strategy based on the optics-plus-light-edge model and an SDN data center strategy would be self-reinforcing.  Absent one or the other of these and it’s harder to see how a complete SDN transition could occur.  To me, that means that it will be hard for a smaller vendor with a limited portfolio could get both right.  Could a white-box player?  My sources in AT&T say that they’d love white boxes from giants like IBM or HP or Intel or Dell, but they’re skeptical about whether smaller players would be credible in as critical a mission.  They are even more skeptical about whether smaller players might be able to field credible SDN software.  A giant IT player is the best answer, so they say.

The role of NFV here is harder to define.  If you presume “cloud” is a goal and “SDN” is a goal, then you either have to make NFV a fusion of these things to gather enough critical executive attention, or you have to say that NFV is really going to be about something totally different from the cloud/SDN combination.  It’s not clear to me where AT&T sits on this topic, but it’s possible that they see NFV as the path toward gaining that next-gen operations model we talked about.

NFV does define a Management and Orchestration (MANO) function.  It’s tempting to say that this activity could become the framework of our new-age operations vision.  The challenge here is that next-gen operations is not the ETSI NFV ISG’s mandate.  It is possible that working through a strategy to operationalize virtual-function-based services could create a framework with broader capabilities, but it would require a significant shift in ISG policy.  The ISG, to insure it gets its own work done, has been reluctant to step outside virtual functions into the broader area, and next-gen operations demands a complete edge-to-edge model, not just a model of virtual functions.

Might our model come from the TMF?  Support for that view inside AT&T is divided at best, which mirrors the views we’ve gotten from Tier Ones globally in our surveys.  The problem here, I think, is less that the TMF doesn’t have an approach (I happen to think that GB942 is as close to the Rosetta Stone of next-gen management as you’ll find anywhere) as that TMF material doesn’t explain their approach particularly well.  TMF material seems aimed more at the old-line TMF types, and the front lines of the NGN push inside AT&T or elsewhere lacks representation from this group for obvious reasons.  NGN isn’t about tradition.

The future of network operations could be derived from NFV activities, or from TMF’s, or from something that embodies both, or neither.  Here again, it would be my expectation that advances in operations practices would have to come out of integration activity associated with lab trials and proof-of-concept (TMF Catalyst) testing.  As a fallen software guy, I believe you can develop software only from software architectures, and standards and specs from either the NFV ISG or the TMF aren’t architectures.  I also think this has to be viewed as a top-down problem; all virtualization including “management virtualization” has to start with an abstract model (of a service, end-to-end, in this case) and move downward to how that model is linked with the real world to drive real operations work.  The biggest advance we could see in next-gen networking for 2014 would come if Domain 2.0 identifies such a model.

I wish you all a happy and prosperous New Year, I’m grateful for your support in reading my blog, and I hope to continue to earn your interest in 2014.

 

A Financial-Conference View of SDN and NFV

I’ve blogged in the past on how the Street views various companies or even the networking industry, and today there’s been an interesting report by equity research firm Cowan and Company about how SDN and NFV will impact the market.  This is the outcome of the company’s Trend-Spotting Conference and it includes some industry panel commentary and also audience surveys.  As usual, my goal here is to review their views against what I’ve gotten in my surveys of enterprises and operators.

The top-line comment is that our revolutionary technology changes are all going to have an adverse impact on the industry, with attendees saying it will commoditize networking within five years (by an 80% margin).  The vision is that white-box devices will suck the heart out of the networking market, and here I think the view of the audience is short-sighted.  Yes, we are going to see commoditization but no, SDN and NFV or even Huawei aren’t the real driver.

Any time you have a market without reasonable feature differentiation you get price differentiation and commoditization, and that is clearly where both networking and IT are already.  Buyers of all sorts tell me that what they want is a low price from a recognized player, period.  But despite this, buyers also say that they have to be able to “trust” their supplier, to “integrate” their current purchases into extant networks and operations practices, and to “protect their investment” as further changes develop.  When you add in these factors to see who’s exercising strategic influence, it’s not the newcomers who have white-box products or revolutionary software layers, it’s the incumbents.  So my conclusion here is that five years is too fast to expect the kind of radical result people were talking about with Cowan’s conference.

The second major point from the conference is that this is all about agility and not about cost savings, which on the face of it shows that simple commoditization is too simple.  When you dig through the comments in the report you get the picture of a market that’s coping with changes in information technology, changes that necessarily change the mission of networking.  They see that change as focusing on the data center, where they want networks and virtualization to work together in a nice way.

I think, based on my surveys, that the “agility” angle is a common evolution of early simplistic benefit models for SDN and NFV alike.  People start by thinking they’re saving money on equipment, realize the savings won’t likely be enough, and then begin to think about other possible benefits.  But whatever the motivation, if it’s true (as I believe it is) that neither SDN nor NFV will pay off enough in equipment savings, then we have to start looking for how the additional savings will be achieved.

The report says that there are two camps with respect to impact on existing vendors; one says that the paradigm shift in networking could help Cisco and similar firms and the other that it would hurt.  Obviously, everything has this kind of polar outcome choice, so that’s not much insight, but I think the reason for the two camps is the agility angle.  If there are real benefits that can be achieved through SDN and NFV deployment, then those new benefits could justify new spending, not reduce existing spending.  So you can say that we have a group who believe that vendors will figure out how to drive new benefits, and a group who thinks the vendors will fall into commoditization.  That’s likely true.

In my view, the “new benefits” group will indeed win out, but the real question is “when?” and not “if.”  Going back to the five-year timeline, the industry has to either develop something new and useful to drive up spending or face continued budget-based pressure to cut it.  No matter how those cuts are achieved, they’ll be tough to add back later on once they happen.  So I do think that there’s a five-year timeline to worry about—an even shorter one, in fact.  If we can’t identify how networking generates new benefits in the next three years, then those benefits won’t get us back to a golden age.  They’ll simply let us hang on to today’s relatively stagnant market picture.

When you look at the details of the Cowen report you see things like “rapid provisioning” as key goals, things that make the network responsive.  To me, that makes it clear that the revolution we’re looking for has little or nothing to do with traditional SDN concepts like “centralization” or even with NFV concepts like hosting efficiencies of COTS.  What it really is about is operationalization.  We have networked for ages based on averages and now we want to have more specific application/service control over things, without incurring higher costs because we’re touching things at a more detailed level.  No more “Shakespearian diagnosis” of network problems; “Something is rotten in Denmark.”  We need to know what’s rotten and where in Denmark it’s happening.

The issue here, of course, is that the classical models of SDN and NFV have nothing specific to do with operationalization.  You can argue strongly that neither SDN nor NFV brings with it any changes in management model (you can even argue they don’t even bring a management model).  The interesting thing is that when I’ve gotten into detailed discussions with operators (about NFV in particular) what I hear from them is that same agility point.  That means that the operators do really see a need for a higher-level benefit set, and they’re hoping to achieve it.

Hope springs eternal, but results are harder to come by.  The reason is simple; nobody really understands this stuff very well.  Among buyers, SDN literacy is satisfactory only for Tier One and Two operators and NFV literacy is unsatisfactory everywhere.  In the Cowen audience polls NFV wasn’t really even on the radar, and yet if you are going to transform network agility it’s hard to see how you do that without hosted processes.

Vendors are, of course, the real problem.  My straw poll of vendors suggests that their literacy on SDN and NFV aren’t much better than buyers’ literacy.  Why?  Because they’re really not trying to sell it.  Startup vendors want to create a “Chicken-Little-Says-the-Sky-is-Falling” story in the press so they can be flipped.  Major vendors want to create a “Chicken-Little-is-Right-but-It’s-Still-in-the-Future” mindset to keep current paradigms in control to drive near-term revenue.  The Cowen panel said that the SDN world was getting “cloudy” in the sense of becoming obscure.  Do we really think that happens for any reason other than that people want it to?  That’s why we won’t see revolutionary change in five years, and that may be why networking will in fact lose its mojo, just as IT is now losing it.  But there’s still time, folks.  Just face reality.

Hockey Stick or Hockey Puck?

We hear all the time about how new technologies like the cloud, SDN, and NFV are poised for “hockey-stick” growth.  Most of the time that’s not true, of course.  In an industry with three to eight-year capital cycles it’s pretty darn hard for something to achieve really explosive growth because of the dampening impact of transitions.  But there are other factors too, and as we’re coming to a new year, this is a good time to look at just what does inhibit technology growth.  What is it that’s making our hockey stick look more like a hockey puck; flat on cold ice?

Most everyone knows that companies look at return on investment in making project decisions, but in fact ROI isn’t always the measure.  Tech spending tends to divide into a “budget” and “project” category, with the former being the money allocated to sustain current capabilities and the latter to expanding tech into new areas.  Most CIOs tell me that budget spending is typically aimed at getting a little more for a little less rather than on achieving a specific ROI target, largely because companies don’t require formal justification to keep doing what they’ve done all along.  Budget spending this tends to focus on cost reduction, and project spending is generally accretive to IT spending overall because it buys new stuff justified by new benefits.

The challenge we have with things like the cloud, SDN, and NFV is that they are all cost-side technologies in how they’re presented.  It’s true that you can justify a “project” aimed at cutting costs through technology substitution, but buyers of all types have been telling me in surveys for decades that this kind of project is a politically difficult sell unless the cost savings are truly extravagant.  The problem is that “status quo for a little less cost” doesn’t overcome the risk that the new technology or the transition to it will prove really disruptive.  Our waves of IT growth in the past have come from projects that created new benefits.  I’ve blogged about this before so I won’t go into more detail here.  Just remember that cost-side projects, if anything, create downward trends in spending overall.

Another factor that inhibits the growth of new technology is buyer literacy.  You don’t have to be an automotive engineer to buy a car, but being able to drive is an asset.  The point is that a buyer has to be able to exercise a technical choice so as to reap the expected benefit, and that takes some understanding of what they’re doing.  We’ve measured buyer literacy for decades, and found that it generally takes literacy rates of about 30% in order to create a “natural market” where buyers’ are able to drive things without a lot of pushing and help from others.  Right now, buyer literacy in the cloud has exceeded that threshold, and operator literacy for SDN also exceeds it.  None of the other “hockey-stick” market areas have the required level of buyer literacy.

With SDN, buyers’ problems relate primarily to the identification of costs and benefits.  Even enterprise technologists have a hard time drawing a picture of a complete SDN network.  A bit less than half could accurately describe SDN application in a data center, but of this group the majority admit that the benefits there aren’t sufficient to drive deployment.  When you get down to the details, you find that the users simply don’t understand what SDN will do differently.  One user in our fall survey said “I have a guy who comes in and tells me that SDN will save on switching and routing, but then he tells me that we’d use the same switches and routers we already have, just run OpenFlow for them.”  You can see the problem here, and isn’t that a fairly accurate picture of the net of SDN sales pitches today?

With NFV the situation is similar, but among operators there’s a lot more proactivity associated with finding the right answers.  Operators started off with a widely accepted benefit paradigm (capex reduction) and a clear path to it (replace middlebox products with COTS-hosted apps).  They then found out that 1) there weren’t that many middleboxes to replace, 2) they could achieve similar savings by pushing vendors on price for legacy gear, 3) that they couldn’t be sure what the operations cost of an NFV deployment would be.  Now they’re shifting their benefit focus from capex to opex to service velocity to service creation and OTT competition.  The problem is that they don’t have a clear idea of how these evolving benefits will be achieved.  NFV by itself targets only hostable functions and few operators believe that these would make up more than 20% of their total infrastructure.  How do you achieve massive improvements in operations or services with that small a bite?

The vendors themselves present an inhibiting impact on adoption of the new technologies.  Companies today rely more than ever on their suppliers for professional services or technology education.    If the new technologies are designed to reduce costs (and remember that’s the classic mission for the cloud, SDN, and NFV) then why would a vendor push customers into them?  Yes, you could argue that a startup would be able to step in and present the new options, but 1) the VCs are all funding new social-network stupid-pet-tricks stuff and won’t be bothered with real tech investment and 2) the buyers will say “I can’t bet my network on some company with fifty million in valuation; I need a public company worth billions.”  So we now have to look for a large public-company startup (that’s why Dell could be such an interesting player—a company who’s gone private like that is exactly what buyers would like and it can take a long-term view most public companies can’t).

The point here is that we’re on track to achieve significant cloud growth, not the pathetic dabbling we have now, in about 2016, with SDN and NFV trailing that.  All these dates could be accelerated by optimal market activity, but it’s going to be up to vendors to initiate that.  What we need to watch in 2014 isn’t who has the best technology, but who has the best buyer education process.  The current “take-root-and-become-a-tree” mindset of vendors will favor the bold; they’ll be the only ones moving at all.