Will OSS/BSS Love or OSS/BSS Hate Win?

SDxCentral cites an important truth in an article and report (the latter from the MEF and the Rayno Report), which is that there’s a lot of dissatisfaction with OSS/BSS out there.  It’s more complicated than that, though.  I mentioned in a prior blog my own experience at a major Tier One, where I got a “I want to preserve the OSS/BSS’ and “I want to trash OSS/BSS and start over” comment set from two people sitting next to each other.  Operations people (the CIO and reports) are largely in the former group and CTO and marketing teams in the latter, to no surprise.  What may be surprising is that it’s not clear which approach would really be best.

Nobody disagrees that virtualization has put pressure on existing systems, for reasons I’ve gone over in detail in prior blogs.  The heart of the problem is that we’re substituting networks built in two steps from networks built in one.  Where in the past we’d coerce services from cooperating devices, the virtual network of the future has to first deploy functions and then link them into services.  The essential question is how interdependent the steps in our new two-step process have to be.

What I’ve called the “virtual device model” says that the two steps can be independent.  You build network services as before, by coercing service behaviors from functional communities.  The fact that those functional communities are themselves built is taken care of outside the realm of OSS/BSS and even NMS.  Why do network management on server farms, after all?

Nearly all the OSS/BSS deployed today supports the virtual-device model.  Services are provisioned and changed through management interfaces exposed by network hardware.  These same interfaces can report faults or congestion, so any commercial impact (SLA violations etc.) can be addressed this way.  OSS/BSS vendors make a nice piece of change selling adapters that link their products to the underlying network equipment.

What the report/article calls “lifecycle service orchestration” or LSO is what I’ve been calling “service lifecycle management” or SLM.  Whatever term you like, the goal is to automate the deployment of functions and their connection into something, which might be a virtual device or a service that it itself presented through a management API.  Since the deployment of hosted features, the connecting of these pieces, and the threading of user connections to the result is a complicated multi-stage process, the term “orchestration” is a good way to describe it.

Where I think “LSO” and “SLM” diverge is actually the place where the virtual device model becomes unwieldy.  I can deploy something through orchestration, and I can even change its configuration, but I still have to address the question of just what’s happening to the functionality of the service itself.  How do we relate conditions in a server resource pool to conditions at the service level?  How do operations practices like the customer service rep (CSR) gain access to the state of the service, or how does a portal relate it to the customer?  Functionally my service looks like discrete boxes (routers, firewalls, etc.) but it isn’t.

The virtual device model says that when I deploy virtual functionality I deploy or commit collateral management processes whose responsibility it is to translate between the “functional management” state that the virtual device has to present upward toward the user, and the “structural management” of real resources that anyone who wants to really see what’s going on or to fix it necessarily has to be able to see.  The biggest problem with this is that every deployed function has to obtain enough information on its resources to be able to represent its state.  That poses risks in overwhelming the real management interfaces of the resource pool, and also risks that service functions with access to collective resource management interfaces could (deliberately or by accident) do something broadly destructive.

Another less obvious risk is that we’ve made lifecycle management a two-level process whose levels are not really visible to each other.  We might be recovering something down inside a virtual device when the best approach would be to reroute to a different function that’s already deployed elsewhere.  If we make a routing decision at a high level, we might really want to instantiate new processes inside virtual devices.  How does this knowledge get passed between independent lifecycle management processes, independent orchestrations?

There’s a really subtle risk here, too, which is that of promiscuous exchanges of status between components of virtual devices, effectively a kind of back-channel.  If vendor builds their own management tools, could these tools communicate across the standard flow of state/event information?  Could we dip down from a functional level to a resource?  If any of that happens, we’re exposing implementation details upward into what’s supposed to be a function-driven process, and that means other solutions can’t be freely substituted.

From an OSS/BSS perspective, the real problem is that it’s very likely that every vendor who offers a virtual function will offer a different lifecycle management strategy for it.  There is then no consistency in how a virtual device looks as it’s being commissioned and as it evolves through its operating life.  There may be no consistency in terms of what a given “functional command” to the virtual device would do in terms of changing the state of the device.  Thus, it’s very difficult to do effective customer support or drive the operations tasks associated with the commercial deployment of a service.

The alternative to the virtual device model is the managed model approach.  With this approach, a service is built as a hierarchical set of functions that start with basic resources and grow upward to commercial visibility.  Each level of the hierarchy can assert management variables whose value is derived from what it expects its components to be.  If you say that 1) each function-model can send stuff only to its direct neighbors and 2) each function-model can define its own local state and event-handling, then you can build a service from functions, not devices, and the only place where you have to worry about mapping to real-world stuff is at the very bottom, where “nuclear functions” have to be deployed.

The difference in these approaches is that in the virtual device approach, we secure lifecycle management by deploying lifecycle management processes as part of the features.  Nothing will ever interwork properly that way.  The second approach says that you define lifecycle management steps in a hierarchical model that builds upward from the resources to manageable functions, all under model control and without specialized processes.  If you want to substitute a feature, the only thing that has to be controlled is the initial mapping of the resource MIBs to the lowest-level functional model, the level that represents the behavior of that resource.

This relates to the OSS/BSS evolution in that with the virtual device model, you really can’t make the OSS/BSS evolve because it doesn’t know anything about the network evolution.  This satisfies the “leave-my-OSS/BSS-alone” school because there’s no other choice.  With the “virtual function” model, you can take the entire suite of operations processes (defined in the TMF’s enhanced Telecommunications Operations Map or eTOM) and make them into microservices run at the state/event intersection of each function in the service model.  You can build a data model to totally define service automation, in other words.

There is definitely a split in opinion on the role that OSS/BSS should play in the future.  The biggest barrier to OSS-centric visions for orchestration is the feeling that the OSS/BSS needs a face-lift, new technologies like SDN and NFV notwithstanding.  The biggest positive for the OSS/BSS vendor community is that there doesn’t seem to be an understanding of how you’d “modernize” an OSS/BSS in the first place.

The TMF outlined the very model-coupled-event approach I’ve outlined here, about eight years ago, in the NGOSS Contract/GB942 work.  They didn’t even present that approach to the ISG in the recent meeting of SDOs to harmonize on service modeling.  That may be the big problem facing the OSS/BSS community.  If they can’t adopt what was the seminal standard on event-based service automation, what would it take to move the ball?  But unless some vendor steps up to implement that second approach, they’ll be safe.