Event-Driven Operations, OSS/BSS Evolution, and Virtualization

All of the discussions of service modeling and management or operations integration that I’ve recently had beg the question of OSS/BSS modernization.  This is a topic that’s as contentious as that of infrastructure evolution, but it involves a different set of players in both the buyer and seller organizations.  Since operations practices and costs will have to figure in any realistic vision of transformation, we have to consider this side of the picture.

Fundamentally, OSS/BSS systems today are transactional in nature, meaning that they are built around workflows that are initiated by things like service orders and changes.  This structure makes sense given the fact that in the past, OSS/BSS systems were about orders and billing, and frankly it would make sense in the future if we assumed that the role of OSS/BSS systems were to remain as it has been.

A bit less than a decade ago, the OSS/BSS community and the Telemanagement Forum (TMF) started to consider the expansion of the OSS/BSS role to become more involved in the automation of service lifecycle management.  A number of initiatives were launched out of this interest, including the Service Delivery Framework (SDF) stuff and the critical GB942 “NGOSS Contract” work.  The latter introduced the concept of using a contract data model to steer events to operations processes, and so it was arguably the framework for “making OSS/BSS event-driven”, a stated industry goal.

Since that time, there has been a continuous tension between those who saw OSS/BSS evolving as infrastructure changed, and those who saw OSS/BSS architectures totally obsolete and needing revision, in many cases whatever happens at the infrastructure level.  My own modeling of operations efficiency gains has shown that operators could gain more from automating operations practices without changing infrastructure than by letting infrastructure change drive the bus.  So which option is best; evolve or revolt?  You have to start by asking what an “event-driven” OSS/BSS would look like.

An event-driven OSS/BSS would be a collection of microservices representing operations tasks, placed into a state/event structure in a service data model and driven by service and resource events that occurred during the service lifecycle.  This approach collides with current reality in a number of ways.

First, a good event-driven structure has stateless service components like microservices.  That means that the software is designed like a web server—you send it something and it operates on that something without regard for what might have gone before.  Today, most OSS/BSS software is stateful, meaning that it processes transactions with some memory of the contextual relationship between that transaction and other steps, maintained in the process.  In event-driven systems, context is based on “state” and it’s part of the data model.  Thus, you’d have to restructure software.

The second issue with event-driven structures is the data model.  Data models are critical in event-driven systems not only to maintain the state/event tables but also to collect the “contextual variables” that have to be passed among the software elements.  The standard data model for OSS/BSS is the venerable TMF SID, and the while on the surface the structure of the SID seems to lend itself to the whole event-driven thing, the problem is that modernization of the SID hasn’t followed software-practices-driven thinking.  Attempting to preserve the structure has overridden logical handling of new things, which means that a lot of what seems intuitively good turns out not to work.  I’ve had years of exposure to SID and I still get trapped in “logical” assumptions about what can be done that turn out to have to work completely differently than logic would dictate.

The third issue in event-driven operations is the events.  Few operators would put OSS/BSS systems into mainstream network management.  Most of the network equipment today, particularly at Levels 2 and 3, are focused on adaptive behavior that does fault correction autonomously.  Virtualization adds in another layer of uncertainty by making service-to-resource relationships potentially more complicated (horizontal scaling is a good example).  What events are supposed to be included in event-driven operations, and how do we actually get them connected to a service contract?

All of these issues are coming to the fore in things like cloud computing, SDN, and NFV—meaning “virtualization” is driving change.  That means that there are infrastructure-driven standards and practices that are doing most of the heavy lifting in service lifecycle automation, in the name of “orchestration”.

You could orchestrate infrastructure events and virtualization-based remediation of problems below traditional operations.  You could create an event-driven OSS/BSS by extending orchestration into operations.  You could do both levels of orchestration independently, or in a single model.  An embarrassment of riches in terms of choice usually means nobody makes a choice quickly, which is where we seem to be now.

I’m of the view that operations could be integrated into a unified model with infrastructure orchestration.  I’m also of the view that you could have two layered state/event systems, one for “service events” and one for “infrastructure events”.  Either could address the issues of event-driven operations, in theory.  In practice, I think that separation is the answer.

Operators seem to be focusing on a model of orchestration where infrastructure events are handled in “management and orchestration” (MANO, in NFV ISG terms) below OSS/BSS.  This probably reflects in part the current organizational separation of CIO and COO, but it also probably preserves current practices and processes better.  In my top-down-modeling blog earlier this week, I proposed that we formalize the separation of OSS/BSS/MANO by presuming that the network of the future is built on network functions (NFs) that bridge the two worlds.  Services and infrastructure unite with the NF and each have their own view—views related in the NF logic.

Some OSS/BSS vendors seem to be heading in this direction—Netcracker for one.  An OSS/BSS vendor can still provide orchestration software (maybe they should, given the disorder in the vendor space) and if they do, they can define that critical intermediary-abstraction thing I called an NF and define how it exchanges events between the two domains.  If you dig through material from these OSS/BSS leaders, that’s the structure that at least I can interpret from the diagrams, but they don’t say it directly.

This is a critical time for OSS/BSS and all the people, practices, and standards that go along with it.  As network infrastructure moves to virtualization, it moves to the notion of abstract or virtual devices and intent models and other logical stuff that’s not part of OSS/BSS standards today and very possibly never will be.  OSS/BSS types have always told me that you can’t easily stick an intent model into the TMF SID; if that’s true then SID has a problem because an intent model is a highly flexible abstraction that should fit pretty much anywhere.

An NF intermediary wouldn’t mean that OSS/BSS can’t evolve or can’t support events, but it would take the event heavy lifting associated with resource management out of the mix, and it would maintain the notion that OSS/BSS and network management and infrastructure are independent/interdependent and not co-dependent.  Perhaps that’s what we need to think about when we talk about operations’ progress into the future.