A Transformed Service Infrastructure from Portal to Resources

Transformation, for the network operators, is a long-standing if somewhat vague goal.  It means, to most, getting beyond the straight-jacket of revenue dependence on connection services and moving higher on the food chain.  Yet, for all the aspirations, the fact is that operators are still looking more at somehow revitalizing connection services than transforming much of anything.  The reasons for this have been debated/discussed for a long time, including here in my blog, so I don’t want to dig further into them.  Instead I want to look at the technology elements that real transformation would require.

I’ve said in the past that there were two primary pieces to transformation technology—a portal system that exposes service status and ordering directly to customers, and a service lifecycle management system that could automate the fulfillment of not only today’s connection services and their successors, but also those elusive higher-layer services that operators in their hearts know they need.  This two-piece model is valid, but perhaps insufficient to guide vendor/product selection.  I want to dig further.

We do have long-standing validation of the basic two-piece approach.  Jorge Cardoso did a visionary project that combined LinkedUSDL, OpenTOSCA, and SugarCRM to produce a transformed service delivery framework.  It had those two pieces—TOSCA orchestration of service lifecycle management and SugarCRM portal technology, bound by LinkedUSDL.  This was a research project, a proof of concept, and it needs a bit of generalizing before it could become a commercial framework capable of supporting transformation.

While there are two major functional elements in the transformative model we’ve been talking about, each of these elements are made up of distinct pieces.  To really address transformation, we have to make all these pieces fit, and make sure that each performs its own mission as independently as possible, to prevent “brittle” or silo implementations.  That’s possible, but not easy.

The front-end of any portal, we know from decades of experience, should be a web-like process, based on RESTful APIs and designed to deliver information to any kind of device—from a supercomputer data center to a smartphone.  This web front-end hosts what we could call the “retail APIs”, meaning the APIs that support customer, partner, and customer-service processes.  To the greatest extent possible, these should be as general as a web server is, because most of the changes we’re going to see in new service applications will focus on this layer.

Behind the web-process piece of the portal is what we could call the cloud-support layer.  You want the front-end to be totally about managing the user interface, so any editing, validation, and rights brokerage should be pulled into something like a cloud process.  I’m calling this “cloud” to mean that the components here should be designed for scaling and replacement, either by being stateless or by using back-end (database or data model) state control.  This is particularly important for portal functions that are inquiries—management status—rather than service orders or updates.  That’s because historically there are more of these status-related requests, because users expect quick responses from such requests, and finally because there’s no long-cycle database updates at the end of the flow to limit how much QoE improvement scalable processes up front can make.

For the entire portal flow, from web-side to cloud-side, it’s more important to have an agile architecture than to have a specific product.  You should be able to adapt any web-based process to be a portal front-end, and you should be wary of selecting a cloud layer that’s too specific in terms of what it does, because the demands of future services are difficult to predict.  It’s also important to remember that the greatest innovations in terms of creating responsive and resilient clouds—microservices and functional (Lambda) computing—are only now rolling out, and most front-end products won’t include them.  Think architecture here!

The “back end” of the portal process is the linkage into the service lifecycle management system, and how this linkage works will depend of course on how the service lifecycle management process has been automated.  My own recommendation has always been that it be based on service modeling and state/event processing, which means that the linkage with the portal will be made by teeing up a service model (either a template for a new service or an instance representing an old one) and generating an event.  This is a useful model even for obtaining current service status; a “service event” could propagate through the service model and record the state/parameters of each element.

If a service model is properly defined (see my Service Lifecycle Management 101 blog series), then any instance of a process can handle it, which means that the structure is scalable as needed to handle the work.  This is important because it does little good to have a front-end portal process that’s elastic and resilient and then hook it to a single-thread provisioning system.  In effect, the very front end of the service lifecycle management system is inherently cloud-ready, which of course is what should be done.

As you dive deeper into service lifecycle management, though, you inevitably end up hitting the border of resource-bound processes.  You can have a thousand different service lifecycle flows in the service layer of the model, where the state and parameters of a given service are always recorded in the data model itself.  Deeper in, you hit the point where resources set their own intrinsic state.  I can allocate something only if it’s not already allocated to the maximum, which means that having parallel processes that maintain their state in a service model are now constrained by “real” state.

The problem of “serialization” of requests to prevent collisions in allocation of inherently stateful resource elements is greatest where specific resources are being allocated.  As an example, you can have a “cloud process” that commands something be deployed on a resource pool, and that process may well be parallel-ready because the pool is sized so as to prevent collision of requests.  But at some point, a general request to “Host My Stuff” will have to be made specific to a server/VM/container, and at that point you have to serialize.

The only good solution to this problem is to divide the administration of pooled resources so that resource-specific processes like loading an app into a VM are distributed among dozens of agents (running, for example, OpenStack) rather than concentrated in a single agent that supports the pool at large.  That means decomposing resource-layer requests to the level of “administrative zone” first, then within the zone to the specific server.

I’ve seen about 20 operator visions on the structure of retail portals that feed service lifecycle management, and I have to honestly say that I’ve not yet seen one that addresses all of these issues.  Most of the operators are saying that they’re only doing early prototypes, and that they’ll evolve to a more sophisticated approach down the line.  Perhaps, but if you base your early prototypes on a model that can’t do the sophisticated stuff, your evolution relies on that model evolving in the right direction.  If it doesn’t, you’re stuck in limited functionality, or you start over.

The thing is, all of this can be done right today.  There are no voids in the inventory of functional capabilities needed to do exactly what I’ve described here.  If you pick the right stuff up front, then your evolution to a full-blown, transformed, system will be seamless.  I argue that it won’t take significantly longer to start off right today, and it darn sure will take a lot longer to turn around a project that can’t be taken to a full level of transformation because the early pieces don’t fit the ultimate mission.