Creating a Role for Standards in a Software-Defined Age

Suppose that we were able to find a role for standards in a software-defined world.  What would that role be?  I think it’s obvious that traditional standards processes are simply not effective in the current age.  They take way too long and they end up defining an architecture that has to be ignored if the end product is to be useful.  So should we scrap them, or at least offer them another chance?  These are my personal views with the wrong and the potential right.

I’ve been involved with standards for decades, and one thing that characterizes the efforts I’ve seen is the fact that they seem to take forever.  The process is very formalized and participation in it demands you follow the processes.  In many cases, these processes are time-consuming enough that it’s impossible for anyone to participate casually; you’d have to dedicate an almost full-time effort.  That, of course, means that the standards groups tend to be populated by people paid to be involved.  If you’ve got such a group, it tends to build a bit of insularity.  Standards people are standards people.  I’m not knocking them, just saying they are a very distinct group.

The insularity issue is a factor in today’s networking space, because we’re facing radical changes in the business of networking, which logically should demand radical changes in the technology.  In the recent standards work I’ve been involved with at various levels, the need for radical changes worked against the mindset of many of those involved.  Can you define a process that has to be agile, built using a glacial methodology?  I doubt it.

The biggest problem, though, is the inspiration that necessarily underlays any great, sweeping, revolutionary concept.  A large group of people can work on an idea, but it’s much harder for a large group to come up with an idea.  Consensus and innovation are strange bedfellows.  The standards efforts I’ve seen that were founded on something truly innovative were driven by the insights of just a few people.  Rarely more than two, often only one.  This set out a straw-man concept, which was then collectively refined.

Even if you get a brilliant insight, you can still trip over other factors.  One of the seminal concepts in zero-touch automation is the notion of data-model-driven event steering to processes based on state/event correlation.  The TMF called this “NGOSS Contract” and it was the brainchild (in my view) of a single person, about a decade ago.  It was almost never implemented, and I probably talked about it more than the TMF did.  A great idea, a foundation for revolutionary changes in operations automation, and the body most associated with operations (the TMF) didn’t move their own ball.

This point opens the “what could be done” topic, because what happened in this case (and in another I know of) is that bureaucracy killed even a good insight.  The reason, in my view, is that you really have to do software stuff by starting a top-down transformation from goals to execution, based on an architecture (the initial insight).  That’s better done with an open-source project, because a good software architect would work just like that.  The trick is giving the architect the right starting point.

Which, as the TMF proved with NGOSS Contract, can be done by a standards group, but not in the usual way.  What the group should do is appoint an “architect team” of a couple of people with solid background both in the topic and in software design.  This team would then develop an architecture, a technical approach.  The body would then either designate a current open source group (Apache, Eclipse, whatever) to pursue the high-level software architecture (with contributions from qualified people within the standards group) or spawn a new group from their members and perhaps some key invitees from the outside.

The next point to deal with is the bureaucracy, the process.  Phone calls in the middle of the night are not a way to get the best minds in the industry to focus on your issues (or much of anything, truth be told).  Standards groups should be built around chat/message collaboration, not real-time stuff.  They should also limit the frequency of scheduled interactions, working instead to build consensus (as the IETF does, at least to a degree) based on successive contributions.  In any event, the idea is to move quickly out of “standards” and into “open source” to build what’s needed.

There are a lot of very smart people with very strong qualifications to participate in standards processes, but they’re not being paid to do that and so can’t devote endless time to the activities.  Building collaborative systems into the standards process, using Wikis (as software groups almost always do) to develop thinking, and using message boards for discussion rather than phone calls, lowers barriers to participation and makes it more likely that people with good stuff to contribute will be able to do that.

The goal of the process changes should be to align standards work with the practices of open-source software development.  Remember, the plan is to have standards groups generate only the high-level architecture and gather requirements, then hand things off to developers to have them build the platform as a series of logical steps.  Most of the work would be done during this phase, so the standards would be functional requirements and architectural guidelines only.

The obvious question is whether this kind of approach is realistic, and I have to admit that it’s probably not going to happen except under pressure.  As I’ve said, I’ve been involved in a lot of standards activities, and they are populated by hard-working, honorable, and specialized people.  That specialization means that they are likely to push back against major changes.  That’s particularly true of industry bodies who have paid staffs and collect dues to cover their costs.  Through the roughly 20 years I’ve been involved with standards, despite the radical shift toward software-defined everything, the old processes and traditional practices remain largely intact.

Pressure may come, though.  Standards groups are, in my personal opinion, largely ineffective in today’s world.  I can’t think of a single activity that shouldn’t have been done differently.  At some point, companies are going to stop contributing resources to standards bodies, and stop paying attention to the results they achieve.  To forestall this, those bodies will have to make changes in their processes.

We should expect that.  We can’t run things in the Internet age with processes that would have been at home (and often were) in the heyday of the Bell System.  Software’s injection into networking is proof of a profound shift in technology, and only a profound shift in practices is going to help us keep pace with the real world.

Facing a Broadband Future

Everyone probably agrees that broadband Internet is the revolution of our time.  Whether we’re talking about wireline or wireless, even business versus consumer, the explosive growth of broadband access to the Internet has transformed much of telecom.  What we used to think of as the prime services (voice calling) is now an afterthought OTT service.  If broadband has driven so many changes in the recent past, what might it do in the future, and how might it change itself along the way?

The first problem broadband has to face is delivery.  Most countries built out their telecom infrastructure with regulated monopolies, or even with network services as an arm of the government (“postal, telephone, and telegraph” or PTT).  Today, few countries have regulated monopolies to supply network services, which means that service providers have to be profitable.  That poses a major challenge in the era of broadband.

Twisted-pair telephone connections are far from ideal in delivering broadband services.  Long copper loops don’t work well with digital broadband even at distances of a couple miles, and if there are intermediate nodes fed with digital trunk lines that divide into those loops, the intermediate structure has to be modernized to even do digital-subscriber-loop (DSL) delivery.  Plus, despite advances in DSL technology, it’s very difficult to get beyond 20 Mbps.  In many countries, including the US, broadband services to the consumer already get to a gigabit per second.

The US demonstrates the challenge of old infrastructure versus new services, and video in particular.  Telcos and cable companies compete today for broadband customers, but cable companies had to deliver channelized video from the start, and so they deployed “community access television” or CATV cable, shielded cable capable of much higher data rates than unshielded copper twisted pair.  This meant that cable companies had an advantage in the broadband race.  Telcos had to deploy fiber to the home to get equivalent capacity, and that could be difficult.

Why?  “Demand density”.  The potential profitability of a base of network service consumers is directly related to their economic power per square mile (or kilometer, or whatever measure you like, as long as it’s consistent).  Where demand density is high, a mile of broadband feed passes a lot of dollars, and so returns decently on investment.  Why do you think Verizon jumped into FTTH with FiOS and AT&T didn’t?  Verizon has a demand density that’s almost ten times that of AT&T.  Demand density is also why you can’t get FiOS everywhere, and why Verizon sold off customers and lines in areas it thought wouldn’t be suitable for FiOS.

Both cable and FTTH have relied on the revenues generated by linear TV services, and we are clearly past the golden age of that opportunity.  Both telcos and cable companies are struggling to even keep the customers they have.  What that means is that if deploying broadband in the past was difficult to justify from an ROI perspective, it’s only going to get harder.

One impact of that shift is that there’s much more focus on mobile broadband today.  Mobile infrastructure uses RF for the last mile, so it’s less expensive to deploy as long as you don’t need enormous per-user capacity.  The push to 4G LTE and now to 5G is largely driven by the fact that older radio access network (RAN) technologies didn’t support enough aggregate customers and bandwidth per cell, so would drive up deployment costs too much.  Because RAN modernization is the hot piece of 5G, there’s a real chance that the transitional model of 5G, the so-called “Non-Stand-Alone” or NSA version, which focuses on New Radio (NR) will be the only thing that really gets going.

Another impact of demand density is the 5G millimeter-wave hybrid with fiber to the node (5G/FTTN).  If running fiber to the home is too expensive, how about running it to a neighborhood and using ultra-short-wavelength 5G to carry nearly a gig per user the rest of the way?  However, 5G/FTTN isn’t for carrying linear TV, only streaming, which means that the competition to get faster broadband to the user will inevitably favor streaming TV delivery.

The reason, of course, why fiber to the home is too expensive is that the return on fiber at anything other than very high demand density is too low.  The reason it’s too low is that broadband Internet is a complicated ecosystem, particularly with video content.  We’ve seen proof of that in the interest of network providers in buying content providers—AT&T and Time Warner are the recent example.  But Wall Street says that even the content providers have been facing profit pressure.  We see more commercials, more low-cost programs (reality shows) being produced, and shorter seasons.  All of that encourages consumers to stream alternative video, and many stay with streaming eventually.

If a content producer can’t get bought out by a network operator, they have to think about creating their own direct pathway to the consumer, meaning their own streaming portal.  Many do that, to a degree, already.  That forces the operators who own content to compete with content providers who simply ride over the top (OTT) on their networks.  That, in turn, means that if you are a network operator, you can’t afford to run your network at low (or negative) margins and make it up in video content profits.  Your competitors won’t have the network cost as a boat anchor.

Content producers individually going direct to consumer through streaming portals would result in a rather expensive competitive overbuild, so what would likely happen is a combination of consolidation among content producers who aren’t bought by network operators, and growth in the service of providing content streaming as a service to producers.  Content delivery networks (CDNs) like Akamai might well reap a lot of this, but network operators could also supply caching as a service.

They’d have the facilities, because if operators are going to own content, supply OTTs with delivery pathways (reluctantly, of course), or both, they’ll want to maximize their delivery efficiency, especially for scheduled content like live TV.  Forward caching of live events prevents wasteful simultaneous delivery of the same material to a bunch of customers.  Owning caching would also be the natural pathway to providing ad insertion; you cache the ads too.

From a traffic perspective, content is everything.  Content is also increasingly cached close to the edge, and that means that most traffic will go only from an edge-cache position into the access network.  It’s almost like the “old Internet” of websites connected globally stays the same underneath this enormous pool of cache/access bandwidth.

From a network infrastructure perspective, these factors add up to a network that’s rich in the access area, feeding a set of edge (meaning local office) hosting/caching facilities.  Vendors who have strong edge positions, particularly wireless RAN offerings, may be advantaged by the shifts, as would optical players whose offerings will deliver more bandwidth to the edge, which requires more fiber.

Conventional switching and routing may not do so well, comparatively.  Metro would also grow, but not as fast, and the core side would have the least pressure.  In fact, because consumer networking would be so driven to be an edge-to-cache relationship, it might well mean that business services would be a greater driver of traditional wide-area network evolution.  Business sites, after all, are widely distributed.  Business networking demands are not as content-centric, and so will change more slowly.  All that “Internet-index” stuff on growth, in other words, may end up creating very shallow network impact.

Broadband changed our world, and broadband obviously changes the network that supplies it.  The interesting thing is that broadband changes behavior, which changes the perceived value of services, which changes the specific ways we’ll consume broadband.  The result is a kind of feedback loop, and it’s going to be transformational for vendors and operators alike.  The next ten years should be very interesting.

Who Will Orchestrate the Orchestrators?

You probably know how fond I am of classic quotations.  One I’ve always particularly liked is in Latin:  “Quis custodiet ipsos custodes”, which translates freely to “who will watch the guards themselves?”  We might be heading for a knock-off of that in zero-touch automation (ZTA).  How do you like “who will orchestrate the orchestrators themselves?”

The issue here is actually related to another old saw:  “You can’t see the forest for the trees”, which reflects on the problem of tunnel vision.  Focus on details and you lose the big picture.  Logically, ZTA means you have to orchestrate every aspect of service lifecycle, which you might expect would mean a single approach with that goal.  Not so.  With ZTA, we have two powerful drivers of tree-focus, one the desire of vendors to get something out to sell quickly, and the other the tendency of standards bodies to create little enclaves of activity, one for each body.  Thus, we are likely to end up with an embarrassment of ZTA riches, in quantity but not in quality.

Despite the trends, there’s considerable question as to whether ZTA is meaningful if it’s applied only to a tiny piece of the service lifecycle process.  We already have plenty of tools that automate a few routine tasks to the point where human intervention isn’t needed.  Every data center or network operations script does that, every system that responds to any event does it too.  The problem is, and has been, that a service lifecycle is a complex sequence of things that have to be done, and to get one or two of them done without a human intervening only means the human has to figure out where the automated process left off, then pick things up.  That’s likely to be a lot of touching.

We actually sort-of-have a solution to the problem of tree-focus in ZTA, which gets back to orchestrating the orchestrators.  It’s intent modeling, if properly applied.  An intent model is a black box abstraction of a component of something, like a service.  The intent model represents the “something’s” intent, meaning what it does.  It’s inputs and outputs are defined, but the interior of the black box that contains the implementation is up to the implementer.  Any two implementations that satisfy the intent are equivalent.

What makes this important for orchestration and ZTA is that you could have a bunch of dissimilar orchestration or ZTA models living inside an intent model, and they’d be just like each other and anything else.  Tiny functional pieces could be orchestrated as “trees” while a higher-level, intent-modeled hierarchy could orchestrate the forest.

Most of the modeling work now being done isn’t working this way, of course.  There’s a tendency to focus on what happens inside one of those functional black boxes.  That may build a bigger tree, but it still doesn’t look at the service as a holistic forest.  Perhaps the reason for this bias is that vendor and even operator attention has focused on low-apple, easily justified, automation projects rather than on creating a broadly useful/efficient approach.

Part of the issue is also jurisdictional, meaning that many standards bodies see themselves as focused on a specific task.  ETSI’s own recent ZTA efforts are, in my own view, slaved to the NFV activity that declared its own operations model out of scope.  In other words, NFV plus ZTA equals an actual solution.  Except of course that very little networking or hosting is NFV.  Those bodies that have broad missions for modeling, like the OMG, tend to be regarded as abstract pie-in-the-sky kinds of people.  Sad, given that abstraction is actually what we need to have here.

Call me a radical, a heretic, but I think the biggest problem with intent modeling is the dogma of modeling languages.  I happen to be a fan of OASIS TOSCA as a vehicle for defining intent models, but I’ve also used XML to do the same thing and it’s fine.  Intent modeling is a principle, and while there may be benefits associated with any particular modeling language over others, as long as one fulfills the requirements of expressing the “outside” of the box and identifying the implementation automation or orchestration tool used to fulfill intent on the inside, it should be fine.  We need to be looking at what the outside/inside requirements are in abstract first, and then worrying about the language to express them in.  The majority of people I’ve talked with in both vendors’ and operators’ organizations admit that they don’t see that kind of discussion happening today.

A set of inside/outside requirements, stated in plain language, could provide the basis for judging how well a given modeling language would work in conveying the necessary information.  You could do a trial expression of various services and models using candidate languages, and the result would provide a means of comparing them.  If we don’t have that plain-language start, how do we know what we’re asking a model to model?

Here’s an example.  Take a “model” of a firewall function.  We could express as the interfaces, three options: Port, Trunk, Management.  We could express for each of these the interface type (IP Data, CLI Text, SNMP, whatever), and we could express for structured APIs like SNMP the model reference for the structure.  If we liked, we could provide a Rule interface and structure what we expected a firewall rule to look like.  If we have that, we could then translate each of our “from-the-outside” interfaces to a modeling language.  We could also then say that if we had an inside rule that “realized” the function through a call to another orchestration process, we’d have an implementation of “firewall”.

The implementation might be a further decomposition into a set of lower-level models.  It might be the orchestration of a uCPE deployment, or an automatic order generation to ship a physical device to the user.  Thus, we could use this approach to tie together disparate orchestration/ZTA initiatives to create a complete service lifecycle automation strategy.  Orchestrating the orchestrators then becomes a path to ZTA, and also a set of requirements for modeling languages that might be helpful in ZTA overall.  Those that can’t describe this process can’t do multi-tree forests.

Another critical piece of this is the contextual orchestration of service lifecycle tasks.  Rarely does one thing not impact others in a complex service, which means that an intent model has to have “states” and “events”, and these have to be organized to coordinate activities across various pieces of the service during changes in state, due to service change orders from one side and faults and reconfigurations from the other.  The modeling language also has to be able to express states and provide a means of linking the state/event combination with a target process set.  As I’ve said many times before, this is what the TMF’s old “NGOSS Contract” approach did (a TMF insider said in a comment on another of my blogs that the TMF might revive NGOSS Contract, which was rarely implemented).

You can do state/event descriptions in abstract too, using text in a table.  That means that we could go a long way toward defining a good intent-based lifecycle automation system without getting jammed up in the specific modeling language.  In fact, I think it means that we should.  This whole process of intent modeling is getting jammed up in a religious war over models, delayed by a bunch of people waiting to develop modeling language approaches.  First things first; let’s decide what kind of forest we want before we start planting trees.

Is Everyone Missing the Boat on SD-WAN?

What’s in a name?  Too much, in many cases, because we tend to use a single very popular name (one with good media visibility) to cover a product space that may have little in the way of uniform features or even missions.  We clearly have that problem with SD-WAN.  Scott Raynovich did a nice piece in Fierce Telecom on why vendors and service providers “missed the boat” on SD-WAN, and it shows indirectly how the same technology name can cover a lot of missions.

The main point of the piece is that because SD-WAN is an alternative to higher-profit MPLS VPN technology (that service providers can charge for and that equipment vendors can sell more to support), fear inhibits support of SD-WAN by either class of player.  I think that’s very true, but I think it’s also true that there could be more to SD-WAN than MPLS replacement, and that’s getting overlooked.

Basic SD-WAN uses the Internet create a little VPN among thinly connected sites, and then connects that to the company’s MPLS VPN.  Some vendors promote a total replacement of MPLS.  That’s an easy value proposition to sell—buy this and spend less, or connect more.  It’s also something that could be done using half-a-dozen classic tunneling approaches, using open-source software.  Durable SD-WAN value demands more features.

The cloud, in various forms, is already driving a broader vision of SD-WAN features.  If you’re going to host something in the cloud or use a cloud-hosted SaaS, you may need to incorporate what you’re using into your VPN.  Cloud providers charge for MPLS VPN connections (some don’t even support them), and they certainly won’t let you stuff SD-WAN boxes into their data centers.  As a result, some vendors have added cloud-hosted SD-WAN agent elements to their offerings, which extend the SD-WAN into the cloud and support at least some configurations of cloud applications.

The cloud is the tip of the iceberg.  Virtualization, as the general process of creating an elastic abstraction between application hosting and actual server resources, creates a much more complicated problem than the cloud.  Here we have the notion of a component of software migrating around a pool of resources to accommodate failures or overloads.  The problem with that is that we reach applications by IP addresses that represent where something is, not what it is in a logical sense.

What I’ve called “logical networking” is the next step in SD-WAN evolution; the need to support the cloud is the “camel’s nose under the tent” here.  Inevitably, we end up with more features to support the agility of virtualization, and inevitably that leads us to logical networking.

What these statements show is that there are really three different broad models of SD-WAN, but we really don’t recognize the differences.  In the process of consulting, blogging, and doing an SD-WAN tutorial (available the end of July 2018), I’ve looked at about two dozen SD-WAN vendors.  What I find is that the great majority cluster around the first and most basic mission, or the second and simple cloud extension mission.  Very few even promise to support broader virtualization and logical networking.  And yet, we talk about “SD-WAN” as though there was some specific set of features that the name implied.  There is no such thing even now.

Recently there has been some discussion about “standardizing” SD-WAN, but it’s my view that these comments suffer from the same name-overloading that the term SD-WAN itself does.  SD-WAN is a bewildering collection of naming, addressing, connection awareness, policy management, network management, security, compliance/governance, orchestration and “network as a service” (NaaS) concepts.  The “standardization” so far seems to focus on a tiny piece or two of this—common encapsulation of data within the SD-WAN, common interfacing to MPLS VPNs or the Internet, common hosting models for the cloud.  The higher-level features that will dominate the SD-WAN market down the line are simply not on the radar of the standardizing people, because these features are way out of their domain of influence.

NFV is another example of the overloading of a term creating confusion.  Much of what people think is happening in the NFV space has nothing to do with the NFV specifications.  We’re really talking about “white box CPE”, a generalized device into which we could inject network features as needed.  This, as an alternative to a bunch of specialized appliances strung out in a sequence at each service termination.  Do we need to service-chain stuff inside the same box?  Hardly.  Is the ability to do that a proof point in NFV adoption?  Hardly.  Is there therefore a fragmented notion of NFV and the market for it?  Surely.

Everyone thinks the cloud and virtualization are transformational in IT and the Internet, yet we’ve barely thought about the ramifications of virtualization in the area of agile connectivity.  We have here the unusual picture of a market rushing forward to define what might be the most fundamental requirement for optimizing our future hosting and networking model, and we’re really still stuck in bearskins and stone knives (to quote an oft-used phrase).  The problems we’ve had in overloading other technology names (like NFV) could pale into insignificance compared with what could happen here, because SD-WAN is really, vitally, important and most of the stuff that makes it that isn’t available in the offerings today.

I remember the first personal computers aimed at business users.  They were as big as a convection oven, difficult to use and develop for, and most of all lacking in any clear mission or major sponsor.  What changed everything was the IBM PC, and not because it was a technical marvel (my system was the second delivered in New Jersey, had 128 kb of RAM (yes, that’s kilobytes) and two floppy drives, and was “so complex” that the local store had to get IBM specialists to help configure it.  Because it had a business mission and a credible sponsor.

IBM didn’t invent the PC, and in fact their technology at both the hardware and software level was much the same as we had already in the older business PCs.  They provided business air cover for buyers, provided a vision of workers empowered by something on their desks and not just something living in a distant data center tended by technological geek acolytes.  This is what the big players could have provided for the SD-WAN space.

This is where Scott’s point is right on, or perhaps even a bit understated.  The big players have missed the boat, but the boat wasn’t launched by the price war with MPLS VPNs.  You could forgive vendors for not rushing to create a network technology that ultimately undermines their revenue stream, which is all the basic model of SD-WAN could be expected to do.

Not all that it can do, though.  What both vendors and network operators have missed is the opportunity to frame SD-WAN and logical networking as the virtual network, the NaaS, of the future.  And, of course, the opportunity to set themselves up as the kingpin of that space.  So here are the questions.  Are we going to have a populist revolution here, a smaller player that rises up to strike at the mighty?  Or are we going to see some major M&A when the real requirements of the market, the next generation of SD-WAN, are realized?

An effective SD-WAN overlay based on logical networking would relegate current enterprise network technology to the role of plumbing; all the features would live up in the SD-WAN layer.  That’s’ where vendor fear comes in, but is some of that fear a result of taking such a short-sighted view of SD-WAN features?  If the total network is more valuable, could it not justify more spending?  If a vendor had both the underlay traditional stuff and the SD-WAN stuff, the combination could be a gain not a loss.  If somebody else gets the SD-WAN stuff, then loss is what you’re left with.

Scott summed it up nicely: “This [the cloud and SaaS] is a strong value proposition that will continue to attract investment from enterprises to build next-generation SD-WANs. Providers of legacy WAN hardware and services should adapt as fast as possible, or they risk being left behind.”