Interpreting Some New Data on Video Consumption

I did a blog earlier about the impact of smartphones on the mobile opportunity and also the way the devices have changed our culture, particularly for the young.  Nielson has released some interesting data on video behavior that relates to smartphone and online video use, and I want to recap what the report shows.

There’s no question that online video is exploding in popularity, but it’s proven difficult to say just what that means for video overall, meaning to predict what will happen to traditional channelized entertainment over time.  Some past surveys have suggested that people haven’t really shifted from TV, simply supplemented it, and others seem to show that people are substituting online entertainment for TV even if the TV is on.

One reason it’s difficult to get consensus on this is that the surveys often target different segments of the overall market, making their conclusions hard to reconcile.  That’s where the current Nielson material may help.  One interesting aspect of the most recent report (for 3Q16) is the data on device use for working mothers versus at-home mothers.  The “family dimension” of TV viewing has been noted in the past—people’s dependence on TV grows once they start a family.  Women are the foundation of family life in most homes, and they are also the adult group most directly impacted by youth behavior.  If video habits socialize, there’s no easier path than from teen to mother.

Let’s start with what the survey targets can do, meaning what devices they have available to them.  Across the dozen entertainment-related devices surveyed, there was never more than a 2% difference in device use depending on whether a woman worked or stayed at home.  This, combined with other data on things like smartphone adoption overall, suggests that people’s entertainment tools are defined more by their peers than by anything else.  That’s an incredibly interesting, perhaps revolutionary fact for those who want to understand the future of online video versus channelized TV.

In the past as in the present, the Nielson data doesn’t show any significant dip in TV viewership, though the use of online video has been rising significantly.  Their interpretation, and mine, is that what’s happening is not a shift from traditional TV viewing to online, but rather an increased use of mobile devices to deliver entertainment in video form at times when traditional TV isn’t available.  Obviously, this would play to the TV networks, stations, and marketers, but a similar view has been expressed by other research, including the financial industry.

So people are watching smartphones when they can’t watch TV, right?  It’s also been widely believed that TV viewing was more likely to be associated with “family time”, meaning that if you were at home with the family you likely watched TV.  Further, if you had children under the teen years, you were more likely to watch it.  On the surface, the Nielson data threatens these assumptions because it shows that for mothers at least, being in the home versus in the workforce isn’t changing their appliance ownership levels much at all.  Thus, even if we didn’t see a dramatic difference in TV versus online today, the seeds to create one are sown by the devices both groups have available.

There’s more.  Other Nielson data shows that the time spent per device between at-home and at-work moms isn’t as different as it has been in the past.  It’s not surprising that at-home moms spend more time with entertainment, but the difference in TV viewing is only about an hour a week (more for at-home moms) and for smartphones about the same.  Stay-at-home moms spend more time proximate to the TV and so spend more time using it in some way (broadcast viewing, DVR, game system, online-to-TV) than working mothers.

It appears, again from both the Nielson data and other surveys, that social media is the primary driver of smartphone adoption and use.  Across the broad market, the dominant device used for social media is the smartphone, which accounts for about 2/3rds of total usage.  But here again the differences between social-media usage in at-home mothers versus working mothers is minimal, with at-home moms using slightly more social media in all categories except (interestingly) tablets, where working mom social media usage on the device was almost double that of the at-home moms.  The smartphone isn’t being driven by being at home versus at work, nor is social media interaction dependent on that metric.  Everyone, it seems, likes continuous social contact, and it’s this craving that puts the instrument of online video in everyone’s hands.

Moving to the broad market “everyone”, Nielson shows that TV viewing, PC internet use, listening to the radio, and similar entertainment activities showed no significant shifts over the last two years.  The same is true with the use of tablets.  In contrast, smartphone use grew significantly, from a total-market average of 58 minutes per day to 2.2 hours per day in 3Q16.  This, to me is the second very significant data point because it shows that what’s dominating changes in consumer entertainment behavior isn’t the “best” device for the experience but the best for convenience of use.  People entertain themselves with what’s with them all day, every day.

What age group dominates video viewing on smartphones?  It’s not the teens according to Nielson, it’s the up-and-coming adults, 25-34 years old.  The 35-49 age group is a close second.  This is the third very significant data point in the broadcast-TV-versus-online discussion.  It shows that it isn’t children who are most likely to step beyond traditional TV, it’s young adults.

So where does all this data lead?  First, if you combine it with overall smartphone penetration globally, it shows that if the smartphone is the instrument of reengineering behavior, then we’re already equipped.  In advanced economies, smartphone penetration equals or exceeds that of the US.  In emerging economies, where the greatest growth in smartphone adoption is found, we’re likely to get to similar penetration rates within five years.  Smartphone adoption is also fairly consistent across racial and economic divisions in most markets.  They’re here.

The second point is that full exploitation of smartphones has not yet been reached except for the teen demographic group in advanced economies.  Even the market group that has the most teen contact—at-home moms—we don’t see the same social media or video behavior we see from that supersaturated teen segment.  However, socialization is happening.

The third point is that the greatest point of socialization in content viewing isn’t the teen group at all, but young adults.  Some data beyond Nielson suggests that it’s actually males in the 25-34 age group, which led one Street analyst to describe it to me as being “bromance driven”, meaning male socialization.  However, I also think it may be because this segment of the market is the most physically mobile, and thus more likely to be dependent on mobile-device-delivered entertainment.

We have the instrument of online entertainment already almost universally deployed.  We have social-media interactions that can suggest viewing options and promote online video.  We have evidence of socialization of dependence on online video, not only within an age group but between age groups, through normal intergenerational contacts.  The question is whether these forces will combine to make online video the dominant media play.

Here I think the key data point is the percentage of time-shifted TV viewing.  Nielson shows us that about 60% of viewers use time-shifted video, which says that video consumers are decisively rejecting the “what’s on?” paradigm of the past, at least some of the time.  But so far, only about 15% of viewing hours are spent on time-shifted programming, and smartphone video in the home accounts for far less.  To me, what this says is that the smartphone social-behavioral shift in video has already had its greatest impact, filling in time when normal viewing isn’t possible.  I can’t yet find convincing data to show it’s really displacing TV, but for sure it’s prevented TV viewing hours from growing.

Perhaps the most significant interpretation of all this data is that mobile video is going to be a major issue for the mobile operators.  We can already see that M&A in the media space seems to be favoring the creation of vertical partnerships between ISPs and content companies.  If video availability (and immunity from data-plan costs) becomes a major selling point for mobile services, operators are going to have to respond, and that is IMHO going to make mobile video the big target for things like 5G.

How Do SDN/NFV Vendors Divide by Category? Differently!

I blogged recently about the three categories of operators, categories divided by the operators’ thinking on the path to transformation and the role of things like SDN and NFV.  It’s only fair to do the same thing now with the network vendors, who also happen to divide into three categories.  Where the operators divide based on how they’ll successfully pursue transformation, the vendors divide based on where they’re placing their own technology bets, which interestingly isn’t necessarily where their buyers seem to be looking to invest.

Vendors, of course, are about products where operators are about ROI.  For at least two decades, vendors have supported network technology revolutions to the extent that they aided in engagement, that they opened doors for dialog.  The current technology revolutions are SDN and NFV, and so it’s easiest to classify vendors by looking at how these technologies play in their strategies.

The first vendor grouping is the formalists.  This group of vendors assumed, and largely still assumes, that SDN and NFV standards and related PoC and trial activity will drive transformation and generate sales for them.  It’s fair to say that most vendors started in this group, but at this point probably two-thirds have moved out of it because of a lack of sales success.

The problem with sales for this group is primarily due to the limited scope of the formal standards activity.  That’s particularly true for NFV, which at first limited its own activity scope so much that an end-to-end service was more out of scope than in.  The lack of scope not only makes trial and PoC work more difficult to align with deployment reality, it forecloses the standards’ support of a complete business case.  Both the SDN (ONF and ODL) and NFV groups have been working to broaden their scope, but obviously recent efforts here couldn’t pay back quickly.  Thus, 2017 isn’t going to be much better for this group.

Pace of progress is the second issue this group has faced.  Both SDN and NFV are around five years old, and we’re still struggling to get SDN out of the data center and NFV out of a little box on the premises that we call “virtual CPE”.  Vendors have quarterly sales quotas, and the big SDN/NFV formalist companies’ sales force started complaining to me about how operators were slow on the uptake in 2014, when virtually nothing had been proven about either technology.

Even those who aren’t motivated to change groups by the sales issues will likely be motivated by the shift in SDN/NFV momentum toward open-source projects, away from standards bodies.  I believe that Open Daylight is doing more for SDN than the ONF, and that the various open-source projects related to NFV and now jockeying for positioning are going to eclipse the work of the NFV ISG.  Open source tends to undermine vendor differentiation, which means that early deals don’t have the profit margin needed to justify all the legwork needed to close them.

The second grouping is the shape-shifters.  The members of this group are formalist expats, driven out by a lack of sales success and the recent negative publicity on SDN/NFV adoption, but not committed to any specific alternative course.  The view of this group seems to be that the market has stalled temporarily, but that everything will come out right in the end.  In the meantime, keep your name in the press.

Some good concepts are being opened up by this group, despite the somewhat cynical foundation for their behavior.  CORD (Central Office Rearchitected as a Data center) is an idea I’ve blogged favorably about in the past.  It frames what a CO should look like after transformation rather than talking about narrow-scope, multiple-standards, initiatives that might get us to that end-game.  That’s helpful because operators don’t like to stumble into a CO framework; they’d rather plan one.  It’s unhelpful because it still begs the question of why we’re taking the journey to CORD to begin with.

Perhaps the best outcome we’ve had so far that relates to the migration between formalism and shape-shifting is the attention that AT&T’s ECOMP is getting.  ECOMP isn’t a vendor strategy; it’s in fact a kind of ant-vendor strategy in that it came out of AT&T’s Domain 2.0 program aimed at containing vendor influence and preventing lock-in.  ECOMP is the glue that binds the D2 domains, and as such it’s a natural mediator of technology differences and evolution.  It’s ECOMP that seems to form the centerpiece of the final vendor grouping.

This group still faces two specific challenges.  One is a way to generate provable benefits in the near term, benefits needed to justify the risk that new technology adoption necessarily brings.  The other is a credible pathway to a new vision of infrastructure.  We have to jump-start SDN/NFV, and then we have to get it to a new, good, place.  Operators tell me that even this second group of vendors is still relying on “everybody’s doing it” to justify their solutions.  Everybody, clearly, is not.

Which is the specialists.  From the first there were vendors who didn’t see themselves as providing a full-spectrum SDN/NFV or transformation solution.  They just wanted to be a part of the game, and make money on it.  In the NFV space, we have a large community of VNF provider hoping for somebody to manage and deploy them, and a bunch of NFV Infrastructure (NFVI) players hoping to be hosted on.  These vendors were all at risk should anyone field a total NFV solution, because such a vendor wouldn’t be likely to make a place for them.  Now, with it less likely every day that any vendor will promote a full-spectrum transformation strategy, and with increased operator interest in the vendor-neutral ECOMP model, the specialization approach doesn’t look as risky.

To say that there won’t be vendors, or vendor domination, of a “neutral” ECOMP may miss the mark, though.  In fact, ECOMP opens the door for another vendor community altogether, one that’s been largely on the sidelines so far—the OSS/BSS providers.  Of the seven operators I’m aware of (beyond AT&T and those announced already), five are looking at ECOMP in no small part because of an OSS/BSS vendor’s recommendations.  One such vendor, in fact—Amdocs.

A perfect transformation storm would start with service lifecycle automation that reduces “process opex” by at least a third for relatively little investment.  That same automation could then facilitate the introduction of SDN and NFV where it made sense, and also prepare operators for new revenue models, including hosted-experience OTT models.  It would base network transformation on overlay models like SD-WAN, something to immunize services and customers from technology shifts at the infrastructure level and to separate service and resource lifecycle management convincingly to let the former evolve toward portals and software automation and the latter toward sustaining planned levels of capacity and performance.

All of this could be done below the OSS/BSS, and could in fact have been a part of NFV all along—the critical orchestration requirement (MANO) was part of NFV from almost Day One.  The issue of scope, the decision to initially exclude legacy elements and OSS/BSS, opened the door for those who wanted to solve the problems elsewhere, or who had to—like AT&T.

What may strike you about the difference between the operator and vendor categories is that operators had two groupings that were on the right track, and the truth is that vendors don’t have any.  All of these three vendor groups are depending on some future shift that the vendors themselves seem unwilling or unable to drive.  It’s almost like the vendors have given up, and are just singing occasionally to the media to avoid losing face.  For network vendors, that would mean accepting the status quo, which is always a tempting strategy for an incumbent.  For non-network vendors relying on SDN/NFV transformation to expand their role (especially the server vendors), it’s accepting no role at all.

It’s hard not to believe that somebody isn’t going to get sensible in this mix, particularly given that the operators have two tracks that will lead eventually to real transformation, real infrastructure change.  It’s just hard to say who’s going to lead the charge from the vendor side, and that means that the opportunity is still wide open.  Takers, anyone?

“Caesar’s View” of the Network Operator Transformation Space

I remember (with little enthusiasm, frankly) my high-school Latin experience.  One fact I can still draw from those days is that “All Gaul is divided into three parts.”  Well, so are all operators.  Regulations, markets, technology and vendor commitments, and management and business goals all tend to make each operator an entity unto itself, but there’s only one market underneath it all.  This market is coalescing operators into three categories, and those categories will probably define how networking technology evolves through the next five years.

The first of these categories are the Johnny One-Notes.  We think of operators as vast businesses with a wide variation in service offerings and customer targets, but a large number of operators don’t really fit that model.  Managed service providers target business services.  Mobile operators target mobile broadband services.  The limited service targeting means that their investment and operations practices are tightly focused, and that the technology model used at the infrastructure level is contained.

If you buy a single class of device and offer the same basic service to all, you can sometimes base transformation on a specialized change in technology.  MSPs, for example, have been leaders in NFV adoption because the service-chain model of NFV has dominated NFV planning work, and because virtual CPE (vCPE) is the simplest model for virtual function hosting—a generalized box on the customer premises and not a cloud data center.

SDN and NFV specifications and the basic implementations of a controller (like Open Daylight) or orchestration (OPEN-O or OSM) are often as much as this group of operators need.  In fact, in truth, many of them don’t need SDN or NFV at all in a strict sense, they need an overlay concept like SD-WAN or the MEF’s Third Network and more agile CPE or some cloud-hosted (but not VNF-icized) features.

For this group, technology transformation is as much about breaking vendor monopolistic pricing power as it is about new technology strategies.  SDN/NFV vendors have taken advantage of the limited scope of operators’ needs in this category to promote early adoption, but these early steps don’t necessarily lead to any broadly useful solution.  The problem is that there are plenty of operators with broad needs who are still in this category in their own minds, still going for SDN/NFV one piece at a time with no systemic vision.  That’s never going to work.

The second category of operator is the broad-based multi-service operator, a category that gets a lot of attention because it’s where most of the Tier One operators are found.  This group of operators will almost always provide ISP services, and they have a mix of business and consumer customers, mobile and wireline.  Their capex drives the equipment market and determines the financial results of the major equipment vendors.

Operators in this category are facing the profit-per-bit compression I’ve blogged about multiple times, and the fact is that neither SDN nor NFV (separately or in combination) have proved to be effective in alleviating the problem.  As a result, they continue to put pressure on their capital budgets (they’ll do so again in 2017), and look for relief from their problem in technology changes.

Where those changes could come from is their big question.  This group drove both SDN and NFV, but the focus of both these activities was fairly narrow from the first and the pace of progress was slow.  As a result, the strict boundaries of the standards-based activities and the solutions derived from those activities proved too little, too late.  This group of operators has therefore been looking beyond the standards, to open-source activities and in particular to AT&T’s ECOMP.

The big issue for this group is opex control, and the problem is that while ECOMP has a broader scope than SDN or NFV and the standards-specific open-source projects, it still may not offer enough operations integration without customization.  It also may induce operators to look at a broad SDN/NFV/automation advance in synchrony, when for most operators prioritizing software automation of the service lifecycle and leaving infrastructure alone would offer a better return and less risk.

There’s a lot of work being done with ECOMP, though, and especially in OSS/BSS integration.  Right now, then, this operator community has the best chance of progress in 2017.

Cost management isn’t the final answer, even for our middle group.  For some, it’s not even sufficient in the near term, and for that reason our last group is the revenue aspirants.  For this group, the key question is what technology changes could drive significant revenue gains and take the pressure off cost management for infrastructure and perhaps even operations.

Everyone loves revenue best, but revenue gains are the most problematic of all the profit-control measures.  The biggest reason for that is that operators have tended to look for revenue gains from traditional connection services, through things like tweaking elasticity features or perhaps adding in managed service features.  They not only miss the best revenue opportunities this way, they focus on “opportunities in name only”.

Consumers want the lowest possible cost, period.  There’s essentially nothing that you can trade against cost for wireline broadband, and for wireless broadband the only thing that could possibly match service cost is geographic coverage.  Even that doesn’t matter for most users as long as you get the home area covered.  In any event, at least unless and until neutrality rules change, there aren’t any features you can charge for.

I’ve surveyed enterprises for literally decades, and I can tell you that even for enterprises, there is no connection service feature you could offer that cost doesn’t trump.  Put another way, enterprises would be happy to buy an elastic-bandwidth service if at the end of the year it cost them less than traditional non-elastic services did.  They’d be happy to get managed firewall and other connection-layer services too, as long as it didn’t cost them much of anything over their current on-prem product-based solutions.  Except for the managed service prospects whose issue is less monthly service cost than sustaining their own operations team to support products (which is really another revenue opportunity completely) connection-service revenue gains are a myth.

It’s harder to say how credible the “managed service” or “vCPE” model is.  Yes, it is true that users don’t want to spend a lot of money on network equipment.  They also don’t want to spend money on network services.  Enterprises typically have a professional networking staff that can support CPE, and in any case, you could argue that most of the “support” of things like a firewall lie in the realm of configuring the feature to admit and block the right stuff, which the user has to do pretty much whatever the ownership model of the “firewall” happens to be.

The real revenue opportunities for operators are just where operators kind of knew they’d be all along, where the OTTs have been working.  That means things that are hosted experiences, mediated experiences, contextual experiences, personalization, etc.  Many of the operators, and more of their people, are uncomfortable with a shift like that.  Perhaps that’s why relatively few have been pushing for technology changes to support the OTT-like space.  Some, though, have started to do just that.

The area that’s getting the most current attention is the application services space, meaning cloud services that deliver an application experience.  It’s not that this offers the best opportunity, but that cloud deployment is a small jump from NFV (see my blog of January 5th if you want to hear more on that).  The biggest actual opportunities are in the personalization and socialization of the video experience, IoT and contextual features for social networking and worker productivity, and optimization of mobile services for social, personal, and geographic features.  All of those require some vision of what the service is, which is why they lag cloud applications in near-term interest.  Operators still don’t see themselves as market evangelists.

The lesson I think should be learned from these operator categories is that nobody is really looking at transformation in a holistic way.  If you focus on a narrow model of SDN and NFV as the first category does, you’re risking either a failure of your business case for lack of sufficient benefits, or a new set of silos as you chase other narrow areas to add to your mix.  If you focus on the opex approach, you’ll embrace cost management but limit or eliminate revenue growth, and if you focus on revenue-generating new services you’ll risk suffocating them in inefficient operations.

We can’t transform three ways, even if there are three motivational groups singing their own separate songs.  The network of the future eventually serves the market of the future, and that market is unifying, not dividing.

The Foundation of the Biggest Transformation Revenue Opportunity of All

The future of networking, like the future of practically everything else, is largely dictated by return on investment, meaning the balance of costs and revenues for all the players.  I think 2017 is going to be a pivotal year, in no small part because network operators have predicted this will be the year where their revenue/cost-per-bit curves cross over.  Probably it won’t happen as they predicted—exactly—because they’ve cut costs more than usual, but the reckoning is coming.

I’ve talked a lot about the cost side of the picture, about the fact that opex reduction is the only credible strategy on the table.  I’ve also noted that the service automation steps that would be available for opex reduction could also improve the operators’ ability to earn new revenues.  However, despite all the interest in things like elastic bandwidth and managed services, the big revenue opportunities have to come not by using technical features to change the pricing model, but by doing new stuff that operators can charge for.  Since supply doesn’t invent demand but only exploits it, we have to look at the factors that will influence demand in 2017 and beyond to get a picture of revenue trends.  That’s what I’m going to do in this series of blogs.

Starting with the biggest demand factor of all, the personalization of broadband.  Consumer broadband and corporate networking have historically been about networking sites or fixed locations.  Workers and consumers alike have come to their networked devices to perform functions, and this doesn’t “personalize” the service, it “devicizes” the people.  Mobile broadband changes all of that, by letting us carry a connection with us everywhere.  The impact that’s created on both networking and human behavior is profound, more so than the impact of “fixed” Internet services.

Social media has been the biggest single factor in broadband personalization.  A smartphone and broadband connection lets people communicate almost as well as if they were face to face, and the result of this has been an explosion in social exchanges.  It’s made every friend into an almost-constant companion, and every acquaintance into a friend.

Among the “youth” (under 35) category, social-media use has hit the 90% level among Internet users.  As these people age, they carry most of their social-media habits into the older categories, and people currently in those categories are “socialized” by contact with youth, spreading the impact.  By the end of 2017 my model says we’ll hit 80% social-media use across the entire base of Internet users.  Among smartphone users, social-media use by youth is 100% within statistical limits, and 86% of the total population of Internet users.

You don’t turn your back on a companion or friend easily, so people don’t surrender social-media access willingly.  About half of smartphone users say they use social media at mealtime (even in restaurants), at events, and even while “watching TV”.  Youth, and parents of teen children, use them under more conditions than the rest of the market, but the data clearly shows that even in the over-65 category, social-media use is growing quickly.

If you establish your personal “society” through your mobile device and don’t want to be parted from it, then you also turn to the device more often because it’s familiar.  We can see this in content consumption.  Music is more likely heard on a smartphone than on a home stereo, and smartphones are the preferred means of viewing video among the young.  This has happened in no small part because of the new social behavior pattern that’s been generated.  You have to have your smartphone in hand to stay in touch, so it makes sense to view/listen using it.

What this means from a service opportunity perspective should be clear.  There is nothing much you can shoot for in terms of incremental revenue opportunity that’s not dependent on social broadband behavior.  Personalization and socialization are everything, even today.  Think about what it will be like in ten years, and ten years is what operators should be looking at in terms of their planning horizon.

Personalization has two primary elements.  First, it means relating to the user in a natural way.  We already know that means speech recognition and speech generation, but we need a mobile device to sound the way we want.  Second, it means understanding the context of the user, because smartphones are not phones, they’re virtual people.

We’re making significant advances in computer speech, as you can see just by looking at how personal agents (Siri, “Hey, Google”, Alexa, and so forth) are evolving beyond basic commands toward being almost conversational.  It’s not “Hal” from 2001 yet, but we’re clearly heading in that direction.  I did an impromptu survey of “youth”, asking whether they would like their phone to be able to carry on a basic conversation when someone called that they didn’t want to talk to.  Almost half were ready for this even today, and nearly all thought it would be a natural development within five years.

Contextualization is more complicated because it involves going beyond conversational commands to keeping a conversational context, and because it extends context from voice to location, even sight.  We are highly reliant on context, and if you look at the social conversations supported over a device link, you see that one of the major limitations is the lack of a shared context.

In terms of vision, it’s easy to find examples.  I see something, but you’re not in the same place and so I have to “show” you by panning my phone (I had a conversation like that just recently).  I expect to see integration of forward and rear-facing video to move toward creating a visually sharable context—if you keep one camera facing you the other is facing toward what you see if you look over/around the phone.

The ultimate in visual context would have to come from a headset or some camera (generally) aimed where our eyes are focused.  That lets a device “see” where we’re looking, and if that vision is combined with a geographic position we could interpret what the objects in view were.  “What’s that tall building?” would then make sense as a question.

Conversational context is also something we see regularly.  How many times do we get jammed up because one person in a conversation has changed topics and the other doesn’t catch the shift?  We could ease toward our goal by having a device “remember” past commands and notifications, then assume there was a relationship between those and the current situation.  If a notice pops up, the next thing I do is likely related to it.  If I’ve said to turn on a specific light, then “turn off” probably refers to that same light.

Geographic context is simple and not-simple at the same time.  Since nearly all smartphones have GPSs, we know where a user is on the earth, and with a little more processing we can figure out what’s nearby in terms of major features or even happenings.  I gave an example of a “What’s that tall building?” query, one that links vision and geographic location.  You could extend this by adding in notifications of specific events, perhaps from sensors (IoT) or even crowdsourced.  Now we could perhaps ask “What’s going on up there?” and have a process interpret a collection of smartphone users at a given point and the nearby facilities, and get an answer “Starbuck is giving away a free latte a block ahead on the right!”

Context leads to our other track to revenue—socialization.  If we assumed that the majority of our relationships were mediated in part through our devices, we could assume that a service could know about those relationships and use that knowledge to add relevance to its capabilities.  The ability to link a notice to a verbal comment is the simplest example; an SMS or email or call is a specific stimulus that could justify a presumptive link to the next verbal command.  Socialization can run deeper though; call or message handling can be made to “learn” who gets priority, and regular interactions show a deeper relationship than occasional ones.

Socialization, like geographic context, has to be mapped in a sense.  Everyone is a member of a number of “groups”, each of which establishes a social context.  Most of us have, for example, a “family” group and an “office” group.  We probably also have a number of “friend” groups, some that include people we regularly see, some perhaps divided by gender or age or a common interest.  You could consider social mapping as a three-step process—first, identify the groups by mapping regular interactions, second placing the device owner in a group based on current behavior, and finally using the group data to make social-context decisions.

The importance of socialization, combined with the fact that both social relationships and geographic context are based on “maps”, raises the question of whether we should be looking at geographic and social contexts in terms of groups.  Group membership, after all, could be defined simply as sharing a geography—“the people on Market Street” is an example.  This could be very important because it could be a way of dealing with other “contexts” beyond the social, and doing so more efficiently than a one-off by user.  If we can develop the context for an area of a street, or for a shopping center, and represent it as a group a user joins by moving there, we simplify contextual analysis.  After all, anyone looking up from Market and 8th would see the same thing.

Social and group management were features of the original ExperiaSphere project, under the topic of the social-management tools called SocioPathTM and there are presentations on the concept HERE.  The details of the approach are too much to present in a blog, but if you’d like me to blog on the concepts at a high level, please comment to that effect on LinkedIn.

It’s my view that the socialization, personalization, and contextualization of broadband services is the secret sauce for future revenue gains—both by operators and OTTs.  The fact that these services are available to both camps means that neither can just sit back and hope the other will ignore the future revenues that could be gained.  However, we all know that OTTs have been eager to exploit novelty and operators always fall into the trap of believing that “new revenues” are new models of old revenues.  How many operators will make the jump?  The answer to that may determine how many optimize their future profits.

Addressing the “Other Half” of Opex Reduction

One of the most important elements in service provider transformation is hardly ever talked about.  It’s the portal that would provide users and operator personnel with access to information.  Since service automation is the critical goal of transformation, and since there clearly has to be a different approach when communicating with an automated lifecycle process, the portal is critical to the business case for transformation, including SDN and NFV.  What is a portal, and how should it work?

In a simple sense, a portal in (network management terms) is a GUI that provides a controllable view of services, resources, or both.  Not everyone accepts this general vision; the media has tended to describe a portal as a means for customer (or customer-service-rep) interaction.  Operators who think the portal concept through take a broader view, and that’s the one that I’ll take here.

Management systems offer APIs that let applications connect, and popular APIs (like SNMP) have viewers associated with them.  Portals are generally more like application front-ends found in business applications, meaning that they are designed to get (and often put) information through APIs but to allow for considerable customization in terms of what the user interaction and even the user device should be.  We see this approach taken regularly in supporting mobile devices (BYOD).

The front-end simile here is useful because it lets us take a step toward a functional model of a portal.  Let’s say that there is, for any given portal, a data element directory.  This directory lists all of the data elements (fields, if you like) that are available and the identity of the API that provides that field.  The data element directory for a portal is derived in a logical sense from a master data element directory that shows all possible elements, and the derivation depends on the nature of the portal and perhaps on the credentials of the user.

From an implementation sense, I’d suggest that the data element directory for a portal is a kind of local wish list that’s associated with the portal instance.  The portal instance connects to the management system through an API, and that API uses the portal information (including user credentials) to “shop” the variables.  Anything the user asks for and the credentials don’t support is simply filled with the null value.  This back-end piece of the portal implementation is also responsible for deciding if the portal/user is allowed to change any of the values, and perhaps defines simple edits on the new values.

If application or mobile-backend-as-a-service (MBaaS) tools and concepts could build the GUI of transformation portals, what do you do for the rest?  These tools also offer a general model of application interfaces through APIs or through a service bus, and the service bus approach is probably the most general solution as the next step.

In a service-bus approach, the portals drop work on the bus and publish-and-subscribe or transaction routing processes then direct the work to the proper process elements.  In some products, you can do some transformation on the data structures when you push something to the bus or pop something off.  The model then would be to link the back-end process APIs to the bus through a small custom component that would be written based on the requirements of the service bus and the API, but would be simple in nearly all cases.

There are two sources of those back-end-process APIs.  One is the exposed API set from the OSS/BSS/NMS framework, and the other is whatever would be exposed by the SDN/NFV orchestration process, including (and perhaps most importantly) the modeling process.

The easiest way to expose SDN/NFV/orchestration/modeling information would be to follow the approach that was first proposed in the IETF (in the now-expired i2aex submission) and later carried into my CloudNFV and ExperiaSphere work.  All management information from any “prime” or “analytics-derived” source is pushed into a time-stamped repository accessed by query.  A generalized repository viewer element then runs a query against the repository to derive results.  That means the data available to the viewer is based on the query, and that by constructing a proper query you could view anything from the current status of the SLA in a single service model element to the state of resources overall.

It’s my view that the need to provide a useful and agile portal is yet another argument for the repository model of management data.  I see no way of avoiding storing all management data in a repository, if you intend to use analytics to review capacity plans or to assess evolving problems.  If you’re storing it anyway, then why not generalize all service and resource management to build off the query-driven viewer approach?  You could then create a portal to show anything you wanted to show, and you could generalize the portal approach from the limited customer-and-CSR mission to the broader mission of supporting any operations task undertaken anywhere.

There’s even more here.  Many of the operations tasks, including billing, can be visualized as queries off our common repository, creating what is in effect a whole series of databases that are derived from a common repository, one that’s shared by service lifecycle management.  The elusive goal of total integration of management and operations might then be on its way to being achieved.

That would be good, because achieving the unification goal is probably critical to fully exploiting the operations efficiency benefits of SDN, NFV, and OSS/BSS modernization.  Service lifecycle management automation doesn’t address all the cost issues (it addresses perhaps half the addressable cost reductions).  To address the rest, you need a very flexible portal approach, and you need to have both service automation and portals working off the same page, data-wise.  We could do that by basing service automation on a management repository, and basing portals on data queries from that same repository.

The net here is that operations efficiency has to automate the service lifecycle process, but also transform the relationship between customers, customer service, operations, and the management and operations tools.  Otherwise we move into the future with only part of the opex challenge resolved.  Portals, meaning query-driven portals into a common management/status repository, can be the way to approach that second half of the opex challenge, and at the same time harmonize everything to a common repository.  What more could you ask?

How do NFV and Cloud Computing Services Fit?

Virtual functions and NFV are all about virtualizing network features, right?  Well, perhaps not.  There is increased interest in looking at application components as virtual functions, and this trend might be critical in justifying the carrier cloud, and even in supporting NFV’s narrower goal.  It might not be too much of a stretch for current thinking, but it does raise some significant points about how NFV and carrier cloud could or should develop.  By doing so, it may also help identify revenue-generating paths for NFV down the line.

The explicit goal of NFV is to host what were previously “physical network functions” on virtualized servers, meaning in the cloud.  While there is no specific requirement for the PNF limitations implied in the material, the great majority of NFV work has focused on hosting data-plane elements, meaning things that sit on the data path and in some way forward packets.  This includes things like firewalls, and the application is the essence of the whole “service chaining” requirement in NFV, which lays out an ordered path through a sequence of functions.  Applications in the cloud, in contrast, are destinations and not part of a data flow (at least not today).

NFV also presumes single-tenant/single-service deployments of functions.  There is no specific provision to deploy something for reuse by others, though some vendors (Metaswitch, notably) have defined virtual functions that are not all data-plane, and that could also include multi-tenant features with some diddling.  Cloud applications, of course, are usually shared among at least a group of workers.

These points are important if you consider the idea of broadening NFV to include cloud applications.  Yes, there are surely single-user applications—desktop or device applications—but the majority of what users would call “applications” are inherently multi-tenant.  They also show that while NFV is conceptually a superset of the cloud, designed to exploit the cloud’s capabilities, the target applications of NFV create a subset of the cloud’s range.  The cloud, for example, is capable of deploying both data-plane and other functions, and of deploying single- or multi-tenant applications.

The distinction between “cloud” and “NFV” is created not by the resources but by the deployment and lifecycle management models.  Those are created by the software layer—NFV MANO, for example, or a cloud DevOps tool.  If you assume that operators have deployed “carrier cloud”, meaning a general set of cloud computing resource pools, then in theory you could host both virtual functions and cloud applications on them as the classic ships in the night.  If operators want to integrate them somehow, it could get more complicated, but before we get to the complications and remedies, let’s look at the “Why?”

There are three broad reasons why an operator would want to integrate cloud applications and NFV services:  1) NFV services were to be used by the cloud applications, to connect with users or resources, 2) the operator wanted to utilize the same capabilities for failover, scaling, etc. for NFV and for cloud applications, or 3) the operator wanted to use NFV orchestration and management for cloud applications as well.

The first case is probably the simplest.  Clearly it would be possible to connect an NFV service to a cloud application if we assumed that the cloud application was represented by a static address and didn’t move around.  The problem arises in even a simple case of cloud failover; the application is moved, and so the NFV connection also has to move too, or rather has to be rerouted to the new location.  You could handle this in NFV by treating the cloud failover as an endpoint failure, a kind of remove-plus-add process if the NFV service didn’t have an explicit “change-endpoint” capability.

The second case is a can of worms at two levels.  First, if you were to be scaling or failing over an application component, the decision might be made either at the cloud/application level or at the NFV level.  You can’t let the processes collide.  Second, it’s hard to see how something like the horizontal scaling of an application service could be done without some specific coordination between the application and feature/network deployments and changes, on a more dynamic basis.  Third (and worst), it might well be that the “best” option for scaling or moving something because of a cloud condition would have to be made in part based on network considerations.

It’s possible to imagine event-driven coordination across both NFV and cloud elements.  If you had a policy management system to control where things were deployed, it could be possible to use that system to position the application and functional assets optimally in combination, and then communicate the results to both cloud and NFV tools for the deployment and/or redeployment steps.

You can see how this sort of back-and-forth between two independent deployment frameworks could get complicated, and it could be worse than just complicated if both systems had jurisdiction over overlapping parts of the resource pool—and they do.

If you’re going to share hosts and data center networks between cloud applications and virtual functions, then keeping deployment systems separate for the two invites collision even if the two systems aren’t working on the same problem or deployment at the same time.  You can’t have two chefs cutting up the same chicken.  If you assumed that all the shared resources were really controlled by the same singular underlying control element, used by both the cloud and NFV, it might work.  Think about Open Daylight or even OpenStack.  However, the performance of such an arrangement might be problematic, and those singular control elements are also single points of failure.

OK, so if it’s difficult to imagine how the cloud and NFV would be able to coordinate even to the point of sharing a resource pool, then why not use NFV to deploy applications or their components?  In theory, NFV could provide enhanced features and capabilities to application components, and certainly it could be extremely valuable in carrier applications (which I firmly believe are coming, and in significant numbers) that blend traditional cloud application features and network services.  But can an application component be a VNF?

Sort of, but perhaps the biggest question for NFV will be what application components as VNFs might expose in the way of new requirements.

The biggest issue in deploying cloud applications with NFV is that of multi-tenancy.  Could you use NFV, today and with little or no modification, to deploy an application?  You can in at least some cases, as Metaswitch has proved.  If you look at an application in the cloud, the most likely model is one of an “elastic subnet”, an IP subnetwork that’s distributed ad hoc across the cloud as components scale and move.  This subnet is gatewayed to a VPN or to the Internet.  This is also the kind of model that some NFV implementations of network features (IMS/EPC, for example) would likely follow, but the focus of NFV has been on that service-chain vCPE model.  For things like scaling, you’d scale within this subnet and it could in theory be controlled by an internal element—part of the VNF in NFV terms.

Inside the subnet, you don’t chain services, you simply provide mutual access (Metaswitch’s open IMS implementation is an example).  However, the nature of the interaction between components has to be kept in mind when deploying and redeploying, lest you move two things apart that really need to be kept close.  If you do this, and it’s possible in at least some NFV implementations, then you could deploy the component hosting side of your subnet.  Which means, then, that you could use NFV to deploy cloud elements, and you could provide some VNF-specific features to exercise things like horizontal scaling where required.  The rules for this would have to be defined, though.

Outside the subnet, things could be more complicated.  In an application deployment, there is always that multi-tenant theme; you are really setting up an application gateway point (or points) that provide access to a networked community.  Network subnet meets networked community; that’s actually an OpenStack Neutron model, but it’s not necessarily an NFV model because we really haven’t defined specific models for NFV other than (implicitly) the service chaining model.

The key element to make this kind of integration is the gateway, a portal between an application/service subnet and the access network to which users are connected.  Gateways are also explicitly required for federated services, and they are inherently multi-tenant so you need to think about what deploying a gateway means in NFV.  Every service doesn’t get their own, but how do you know whether there is one, and if so where it is?  The essential presumption so far is that gateways in IP networks are simply BGP transitions that are opaque to services, but that may not be true, and we need to think about how NFV would, for example, change BGP lists.

Management is the next issue.  I’ve said many times before that NFV’s VNF management process is broken, and the introduction of applications/components as VNFs would only exacerbate current issues.  In a nutshell, the problem is that adapting a virtual function for management is almost certainly a one-off task.  I’ve recommended that the approach be changed to create a standard VNFM API with “plugins” that would adapt that standard API to the requirements of a given VNF.  This would be workable for application components, but it would be more difficult.

Most VNFs, as representations of former network devices, would likely have very few (usually only one) management interface.  Applications often not only have multiple management interfaces, but have in-line (meaning from their data flow) mechanisms and even back-end processes.  Can you “add a user” to an application by making a network connection and not changing the application’s own access control database?

This problem is totally solvable, and with the same approach to a management plug-in.  If we assumed that all configuration data, parameterization, and management variables were collected in a repository (like the now-expired) i2aex proposal within the IETF mandated, and if we further assumed a set of management agents that queried this repository (or updated it), and linked to a stub that accessed the management interface of the component, you could deploy and redeploy applications.

Perhaps the biggest question is whether any or all of this should really be “done in NFV”.  Remember that we have the cloud and DevOps tools already, widely adopted for this mission.  Should we be thinking about building a specific relationship between cloud DevOps and NFV “servops?”  Probably.  The most attractive way to integrate applications and virtual functions could well be in orchestration, and TOSCA can orchestrate cloud deployment (that’s what it was designed for, after all).  That wouldn’t totally resolve the possibility of resource collision between application and virtual function deployment, nor would it totally resolve management coordination, but it’s a first step, and likely a very important one.

Application integration with NFV opens a lot of issues, and the Metaswitch example illustrates that the cloud issues raised aren’t unique to cloud computing.  Many VNFs, including IMS/EPC and CDNs, are multi-tenant and look more like cloud/subnet applications.  Addressing these issues in NFV, or addressing application integration, could broaden NFV’s scope and make it more valuable in the early services that will drive carrier cloud.

What to Expect from the Networking Industry in 2017

I almost hate to do a blog on what 2017 holds for networking, and for network operators in particular; that kind of analysis is almost a cliché.  In this case, though, it might be important to look ahead to the coming year because it could include some fairly cataclysmic events and changes.  At least I can promise you that this won’t be your usual year-ahead Pablum!

The big news in 2017 was actually telegraphed with the news that Huawei’s sales grew almost a third in 2016, making the company the big winner in the network infrastructure game.  The reason for that is my opening point—2017 is the year when network operators stop preparing for a return-on-infrastructure crunch and start making hard decisions for the long term.

Most operators have drawn charts showing revenue per bit collapsing under the Internet pricing model of all-you-can-eat, and cost per bit declining a bit less dramatically.  As a result, there’s a crossover point, and the date for this is usually placed sometime in 2017.  This impending financial milestone is what has been pushing SDN, NFV, carrier cloud, and a bunch of other things.  But now, it’s upon the operators and so they have to be prepared to do something that bears fruit rather quickly.

The new pressure on operators to do something will be manifest in their efforts in the SDN and NFV space.  Expect to see an avalanche of operators looking at AT&T’s ECOMP, because it represents a broad and adoptable model for software automation overall.  Most operators now realize that “SDN” and “NFV” in the narrow scope of the standards aren’t going to do them enough good, or fast enough.  They’re now looking beyond the standards, and probably they’ll never come back.  Open Source will replace standards forever, starting in 2017.

I can tell you that there are a few operators who already know that SDN and NFV aren’t enough, and aren’t fast enough.  Those operators know that somehow software automation, effective use of portals, or both are going to be their touchstones for 2017.  The rub for them is the “somehow” part; most have a goal but not a specific path to reaching it.  Even architectures like AT&T’s ECOMP, which at least admits full-scope software automation, doesn’t yet address all the key points there.  Nobody has done much on the portal side, despite the fact that some university projects in that area first appeared three or four years ago.  In 2017 we’ll see some vendors step up on software automation, both in the OSS/BSS space and among the network equipment vendor community.  We might also see some portal progress, but even if we don’t, software automation impetus will be enough to support real progress in opex reduction by the end of 2017.  Implementations started then will mature and offer our first real “success stories” in 2018.

Which, of course, is a bit late to prevent the critical crossover operators expect.  The next news item is good news for operators, though.  There is a general trend among regulators worldwide to accept that the pro-OTT bias of regulations under the banner of “net neutrality” has threatened investment in infrastructure.  Huawei’s surge is a direct result of operator pressure on capex to lower that cost-per-bit curve, and when network vendors talk about systemic conditions or market conditions, what they’re really talking about is that critical point of revenue/cost convergence.  If regulatory changes occur, they might let operators alleviate the pressure without further capex reductions.

The two issues that are critical are settlement and paid prioritization.  The first of the Obama Administration’s FCC Chairmen (Genachowski) was aware that these could be used to create a special advantage to large players, but was willing to accept measures in both areas with the proviso that the FCC would watch developments.  His successor (Wheeler, the current Chairman) pulled back and banned both.  Republican Commissioners say they’ll reverse that decision, and if they do it could improve operator return on infrastructure enough to fend off further capex pressure until 2020.

I think that they may do all of that, and certainly they’ll do something this year.  That means that network infrastructure might have a couple of years of relief from the pricing pressure once the impact of regulatory change takes hold.  But not decisive relief, because the fact is that regulations are politics under the covers, and nobody can be sure where the political winds will blow in 2017 and beyond.

Despite regulatory relief and opex improvements, network vendors will be under tremendous pressure in 2017.  Specialists in SDN and NFV will be under even more, because relief from the profit-per-bit risk won’t come instantly and won’t be fully proven and accepted even when it does start to appear.  Further M&A among network vendors is likely, focusing on consolidation in the middle-tier players and on adding critical technology elements among the higher tiers.  We’ll also see management/organizational changes in most of the network vendors as they work to adapt to market conditions that will be difficult no matter what happens with regulations and technology.

For vendors, the timing of possible regulatory relief and a possible introduction of enough opex savings to mitigate capex pressure creates a risk of its own.  Do you sit in the rain and hope for a break in the clouds, or seek shelter?  The logical, smart, thing to do at this point would be to recognize the shifts underway in software automation will transform networking, and that things like SDN and NFV will eventually change what software automation doesn’t.

That’s right; SDN and NFV don’t go away, they simply become the “two” of a one-two punch at cost and agility issues.  Complete software automation could reduce opex by at least 60% and improve service agility, but it doesn’t create much new revenue because it only tweaks the business-service pricing model a bit.  To do more, you need to add in cloud services, and SDN and NFV will not drive the cloud but rather will be driven by it.

Over the next week or so, I’m going to expand on some of the critical points in the evolution of networking that we can expect to see exposed and addressed (if not resolved) this year.  In the meantime, I want to wish you all a Happy New Year!

Looking Deeper into Nokia’s Deepfield Deal

Like most players in the network space, Nokia is eyeing SDN and NFV with a mixture of hope and fear.  I’d guess for Nokia it may be a bit more of the latter, because major changes in the market could upset the merger of Nokia and Alcatel-Lucent, the latter being an example of almost perpetual merger trauma itself.  Now, Nokia has announced…drum roll…a new acquisition, Deepfield, to improve their network and service automation.  The obvious question is whether this makes any sense in the SDN/NFV space.  A less obvious question is whether it makes sense without SDN or NFV.

Virtualization creates a natural disruption in network and service management because the actual resources being used don’t look like the virtual elements that the resources are assigned to support.  A virtual router is router software running in a data center, on servers, using hypervisors, vSwitches, real data center switches, and a bunch of other stuff that no router user would expect to see.  Because of this disconnect, there’s a real debate going on over just how you manage virtual-based networks.  The disagreement lies in just how the virtual and real get coordinated.

If you looked at a hypothetical configuration of a totally virtualized (using NFV and SDN) IP VPN service, you’d see so much IT that it would look like application hosting.  Imagine what happens, then, when you have a “server fail” event.  Do you tell the router management system the user has connected that you have a server failure?  Hardly.  Broadly, your realistic options are to try to relate a “real” resource failure to a virtual condition, or to just fix everything underneath the virtual covers and never report a failure unless it’s so dire that causal factors are moot.

To put the latter option more elegantly, one approach to virtualization management is to manage the virtual elements as intent models with SLAs.  You simply make things happen inside the black box to meet the SLAs, and everyone is happy.  However, managing this way has its own either/or choice—do you manage explicitly or probabilistically.

Explicit management means knowing what the “virtual to real” bindings are, and relating the specific resource conditions associated with a given service to the state of the service.  You can do this for very high-value stuff, but it’s clearly difficult and expensive.  The alternative is to play the numbers, and that (in case you were wondering if I’d gotten totally off-point) is where big data, analytics, and Deepfield come in.

Probabilistic network management is based on the idea that if you have a capacity plan that defines a grade of service, and if you populate your network with resources to meet the goals of that plan, then any operation that stays within your plan’s expected boundaries meets the SLAs with an acceptable probability.  Somewhere, out there, you say, are the resources you need, and you know that because you planned for the need.

This only works, of course, if you didn’t mess up the plan, and if your resources don’t get messed up themselves.  Since the question of whether a massive, adaptive, multi-tenant, multi-application network or service is running right, or just how it’s wrong if it is, is complex, you need to look at a bunch of metrics and do some heavy analytics.  The more you can see and analyze the more likely you’ll obtain a correct current state of the network and services.  If you have decent baseline of normal or acceptable states, that gets you a much higher probability that your wing-and-a-prayer SLA is actually being met when you think it is.

Many people in the industry, and particularly in the telco space, think explicit management is the right answer.  That’s why we hear so much about “five-nines” today.  The fact is that almost none of the broadband service we consume today can be assured at that level, and almost none of it is explicitly managed.  Routers and Ethernet switches don’t deliver by-the-second SLAs, and in fact the move to SDN was largely justified by the desire to impose some traffic management (besides what MPLS offers) on the picture.  In consumption terms, consumer broadband is swamping everything else already, it’s only going to get worse, and consumers will trade service quality for price to a great degree.  Thus, it’s my view that probabilistic management is going to win.

That doesn’t mean that all you need to manage networks is big data, though.  While probabilistic management based on resource state is the basis for a reasonable management strategy for both SDN and NFV, there’s still a potential gap that could kill you, which I’ll call the “anti-butterfly-wings” gap.

You know the old saw that if a butterfly’s wings flap in Japan, it can create a cascade impact that alters weather in New York.  That might be true, but we also could say that a typhoon in Japan might cause no weather change at all in nearby Korea.  The point is that a network resource pool is vast, and if something is buggered in a given area of the pool there’s a good chance that nothing much is impacted.  You can’t cry “Wolf!” in a service management sense just because something somewhere broke.

That’s where Deepfield might help.  Their approach adds endpoint, application, or service awareness to the mass of resource data that you’d have with any big-data network statistics app.  That means that faults can be contextualized by user, service/application, etc.  The result isn’t as precise as explicit management, but it’s certainly enough to drive an SLA as good or better than what’s currently available in IP or Ethernet.

The interesting this about this approach, which Nokia’s move might popularize, is the notion of a kind of “resource-push” management model.  Instead of having the service layer keep track of SLAs and draw on resource state to get the necessary data, the resource layer could push management state to the services based on context.  At the least, it could let service-layer processes know that something might be amiss.

At the most, it opens a model of management where you can prioritize the remedial processes you’re invoking based on the likely impact on customers, services, or applications.  That would be enormously helpful in preventing unnecessary cascades of management events arising from a common condition; you could notify services in priority order to sync their own responses.  More important, you could initiate remedies immediately at the resource level, and perhaps not report a service fault at all.

That’s the key point.  Successful service management in SDN and NFV is successful not because it correctly, or at least logically, reflects the fault to the user.  It’s successful because no faults are reported because no SLA violations occur.  It will be interesting to see how Nokia plays this, though.  Most M&A done these days fails because the buyer under-plays its new asset.

Federation, Virtual Network Operators, and 5G Slicing, and Their Relationship to SDN/NFV

Every network operator I’ve surveyed has some sort of wholesale/retail relationship with other operators.  Most fit into two categories—a relationship that extends geographic scope or one that incorporates one operator’s service inside another (like backhaul or MVNO).  Given this, it is natural to assume that services built on SDN and/or NFV would have to be covered by these same sorts of deals.  The question is how, and how the need to support these relationships could impact the basic architecture of SDN or NFV.  It’s important because the 5G specifications are going to make “slicing” into a new standard mechanism for virtual networking.

To avoid listing a host of relationships to describe what we’re going to talk about here, I’m going to adopt a term that has been used often (but not exclusively) in the market—federation.  For purposes of this blog, federation is a relationship of “incorporation”, meaning that one operator incorporates services or service elements from another in its own offerings.  We should note, though, that operators are a kind of special case of “administrative domains”, and that federation or sharing-and-incorporation capabilities could also be valuable or essential across business units of the same operator, across different management domains, etc.

We used to have federation all the time, based on intercarrier gateways.  Telcos intercall with others, and early data standards included specific gateway protocols—the venerable and now-hardly-used packet standard X.25 used the X.75 gateway standard for federation.  All of this could be called “service-level” federation, where a common service was concatenated across domains.  Federation today happens at different levels, and creates different issues at each.

There is one common, and giant, issue in federation today, though, and it’s visibility.  If a “service” spans multiple domains, then how does anyone know what the end-to-end state of the service is?  The logical answer is that they have a management console that lets them look, but for that to work all the federated operators have to provide visibility into their infrastructure, to the extent needed to get the state data.  Management visibility is like taxes, if you can invoke it then it can lead to destruction.  No operator wants others to see how their networks work, and it’s worse if you expect to remediate problems because a partner actually exercising management control could lead to destabilizing events.

The presumption we could make to resolve this issue is that service modeling, done right, would create a path to solution.  A service model that’s made up of elements, each being an intent model that asserts a service-level agreement, would let an operator share the model based only on the agreed/exposed SLA.  If we presumed that the “parameters” of that element were exposed at the top and derived compatibly by whatever was inside, then we could say that what you’d see or be able to do with the deployed implementation of any service element would be fixed by the exposure.  In this approach, federation would be the sharing of intent-modeled elements.

This is a big step to solving our problems, but not a complete solution.  Most operators would want to have different visibility for their own network operations and those of a partner.  If I wholesale Service X to you, then your network ops people see the parameters I’ve exposed in the relationship, but I’d like my own to see more.  How would that work?

One possibility is that of a management viewer.  Every intent model, in my thinking, would expose a management port or API, and there’s no reason why it couldn’t expose multiple ones.  So, a given element intent model would have one set of general SLA and parametric variables, but you’d get them through a viewer API, based on your credentials.  Now partners would see only a subset of the full list, and you as the owner of the element could define what was exposed.

Another possibility is an alias element.  We have a “real” service element, which decomposes however it has to into a stream of stuff toward the resources.  We have another element of the same kind, which is a fork upward from this real element.  Internal services compose the real element, but you expose the alias element in federation, and this element contains all the stuff that creates and thus limits the management visibility and span of control.

The issues of visibility can be addressed, but there remain two other federation curve balls to catch.  One is “nesting” and the other is “foundation services”.

Nesting is the creation of a service framework within which other services are built.  A simple example is the provisioning of trunks or virtual wires, to be used by higher-layer Ethernet or IP networks.  You might think this is a non-issue, and in some ways it might be, but the problem that can arise goes back to management and control.  Virtual resources that create an underlayment have to be made visible in the higher layer, but more importantly the higher layer has to be constrained to use those resources.

Suppose we spawn a virtual wire, and we expect that wire to be used exclusively for a given service built at L2 or L3.  The wire is not a general resource, so we can’t add it to a pool of resources.  The implications are that a “layer” or “slice” creates a private resource pool, but for that to be true we either have to run resource allocation processes on that pool that are pool-specific (they don’t know about the rest of the world of resources, nor does the rest of the world see them) or we have to define resource classes and selection policies that guarantee exclusivity.  Since the latter would mix management jurisdictions, the former approach is best, and it’s clearly federation.  We’re going to need something like this for 5G slicing.  The “slice-domain” would define a series of private pools, and each of the “slice-inhabitants” would then be able to run processes to utilize them.

The key point in any layered service is address space management.  Any service that’s being deployed knows its resources because that’s what it’s being deployed on.  However, that simple truth isn’t always explicit; you almost never hear about address spaces in NFV, for example.  Address spaces, in short, are resources as much as wires are.  We have to be explicit in address management at every layer, so that we don’t partition resources in a way that creates collisions if two layers eventually have to harmonize on a common address space, like the Internet.  We can assign RFC 1918 addresses, for example, to subnets with regard for duplication across federated domains because they were designed to be used that way, with NAT applied to linking them to a universal address space like the Internet.  We can’t assign Internet-public addresses that way unless we’re willing to say that our parallel IP layers or domains never connect with each other or with another domain—we’d risk collision in assignment.

The other issue I noted was what I called “foundation services”.  We have tended to think of NFV in particular as being made up of per-customer-and-service-instance hosted virtual functions.  Some functions are unlikely to be economical or even logical in that form.  IMS, meaning cellphone registration and billing management, is probably a service shared across all virtual network operators on a given infrastructure.  As IoT develops, there will be many services designed to provide information about the environment, drawn from sensors or analysis based on sensor data.  To make this information available in multiple address spaces means we’d need a kind of “reverse NAT”, where some outside addresses are gated into a subnet for use by a service instance.  How does that get done?

How do we do any of this in SDN, in NFV?  Good question, but one we can’t really answer confidently today.  As we evolve to real deployments and in particular start dealing (as we must) with federation and slicing, we’re going to have to have the answers.