If the Data Center Drives Networking Overall, What’s the Destination?

For decades, enterprises have told me that the data center drives the network.  The vendors they said had the greatest influence on their overall network strategic planning were those who had the greatest strategic influence in the data center.  For the last ten years, the data center has been evolving to meet the requirements of componentized applications, virtualization, and cloud hosting.  In recent Wall Street reports, analysts have commented on this trend and cited it as the basis for software-defined network (SDN) growth.  What do data center users, particularly cloud providers and network operators, think?

Both large enterprises and service providers (including telecom, cable, and cloud providers) have long noted that the trend in software was toward componentized applications.  These applications generated the same “vertical” app-to-user traffic as monolithic applications, but also generated “horizontal” or inter-application traffic through component workflows.  Since a unit of work passes twice on the average (in and out) in the vertical direction, but could pass through four or five components in the horizontal direction, it was clear that we were heading to a time when horizontal traffic could outstrip vertical traffic.

Both classes of data center users also realized that resiliency and scalability were far more reliable and less resource- and performance-impacting if they were confined within the same data center, creating no WAN connections.  “Local” component workflows are much better than those that involve remotely connected resources.  Thus, the horizontal traffic expected would grow up within data centers more than between them, and thus the natures of data center and data-center-interconnect traffic would diverge over time.

Data center switching concepts have lagged this horizontal-emphasis transformation.  Historically, there have been two considerations associated with building data center switching systems.  The first is that the capacity of a given switch is limited by backplane speed, and the second is that Ethernet bridging protocols have historically limited a switch to a single trunk path to another switch.  About ten years ago, some network vendors (including Juniper, whose announcement I attended) began to offer fabric solutions.  Fabrics also developed in the storage market (remember InfiniBand?).  By traditional definition, a fabric is a switch that can connect every port to every other port without blocking and without major differences in transit performance.

You don’t necessarily have to use a monolithic structure to create a fabric.  The combination of top-of-rack and transport switches can do nearly as well, as long as you address the issue of multiple pathways between switches.  There are protocol enhancements to Ethernet bridging to do that, but an increasingly popular approach is to use SDN.  What SDN does is allow the creation of ad-hoc Level 2 virtual networks.  If it’s combined with a low-latency, non-blocking, switching system in a data center, that lets tenant and application networks to be created, repaired, and scaled dynamically with minimal concerns about whether the resulting workflows will be efficient.  Finally, you can use hosting policies that reflect horizontal traffic levels to ensure you don’t overload switch-to-switch trunks.  This is a workable way to extend basic switching, but obviously a non-blocking approach will make finding optimum locations for hosting capacity easier.

The big question for data-center-centric network planners is the one raised by edge computing.  The efficiency of a data center as a resource pool is roughly expressed in an Erlang C curve, which says that even though increasing the number of servers won’t create a uniform increase in efficiency, there is a critical level of resources needed to provide a reasonable hope of fulfilling a hosting request.  Edge computing would naturally distribute resources, and in the early days of edge computing it might be difficult to assign enough servers to a given edge data center to reach reasonable Erlang efficiency.  If you can’t host a given component in the right place, then hosting it in the next-best place could mean a significant increase in the cost and delay associated with the horizontal connections with the component.

One implication for networking overall is that a move to edge computing would be most effective if it were accompanied by a parallel improvement of edge transport connectivity; data center interconnect using high-capacity pipes.  You don’t necessarily have to mesh every edge point; it would probably be enough to join edges by simply connecting each one to its nearest neighbors.  That would produce a virtual resource pool consisting of three edge data centers.  Data center interconnect (DCI) facilities aimed at creating this kind of trio-modeled collection would go a long way toward minimizing efficiency and availability risks associated with the smaller data centers.

Another implication for networking is that if edge data centers, like other cloud data centers, are multi-tenant, then DCI connections between data centers will have to extend the multi-tenant mechanisms across data center boundaries.  There’s nothing inherently wrong with doing that providing that the trunks are efficient, but what it could do is increase the size of the virtual switch and the need for reliable switch control, both at the real-device and virtual-device levels.

Span of control for SDN switching has always been recognized as a potential issue, for a number of reasons.  First, the scalability of central controllers is something that almost all enterprises and operators are wary about.  How many events can they handle, how many events might be generated by common failures, and so forth?  This could be handled through the use of federated SDN controllers, but that approach is very new.  We don’t fully understand how federating SDN could impact things like setting up virtual tenants across a DCI.  Is there a risk of collision, or does the federation add excessive delays?

Does data center virtualization stop at the data center boundary, or extend to partner data centers?  Does it extend outward even from there to things like cell sites, content sources?  Those are the questions that are now going to be asked, and the answers may determine the future of SDN and the model for metro networking in the future.  There are powerful inertial forces established in networks by legacy technology investment.  Open-box operating systems and open switch/routing software could create a more agile future without risking all of this incumbent investment.  The P4 forwarding language could then support modernization and a whole new model, which could include some SDN.  Or SDN might get its federation act together and leverage what it’s already won—the data center.  That’s still where the strategic balance of power for networking is found.