Let us suppose that the goal of next-gen infrastructure (for operator services, cloud providers, or applications) is full virtualization of both application/service elements and hosting and connection resources. Most would agree that this is at least a fair statement of the end-game. The question, then, is what steps should be followed to achieve that goal. The problem we have now is that bolded word—everyone is looking at some atomic change and saying it’s progress, when without a full roadmap we don’t know whether “movement” constitutes “progress” at all.
As is often the case, we can learn from the past if we bother to inspect it. When virtualization came along, it quickly spawned two important things. First, an overall model for deployment of software components that combined to form a service or application. Such a model standardizes how deployment and redeployment works, which is essential if the operational processes are to be efficient and error-free. Second, the notion of a “software-defined network” in the Nicira overlay SDN offering (since acquired by VMware and renamed “NSX”).
If you look at any cooperative system of software components, you find a very clear inside/outside structural model. There are many components that need to exchange work among themselves, so you need some form of connectivity. What you don’t need is for the interfaces involved in these internal exchanges to be accessible from the outside. That would create a risk to security and stability that you’d have to spend to address. What emerged, as early as the first OpenStack discussions on network connectivity, was the classic “subnet” model.
The subnet model says that you build our cooperative systems of software components using the private IP addresses, defined in RFC 1918 for IPv4. Private IP addresses are shared by all, because they are not permitted outside the subnet. You can’t send something out from a gateway to “192.168.1.1”, but you can address that entity on/within the subnet. Home networks are built this way.
The subnet model means that those interfaces that are supposed to be accessed from outside, from a VPN or the Internet, have to be exposed explicitly by mapping an internal private address to an external public address. Network Address Translation (NAT) is a common term for this; Amazon calls it “Elastic IP Addresses”. This, as I noted in my last blog on the topic, is what creates the potential for the “Proxy Load-Balancer” as a kind of bridge between two worlds—the outside world that sees services and applications, and the inside world that sees resources and mappings. Building the inside world is important, even critical. We need consistency for efficiency, and that means consistency in how we can address subnet components and efficiency in how we deploy and redeploy, meaning the orchestration practices needed.
An IP subnet in its original form is made up of elements that are connected on a local-area network, meaning that they have Level 2 connectivity with each other. If we translate this into the world of the cloud and cloud infrastructure, we’d have to make a choice between keeping scaling and redeployment of components confined to places where we had L2 connectivity (likely within a data center) or accepting extension of Level 2 across a WAN. You either have to stay with the traditional subnetwork model (L2) or you have to provide a higher-level SDN-like networking tool (Nicira/NSX or another) to connect things.
Containers have arisen as a strategy to optimize virtualization by making the hosting resource pool as elastic and generic as possible. One vendor (Mesosphere) describes this as a pets-versus-cattle approach to hosting. With bare-metal fixed-resource hosting (pets), you fix something that breaks because it has individual value to you. With true agile virtual hosting (cattle) you simply toss something that breaks because the herd is made up of indistinguishable substitutes. The goal of container networking is to facilitate the transformation from pets to cattle.
Containers are ideal for applications and service features alike because they’re low-overhead. The most popular container architecture is Docker, which is based on the subnet model I mentioned. That offers the benefit of simplicity, but it complicates deployments where a service or application has to be broadly horizontally integrated with other services/applications. The most popular Docker orchestration approach, Kubernetes, takes a different approach. It mandates that you make all containers accessible to each other as a default, but how you do that is left to you.
Containers, Docker, and Kubernetes are in my view the foundation for resolving the first point I noted as a requirement for virtualization—a standard hosting framework. It’s my view that the current leading approach to that, the natural successor to the container crown, is DC/OS. This is an open-source project based on the Apache Mesos container framework, and designed to work not only on bare metal (where most containers are hosted) but also on virtual machines.
You deploy DC/OS on everything, and you then deploy all your applications and features on DC/OS. It supports all the popular container models, and probably many you’ve never heard of, because it runs below them and abstracts all kinds of infrastructure to a common view. DC/OS (in conjunction with Mesos) creates a kind of open virtual host, a layer that links resources to agile containers with orchestration and also introduces coordination across orchestrated frameworks. With DC/OS, a data center is a virtual server in a very real sense.
One of the nice, smart, features of DC/OS and many other container systems is the emphasis on “services” delivered through a lightweight load-balancer. This element can be used as the boundary between the logical and virtual network worlds, the link between internal private-network connectivity for containers and the public Internet or corporate VPN. The load-balancer is an ever-visible, ever-addressable, symbol of the underlying resource pool and how it’s allocated to support applications. Rehost something and you simply connect the new host to the same load balancer. Scale something and the mechanism for work distribution is already there and already being addressed.
With DC/OS and either Kubernetes or Apache Marathon orchestration, you can deploy and connect everything “inside the boundary” of virtualized hosting. Outside that boundary in the real world of networking, you still have other technologies to handle, including all the legacy pieces of IP and Ethernet and fiber and microwave. Thus, DC/OS and similar technologies are all about the software-hosted side of networking, or the application side of corporate IT. While containers answer all of the hosting problem and part of the networking problem, they really do the latter by reference, by suggestion. The connectivity is outside DC/OS.
Applications are built from components, and networks from features. To the extent that either are hosted (all of the former, and the software-defined or virtual-function part of the latter), then the DC/OS model is the leading-edge approach. It’s what NFV should have assumed from the first as the hosting model for VNFs. For those who (like me) see the future of zero-touch automation as an intent-model-driven process, it’s how software can create intent. Inside the resource boundary, DC/OS lays out the picture of networks and services, but not connectivity. What it does do is admit to the need for explicit software-defined connectivity. It is probably what gets SDN (in some form) out of the data center and into the WAN, because services and applications will cross data center boundaries as they redeploy and scale.
There’s a basic principle at work here. The goal is to make all IT and virtual-function infrastructure look like a virtual host. Inside you have to support real server iron, virtual machines, or whatever else suits your basic model of secure tenant isolation. At the next layer, you need to support any form of containers, and you need to be able to harmonize their operational (deployment and redeployment) models and support any useful mechanism to provide connectivity. All of this has to be packaged so that it doesn’t take a half-a-stadium of MIT PhDs to run it. This is the foundation of the future cloud, the future network, the future of applications.
DC/OS isn’t the total solution. We still need to add on the networking aspect, for example. It is a step in the right direction, and most of all it’s a demonstration of how far we’ve come from the simplistic and largely useless models of the past, and how far we still have to go. Others are looking at the same problems, in different ways, and I’ll be exploring some of those and the issues they frame and solve, in later blogs.