Can We Fit Testing into Zero-Touch Automation?

Test equipment was a mainstay in networking when I first got into it 40 years ago or so, but the role of testing in modern networking has been increasingly hard to define.  Part of that is that a lot of “testing” was never about networks at all, and part is because what a network is has been shifting over time.  There is a role for testing today, but it’s very different from the role during its glory days, and not everyone is going to like it.

I’ve used test equipment extensively for years, and when I reviewed the specific projects I’d used it on, I was surprised to find that 88% of them focused on debugging end-devices or software, 8% on a mixture of those and network issues, and only 4% on purely network issues.  In one memorable case, I used test equipment to analyze the exchanges between IBM mainframes and financial terminals so I could implement IBM’s SNA on a non-IBM minicomputer.  In another I used the equipment to help NASA emulate an obsolete terminal on a PC.  Even in the “pure network issues” situations, what I was doing was looking for the impact of network faults on the protocols being used.

Test equipment, meaning either devices or software probes inserted into protocol streams, has always been more about the protocols than the stream, so to speak.  More than anything else, according to enterprises, the decline in dependence on testing relates to the perceived limited value of protocol analysis.  Protocols are the language of exchange in data systems, and since the data systems themselves increasingly record any unusual conditions the value of looking at protocols is declining.  This, while the knowledge required to interpret what you see if you look at a protocol steaming on a path is increasing.

Traditional “data line monitoring” testing is particularly challenging given that somebody has to go to a location and stick a device into the connection.  Soft probes, including things like RMON, have grown in favor because a “network” today is a series of nodes and trunks, and testing the data interfaces with a device would mean running between nodes to look at the situation overall.  While you’re in transit, everything could be changing.

Other modern network advances haven’t helped either.  IP networking, and virtual networks, make even soft probes a challenge.  IP traffic by nature is connectionless, which means that data involved in a “session” where persistent information is exchanged may be routed along a variety of paths depending on conditions.  Virtual networking, including all forms of tunneling and overlay encapsulation, can make it difficult to trace where something is supposed to be going, or even to identify exactly what messages in a given place are actually part of what you’re trying to watch.

Back in the early days of NFV, I did a presentation on how test probes could be incorporated into an NFV deployment.  This was an advanced aspect of my work with the NFV ISG on a PoC, and when I dropped out of the activity (which was generating no revenue and taking much of my time) nothing further came of it.  The presentation, now removed, pointed out that in the real world, you needed to integrate testing with management and modeling or you had no way of establishing context and might find it difficult or impossible to even find what you wanted to probe.

The virtual world, particularly where services are composed from hosted functions, physical elements, and trunks, lends itself to probe insertion through deep packet inspection.  My approach was to support the use of service models to define “tap points” and “probes” as a part of a service, to be inserted and activated when needed.  Since I’ve always believed that both “services” and “resources” could be managed, and therefore could be probed and tested, this capability was aimed both at the service layer in the model and the resource layer.  The model provided the correlation between the testing probe and the service and resource management context, which as I’ve said is critical in today’s networking.  Since the DPI probe connection connected to a specified process (a “management visualizer”) the probe could either drive a GUI for display or drive an AI element to generate events that would then trigger other process actions, as part of service lifecycle management.

I also suggested that test data generators be considered part of testing functions, supported by their own modeled-in processes at appropriate points.  The combination of TDGs and a set of probes could define a “testing service” modeled separately and referenced in the service model of the actual service.  That would allow for automatic data generation and testing, which I think is also a requirement for modern testing.

I still believe, five years after that presentation was prepared, that the only way for testing to work in a modern world is to contextualize it with service lifecycle automation and the associated models.  The problem we see in testing today, the reason for the decline in its perceived value, is that it doesn’t work that way, doesn’t include the model-driven mechanisms for introducing context, coupling it to management, and automating the testing process fully.

I also think that testing has to be seen as part of the service automation process I’ve described in earlier blogs.  Testing, AI, and simulation all play a role in commissioning network resources and validating service composition and status.  The craze today is automating everything, and so it shouldn’t be surprising that what’s inherently a manual testing process is hard to fit in.

That’s the point that test equipment advocates need to address first.  Any technology has to fit into the thrust of business process changes or it runs against the buyer’s committed path.  First, we don’t want decision support, we want to be told what to do.  Then, we don’t want to be told what to do, we want it done automatically.  Test equipment used to be protocol visualization, which proved ineffective because too few people could do it.  Then it became protocol analysis, decoding the meaning of sequences.  Now it’s event automation.  You identify one of those sequences and respond without human interpretation or intervention.

You can’t do this without context, which is why expert people (who are pretty good at contextualizing out of their own knowledge) were always needed for testing.  Contextualization of testing using models, simulation, and AI, provide an automated substitute for that human expertise.  Yes, there will always be some number of things an automated process has to punt to a higher (meaning human) level, but a proper system would learn from that (the machine-learning piece of AI) and do the right thing the next time.

In an implementation sense, this means a kind of layered vision of testing, one that applies AI and human interaction to change the basic model that sets the context overall.  It doesn’t eliminate the older elements of probes and protocols, but it gradually shifts the burden from people to processes.  That’s what the market needs, and what successful testing strategies will eventually provide.