Is Security a Good Application for AI?

What, exactly, is the relationship between AI and security? We hear a lot about AI these days, and though my own data doesn’t come close to the level of AI adoption some surveys claim to find, it does show that the number of enterprises with actual AI applications has tripled since 2021, which represents pretty decent growth. Security, of course, has been a priority for much longer, but has also been a recent target for AI enhancement. That seems to be a work in progress, but the potential there is real, and so is enterprise interest.

Enterprises tell me that there are two ways in which AI could enhance security. One is the obvious application, where AI is used to detect intrusions directly. The other is less obvious but, at least for the present, a bit more credible to enterprises; the use of AI to reduce configuration/parameterization errors that open security holes. The interest in the latter comes in part because that error-management mission has other benefits.

When Cloudflare had its human-error BGP problem that took down part of the Internet, and Rogers in Canada admitted to a “maintenance” error that took most customers/services offline, enterprises told me that they had their own network configuration problems that impacted their own applications. They also told me that deployment/redeployment automation, through tools like DevOps (Ansible, Chef, Puppet) or container orchestration (Kubernetes) also created problems. A surprising number commented that while these errors sometimes created failure modes that impacted availability, their real concern was that they might well also impact security, and that these impacts might not be easily recognized.

One CIO told me that after Cloudflare, the CEO became concerned that the same kinds of errors might have created accidental holes in security, and asked for a review. To the surprise of IT, they found that there were a half-dozen examples of what they called “loopholes” in their security plan, created by misconfiguration of one or more applications, virtualization platforms, or networks. There were indications that one or perhaps two had been exploited already, but none had been caught.

This enterprise has undertaken what some jokingly called the “Big Brother” project, because we all know the phrase “Big Brother is watching you!” They want AI to play that role, looking at the hosting and network platform setup for their applications and verifying continually that changes and tweaks and recoveries don’t create those pesky loopholes.

While this enterprise hasn’t yet made the leap from a security-focused AI plan to a governance/compliance AI plan, it seems pretty likely that they will, because most organizations tell me that they don’t think that their compliance plans address all of the application and network changes, ensuring that the latter don’t break the former without leaving a clear fingerprint to trigger management focus.

Enterprise have at least a vague view of how they’d like this to work, as the “Big Brother” term suggests. They believe that AI could play a role in continuously matching what we might call “platform and application configuration” policies for security (and eventually, I think, compliance) against actual conditions. They acknowledge that this will almost surely mean formally authoring those policies so that automated analysis can be conducted, but they believe that the information on current conditions is available already. Most think that machine learning could ease the process of establishing what conditions actually break policies, but some believe that at least a bit of expert input would speed things up.

Would a single architecture help this along? The limited input I’ve gotten on that question suggests that it would, but nobody believes that such an architecture exists or says that their vendors are currently pitching one as a future capability. Very few believe such an architecture is on the radar overall, in fact.

How about the “detect the breach” mission for AI in security enhancement? This seems like a no-brainer to enterprises; they believe that AI could be used to detect “normal” application and network behavior and then identify “abnormal” states. Juniper is apparently seen as a vendor making progress in this area, but enterprises think those initiatives are aimed more at fault management than at security management.

Enterprises who have had security breaches tell me that in almost every case, postmortem analysis shows that there was a detectable change in traffic patterns, application behavior, or other visible metrics, but that operations staff either didn’t notice the difference or didn’t interpret it correctly. Most said that while they could identify the link between a breach and a change in behavior after the fact, they were unsure that the correlation was strong enough to trigger human intervention, which is where they hope AI could play a role.

Most enterprises seem to think that this is a task AI could undertake, and successfully, but they’re more divided on just how to go about this and what to do with any AI-generated alerts. Among my contacts, the presumption that a vendor would offer a canned product that was somehow pre-loaded with rules to identify breach-related abnormalities has declined somewhat over time, in favor of a machine-learning approach. However, there is concern about both of these models of AI application.

The problem enterprises have with pre-loaded analysis frameworks, of course, is that they may not reflect the actual conditions created by a company’s own configuration and security problems. Enterprises have their own network traffic patterns, driven by their own shifts in application usage. Would it be possible to establish rules that were helpful in identifying abnormal shifts when “normal” patterns are so diverse?

Machine learning is a solution most enterprises say could be made to work, but it’s the “made” part that’s concerning them. In order for ML to identify abnormal patterns you first need to have traffic monitoring at the level needed to identify application/user traffic. Most enterprises do not have that, and many are concerned that this sort of monitoring might actually create security/compliance risks in itself. Second, you need someone to review a new pattern to determine if it’s “abnormal”, and enterprises fear that this could require so much time from network/applications operations teams that it could disrupt operation.

The final issue, where both the objections seem to converge, is what to do with the alerts. Do you activate some remedial process (one enterprise called this a “lockdown”) in response to an alert, or notify somebody? Enterprises think that “eventually” they would be confident enough in an AI big-brother oversight system to let it take action, but they admit it would take six months or a year for that to happen, and that in the meantime they’d have to rely on human reaction to an alert. That could increase the operations burden again, and also create a “risk window” that could reduce the overall value of the AI solution.

I think that security could emerge as a driver to AI oversight of network operations practices and network connection and traffic behaviors. I think the barriers to that emergence aren’t very different from the barriers to AI adoption in netops overall, and so it may be that security is building adoption benefits without really adding to operations burdens or risks, versus AI driven by operations oversight alone. If that’s true, then a discussion on AI’s value in security could benefit vendors and enterprises alike.