by John Matthews, ExtraHop 28 November 2019
Artificial Intelligence (AI) might be the most
popular word in cybersecurity. And why not? A piece of technology that learns
and thinks like a human brain, but better. Who wouldn’t want that?
The mere word conjures up images of
supercomputers, sentient robots and rogue operating systems. The promises of AI
are great and cybersecurity has seized upon the concept with great enthusiasm.
However, in an industry where jargon reigns, it can be easy to lose sight of
the original meaning of concepts or for them to morph into something larger
than what they are. That lack of precision can mean poor security decisions and
confusion as to where, and how to deploy security resources.
A survey by CapGemini earlier this year revealed just how much the cybersecurity industry expects of Artificial Intelligence with over 60 percent of respondents proclaiming they cannot “identify critical threats” without AI technology.
While the excitement for AI can’t be understated, the term has suffered from overuse in the fervor and stretched the meaning of the term rather far.
A report from MMC Ventures showed that 40 percent of the European tech startups that claim to be using Artificial Intelligence, in fact are not. The report also found that companies that were listed as such received 15 to 50 percent more funding than those that didn’t, further incentivizing the stretch the meaning of "AI" to impress investors.
Artificial Intelligence has many definitions
but a good working definition came from Andrew Moore, the former dean of
Carnegie Mellon’s University’s School of Computer Science. Simply - “Artificial
Intelligence is the science and engineering of making computers behave in ways
that, until recently, we thought required human intelligence.”
Machine Learning on the other hand, offers
something more modest. Machine Learning is a computer system that can
continually learn from new experiences and take on new data autonomously. From
there, it can easily automate a lot of the more mundane tasks in IT and
continue to improve upon them without much more input needed from its operator.
In cyber this often expresses itself in
vendors telling customers that they’re offering Artificial Intelligence when
they're really offering Machine Learning, a tremendously effective piece of
technology but ultimately a constitutive part of an Artificial Intelligence.
The importance of the definition should not be
understated. “Machine Learning is a subset of Artificial Intelligence.” There
is much confusion in the market when the two terms are conflated or
interchanged turning into promises that can’t be delivered upon.
One of the most effective uses of Machine Learning in cybersecurity is in Network Detection and Response (NDR). Machine Learning allows organizations to make better use of their data to monitor their own networks and spot anomalous behavior. In many places, that job is still being done manually by humans who have to pore over thousands of false positive alerts a day - dulling their alertness. Machine Learning enables organizations to reduce false positives by learning baseline behavior to understand what is normal and only alert on what is a real risk. This means that security teams can respond to threats faster and with more certainty.
Many NDR solutions are currently rule- or signature-based. Rule-based systems police their environments by matching observed behavior on the network against known signals of suspicious behavior.
However, they typically require constant
maintenance and updates. They can only respond to threats that have been seen
before and incorporated into a current knowledge base, precluding them from
detecting new attacks and tactics. Furthermore, rules are fairly simple for
attackers to evade. A minor tweak to an existing attack tactic or piece of
malware is often enough to make existing rules ineffective against the attack.
Signature-based systems detect malware from a
catalogue of known publicly reported signatures using file hashes. But problems
still arise when new, unknown malware families arrive on the scene or adversaries
merely change signatures. This approach has no hope against modern polymorphic
The problem with both of these approaches lies
in an asymmetry between cyber attackers and their victims. Attackers can
develop new attack tactics and new malware variants much more quickly than
defenders can develop new defenses. Attackers only need one of their attempts
to work, whereas defenders need to defend against every existing tactic while
always anticipating never-before-seen attacks.
Here’s where Machine Learning’s edge is
sharpest. By learning from real-time observed behavior in the environment that
needs to be defended, a ML system can improve upon itself much more effectively
than a manually updated signature database, and can refine its results over
time without human intervention.
An NDR solution with Machine Learning can learn what suspicious behavior looks like in the context of a specific environment, rather than relying on generic known-bad behavior signals and indicators of compromise. Furthermore, it can adapt to new tactics, and better respond to the ever-changing threat landscape.
AI often means whatever people want it to mean. It can be a powerful marketing term, but as more technologies based on this slippery foundation fail to deliver on their promises, the customer's trust in the term will wane rapidly. Machine Learning can be a powerful arrow in the security operations quiver, but buyers should learn to recognize the telltale signs of AI snake oil, and should verify that any vendor claiming AI or Machine Learning can deliver on the value they promise.
John Matthews is Chief Information Officer at ExtraHop, a company specializing in network traffic analysis tools.