Machine learning is the buzzy new thing in endpoint protection (and practically every other industry) but is it truly a game changer when it comes to your security?
In this post we’ll explain the role machine learning plays in today’s endpoint protection landscape and what it means for your business.
Why endpoint protection companies are turning to machine learning
The primary job of endpoint protection is to block attacks. Endpoint protection software gets better at blocking attacks when it gets better at identifying and stopping the malware and viruses that dominate the threats to those endpoint systems.
In the past, this identification relied on recognizing specific malicious executables and packages. This technique isn’t workable in the current environment, where malware is commonly programmed to generate new variants of itself to avoid detection.
Today, endpoint protection software must be able to identify previously unseen malware types and variants. Meeting this challenge requires extrapolating common characteristics from a variety of samples of known malware and using them to identify new malware in the wild.
To do this well, the software needs the ability to analyze and find meaning in massive amounts of data from a wide variety of malware samples. Analysis on this scale is not easy to accomplish, and this is where machine learning comes in.
Machine learning gives programs the ability to interpret trends and relationships within datasets without being told what to look for.
Without machine learning, algorithms for data analysis are limited by human inputs and intuition. Human analysts must decide what questions to ask, what relationships to look for, what data to use, and how that analysis will be performed, step-by-step. Each decision the analyst makes prescribes a particular path. To gain new insights, the analyst must start over again with new questions or new data, but again, the analyst will start with their own hypothesis, testing to see if it holds.
Let’s consider what this means for endpoint protection. Without machine learning, security researchers have to reverse engineer malware samples one by one, attempting to draw their own conclusions on common attributes and behaviors. A single piece of malware might have thousands of attributes and behaviors to dissect, and the challenge isn’t just identifying them — it’s understanding how they work in combination and comparing them across a wide enough variety of malware samples to attain meaningful results. It is nearly impossible to do a comprehensive job.
Because computers can recognize relationships in data and draw and test inferences much more quickly than humans can, machine learning offers a more efficient way of doing things. Replacing human analysis with machine learning means that we can extrapolate common attribute and behavioral characteristics of malware from a wide variety of samples more quickly and efficiently than ever before.
Machine learning is only as good as its data
Machine learning is a tool for better analysis, and in endpoint security, machine learning is only as useful as the data it analyzes. Improvements to protection depend on rigorous training of the model using data with high fidelity to the real world.
The most common application of machine learning in today’s endpoint protection landscape is to analyze file attributes during scans to predict whether a file is malicious. This is an effective approach for identifying and blocking file-based malware, but doesn’t help when there are no file attributes to analyze.
To circumvent protection that relies on the attributes of files, hackers are now coming up with increasingly creative ways to deliver “fileless” attacks. What constitutes a fileless attack warrants another discussion in itself, but for the purposes of this post we’ll think of it as an attack that doesn’t require or leave a file on disk.
With no file on disk, there is nothing to scan, and no protection that works by scanning will be able to recognize and stop these attacks. This applies to traditional signature-based antivirus technologies as well as more cutting-edge solutions that use machine learning as described above.
The weakness in these cases isn’t the machine learning, it’s the data.
Applying the most sophisticated machine learning in the world to file-attribute data won’t help you stop an attack that doesn’t have any file attributes to analyze.
Fortunately, it’s possible to identify and stop fileless attacks by analyzing a different type of data: activity on the system. Malware doesn’t have to be a file, but it must attempt malicious actions eventually. A new category of endpoint protection solutions called Runtime Malware Defense is tackling this problem by analyzing system activity in real-time to identify malware based on its behavior.
The adoption of machine learning is driving security forward on multiple fronts, but the most effective endpoint protection solutions will be the ones that apply it to the right mix of data.
As attacks progress, they reveal more and more data that can be used to predict their malicious nature. This data comes in two forms:
- Artifacts: Discrete objects such as files on disk or a phishing email containing a malicious attachment
- Behaviors: The actions the malware takes to gain access and infect the system
Protection that makes use of machine learning to analyze both artifacts and behaviors across stages of an attack will be the most effective at identifying and blocking attacks.
What businesses can do now to improve endpoint protection
The application of machine learning in endpoint protection is still in its infancy, with most products focused on applying machine learning to file analysis in the pre-execution phase of an attack. Pre-execution protection has been dominated by traditional antivirus for too long, and the use of machine learning to analyze files on disk is certainly an improvement over signature matching.
Still, given current cybersecurity risks, investing in additional pre-execution protection will continue to leave a large blindspot to the growing number of “fileless” attacks.
To maximize the security return on their next dollar invested, businesses should be adding runtime malware defense that identifies and blocks malware based on both artifacts and activity. Runtime malware defense works downstream from existing pre-execution protection, and provides the last line of defense against attacks that pre-execution protection simply can’t see.
Find out more about how RMD works in our Complete Guide to Runtime Malware Defense.