Sqrrl Blog

Sep 12, 2016 3:41:22 PM

The Applicability of Graphs for Information Security Combatants

This post by Henrik Johansen originally appeared on Medium. Henrik is an IT Security professional at a Danish public sector entity called Region Syddanmark.

I have been tweeting a lot lately about Graphs and how they can be utilised in the context of Information Security. Since this is a topic that seems interesting to a few people I thought a more thorough explanation would make sense. Think of this as the “why” and “what” more than the “how”. 


Before I bore you to death with this rant I would like to give you a few words of caution; everything below is highly biased and relies on a few key observations …

“The relations between entities are often more interesting that the individual entities themselves in the context of modern Information Security”
“Maliciousness is often determined by something that is not normal in the given context”
“The creativity, ingenuity and brain power of the humans involved in Information Security matters more than the individual tools you buy from Infosec vendors”

If you strongly disagree with what is stated above this post probably isn’t for you :)

Graphs can form the basis for a really neat platform that can enable, facilitate and accelerate both traditional DFIR workflows and threat hunting scenarios. Before we start with some concrete examples we need to take a small detour into our Graph and the data that it contains.

Our current Graph is structured around the content of our Active Directory. The relevant objects (User, Group, Computer) are regularly pulled out and created (or updated) in our Graph — the basic relationship looks like this :

(Computer) -> MemberOf -> (Group)
(User) -> MemberOf -> (Group)
(Group) -> MemberOf -> (Group)

Besides data from Active Directory we enrich this basic structure with other information to form a multidimensional, contextual super Graph.

User logins, process executions, network connections, dns queries, ids alerts, etc are all added to this Graph in near-realtime through our log management platform. Think of this as a constantly evolving, materialized view on what is happening and has happened in your infrastructure — you can explore single or multiple dimensions of this Graph either visually or be writing queries that answer specific questions or search for specific structures.

The individual pieces of your Graph (nodes and relationships) can also be contextualized by adding information to them in the form of key:value pairs. Is a User a local admin? Put that in a property. The geolocation of IP adresses? Put it in a property. The duration or number of bytes of a particular network connection? Yeah, you can properly guess what I am about to say.

There is no limit for information that you can add to your Graph. Want to see what registry key a process has modified? Want to see how has accessed a particular directory of file? Put that in your Graph.

No more pivoting between different tools or browser tabs — you have a singular source of truth that can provide insights at speed and scale because all the parts you care about are related. A precomputed chain of things, events and context.

It is pretty easy to evolve a Graph as time passes — there really is no excuse for not starting that journey now and figure it out as you go.

Let’s see some actual examples of how a Graph can be utilised …

DFIR Workflow — the IDS alert

This is probably something you’re all familiar with — your IDS throws an alert from an internal machine and you need to deal with that alert.

ET MALWARE User-Agent (Internet Explorer)

Is this legit traffic or malicious? Well, it depends on the circumstances that are related to this alert; the machine this connection came from, the process that initiated this connection, it’s name, the user context that this process ran under, it’s parent process, location on disk, time of day, etc. All of that data can be pulled out of the Graph as a structure that can be visualised or even processed by logic for automated decision making.

Let’s say this was determined to be a malicious process — what impact does this threat pose to your organisation?

What has this machine access to (applications, data, fileshares, etc)? Which users have accessed that machine lately or during the time of infection? What other systems do those users have access to and what privilege levels do they have where? What have those machine access to?

These are only a few of the questions that come to mind — all of them can be answered using a Graph. And they can be answered quickly :)

You can build a complete reverse attack tree and simulate the most likely path an attacker would take during lateral movement with a few clicks of your mouse or a few queries.

All from within the same Graph.

Threat Hunting using Graphs

I really think that observation #2 is a prime candidate for hunting and here a Graph can help, too.

I like to start my hunting activities by formulating a hypothesis (ie. having a hunch) and then query my Graph to validate my idea.

A hypothesis looks something like this :

In many cases rarity is an excellent indicator for deviations from what is considered “normal” so I would like to look a parent — child relationships in process executions across our entire fleet because the majority of processes on Windows should only ever have the same parent process name.

This could also be done for the user context, location on disk, etc.

Another example is DNS query distribution by domain name…

I would like to look at external DNS query distribution (domain names) per source machine because domains that have been looked up by a few machines are suspicious.

Same process is applicable to other stuff; HTTP host headers, domain names in SSL/TLS certificates, process names, etc

All of the above is determined by structure alone; you do not need to feed in a list of potentially suspicious process names, domains, disk locations, etc.

A few last words

We all know that our workload as defenders is only going to increase; the complexity and operational speed of our adversaries and the stuff we are trying to protect is also increasing in complexity at faster and faster rates.

Graphs are a way to address this effectively.

“Defenders think in lists. Attackers think in graphs. As long as this is true, attackers win”

The impact that a multidimensional, contextual super Graph can have on Information Security efforts is quite honestly a quantum leap forward and, if your ask me, an actual requirement for the modern defender.

Topics: Graphs, Incident Response, Threat Hunting, Cyber Threat Hunting