Name: Josh Liburdi
Years hunting: 3
Favorite datasets: Bro, memory artifacts, file metadata
Favorite hunting techniques: Stack Counting, baselining, data visualization
Favorite tools: Bro, LaikaBoss, Volatility, Sqrrl
Who are you?
My name is Josh Liburdi, I’m a Security Technologist at Sqrrl. I previously worked at CrowdStrike on the Professional Services team and at General Electric on the GE-CIRT.
Why do you hunt and what is your experience hunting?
I hunt because I know attackers are getting into and moving around networks unseen, often evading traditional prevention and detection systems. I also hunt because it’s highly engaging and fun-- for me, it’s an intellectually satisfying activity that rewards creativity and learning.
I’ve been threat hunting for the past three years. I started with threat hunting by doing exploratory data analysis for the purpose of creating automated threat detection. The process of doing that is similar to hunting— it involves looking through large data sets and finding news ways to identify attackers. That wasn’t a required daily task, but I found it very engaging and made it a part of my daily routine.
How would you define Threat Hunting?
I keep it simple, I just refer to it as manual threat detection that is driven by people, not computers. The opposite of threat hunting is something complementary that most people in the security operations space are already familiar with-- automated threat detection that is driven by computers, like an intrusion detection system (IDS) or a Security Information and Event Management (SIEM) tool.
What projects and organizations are you involved with right now?
Aside from working as Security Technologist at Sqrrl, I try to embed myself in communities for open source projects that I enjoy using. Over the past few years I’ve done a lot of work with Bro, an open source network security monitoring tool that is really useful for hunting. I’ve contributed a dozen or so scripts to the community and was the original developer for the remote desktop protocol (RDP) analyzer that’s now included with the tool by default.
Within the past year I’ve really gravitated towards file metadata tools like Lockheed Martin’s LaikaBoss and Emerson Electric’s File Scanning Framework, and have written and shared modules for those tools. My focus in the open source community tends to shift to whichever tool I find most interesting and most useful at the time.
Which of the hunts you’ve carried out was the most interesting or challenging?
The most interesting hunts to me are the ones where you uncover a significant set of previously unidentified activity. Late last year I became very interested in hunting for phishing emails. Throughout that process I came up with some new (to me) ways of finding them-- most were related to examining file attachments and looking for odd artifacts, like Zip files that only contained a single file-- and as a result found some large phishing campaigns that our automated threat detection systems didn’t catch.
What hunting techniques, tools, and datasets do you use most frequently?
I commonly use some combination of stacking, baselining, and data visualization when hunting. Stacking and baselining are typically used to achieve the same goal, which is identifying outliers / anomalies. I find data visualization really useful for understanding how networks are laid out and how systems interact-- which user was on which system, which systems connected to other systems, which systems requested specific resources on the internet, etc. For me, these techniques are most useful when combined with threat intelligence or friendly intelligence-- for example, if I know an attacker has a tendency to place attack tools in a specific directory, then I focus hunting for activity in that directory across multiple systems or if I know an attacker is likely to target a specific user or class of users, then I focus hunting systems and accounts related to those users.
I separate tools into two categories: tools that provide data and tools that are used to analyze data. For tools that provide data, I really like using free / open source tools. Some of my favorites are Bro, LaikaBoss, Volatility, Yelp’s OSXCollector, and CrowdStrike’s CrowdResponse. Analysis tools are more hit or miss. If I’m doing simple search / retrieve / filter activities from a single data source, then command line tools like awk, sed, and grep are fine, and even something like a spreadsheet program is useful when handling CSV files. For data visualization and data correlation, I use Sqrrl Enterprise-- I find the free / open source alternatives to be unwieldy and they don’t always have the functionality I’m looking for.
When it comes to datasets, I usually try to have both network data and endpoint data available and my preference is to have both organized into a single place so I can easily correlate between the two. To generate these datasets, the tools previously mentioned are what I prefer to use-- Bro for network data, LaikaBoss for file data, Volatility for memory artifacts, OSXCollector for Mac endpoint data, and CrowdResponse for Windows endpoint data. I also think it’s important to “live off the land” and use whatever data is available, so long as it fulfills a need-- firewall logs, proxy logs, and Windows event logs are datasets commonly found in enterprise networks and they can be very useful for hunting. The particular tools / datasets that I like to use might not be the most appropriate for everyone, but using the network and endpoint data domains as a guide can help analysts understand what datasets they might need.
What general advice do you have for new Threat Hunters?
Don’t get frustrated and don’t be discouraged. Threat hunting doesn’t always end with you finding the latest and greatest threat actor in your environment-- sometimes you find misconfigured servers, sometimes you find users doing odd things, and sometimes you find nothing.
What hunting procedure would you recommend for a new to Threat Hunter?
I recommend starting with a data set that you know well-- for the sake of an example, data sources that contain network connection data tend to be prevalent and commonly understood. Look in those data sources (e.g., firewall logs, NetFlow) for outbound connections on uncommon service ports that are transferring a significant number of bytes. Related to that, if you use Bro, then look in conn.log for outbound connections on common service ports (e.g., HTTP, FTP) and look for the lack of an identified service (e.g., a lack of the HTTP service over TCP port 80).
What parts of a hunt could you see as being most successfully automated or assisted by a machine?
Like most activities, any task you find yourself performing over and over is probably best suited for automation. Two tasks that come to mind are pulling internal data to establish context and pulling external data to establish context. Analysts use multiple data sources to build context and many of those data sources exist in different tools and services-- automating these collection tasks would mean less time spent retrieving data and more time spent investigating data.
What types of friendly intelligence are most useful for a hunter to have in an investigation?
Asset management data, data that describes users and their roles in the organization, and information regarding servers that store critical files (e.g., a list of servers that store “crown jewel” data) are useful to have.