Remote Forensics is the New Black

Like pretty much else in information security, forensics is constantly evolving. One matter of special interest for practitioners is doing forensics on remote computers, not that it's entirely new.

The use-case is self-explanatory to those working in the field, but for the beginners I'll give a brief introduction.

When you get a case on your desk and it lights up as something is interesting, what do you do? Probably your first step is network logs, searching for network indicators of compromise. Finding something interesting in some of the clients, let's say ten in this case, you decide to put some more effort into explaining the nature of the network traffic. None of the clients is on-site, multiple of them is even on locations with 1Mbps upload speeds.

The next phase would probably be to search in open sources, perhaps supporting that something fishy is going on. Now you'd like to examine some of the clients logs for some hashes and strings you found in open sources, and the traditional way to go is acquiring disk and memory images. Or is it? That would have easily taken weeks for ten clients. In this case you are lucky and you have a tool for performing remote forensics at hand. The tool was a major roll-out for your organization after a larger breach.

What's new in remote forensics is that the tools begin to get more mature, and by that I would like to introduce two products of which I find most relevant to the purpose:

Actually I haven't put the latter option to the test (MIR supports OpenIOC which is a huge advantage) - but I have chosen to take GRR for a spin for some time now. There are also other tools which may be of interest to you such as Sourcefire FireAmp which I've heard performs well for end-point-protection. I've chosen to leave that out this presentation since this is about a different concept. Surprisingly the following will use GRR as a basis.

For this post there are two prerequisites for you to follow in which I highly recommend to get the feel with GRR:

  • Setup a GRR server. In this post I've used the current beta 3.0-2, running all services on the same machine, including the web server and client roll-in interface. There is one install script for the beloved Ubuntu here, but I couldn't get it easily working on other systems. One exception is Debian which only needed minor changes. If you have difficulties with the latter, please give me a heads-up.
  • Sacrifice one client (it won't brick a production system as far as I can tell either though) to be monitored. You will find binaries after packing the clients in the GRR Server setup. See the screenshot below for details. The client will automatically report in to the server.

You can find the binaries by browsing from the home screen in the GRR web GUI. Download and install the one of choice.

You can find the binaries by browsing from the home screen in the GRR web GUI. Download and install the one of choice.

A word warning before you read the rest of this post: The GRR website is was a little messy and not entirely intuitive. I found, after a lot of searching, that the best way to go about it is reading the code usage examples in the web GUI, especially when it comes to what Google named flows. Flows are little plugins in GRR that may for instance help you task GRR to fetch a file on a specific path. Look for the call spec, it may look like this:

Notice the call spec. This can be transferred directly to the iPython console.

Notice the call spec. This can be transferred directly to the iPython console.
Before I started off I watched a couple of presentations that Google have delivered at LISA. I think you should too if you'd like to see where GRR is going and why it came to be. The one here gives a thorough introduction on how Google makes sure they are able to respond to breaches in their infrastructure.

I would also like to recommend an presentation by Greg Castle on BlackHat for reference GRR: Find All The Badness. For usage and examples Marley Jaffe at Champlain College have put up a great paper JaffeCapstonePaperFOR410. Have a look at the exercises at the end of it.

First of all what is great with GRR is that it supports the most relevant platforms: Linux, Windows and OS X. This is also fully supported platforms at Google, so expect development to have a practical perspective for those ones.

While GRR is relevant, it is also fully open source, and extensible. It's written in Python with all the niceness that comes with it. GRR have direct memory access by custom built drivers. You will find support for Volatility in there. Well they forked it into a new project named Rekall which is more suited on scale. Anyways it provides support for plugins such as Yara.

If you are like me and got introduced to forensics through academics, you will like that GRR builds on Sleuthkit through pytsk for disk forensics (actually you may choose what layer you'd like to stay on). When you've retrieved an item, I just love that it gets placed in a virtual file system in GRR with complete versioning.

The virtual filesystem where all the stuff you've retrieved or queried the client about is stored with versioning for you pleasure
In addition to having a way-to-go console application GRR provides a good web GUI which provides an intuitive way of browsing about everything you can do in the console. I think the console is where Google would like you to live though.

An so I ended up on the grr_console which is a purpose-build iPython shell, writing scripts for doing what I needed it to do. Remember that call spec that I mentioned initially, here is where it gets into play. Below you see an example using the GetFile call spec (notice that the pathspec in the flow statement says OS, this might as well have been REGISTRY or TSK):

token = access_control.ACLToken(username="someone", reason="Why")

flows=[]
path="/home/someone/nohup.out"

for client in SearchClients('host:Webserver'):
  id=client[0].client_id
  o=flow.GRRFlow.StartFlow(client_id=str(id),
  flow_name="GetFile", pathspec=rdfvalue.PathSpec(path=path, pathtype=rdfvalue.PathSpec.PathType.OS))
  flows.append(o)

files=[]
while len(flows)>0:
  for o in flows:
    f=aff4.FACTORY.Open(o)
    r = f.GetRunner()
    if not r.IsRunning():
      fd=aff4.FACTORY.Open(str(id)+"/fs/os%s"%path, token=token)
      files.append(str(fd.Read(10000)))
      flows.remove(o)

If you are interested in Mandiant IR (MIR) and its concept, I'd like to recommend another Youtube video by Douglas Wilson, which is quite awesome as well.

Tommy

Tommy is an analyst and incident handler with more than seven years of experience from the government and private industry. He holds an M.Sc. in Digital Forensics and a B.Tech. in information security