/ splunk

Pushing Indicator Data To Splunk Through The REST API

While doing a plugin in another system I figured I'd give Splunk as an indicator database a shot. This post will show a quick example of how it can look. The plugin name is "unsafe" and is a means to mark something in an article as malicious.

Also note that this post is a concept of how to integrate existing systems with Splunk. In this case you won't have to poll the data but can rather expect the data to be delivered instant when something happens or something evil is submitted by an analyst.

Plugin code looks like this:

from db import plugins_db,config,splunk

class unsafe:
    def common(self,title,body,id):
      return

    def html(self,title,body,id):
      self.common(title,body,id)
      if config().get('splunk','enabled'):
        sourcetype,index=splunk()
        index.submit("%s|%s|%s"%(title,body,id), sourcetype=sourcetype)

      return "<span style='background-color:yellow;color:red;'>[%s, unsafe]</span>"%(body)

    def latex(self,title,body,id):
      self.common(title,body,id)
      return "\\textcolor{red}{%s}"%body

When processed, the text will look like this: [1.2.3.4, unsafe] (if you where wondering). As you probably noticed we are doing a couple of calls through the Splunk API in there as well. This post is about that part, and we will step through how to create the index where the final and searchable indicator data to reside as well. The process is quite simple, and as you should expect:

Connect to the Splunk service

Create index if it doesn't exist

Add event to data

The connect class is basically what connects us to Splunk. We more or less get a connection to Splunk below, and return that and an index object we retrieve from the store:

def config():
    cfg_file = file(os.path.join(os.path.dirname(sys.argv[0]),'config.cfg'))
    cfg = ConfigParser.ConfigParser()
    cfg.readfp(cfg_file)

def splunk():
    service = connect(
        host=cfg.get('splunk','host'), 
        port=cfg.get('splunk','port'), 
        username=cfg.get('splunk','username'), 
        password=cfg.get('splunk','password'))

    sourcetype=cfg.get('splunk','sourcetype')

    try:        service.indexes.create(sourcetype)
    except:  pass

    // Get the index
    index = service.indexes[sourcetype]

    return (sourcetype,index)

Now that you've created the connection class the rest is a breeze, and is done by the two simple lines you saw in the script at the beginning of the post:

sourcetype,index=splunk()
index.submit("%s|%s|%s"%(title,body,id), sourcetype=sourcetype)

The above is a bit suboptimal if you insert a lot of data at once since it will create an connection each time. Anyways, that was everything you needed to do of coding. The submitted event will look like this:

unsafe|1.2.3.4, just a test|article-id (plugin name, indicator and description and article ID for reference).

You will also need to create a user with the appropriate permissions (the one in the configuration file I'm not showing here) in the Splunk console. In (Splunk 6), go to Settings -> Access Controls -> Add new user. Also tick off that Create a role for this user box. So now, go back and into roles. Here you'll need to give the user permissions to edit indexes in order to allow it to create one, and edit TCP to be able to submit events (indexes_edit and edit_tcp).

When executing the plugin code you have successfully created a functional, and searchable, indicator set in Splunk. You'll notice that the events aren't extracted though. To specify the fields you will need to create a field extraction. Find it like this:

In the regex field you will for the above scenario specify something like this: (?i)(?P<plugin>[^\|]+)\|(?i)(?P<indicator>[^,]+),(?i)(?P<description>[^\|]+)\|(?i)(?P<article_id>[^\|]+)

There you go, the finished result should look something like the following:

In short summary I find Splunk to be quite nice to work in when you aren't doing the really heavy lifting. Notice that everything goes via HTTP as far as I could find in the SDK, so if submitting a lot of events this way it will probably turn out a bit slow due to the overhead.

For more regex-training refer to this tutorial and the Regexpal tester.