Splunk Getting Extreme Part One

This is going to be a series of posts on Splunk Extreme Search (XS). Something that is for folks with Splunk Enterprise Security (ES). At this time that is the only way to get Extreme Search. It comes as an included Supporting Addon. You can get the Extreme Search Visualization (XSV) app from splunkbase, but it does not have all the commands to fully use Extreme Search (XS)

Extreme Search is a tool for avoiding making searches that rely on static thresholds. The Splunk documentation example walks through this based on a simple Authentication failure example. http://docs.splunk.com/Documentation/ES/4.5.1/User/ExtremeSearchExample

I like to explain XS this way. You still are filtering search results on a range of values. The difference is rather than hard coding a simple threshold for all data like failureCount > 6 you build a “context” profiled per “class” such as src, user, app so you can search filter on a word like “anomalous”, “medium”, or “extreme”. Those terms are range mapped to values that got calculated based on your existing data. This is way better than simple thresholds.

Contexts are nothing but fancy lookup tables. They actually get created in the desired app context’s lookup folder.

To use XS we need to make the “context” and ideally freshen it up on some time interval using a scheduled search. Then we have our search that uses that context to filter for what we are looking for.

Construction of a context generating search:

  1. We need a search that gets the data we want bucketed into some time chunks that is meaningful to us.

  2. Next we generate the statistics that XS needs to generate our context lookup table for us based on the data.

  3. We calculate/handle the depth of our context by working with the values such as max, min, and what are called cross over points. We will talk more about those shortly.

  4. We add on the context create/update statement.

Scenario:

This example needs both the XS and XSV apps installed. XSV adds a command called xsCreateADContext that we will need. This stands for Extreme Search Create Anomaly Driven Context. All these XS commands are just custom search commands in a Splunk perspective.

We are interested in event per second spikes beyond “normal” for a sending host. We will take advantage of Splunk’s own internal metrics logs to do this.

Context Generation:

This search will give us all metrics events bucketed into 5 minute averages for a host by day of week and hour of day.

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_hour, date_wday

Next we expand that to generate the overall statistics.

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_hour, date_wday | stats avg(eps) as average, stdev(eps) as stddev, count by series, date_hour, date_wday

screen-shot-2016-12-06-at-7-19-12-pm

We want to find EPS values that are anomalous to their normal levels. We will be using xsCreateADContext from the XSV app. That command needs the fields min, max, anomalous_normal, and normal_anomalous.

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_hour, date_wday | stats avg(eps) as average, stdev(eps) as stddev, count by series, date_hour, date_wday | eval min=(average-3*stddev-3), max=(average+3*stddev+3), anomalous_normal=(average-2*stddev-1), normal_anomalous=(average+2*stddev+1)

screen-shot-2016-12-06-at-7-23-41-pm

Last we add the command to create the context file.

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_hour, date_wday | stats avg(eps) as average, stdev(eps) as stddev, count by series, date_hour, date_wday | eval min=(average-3*stddev-3), max=(average+3*stddev+3), anomalous_normal=(average-2*stddev-1), normal_anomalous=(average+2*stddev+1) | xsCreateADContext name=eps_by_series_5m app=search container=splunk_metrics scope=app terms="anomalous,normal,anomalous" notes="eps by host by 5m" uom="eps" class="series, date_wday, date_hour"

screen-shot-2016-12-04-at-4-51-38-pm

Fields and Depth:

Min: We calculate min to be the average EPS minus 3 times the standard deviation minus 3. We have to subtract off that last 3 in case the standard deviation is zero. If we did not do this we would get a min=max situation when it was zero. XS has to has ranges to work with.

Max: We calculate min to be the average EPS plus 3 times the standard deviation plus 3. We have to add on that last 3 in case the standard deviation is zero. If we did not do this we would get a min=max situation when it was zero. XS has to has ranges to work with.

Anomalous_Normal: This is the cross over point between a low (left side) anomalous section. So it is similar to calculating Min. But we pull it in some from Min by only using 2 times standard deviation and tacking on a 1 to handle the standard deviation being zero.

Normal_Anomalous: This is the cross over point between a high (right side) anomalous section. So it is similar to calculating Max. But we pull it in some from Max by only using 2 times standard deviation and tacking on a 1 to handle the standard deviation being zero.

In my experience so far playing with the computation of min, max and the cross over points are an experiment. In large volume authentication data I have used 5 times standard deviation for min/max and 3 times for the cross over points. What you use will be some trial and error to fit your data and environment. But you have to create a spread or none of your results will have a depth and then you might as well search for all raw events rather than looking for “abnormal” conditions.

Breaking down the xsCreateADContext command:

Name: that is the name of our data context. In this case we called it eps_by_series_5m to represent it is events per second by the series field values in 5 minute averages.

App: this is the app context we want our stuff to exist in within Splunk. In this case we have it make the context file in the search/lookup folder.

Container: this is the name of the csv file that is created in the lookup folder location. The trick to remember here is that the entire csv “container” is loaded into RAM when Splunk uses it. So you want to consider making contexts that have very large numbers of row into their own containers rather than putting multiple named contexts into the same file.

Scope: this is same how to scope access permissions. Normally I just keep it to the app that I am making the context in using the word “app”

Terms: Since we are making an AD context we need to set “anomalous, normal, anomalous” You can understand why when you look at the graphic below. We are saying that the left low side has the word mapped to the ranges as anomalous, the middle range is normal values, then the right high side is anomalous. This is important because when we use this context to search we will say something like “eps is anomalous” which will match any values in the ranges to the left or right of “normal”. This is what I meant by we range map values to words.

Notes and uom: the notes and units of measure fields are just optional. They only matter when you look at the contexts in something like the XSV app GUI.

Class: this is critical as this is saying we are profiling the values BY series, date_wday and date_hour. This is exactly the same as the split by in a stats command in Splunk.

In this chart notice how the light blue middle region is “normal” and to the left and right we have the “anomalous” zones. This helps you visualize what areas you will match when you make compatibility statements like “is normal”, “is anomalous”, ” is above normal”.

screen-shot-2016-12-04-at-4-53-24-pm

Using our Context:

The key to using the context is making sure we search for data with the same time bucketing and split by fields. Otherwise the context value model won’t line up with our real data very well.

There are several XS commands we should be familiar with in getting ready to use the context.

  1. xsFindBestConcept: this takes our search and compares it to our context and gives us a guide on what “term” we should use for that result line if we wanted to get it from the filter.

  2. xsgetwherecix: this shows us all the results without filtering them but gives us the CIX or compatibility fit value based on the compatibility statement we make. Aka “is anomalous”

  3. xswhere: this is the filtering command we will actually use when we are done.

xsFindBestConcept:

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_wday, date_hour | xsFindBestConcept eps from eps_by_series_5m by series, date_wday, date_hour in splunk_metrics

screen-shot-2016-12-06-at-7-10-45-pm

xsgetwherecix:

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_wday, date_hour | xsgetwherecix eps from eps_by_series_5m by series, date_wday, date_hour in splunk_metrics is anomalous

screen-shot-2016-12-06-at-7-34-25-pm

xswhere:

index= _internal source=*metrics.log group=per_host_thruput | bucket _time span=5m | stats max(eps) as eps by _time, series, date_wday, date_hour | xswhere eps from eps_by_series_5m by series, date_wday, date_hour in splunk_metrics is anomalous

screen-shot-2016-12-06-at-7-35-28-pm

There out of our data for yesterday only two hours were classified as “anomalous.” We did not have to hard code in specific limiting values. Our own existing data helped create an XS context and we then applied that to data going forward.

Next in our series we will start going through different use cases related to security. We will also cover the other types of contexts than just simply “anomalous.”

Share