Splunk Getting Extreme Part Four

Let’s revisit our EPS Splunk Metrics. This time we are going to use type=domain and do something else a little different. We are going to make a non classed context and apply it directly to the raw event data.

The Question:

The question we want is, what systems are generating metrics events well above low AND we want to know what concept term they fall in?

We also want get the original raw events precise in time. That is technically a different question than we asked in part one of this blog series. There we made more of a canary that asked when did a given host go over normal for it’s activity levels with no relation to the whole environment in a particular bucket of time.

Context Gen:

We want to make a context that is not setup for a class. Note we don’t even use a time bucketing step. The search just is set to run across the previous 30 days which is typically the retention period of Splunk index=_internal logs.

The reason we are doing it this way is we are wanting to find events that are something like high, extreme etc for our entire environment. We don’t care about trending per source system (series). We get count as the distinct count of source systems (series), then the min and max values for EPS for all sources.

index= _internal source=*metrics.log earliest=-30d latest=now group=per_host_thruput | stats dc(series) as count, min(eps) as min, max(eps) as max | xscreateddcontext name=eps container=splunk_metrics app=search scope=app type=domain terms="minimal,low,medium,high,extreme" notes="events per second" uom=“eps”


First we see if we have any extreme events in the past 30 days.

index= _internal source=*metrics.log group=per_host_thruput | xswhere eps from eps in splunk_metrics is extreme

I get one event, the largest catch up of web log imports.

11-11-2016 11:11:54.003 -0600 INFO Metrics - group=per_host_thruput, series="www.georgestarcher.com", kbps=2641.248883, eps=7144.764212, kb=81644.455078, ev=220854, avg_age=1172745.126151, max_age=2283054

Next let’s get fancier. We want to know events very above low and have XS tell us what concept term those events best fit. This is a handy way to get the word for the term it fits

index= _internal source=*metrics.log group=per_host_thruput | xswhere eps from eps in splunk_metrics is very above low | xsFindBestConcept eps from eps in splunk_metrics | table _time, series, eps, kbps, BestConcept


The point is that you can use XS to build a context profile for raw data values then apply them back to the raw events. Raw events, if you can keep the number of matches low, make great ES notable events because they have the most of the original data. Using stats and tstats boils down the fields. That requires us to pass through values as we we saw in Part Three to make the results more robust.