Splunk and Geo Location

It is important to ensure the MaxMind database behind Splunk’s iplocation command is as up to date as possible. Long ago I made a skeleton app to download the database in place and take advantage of a Splunk configuration option to point at it. 

MaxMind changed how you can download the free databases in 2019. It is detailed in their Blog Post. I have updated the TA located in my GitRepo to hold and use a downloaded updated database. 

Here are the things you need to do when using this TA to use an updated DB in your environment.

1.  You also cannot auto download without a paid license key. Regardless of how you obtain the mmdb file you need to update it on ALL search heads and indexers to ensure the iplocation has updated information as you can obtain.

2. You will need to follow their instructions for setting up an account and download it to a central location. If you mass download it from a large number of servers you could get blocked for appearing to be a DDoS against Maxmind.

3. Use your organization’s configuration automation tools to distribute it to ALL the Search Heads and Indexers and place the file into TA-geoip/bin/GeoLite2-City.mmdb

4. If you deploy this configuration container app and do not place the mmdb in bin Splunk will simply default to the installation’s default copy. This will most likey be very out of data geo information.

It is important you are updating the database especially if you are using things like Country or Improbable Travel criteria for generating Splunk Enterprise Security Notable Events.

Splunk Getting Extreme Part Nine

I ran a series on Splunk Enterprise’s Extreme Search over a year ago. It is time to add a discussion about alfacut. I had not realized the exact behavior of alfacut at the time of that blog series. 

Alfacut:

What is alfacut in XS? Extreme Search generates a score called WhereCIX that is the measurement ranging from 0.0 to 1.0 of how compatible what you are measuring is against the Extreme Search statement you made in the xswhere command. The xswhere command uses an alfacut (the limit on the WhereCIX score) of 0.2 by default.

Here is where I rant. You CAN put alfacut>=0.7 in your xswhere command. This is like thinking you are cutting with shears and instead you are cutting with safety scissors. It doesn’t work like you would expect when you use compound xswhere commands: AND, OR.

If you specify the alfacut in the xswhere command it applies the limit to BOTH sides of the compound individually . It does NOT apply to the combined score, which is what you would expect as a user if you were looking at the output after the xswhere command. The WhereCIX  displayed from the output of the xswhere is the score of the compatibility of the entire compound statement.  If we want to filter on the combined score we simply have to know this is how things work and add our filter in the next step in the SPL pipeline.

XSWhere Compound Statements:

How you define your interesting question for ES Notables matters. Let’s take Excessive Failures.

The default Excessive Failures is not even Extreme Search based:

This says we find interesting to be a failure count greater or equal to 6 by src, app. 

That is going to be VERY noisy and have little context based on existing behavior. A static limit also gives bad actors a limbo bar to glide under.

What we really want to ask is: “Give me extreme non normal failure counts by source and app.” This would squash down common repetitive abuse.

We need a compound statement against anomalous and magnitude contexts.

The SA-AccessProtection app has an existing XS Context Generator named: Access – Authentication Failures By Source – Context Gen

It looks like this:

The issue is that failures by src only is misleading because what if the failure pattern of the “app” such as ssh or mywebapp differ but from same source. You don’t want them hiding the other’s pattern.

Magnitude Context Gen:

That gives us the stats for a source and the app it is abusing such as sshd specifically. I used xsCreateDDContext instead of xsUpdateDDContext, because the first time we run it we need to create  a new vs update an existing context file.

We need more than this though. Because maybe there is a misbehaving script and we do not want ES Notable events for simply extreme failures but non normal extreme failure counts.

To use Anomalous XS commands you need to have the extra XS visualization app installed from splunkbase: https://splunkbase.splunk.com/app/2855/

Anomalous Context Gen:

Combining Contexts:

Our NEW xswhere:

See we specify the alfacut in the xswhere command as zero to force everything to come back and allow our following statement to be the limiter:

| where WhereCIX>=.9

That gives us a limit using the compatibility score of the entire compound statement. This will be much less noisy a result. If you have a compound statement and a WhereCIX of 0.5 you typically have a one sided match. Meaning it was perfectly compatible with one side of your statement but not the other at all. This is common when you combine anomalous and magnitude contexts like this. You have yup it was super noisy but that was totally normal for it so it was NOT above normal but WAS extreme. 

One possible new Excessive Failed Logins search could look something like this:

Splunk – Metrics and a Poor Man’s mcollect

Strangely the metrics feature in Splunk v7 is missing a command you would think we would have — mcollect. A command to take search results and collect it into a metric store index. Similar but way better for tracking things like license or index size over time using the normal stats and collect commands together.

Duane and I were hanging out and decided to make a poor man’s mcollect.

  1. Make a metric index, mine is called testmetrics
  2. Using inputs.conf in the app context of your choice setup a batch input
  3. Make your search to format results compatible with the metrics_csv sourcetype
  4. Graph your mstats for fun!

Inputs.conf Example:

Note we used a name convention of metrics_testing* so that we could easily target only the csv files we will be exporting using outputcsv soon. Start with the word metrics then second word is related to the index we are going to put it into then a wildcard. You can see we use the sinkhole policy to ensure the file is removed after indexing to avoid filling the disk if you do a lot of metrics this way.

Note that indexing data into a metrics index is counted as 150 bytes per metric event against your Splunk License.

Metrics Gen Search:

In this search we gather the index total size per index, reformat fields to match metrics_csv format.

Then we use an Evil subsearch trick to generate a filename with a timestamp to outputcsv to.

You could easily schedule this search to collect your stats hourly, daily etc. And adapt all this for license usage.

Viewing your New Stats:

The graphing search:

And a sample from our nice new graph!

splunk mstats

Splunk – Metrics Getting Data In my HEC Class

If you are using Splunk v7 you may be looking into the new Metrics store. This is a specialized index time summary index.

If you want to play with it you can use my Splunk HTTP Collector Python Class.   Setup a HEC token and Metric index as outlined in the Splunk Docs. Look at the section “Get metrics in from clients over HTTP or HTTPS”.

Then write a little python to make a HEC connection object then set the payload to match the requirements from the Metrics docs.

Here is an example block of code from modifying my example.py from my class git repo.

A couple of comments. First note the “event” part of the payload that normally has your event dict is just the word “metric”. This is required by Splunk for Metrics via HEC. Next, put your metric payload into the “fields” part of the HEC payload. This is a dict that HEC turns into index time extractions. Normally you want to minimize HEC index time extractions. However, Metrics is solely index time. You have to set your metric_name and _value for that measurement in the fields dict. Additionally any other fields under the fields dict become the dimensions fields the Metrics docs talk about.

Enjoy exploring the metrics store in Splunk.

Splunk – Enterprise Security Enhancing Incident Review

I see folks ask a lot about adding fields not originally in a notable to the notable  in incident review in SplunkES. The initial idea people have is that if they make an adaptive response action that it magically go get more data and have it show up for that notable event. The seed of the idea is there, but it does not work on its own.

Let’s go backwards through this.

Follow the ES Docs on adding a field to display in incident review *IF* it is in the notable and has a value. http://docs.splunk.com/Documentation/ES/5.0.0/Admin/CustomizeIR

That means normally the field has to be present already in the notable to be displayed . Bear with me. This is the first step and we can actually make fields show up that are NOT already in the notable event. Let’s say we added the field support_ticket to Incident Review Settings.

Next a magic trick that has nothing to do with adaptive responses. You CAN make new fields not in the notable show up if you have them added above AND you use a lookup. This is what most folks do not know.

Prior to ES 4.6 there was a macro that drives the content display for Incident Review called “Incident Review – Main”.  This became a search with v4.6.

The default search looks like:

We can cleanly insert our own lookup by making a macro for it and shimming it into the Incident Review – Main search. Just make sure your lookup, macro permissions are set right and everyone in ES can read it.

Presume we have a lookup that somehow has event_id to support_ticket mapped. Since we have event_id for a notable event the lookup can return the field support_ticket.

We define a macro for the lookup command: get_support_ticket

We shim that macro into the Incident Review – Main search:

Note I put it after the suppression filter.

Now magically if a notable’s event_id value is added to the lookup, next time someone views that notable in Incident Review they will see the new field support_ticket. If you are clever you can add a workflow action on the support_ticket field to open that ticket via URL in the ticketing system.

The above is helpful already. Think of having a lookup of your cloud account IDs for AWS, Azure etc with all sorts of contact and context. Have that field cloud_account_id show in your notables and the above auto lookup can make more context show for your SOC team.

Now you can go further by writing a custom alert action using Splunk Add-On builder setup with Adaptive Response support. The code can get data from an external system based on data in the notable event.  You could then index that returned data and have a search run across that to maintain a lookup table using outputlookup. Alternatively, you can update KVStore based lookups directly via the Splunk API. If you are willing to dig into the code you can see something similar with my TA-SyncKVStore app modular alert for Splunk.

Combining all that would let an adaptive response fetch data, store it in Splunk and have it display automatically on existing Notable Events within Incident Review.

Good Luck!

 

 

Splunk – Null Thinking

It takes GOOD data hygiene to become a mature Splunk environment. This starts with field parsing and mapping fields to the common information model.

There is one thing some people overlook with their data hygiene and Splunk. That is when NOT to include a field in the data payload indexed into Splunk. This happens a lot when the logs are internally developed and in use with Splunk HTTP Event Collector.

Sample JSON event fields:

Notice how the field killer has an empty value?

Why does this matter?

1. Consumes license for the string of the field name in all the events. This can be real bad at volume.
2. It throws off all the Splunk auto statistics for field vs event coverage.
3. Makes it hard to do certain search techniques.

License Impact:
If you have empty fields in your event payload in all your events this can add up. It impacts not only the network traffic to get it to Splunk but license usage and storage within Splunk. Splunk does compress indexed data. Still at very large volumes empty fields across your events can add up.

Event Coverage:

Splunk makes it handy to click on an “interesting” field in the search interface to see basic stats about the field and its values. If you have an empty value field preset then you can end up with 100% coverage. The field exists in all events from a search perspective. This is misleading if you want to know what percentage of events have actual useful values.

 

NULL Usage:

Null is a STATE not a value. I mean something is NULL or it is NOT NULL. This has no reflection of the content of the field value.

If you search like this

Empty strings will break easy to use null based functions like coalesce, isnull, and isnotnull.

Not sending in empty string fields you can easily find events that are missing the field when you expected it to be there. You can use the event coverage mentioned above or a simple search like.

Lookups and KVStore:

Another place making sure you have popped off empty string fields matters is when updating KVStore lookup tables via the Splunk REST API. If you leave the field on the data payload you get a row in your lookup with an empty string for that column.

If you are programmatically maintaining a lookup table which you use as an event filter this can break the common search pattern below. This is a pattern when you simply want events where they have a field that are or are not in a lookup table regardless of the information in the table.

Summary:

Save bandwidth, field count, make null testing easy by dropping all fields that have empty values from your events before indexing.

A quick python example:

Let’s say we have some API that returns this JSON dict for an event we want to index.

This python example gives us a stripped down dict we can use instead.

This gives us:

You can see the None and empty fields test and killer are removed. Now we have a nice JSON dict to use as our event payload to index. Real handy if you are using my Splunk HTTP Event Collector class for data ingestion.

Splunk New Technology Add-ons: SyncKVStore and SendToHEC

I recently updated and replaced older repositories from my GitHub account that were hand made modular alerts to send search results to other Splunk instances. The first one sends the search results to a Splunk HTTP Event Collector receiver. The second one came from our Splunk 2016 .conf talk on KVStore. It was useful for sending search results (typically an inputlookup of a table) to a remote KVStore lookup table.

TA-Send_to_HEC

You can find the updated Send To HEC TA on Splunkbase: TA-Send_to_HEC or in my GitHub repository: TA-Send_to_HEC.

This is useful for taking search results and sending to another Splunk instance using HEC. If you chose JSON mode it will send the results as a JSON payload of all the fields after stripping any hidden fields. Hidden fields start with an underscore. RAW mode is a new option which takes the _raw field and sends ONLY that field to the remote HEC receiver.

TA-SyncKVStore

This has been completely redone. I have submitted it to Splunkbase, but for the moment you can get it from my GitHub repository: TA-SyncKVStore

Originally it only sent search results to a remote KVStore. Now it also has two modular inputs. The first pulls a remote KVStore collection (table) and puts it into a local KVStore collection. The second pulls the remote KVStore collection but indexes it locally in JSON format. It will strip the hidden fields before forming the JSON payload to index. You are responsible for making sure all the appropriate and matching KVStore collections exist.

If you look in the code you will notice an unusual hybrid of the Splunk SDK for Python to handle KVStore actions and my own python class for batch saving the data to the collection. I could not get the batch_save method from the SDK to work at all. My own class already existed and was threaded for performance from my old version of the modular input so I just used the SDK to clear data if you wanted a replace option and then my own code for saving the new or updated data.

I rebuilt both of these TAs using the awesome Splunk Add-on Builder. This makes it easy in the SyncKVStore TA to store the credentials in the internal Splunk encrypted storage. One comment to update on the previous post on credential storage. The Add-on Builder was recently updated and now gives much better multiple credential management with a “global account” pull down selector you can use in your inputs and alert actions.