Splunk Stored Encrypted Credentials

I wrote about automating control of other systems from Splunk back in 2014. Things are very different now in what support Splunk provides for framework and SDKs. I have been looking to update some of the existing stuff in my git repo and using the Splunk Add-on builder. It handles a lot of the work for you when integrating to Splunk.

We now have modular alerts which is the evolution of the alert script stuff we were doing in 2014. Splunk also now has modular inputs, the old style custom search commands, and the new style custom search commands. In all cases, you could want to use credentials for a system that you do not want hard coded or left unencrypted in the code.

The Storage Passwords REST Endpoint

You will typically find two blog posts when you look into storing passwords in Splunk. Mine from 2014 and the one from Splunk in 2011 which I referenced in my details post with code. Both posts mention a critical point. The access permissions to make it work.

Knowledge objects in Splunk run as the user that owns them. I am talking the Splunk application user context. Not the OS system account you start Splunk under. If I run a search and save it as an alert then attach an alert action the code that executes in the alert action has Splunk user permissions as me. The owner of the search that triggered it at the time.

This is a critical point because you had to have a user capability known as ‘admin_all_objects’. Yes that is as godlike as it sounds. It normally is assigned to the admin user role. That has changed recently with Splunk 6.5.0. There is a new capability you can assign to a Splunk user role called ‘list_storage_passwords’. This lets your user account fetch from the storage passwords endpoint without being full admin over Splunk. It still suffers one downside. It is still an all or nothing access. If you have this permission you can pull ALL encrypted stored passwords. Still it is an improvement. Yes, it can be misused by Splunk users with the permission if they go figure out how to directly pull the entire storage. You have to decide whom your adversary is. The known Splunk user whom could pull it out, or an attacker or red team person whom finds credentials stored in scripts either directly on the system or in a code repository. I vote for using the storage as the better of the two choices.

Stored Credentials:

Where are they actually stored? On that point I am not going to bother with old versions of Splunk. You should be life cycle maintaining your deployment so I am going to just talk about 6.5.0+.

You need to have a username, the password, a realm and which app context you want to put it in. Realm? Yeah that is a fancy name for what is this credential for because you might actually have five different accounts named admin. How do you know which is the admin you want for a given use? Let’s say I have the username gstarcher on the service adafruit.io. I want to store that credential so I can send IOT data to my account there. I also have an account named gstarcher on another service and I want Splunk to be able to talk to both services using different alerts or inputs or whatever. So I use the realm to say adafruitio, gstarcher, password to define that credential. I might have the other be like ifttt, gstarcher, apikey. I can tell them apart because of the realm.

Wait, what about app context? If you have been around Splunk long you know that all configurations and knowledge objects exist within “applications” aka their app context. If you make a new credential via the API and do not tell the command what application you want it stored under then it will use the one your user defaults to. That is most often the Searching and Reporting app, aka search. That means if you look in $SPLUNK_HOME$/etc/apps/search/local/passwords.conf you will find the credentials you stored.

Example passwords.conf entry:

Do you notice it is encrypted? Yeah, it will be encrypted ONLY if you add the password using the API calls. If you do it by hand in the .conf text file then it will remain unencrypted. Even after a Splunk restart. This is odd behavior considering it uses splunk.secret to auto encrypt passwords in files like server.conf on a restart. So don’t do that.

How is it encrypted? It is encrypted using the splunk.secret private key for the Splunk install itself on that particular system. You can find that in $SPLUNK_HOME/etc/auth. That is why you tightly control whom has access to your Splunk system at the OS level. Audit it, make alerts on SSH into it etc. This file is needed as the software must have a way to know its own private key to decrypt things. Duane and I once wrote something in 30 minutes on a Saturday to decrypt passwords if you have the splunk.secret and conf files with encrypted passwords. So protect the private key.

Let me say this again. The app context ONLY matters in where the password lands for a passwords.conf perspective. The actual storage_passwords rest endpoint has no care in the world about app permissions for the user. It only checks if you have the capability list_storage_passwords. It will happily return every stored password to a get call. It will ONLY filter results if you set the app name when you make the API connection back to the Splunk REST interface. If you don’t specify the app as a filter it will return ALL credentials stored. Other than that, it is up to you to use username and realm to grab just the credential you need in your code. Don’t like that? Then please, please log a Splunk support ticket of type Enhancement Request against Core Splunk product asking for it to be updated to be more granular and respect app context permissions. Be sure to give a nice paragraph of your particular use case. That helps their developer stories.

Splunk Add-on Builder:

There are two ways the Splunk Add-on Builder handles “password” fields. First, if you place a password field in the Alert Actions Inputs panel for your alert, the Splunk GUI will obscure the password. The problem is that it is NOT encrypted. Let’s say you made this alert action. You attach your new alert action to a search. The password gets stored unencrypted in savedsearches.conf of the app where the search is saved.

The Add-on Builder provides an alternative solution that does encrypt credentials. You have to use the Add-on Setup Parameters panel and check the Add Account box. This lets you build a setup page you can enter credentials in for the TA. Those credentials will be stored in passwords.conf for the TA’s app context. There is one other issue. Currently the app builder internal libraries hard code realm to the be the app name. That is not great if you are making an Adaptive Response for Splunk Enterprise Security and want to reference credentials stored using the ES Credential Manager GUI. If you are making a TA that will never have multiple credentials that share the same username then this is still ok.

Patterns for Retrieval:

This is where everyone has the hardest time. Finding code examples on actually getting your credential back out. And it varies based on what you are making. So I am going to show an example for each type. Adapting it is up to you.

Splunklib Python SDK:

You will need to include the splunklib folder from the Splunk Python SDK in your App’s bin folder for the newer non InterSplunk style patterns. Yeah I know, why should you have to keep putting copies of the SDK in an app on a full install of Splunk that already should have it? Well there are reasons. I don’t get them all, but has to with design decisions and issues on paths, static vs dynamic linking concepts etc. All best left to the Splunk dev teams. Splunk admins hate the result of larger application bundles, but it is what it is.

Adding a Cred Script:

This is just a quick script that assumes it is in a folder and the splunklib is a folder level up which is why the sys.path.append is what it is for this example. This is handy if you are a business with a central password control system. You could use this as a template on how to reach into Splunk to keep credentials Splunk needs in sync with the centrally managed credential.

Modular Alert: Manual Style

The trick is always how do you get the session_key to work with. Traditional modular alerts send the information into the executes script via stdin. So here we grab stdin, parse it to JSON and pull off our session_key. Using that we can call a simple connect back to Splunk using the session_key and fetch the realm/username that are assumed to be setup in the modular alert configuration which is sent also in that payload of information.

Add-on Builder: Alert Action: Fetch realm other than app

Again it comes down to how do you obtain the session_key of the user that fires the knowledge object. The app builder has this great helper object and session_key is just a method hanging off it. We do not even have to grab stdin and parse it.

Add-on builder: Alert Action: App as realm

Just call their existing method you only specify the username because it is hardcoded to the app name for the realm.

Custom Search Command: Old InterSplunk Style

In an old style custom search command the easiest pattern to leverage the Intersplunk library to grab the sent “settings” which includes the sessionKey field. After we have that we are back to our normal Splunk SDK client pattern. You can see we are just returning all credential. You could use arguments on your custom search command to pass in the desired realm and username and borrow the credential for if pattern from the modular alert above. This assumes you have put the splunklib from the Splunk Python SDK in the bin folder of the app where your command exists. Also you must set passauth=true in the commands.conf where you define your search command.

Custom Search Command: New v2 Chunked Protocol Style

The new v2 chunked style of search command gives us an already authenticated session connection via the self object. Here we don’t even need to find and handle the session_key and just call the self. service.storage_passwords method to get all the credentials and leverage our usual SDK pattern to get the credential we want. The below pattern does not show it but you could pass realm and username in via arguments on your custom search command. You could then use the credential for if pattern from the modular alert example up above to grab just the desired credential.

Modular Input: Manual Style

I honestly recommend using the Add-on Builder these days. But if you want to use credentials with a manually built input Splunk has documentation here http://dev.splunk.com/view/SP-CAAAE9B#creds . Keep in mind you have to setup what username to send a session_key for by specifying the name in passAuth in the inputs.conf definition.

Modular Input: Add-on Builder

This works the same as our alert actions because of the helper object and the wrapping App Builder does for us. See Above on the other Add-on Builder examples. It is much easier to use and could be made to use the gui and named user creds.

Splunk Getting Extreme Part Three

We covered an example of an Anomalous Driven (AD) context in part one and how to use tstats in part two. We will cover the a traditional Domain type context example using Authentication data and tstats.

In XS commands DD mean Data Driven context. Here we will cover a use case using xsCreateDDContext of the type Domain. Using type=domain means we are going to need a count, max, mix. The terms we will use are minimal, low, medium, high, and extreme. This will let us find certain levels of activity without worrying about what “normal” is vs “anomalous” as we saw in part one.

Extreme Search Commands:


The Create method tells extreme search to create the container and populate or update all the classes if the container already exists. You have to use this if the container does not already exist.


This functions exactly as the xsCreate except that it will NOT work if the container does not exist. It will return an error and stop.


This will delete a SPECIFIC class or “all” if no class is specified from a context in a container. There is no XS command to actually remove the contents from the container. Deleting against a context/container without a class leaves all the class data but searching against the context will act as if it does not exist. The deletion without a class removed the default class lines. From there XS commands act as if the context is gone though most of the class data remains. This means the file exists with most of its file size intact. There is not even an XS command to remove an entire container. We can still cheat from within Splunk. Normally, you should NEVER touch the context files via the outputlookup command as it will often corrupt the file contents. If we want to empty a container file we can just overwrite the csv file with empty contents. The CSV file name will be in the format: containername.context.csv

If we had made a context with:
| xsupdateddcontext name=mytest container=mytestContainer app=search scope=app class=src terms=terms="minimal,low,medium,high,extreme"

We can nuke the contents of the file using the search:
makeresults | outputlookup mytestContainer.context.csv

We can now populate that container with either xsCreate or xsUpdate. xsUpdate will work since the container file exists. This trick can be handy to reset a container and cull out accumulated data because the file has grown very large over time with use or if you accidentally fed too much data into it.

Let’s talk about that for a minute. What is too large? XS has to read in the entire CSV into memory when it uses it. That has the obvious implications. A data set of 10 rows with the normal 5 domain terms of “minimal,low,medium,high,extreme” gives us 56 lines in the csv. 10 data items plus a default data item = 11 * 5 = 55 plus a header row = 56. Generally, if you are going to have 10K data items going into a context I would make one container for it and not share that container with any other contexts. That way you are not reading in a lot of large data into memory you are not using with your XS commands like for xswhere filtering.

One other thing to consider. The data size of this file is important in the Splunk data bundle replication. It is a csv file in the lookups folder and gets distributed with all the other data. If you made a context so large the CSV was 1.5GB in size you could negatively impact your search bundle replication and be in for the fun that brings.


This command comes from the Extreme Search Visualization app. It lets you run data against your context and have it tell you what concept terms best match the each result. This command has to work pretty hard so if your data going in is large it may take a few minutes to come back.

| tstats summariesonly=true dc(Authentication.user) as userCount from datamodel=Authentication where (nodename=Authentication.Failed_Authentication sourcetype=linux_secure) by _time, Authentication.src, Authentication.app span=1d | rename Authentication.* AS * | xsFindBestConcept userCount FROM users_by_src_1d IN auth_failures BY "src,app"


This command comes acts like xswhere but does not actually filter results. It just displays ALL results that went in and what their CIX compatibility value is for the statement you used.

| tstats summariesonly=true count as failures, dc(Authentication.user) AS userCount from datamodel=Authentication where nodename=Authentication.Failed_Authentication by _time Authentication.src, Authentication.app span=1d | eval avgFailures=failures/userCount | rename Authentication.* AS * | xsgetwherecix avgFailures from failures_by_src_1d by "src,app" in auth_failures is extreme

Min and Max:

XS for the type=domain needs count, and min/max values with depth. This means where min/max are never equal. The fun part is HOW you get a min and max is up to you. You will see examples that just use the min() and max() functions. Other examples will get min() and make max the median()*someValue. You often have to experiment for what fits your data and gives you an acceptable result. We touched on this value spreading in part one of Getting Extreme.

Here a couple of different patterns though you can do it any way you like.

  1. stats min(count) as min, max(count) as max … | eval max=if(min=max,min+5,max) | eval max=if(max-min<5,min+5,max)
  2. stats min(count) as min, median(count) as median, average(count) as average … | eval median=if(average-median<5,median+5,average) | eval max=median*2

If you don’t get min/max spread out you will see a message like the following when trying to generate your context.

xsCreateDDContext-W-121: For a domain context failures_by_src_1d with class, min must be less than max, skipping

Use Case: Authentication Abusive Source IPs

Question: We will define our question as, what are the source IPs that are abusing our system via authentication failures by src and application type. We want to know by average failures/number of user accounts tried per day. We also want to know if it is simply an extreme number of user accounts failed regardless of the number of failures per day. Yeah, normally I would do by hour or shorter period. The test data I have is from a Raspberry Pi exposed to the Internet. The RPi is sending to Splunk using the UF for Raspberry PI. That RPi is also running fail2ban, so it limits the number of failures a source can cause before it is banned for a while. This means we will work with a scale that typically maxes out at 6 tries.

Avg Failures/userCount by src by day

Here we divide the number of failures by the number of users. This gives us a ball park number of failures for a user account from a given source. We could put user into the class but that would then make our trend too specific of being tied to a distinct src, app,user. We want more a threshold of failures per user per source in a day.

Context Gen:

| tstats summariesonly=true count as failures, dc(Authentication.user) AS userCount from datamodel=Authentication where nodename=Authentication.Failed_Authentication by _time Authentication.src, Authentication.app span=1d | eval avgFailures=failures/userCount | stats count, avg(avgFailures) as average, min(avgFailures) as min, max(avgFailures) as max by Authentication.src, Authentication.app | rename Authentication.* AS * | eval max=if(min=max,min+5,max) | xsCreateDDContext name=failures_by_src_1d app=search container=auth_failures scope=app type=domain terms="minimal,low,medium,high,extreme" notes="login failures by src by day" uom="failures" class="src,app"


Here we use the context to filter our data and find the extreme sources.

| tstats summariesonly=true count as failures, dc(Authentication.user) AS userCount from datamodel=Authentication where nodename=Authentication.Failed_Authentication by time Authentication.src, Authentication.app span=1d | eval avgFailures=failures/userCount | rename Authentication.* AS * | xswhere avgFailures from failures_by_src_1d by "src,app" in auth_failures is extreme | iplocation prefix=src src | rename src_City AS src_city, src_Country AS src_country, src_Region as src_region, src_lon AS src_long | lookup dnslookup clientip AS src OUTPUT clienthost AS src_dns

Distinct User Count by src by day

Here we are going to trend the distinct number of users tried per source without regard of the number of actual failures.

Context Gen:

| tstats summariesonly=true dc(Authentication.user) as userCount from datamodel=Authentication where (nodename=Authentication.Failed_Authentication sourcetype=linux_secure) by _time, Authentication.src, Authentication.app span=1d | stats min(userCount) as min, max(userCount) as max, count by Authentication.src, Authentication.app | rename Authentication.* as * | eval max=if(min=max,min+5,max) | xsCreateDDContext name=users_by_src_1d app=search container=auth_failures scope=app type=domain terms="minimal,low,medium,high,extreme" notes="user count failures by src by day" uom="users" class="src,app"


Here we use the context to filter our data and find the sources with user counts above medium.

| tstats summariesonly=true dc(Authentication.user) as userCount from datamodel=Authentication where (nodename=Authentication.Failed_Authentication sourcetype=linux_secure) by _time, Authentication.src, Authentication.app span=1d | rename Authentication.* AS * | xswhere userCount from users_by_src_1d in auth_failures by "src,app" is above medium

Merge to get the most abusive sources by app

We can actually merge both of these searches together. This lets us run one search over a give time period reducing our Splunk resource usage and giving us results that match either or both of our conditions.

Combined Search:

This search is bucketing the time range it runs across into days then compares to our contexts that were generated with day period as a target. Normally for an ES notable search you would not bucket time with the “by” and “span” portions as you would be only running the search over something like the previous day each day.

| tstats summariesonly=true count AS failures, dc(Authentication.user) as userCount, values(Authentication.user) as targetedUsers, values(Authentication.tag) as tag, values(sourcetype) as orig_sourcetype, values(source) as source, values(host) as host from datamodel=Authentication where (nodename=Authentication.Failed_Authentication sourcetype=linux_secure) by time, Authentication.src, Authentication.app span=1d | eval avgFailures=failures/userCount | rename Authentication.* AS * | xswhere avgFailures from failures_by_src_1d by "src,app" in auth_failures is extreme OR userCount from users_by_src_1d in auth_failures by "src,app" is above medium | iplocation prefix=src src | rename src_City AS src_city, src_Country AS src_country, src_Region as src_region, src_lon AS src_long | lookup dnslookup clientip AS src OUTPUT clienthost AS src_dns

The thing to note about the CIX value is anything that is greater than 0.5 means it matched both our contexts to some degree. The 1.0 matched them both solidly. If the CIX is 0.5 or less it means it matches only one of the contexts to some degree. Notice, I used “is extreme” on one test and “is above medium” on the other. You can adjust the statements to fit your use case and data.

Bonus Comments:

You will notice in the searches above I added some iplocation and dnslookup commands. I also used the values and extra eval functions to add to the field value content of the results. This is something you want to do when making Enterprise Security notables. This helps give your security analysts data robust notables that they might can triage without ever drilling down into the original event data.

To restrict an Active Directory Group to a single VPN Tunnel Group


Let’s say you have Cisco ACS up and running. It is already successfully talking to your Active Directory installation. You also already have an existing VPN Client remote configuration where the group policy name is “GP_VPN_ITNET” and the tunnel group name is “TG_VPN_ITNET”

Now you have an active directory group called “RG_VPN_ITNET” and want to ensure that the only vpn remote access profile that group can use is the existing remote configuration.

Continue reading “To restrict an Active Directory Group to a single VPN Tunnel Group”

Setting up SSH Alerts to iPhone

This is sort of a follow up to my SSH screencast series for remote access to your Mac.  Maybe you are paranoid like me and want to know when a connection has been made to your mac, when a wrong user name has been tried or even a failure to login on a good username.  You also want to know this no matter where you are.

I was inspired by the script written by Whitson Gordon, over at Macworld on automating turning off your wireless Airport interface.  Note what I have below has only been tested on my Snow Leopard setup.  I leave it up to you if you are on Leopard or even Tiger.  BTW update your system if you are as far back as Tiger. C’mon join the modern world.

You will have to have Growl installed, also install growlnotify and last you need a Growl to push notification service like Prowl.  Then have the Prowl app on your iPhone or iPad.

Read on for the scripts and how to get it all working.

Continue reading “Setting up SSH Alerts to iPhone”

Cisco – AAA Exclude Console Port for Local Backup access

Man. Today I was putting a core 4507R switch onto our Tacacs AAA controls. The main IT admin for that site got all fussy about what if my tacacs account is locked out and its an emergency? Did not like the answer well call the Corporate helpdesk to have it unlocked. So I had to figure out how to make only the console port ignore tacacs AAA and use the local login database instead. Here is what I had to add to the aaa commands.

  1. Create a local user account under global config mode.
    username local-MYNAMEHERE privilege 15 password MYPASSWORDHERE
  2. Next under global config mode
    aaa authentication login console local
    aaa authorization exec console local
    aaa authorization commands 0 console local
    aaa authorization commands 1 console local
    aaa authorization commands 15 console local
    aaa authorization console
  3. Then under the console line interface
    authorization commands 0 console
    authorization commands 1 console
    authorization commands 15 console
    authorization exec console
    login authentication console

When does 2 = 1 ?

Talk about the wrong way to make a piece of software. I was helping a friend get iGet working with his mac. We did not want to leave SSH running on port 22. It was getting hit with all sorts of brute force user guessing attacks. Here are some examples

Sep 4 09:29:27 sshd[13637]: Invalid user admin
Sep 4 09:29:37 sshd[13641]: Invalid user stud
Sep 4 09:29:45 sshd[13643]: Invalid user trash
Sep 4 09:29:51 sshd[13645]: Invalid user aaron
Sep 4 09:29:56 sshd[13647]: Invalid user gt05
Sep 4 09:30:00 sshd[13649]: Invalid user william
Sep 4 09:30:03 sshd[13651]: Invalid user stephanie
Sep 4 09:30:40 sshd[13664]: Invalid user gary from
Sep 3 16:51:06 sshd[10423]: Invalid user nagios
Sep 3 16:51:07 sshd[10425]: Invalid user backuppc
Sep 3 16:51:09 sshd[10427]: Invalid user wolfgang
Sep 3 16:51:10 sshd[10430]: Invalid user vmware
Sep 3 16:51:13 sshd[10432]: Invalid user stats
Sep 3 16:51:14 sshd[10434]: Invalid user kor
Sep 3 16:51:15 sshd[10436]: Invalid user wei
Sep 3 16:51:16 sshd[10438]: Invalid user cvsuser

Also we wanted to fix up public key authentication instead of passwords. So we used his Apple airport extreme to map an external port say 3622 to 22 on his Mac in his home network. Then we whipped up public-private key pair. “ssh-keygen -t rsa” was good enough to do that. We of course put a good strong passphrase on it.

Now things like iTerm and Cyberduck on the mac worked great with his new setup. Both the port and the private key. But he has this thing called iGet. It claims key support. But I did let it work with key authentication if the private key had a passphrase. So we had to whip up a second keypair just for that program and append the new public key to his authorized_keys file in his .ssh folder. The worst part is the vendor tried to say that using keys is not inherently more secure than a password because iGet just uses SSH to start the connection then takes over with its own protocol. How stupid. And such a bad attitude. Cyberduck is free and it works way better on key support. Sure iGet has some neat features like access to the remote machine’s spotlight etc. But kiss that advantage goodbye once Leopard comes out. But personally I would not give iGet my money for their product with that attitude and poor private key with passphrase support. By default keys get dumped right into your .ssh folder on a Mac. If there is no passphrase and someone somehow runs code that lets them grab the entire folder contents they would have your access into machines via SSH. At least if it has a passphrase they still have to brute force the key just to use it.

After all it is not two factor authentication if it is just a key without a passphrase.  Its one factor.  Something you have.  Adding something you know (the passphrase) greatly improves the security so in the world of iGet 2=1.

Paypal + Verisign PiP Token

I came back from Birmingham today to find my Paypal security token arrived yesterday.  It came with a nice instruction card on how to add it to your Paypal account.  Only problem it’s wrong.  I tried going to the Token link and logging it.  It did not take me straight to adding the key.  I had to go into my account page then clicking the link to the security key.  From there it is straight forward to add it.

As an added bonus.  If you use Verisign PiP for OpenID you can add the same token to logon authorization there as well.   Then all you do is go to your account details.  Click the Add Credential.  It wants two fields to be filled in.  The top one is the code on the back of your Paypal security token.  The long number over the barcode, the token’s serial number.  The second field is a keycode from the token.  So flip it back over and press the button.  Enter the displayed six digit number into the second field and submit.

Now when you log into Paypal or Verisign PiP you have to know your logon name, password AND the six digit number on your token at the time you sign in.  There is a slight difference in how you enter it though.  On Paypal, append the six digits to your password when you type in your password.  On Verisign logon as normal and it will prompt you for the six digit code after you submit your name and password.

That is it.  Now you have two factor authentication.  Your password (something you know) and  the code the security token provides (something you have).  Without the token your accounts cannot be used.