Splunk Importance of Indexes

I see a lot of folks new to Splunk have to work to mature their deployments because the did not tackle indexes early on. Indexes are how you control access to data and it’s retention period.

Consider a “traditional” starting splunk deployment by a security group. You get the IT group to install the universal forwarder sending you logs. Up front they aren’t interested in more than making you go away so they can work the next support ticket. Later, they find out how much access to their own logs in splunk can help operations succeed. Everything is all mixed together; your IDS, mail logs and web logs. Maybe a lot they don’t need to see.

Splunk will put data into the index named “main” by default. Everyone with a login to splunk can see this index. There is no simple move command once data is in an index to shift it into a new one.

It gets to be a bigger mess when start installing apps. Some like the *nix app put everything into an index called “os”.

Naming Convention

You should setup different indexes as early on as possible in a new deployment. Above all use a naming convention. Sticking with the default retention period is ok. It’s six years, so you have time to shrink it later.

I follow this naming convention.
* os_windows_groupname
* os_linux_groupname
* os_windows_groupname_secondgroupname

  1. I use underscores in index names.
  2. This type of index is for OS related logs so it starts with os_.
  3. The first and often only groupname is the IT or organizational group that owns the systems and provides the logs.
  4. Optionally what if you have a system developers and IT admins need to share log access. That is where I add _secondgroup name to it and send events for just those systems to this index.

Why do I follow this convention?

As mentioned indexes in Splunk are the control mechanism for access control and data retention. This is all set by index for user roles, then time periods for retention set for the index as well.

Searching with wildcards. Using this scheme you can setup a dashboard that leverages searches like
index=os_linux_* sudo

If you save that search or build it into a dashboard then if one group has access to the dashboard they see only their logs that match. The next group sees only theirs with the same dashboard. You get to see ALL events if as the security staff you have permissions to all the indexes. This also works well for eventtyping. Since eventtypes are defined by searches you can ensure an eventtype for only certain windows events run only across those indexes but ALL of them via the wildcard.

The downside shows up when you are not using the default index and you are new to splunk. There is a tendency to install some given Splunk app and expect it to just show data. Often these apps are coded to search just default indexes or their own. You will have to dig into their code and find where you have to replace the app searches etc with your wildcard naming scheme to get it wired up. It is still worth the effort and saves you from a lot of pain as your deployment matures.

For more about indexing be sure to read through the Splunk manual on Managing Indexes and Clusters.


Splunk Adding Simple Deployment Client Monitoring

This will only help you if you are using Deployment Server. This is an Enterprise server role so it won’t work if you are on the free license.

Sure you can install the Deployment Monitor application. In fact, I recommend that if you use Deployment Server(DS) that you use the app. But, we want to be able to quickly see if any new Splunk forwarders have been setup by our IT admins and they haven’t told us. So we will add some panels to our personal admin app and dashboard.

I like to whitelist assign log collection configs by assigning apps manually to the forwarders. However I make all systems that phone home to my deployment server pickup up the output app for my organization. This app just tells the forwarders how to talk to the indexers. It has nothing to do with what logs are picked up and what indexes they are sent to.

Keep in mind naming schemes. You should name your applications with your org name at the start of them to make them easy to spot.

In my serverclass.conf I have a stanza to assign an application called “org_all_forwarder” to all forwarders (excluding my indexers) that talk to the DS pickup. This app tells the Splunk Universal Forwarders how to send to the indexers. Nothing else is in this app.

We also assign a second app “org_all_deploymentclient” which contains the configuration on reporting to the DS. We won’t get into what is in these apps. This post is about a dashboard of what forwarders are pulling down applications.

So to detect when I have new forwarders I just need to see systems that pickup the all forwarder app and nothing else. That means I have not assigned any other applications to it.

We made a personal admin dashboard in the previous blog post on license summarization. Let’s add two panels to our personal admin dashboard that we will review daily. The data is in the _internal index for 28 days. That is the default retention period for that index. This doesn’t matter since we are only watching a week back.

Go into the application,MY-ADMIN, and the dashboard, My-Daily-Admin.

Create the Unassigned Forwarders Panel

  1. Click Edit->Edit Panels
  2. Click Add Panel
  3. Choose a title of “Splunk Web Login Activity (past 7 days)”
  4. Paste the following into the search field
    index=_internal sourcetype=splunkd DeployedApplication Downloaded | rex “deployment\?name=(.+?):(?<ds_class>.+?):(?<ds_app>.+?)\s” | table _time, host, ds_class, ds_app | lookup dnsLookup hostname AS host | transaction host | search ds_class=org_all_forwarder | eval classCount=mvcount(ds_class) | where classCount=1 | table _time, host, ds_class, ds_app
  5. Change the time range to last 7 days and click Add Panel to save it.
  6. I like to leave this one a statistics table visualization

Create the Recent Forwarders Panel

  1. Click Edit->Edit Panels
  2. Click Add Panel
  3. Choose a title of “Splunk Web Login Activity (past 7 days)”
  4. Paste the following into the search field
    index=_internal sourcetype=splunkd DeployedApplication Downloaded | rex “deployment\?name=(.+?):(?<ds_class>.+?):(?<ds_app>.+?)\s” | table _time, host, ds_class, ds_app | lookup dnsLookup hostname AS host | transaction host | search ds_class=org_all_forwarder | eval classCount=mvcount(ds_class) | where classCount>1 | table _time, host, ds_class, ds_app
  5. Change the time range to last 7 days and click Add Panel to save it.
  6. I like to leave this one a statistics table visualization

There you go. Two more panels for your daily admin review.


Splunk New Certification Tracks

Splunk updated their entire product certification process for those who need to manage and administrate Splunk. Previously, to get certified in Splunk it was a game of collecting the Pokemon cards of each training course\’s certificate of completion. That had the major downside for those of us experienced in Splunk. We could never get our employers to fund taking classes for material we knew well.

The process now involves an actual online exam. It is FREE. The courses can give you a very good foundation in the topics and prepare you for the exam. As with most certification exams the training and self study will cover the skill sets much deeper than the exam material alone can cover. I always recommend training when you can swing it as you never know what you do not know about a topic.

Splunk Certified Knowledge Manager

This certification covers the operation and managing the various knowledge objects within the Splunk application. This is more about helping the users have a solid consistent experience in using Splunk. It is not about the back end administration of the servers themselves.

The courses behind this certification are:
* Using Splunk
* Searching and Reporting
* Creating Splunk Knowledge Objects

Splunk Certified Admin

This certification is all about the technical administration of all aspects of Splunk. Everything from licensing, deployment management, indexing etc. This is for you if you want to be the wizard behind the curtain.

There is just one course behind this certification. It is the combination of the old Admin and Advanced Admin courses. It does require you have passed the Certified Knowledge Manager as the pre-requisite.
* Splunk Administration

Taking the Exams

Most of the folks I know have some experience with Splunk. For those people, I recommend you take the outlines for the courses behind each track. Highlight the agendas for the areas that you know you are weak in. Setup a v6 Splunk instance to practice those areas. Watch the tutorial videos from the Splunk intro page when you log into Splunk. Last, be sure to read ALL the documentation at least once related to the course material.

Then you just email certification@splunk.com to request registration to take the exam. They will send you a personalized exam link in an email with details of the number of questions you have to pass for the particular exam. It will also tell you how long you have to take the exam once you start it. You can take the exam as many times as you need to pass it. But you have to wait two hours between attempts.

Good luck!


Splunk Setting up License Usage Trending

You get a good bit of license usage trends when you install the Deployment Monitor and Splunk on Splunk applications. Or if you don’t use those apps, data in the _internal index ages out over time and you lose your trends beyond approximately 30 days.

I prefer to setup my own index and collect the summarized usage data into it so I can keep it indefinitely and do easy graphs on the data in my daily admin dashboard. This is also handy on a Splunk instance where you do not have the CPU cores to spare for Deployment Monitor to be running a lot of scheduled searches. Such as your admin laptop instance.

Lastly, you may need this data over the long term so you can justify more Splunk license in your next budget as you get close to averaging at your license limit.

Continue reading “Splunk Setting up License Usage Trending”


Getting started with Splunk and my favorite starter applications.

Getting Started

I am often asked how to start looking at Splunk when someone gets interested. This is the same thing I do for myself.

  1. Get the latest build of Splunk and install it on a machine you can test with. Usually this is your daily use laptop or desktop.
  2. Consider your license options. Splunk licensing is based on how much data per day you index into Splunk for searching. The free license will let you index up to 500MB per day. One thing many Splunk administrators do is to get a development license for their personal workstation. This will let you index up to 10GB per day and unlock all the enterprise features. This is great for prototyping and testing your parsing, apps etc on your workstation before moving it to your production system.
  3. Change your default admin password on Splunk once you login for the first time. The last thing you want is to be in a coffee shop and have someone poking into data you have indexed into Splunk that you might not want to share.
  4. Change the web interface to use https. Sure it is the default Splunk SSL certificate but it is better than no encryption at all. Just enable it under Settings->System Settings->General Settings

If you do not end up using a development license or your demo license runs out be sure to firewall Splunk from being accessed outside your local machine. Reference back to my someone in a coffee shop digging through your data comment.

Continue reading “Getting started with Splunk and my favorite starter applications.”