Fishing for Phishers

Earlier today I saw @averagesecguy tweet a Python script for submitting random credentials to a phishing site. This got my attention as I have manually done this to some of my phishing group “BFF”s before.

It can be entertaining to submit a honey token credential to a phishing campaign against your organization. Follow up with a Splunk alert on the credential to monitor sources, maybe even take an Active Defense approach to them.

It got me thinking. How could I glue this together for a sit back and enjoy experience?

I have been working on a Splunk TA (technology add on) for feeds. I’ve done automated response before in Splunk. I bet you see where this is headed.

The Idea:

  1. Take in the feed.
  2. Alert in Splunk on your Brand.
  3. Have the alert submit a random credential leveraging @averagesecguy’s script.
  4. Have the credential add to a Splunk KV store table for used honey credentials.
  5. Setup alerts and active response in Splunk based on any authentication hits on the KV store lookup.
  6. Grab the tartar sauce and enjoy.


  1. Maybe have the random honey credential submissions generate a modest number of submissions per phishing link. Only one and the bad guys might not use it amongst real ones obtained from your organization. Too many and they might notice and filter those out such as from same source IP.
  2. Conform the honey credentials to your organizations naming and password credentials. This will make them appear real compared to genuine credentials they capture for your organization.
  3. Make the submission mechanism use one or more appropriate source IPs for your Org. If its traceable to one single source IP the bad guys could filter on it.
  4. Make sure your pool of random credentials do not contain valid usernames of real users so your alert/automation don’t hit folks you care about.
  5. If you get into automating defensive action be sure to whitelist source IPs appropriately. It would be unpleasant if the bad guys tricked your defenses into shutting down traffic to things you care about.
  6. As we evolve our code maybe take into account the time discovered on phishing pages and don’t submit to all of them or if they are too fresh. This could reduce chances the Phishers are making a new site and seeing if the security team finds and hits it before they’ve had a chance to send it in a real phishing email blast.
  7. Account for source IPs for successful two factor associated logins for your employees. You might use Duo Security with Last Pass Enterprise as an example. That gives you source IPs you have high confidence are indeed your employees. You can tailor response to alerting vs active defense accordingly.

We know phishers use poor grammar to target the users most likely to fall for phishing. We can use this as a similar strategy. Target the less sophisticated phishers with some simple automation and alerting. You could spice it up by adding auto abuse reporting on the hosting of the phishing sites hitting our brand.

I will be trying out some coding on this. If I get it working reasonably well it will go up into my git repo as usual.


Splunk Dry Harder – Splunking the Laundry 2

I was originally going to call this revisit of my old Splunking the Laundry post, Heavy Duty Cycle. My former coworker Sean Maher instead suggested Dry Harder and I could not pass that up as the sequel. So we return to playing with Laundryview data. This is a fun service used in campuses and apartment buildings to let the residents track when there are available machines, check status of their wash and get alerts when done.

The original code was very primitive python scraping a specific laundry view page for my apartment building laundry room. It formatted the data like syslog. From there that went into Splunk.

I decided remaking the code as a modular input was in order if I could make it scrape all shown machines from the page automatically. It works and you can find the TA-laundryview on my Github account. The readme does point out you need to know the laundry room (lr) code found in the URL you normally visit to see a room’s status.

Splunk can pull in any textual information you feed it. Whether that is data generated by small devices like a RaspberryPi 2 or scraping a site like Laundryview and benefitting from the existing machine data. Let’s explore a day’s data from UoA.

So here is the example I have been collecting of the University of Alabama laundry rooms. Note that I have defined an input for each laundry room on campus. The interval I set is every 15 minutes to the index=laundry, sourcetype=laundry, I have found that the 15 minute time frame is enough resolution to be useful without hammering the sites too hard.

UoA Laundry Rooms

UoA Laundry Rooms

A pair of stacked column graphs gives us a fun trend of washers and dryers in use for the entire campus population.

> index=laundry site_name=”UNIVERSITY OF ALABAMA” type=washer | timechart span=15m count by inUse

UoA Washers

> index=laundry site_name=”UNIVERSITY OF ALABAMA” type=dryer | timechart span=15m count by inUse

UoA Dryers

Next we make a Bubble chart panel to bring out the machines in an error status. We define that as Laundryview reporting an offline or out of service status. You will find I defined an eventtype for that.

> index=laundry | stats dc(uniqueMachineID) AS totalMachines by room_name, type | append [search index=laundry eventtype=machine_error | stats dc(uniqueMachineID) AS inError by room_name, type] |  stats sum(totalMachines) AS totalMachines sum(inError) AS inError by room_name, type | eval failure=inError/totalMachines*100

UoA Machine Errors

Here we show it again but this time with a stats table below it helping spot the laundry rooms with the most unavailable machines.

UoA Machine Error Table

You can see we could run all sorts of trends on the data. Want to bet the laundry room usage plummets around the football game schedule? How about throwing machine errors on a map? I actually did make a lookup table of laundry room name to lat/long information. That is when I found out the default map tiles in Splunk do not have enough resolution to get down to a campus level. It gets you down to about the city level. Tuscaloosa in this case. So it was not worth showing.

Other questions you could answer from the data might be:

  1. Do we have any laundry rooms with functioning washers or dryers with none of the other type? Imagine how ticked students would be stuck with a wet bunch of clothes and have to carry it to the next closest laundry room to dry it.

  2. How about alerting when the ratio of machines in an error state hits a certain level compared to the population available in a given laundry room.

  3. Could the data help you pick which housing area you want to live in at school?

  4. How long and how often do machines sit in an Idle status? This maps to a machine that has finished it’s cycle but no one has opened the machine door to handle the finished load. (eventtype=laundry_waiting_pickup)

The possibilities are quite fun to play with. Enjoy!