Showing posts with label forensics. Show all posts
Showing posts with label forensics. Show all posts

Friday, July 07, 2017

Toolsmith #126: Adversary hunting with SOF-ELK

As we celebrate Independence Day, I'm reminded that we honor what was, of course, an armed conflict. Today's realities, when we think about conflict, are quite different than the days of lining troops up across the field from each other, loading muskets, and flinging balls of lead into the fray.
We live in a world of asymmetrical battles, often conflicts that aren't always obvious in purpose and intent, and likely fought on multiple fronts. For one of the best reads on the topic, take the well spent time to read TJ O'Connor's The Jester Dynamic: A Lesson in Asymmetric Unmanaged Cyber Warfare. If you're reading this post, it's highly likely that your front is that of 1s and 0s, either as a blue team defender, or as a red team attacker. I live in this world every day of my life as a blue teamer at Microsoft, and as a joint forces cyber network operator. We are faced, each day, with overwhelming, excessive amounts of data, of varying quality, where the answers to questions are likely hidden, but available to those who can dig deeply enough.
New platforms continue to emerge to help us in this cause. At Microsoft we have a variety of platforms that make the process easier for us, but no less arduous, to dig through the data, and the commercial sector continues to expand its offerings. For those with limited budgets and resources, but a strong drive for discovery, that have been outstanding offerings as well. Security Onion has been forefront for years, and is under constant development and improvement in the care of Doug Burks.
Another emerging platform, to be discussed here, is SOF-ELK, part of the SANS Forensics community, created by SANS FOR572, Advanced Network Forensics and Analysis author and instructor Phil Hagen. Count SOF-ELK in the NFAT family for sure, a strong player in the Network Forensic Analysis Tool category.
SOF-ELK has a great README, don't be that person, read it. It's everything you need to get started, in one place. What!? :-)
Better yet, you can download a fully realized VM with almost no configuration requirements, so you can hit the ground running. I ran my SOF-ELK instance with VMWare Workstation 12 Pro and no issues other than needing to temporarily disable Device Guard and Credential Guard on Windows 10.
SOF-ELK offers some good test data to get you started with right out of the gate, in /home/elk_user/exercise_source_logs, including Syslog from a firewall, router, converted Windows events, a Squid proxy, and a server named muse. You can drop these on your SOF-ELK server in the /logstash/syslog/ ingestion point for syslog-formatted data. Additionally, utilize /logstash/nfarch/ for archived NetFlow output, /logstash/httpd/ for Apache logs, /logstash/passivedns/ for logs from the passivedns utility, /logstash/plaso/ for log2timeline, and  /logstash/bro/ for, yeah, you guessed it.
I mixed things up a bit and added my own Apache logs for the month of May to /logstash/httpd/. The muse log set in the exercise offering also included a DNS log (named_log), for grins I threw that in the /logstash/syslog/ as well just to see how it would play.
Run down a few data rabbit holes with me, I swear I can linger for hours on end once I latch on to something to chase. We'll begin with a couple of highlights from my Apache logs. The SOF-ELK VM comes with three pre-configured dashboards including Syslog, NetFlow, and HTTPD. You can learn more in the start page for the SOF-ELK UI, my instance is http://192.168.50.110:5601/app/kibana. There are three panels, or blocks, for each dashboard's details, at the bottom of the UI. I drilled through to the HTTPD Log Dashboard for this experiment, and immediately reset the time period for analysis (click the time marker in the upper right hand part of the UI). It defaults to the last 15 minutes, if you're reviewing older data it won't show until you adjust to match your time stamps. My data is from the month of May so I selected an absolute window from the beginning of May to its end. You can also select quick or relative time options, it's great to get comfortable here quickly and early. The resulting opening visualizations for me made me very happy, as seen in Figure 1.
Figure 1: HTTPD Log Dashboard
Nice! An event count summary, source ASNs by count (you can immediately see where I scanned myself from work), a fantastic Access Source map, a records graph by HTTP verbs, and one by response codes.
The beauty of these SOF-ELK dashboards is that they're immediately interactive and allow you to drill right in to interesting data points. The holisticinfosec.org website is intentionally flat and includes no active PHP or dynamic content. As a result, my favorite response code as a web application security tester, the 500 error, is notably missing. But, in both the timeline graphs we note a big traffic spike on 8 MAY 2017, which correlates nicely with my above mention scan from work, as noted in the ASN hit count, and seen here in Figure 2.

Figure 2: Traffic spike from scan
This visualizes well but isn't really all that interesting or uncommon, particularly given that I know I personally ran the scan, and scans from the Intarwebs are dime a dozen. What did jump out for me though, as seen back in Figure 1, was the presence of four PUT requests. That's usually a "bad thing" where some @$$h@t is trying to drop something on my server. Let's drill in a bit, shall we? After clicking the graph line with the four PUT requests, I quickly learned that two requests came from 204.12.194.234 AS32097: WholeSale Internet in Kansas City, MO and two came from 119.23.233.9 AS37963: Hangzhou Alibaba Advertising in Hangzhou, China. This is well represented in the HTTPD Access Source panel map (Figure 3).

Figure 3: Access Source
The PUT request from each included a txt file attempt, specifically dbhvf99151.txt and htjfx99555.txt, both were rejected, redirected (302), and sent to my landing page (200).
Research on the IPs found that 119.23.233.9 was on the "real time suspected malware list as detected by InterServer's intrusion systems" as seen 22 MAY, and 204.12.194.234 was found twice in the AbuseIPDB, flagged on 18 MAY 2017 for Cknife Webshell Detected. Now we're talking. It's common to attempt a remote file include attack or a PUT, with what is a web shell. I opened up SOF-ELK on that IP address and found eight total hits in my logs, all looking for common PHP opportunities with the likes of GET and POST for /plus/mytag_js.php, noted in PHP injection attack attempts.
SOF-ELK made it incredibly easy to hunt down these details, as seen in Figure 4 from the HTTPD Discovery panel.
Figure 4: Discovery
That's a groovy little hunting trip through HTTPD logs, but how about a bit of Syslog? I spotted I likely oddity that could be correlated across a number of the exercise logs, we'll see if the correlation is real. You'll notice tabs at the top of your SOF-ELK UI, we'll use Discover for this experiment. I started from the Syslog Dashboard with my time range set broadly on the last two months. 7606 records presented themselves, sliced neatly by hosts and programs, as seen in Figure 5.

Figure 5: Syslog Dashboard
Squid proxy logs showed the predominance of host entries (6778 or 57.95% of 11,696 to be specific), so I started there. Don' laugh, but I'll often do keyword queries just to see what comes up, sometimes you land a pointer to a good rabbit hole. Within the body of 6778 proxy events, I searched malware. Two hits came back for GET request via a JS redirector to bleepingcomputer.com for your basic how-to based on "random websites opening in Chrome". Ruh-roh.
Figure 6: Malware keyword
More importantly, we have an IP address to pivot on: 10.3.59.53. A search of that IP across the same 6778 Squid logs yielded 3896 entries specific to this IP, and lots to be curious about:
  • datingukrainewomen.com 
  • anastasiadate.com
  • YouTube videos for hair loss
  • crowdscience.com for "random pop-ups driving me nuts"
Do I need to build this user profile out for you, or are you with me? Proxy logs tell us so much, and are deeply worthy of your blue team efforts to collect and review.
I jumped over to the named_log from the muse host to see what else might reveal itself. Here's where I jumped to Discover, the Splunk-like query functionality inherent to SOF-ELK (and ELK implemetations). I did reductive query to see what other oddities might surface: 10.3.59.53 AND dns_query: (*.co.uk OR *.de OR *.eu OR *.info OR *.cc OR *.online OR *.website). I used these TLDs based on the premise that bots using Domain Generation Algorithms (DGA) will often use the TLDs. See The DGA of PadCrypt to learn more, as well as ISC Diary handler John Bambanek's OSINT logic. The query results were quite satisfying, 29 hits, including a number of clearly randomly generated domains. Those that were most interesting all included the .cc TLD, so I zoomed in further. Down to five hits with 10.3.59.53 AND dns_query: *.cc, as seen in Figure 7.
Figure 7:. CC TLD hits
Oh man, not good. I had a hunch now, and went back to the proxy logs with 10.3.59.53 AND squid_request:*.exe. And there you have it, ladies and gentlemen, hunch rewarded (Figure 8).

Figure 8: taxdocs.exe
It taxdocs.exe isn't malware, I'm a monkey's uncle. Unfortunately, I could find no online references to these .cc domains or the .exe sample or URL, but you get the point. Given that it's exercise data, Phil may have generated it to entice to dig deeper.
When we think about the IOC patterns for Petya, a hunt like this is pretty revealing. Petya's "initial infection appears to involve a software supply-chain threat involving the Ukrainian company M.E.Doc, which develops tax accounting software, MEDoc". This is not Petya (as far as I know) specifically but we see pattern similarities for sure, one can learn a great deal about the sheep and the wolves. Be the sheepdog!
Few tools better in the free and open source arsenal to help you train and enhance your inner digital sheepdog than SOF-ELK. "I'm a sheepdog. I live to protect the flock and confront the wolf." ~ LTC Dave Grossman, from On Combat.

Believe it or not, there's a ton more you can do with SOF-ELK, consider this a primer and a motivator.
I LOVE SOF-ELK. Phil, well done, thank you. Readers rejoice, this is really one of my favorites for toolsmith, hands down, out of the now 126 unique tools discussed over more than ten years. Download the VM, and get to work herding. :-)
Cheers...until next time.

Wednesday, June 08, 2016

Toolsmith Feature Highlight: Autopsy 4.0.0's case collaboration

First, here be changes.
After nearly ten years of writing toolsmith exactly the same way once a month, now for the 117th time, it's time to mix things up a bit.
1) Tools follow release cycles, and often may add a new feature that could be really interesting, even if the tool has been covered in toolsmith before.
2) Sometimes there may not be a lot to say about a tool if its usage and feature set are simple and easy, yet useful to us.
3) I no longer have an editor or publisher that I'm beholden too, there's no reason to only land toolsmith content once a month at the same time.
Call it agile toolsmith. If there's a good reason for a short post, I'll do so immediately, such as a new release of feature, and every so often, when warranted, I'll do a full coverage analysis of a really strong offering.
For tracking purposes, I'll use title tags (I'll use these on Twitter as well):
  • Toolsmith Feature Highlight
    • new feature reviews
  • Toolsmith Release Advisory
    • heads up on new releases
  • Toolsmith Tidbit
    • infosec tooling news flashes
  • Toolsmith In-depth Analysis
    • the full monty
That way you get the tl;dr so you know what you're in for.

On to our topic.
This is definitely in the "in case you missed it" category, I was clearly asleep at the wheel, but Autopsy 4.0.0 was released Nov 2015. The major highlight of this release is specifically the ability to setup a multi-user environment, including "multi-user cases supported that allow collaboration using network-based services." Just in case you aren't current on free and opensource DFIR tools, "Autopsy® is a digital forensics platform and graphical interface to The Sleuth Kit® and other digital forensics tools." Thanks to my crew, Luiz Mello for pointing the v4 release out to me, and to Mike Fanning for a perfect small pwned system to test v4 with.

Autopsy 4.0.0 case creation walk-through

I tested the latest Autopsy with an .e01 image I'd created from a 2TB victim drive, as well as against a mounted VHD.

Select the new case option via the opening welcome splash (green plus), the menu bar via File | New Case, or Ctrl+N:
New case
Populate your case number and examiner:
Case number and examiner
Point Autopsy at a data source. In this case I refer to my .e01 file, but also mounted a VHD as a local drive during testing (an option under select source type drop-down.
Add data source
Determine which ingest modules you'd like to use. As I examined both a large ext4 filesystem as well as a Windows Server VHD, I turned off Android Analyzer...duh. :-)
Ingest modules
After the image or drive goes through initial processing you'll land on the Autopsy menu. The Quick Start Guide will get you off to the races.

The real point of our discussion here is the new Autopsy 4.0.0 case collaboration feature, as pulled directly from Autopsy User Documentation: Setting Up Multi-user Environment

Multi-user Installation

Autopsy can be setup to work in an environment where multiple users on different computers can have the same case open at the same time. To set up this type of environment, you will need to configure additional (free and open source) network-based services.

Network-based Services

You will need the following that all Autopsy clients can access:

  • Centralized storage that all clients running Autopsy have access to. The central storage should be either mounted at the same Windows drive letter or UNC paths should be used everywhere. All clients need to be able to access data using the same path.
  • A central PostgreSQL database. A database will be created for each case and will be stored on the local drive of the database server. Installation and configuration is explained in Install and Configure PostgreSQL.
  • A central Solr text index. A Solr core will be created for each case and will be stored in the case folder (not on the local drive of the Solr server). We recommend using Bitnami Solr. This is explained in Install and Configure Solr.
  • An ActiveMQ messaging server to allow the various clients to communicate with each other. This service has minimal storage requirements. This is explained in Install and Configure ActiveMQ.

When you setup the above services, securely document the addresses, user names, and passwords so that you can configure each of the client systems afterwards.

The Autopsy team recommends using at least two dedicated computers for this additional infrastructure. Spreading the services out across several machines can improve throughput. If possible, place Solr on a machine by itself, as it utilizes the most RAM and CPU among the servers.

Ensure that the central storage and PostgreSQL servers are regularly backed up.

Autopsy Clients

Once the infrastructure is in place, you will need to configure Autopsy to use them.

Install Autopsy on each client system as normal using the steps from Installing Autopsy.
Start Autopsy and open the multi-user settings panel from "Tools", "Options", "Multi-user". As shown in the screenshot below, you can then enter all of the address and authentication information for the network-based services. Note that in order to create or open Multi-user cases, "Enable Multi-user cases" must be checked and the settings below must be correct.

Multi-user settings
In closing

Autopsy use is very straightforward and well documented. As of version 4.0.0, the ability to utilize a multi-user is a highly beneficial feature for larger DFIR teams. Forensicators and responders alike should be able to put it to good use.
Ping me via email or Twitter if you have questions (russ at holisticinfosec dot org or @holisticinfosec).
Cheers…until next month time.

Saturday, April 09, 2016

toolsmith #115: Volatility Acuity with VolUtility

Yes, we've definitely spent our share of toolsmith time on memory analysis tools such as Volatility and Rekall, but for good reason. I contend that memory analysis is fundamentally one of the most important skills you'll develop and utilize throughout your DFIR career.
By now you should have read The Art of Memory Forensics, if you haven't, it's money well spent, consider it an investment.
If there is one complaint, albeit a minor one, that analysts might raise specific to memory forensics tools, it's that they're very command-line oriented. While I appreciate this for speed and scripting, there are those among us who prefer a GUI. Who are we to judge? :-)
Kevin Breen's (@kevthehermit) VolUtility is a full function web UI for Volatility which fills the gap that's been on user wishlists for some time now.
When I reached out to Kevin regarding the current state of the project, he offered up a few good tidbits for user awareness.

1. Pull often. The project is still in its early stages and its early life is going to see a lot of tweaks, fixes, and enhancements as he finishes each of them.
2. If there is something that doesn’t work, could be better, or removed, open an issue. Kevin works best when people tell him what they want to see.
3. He's working with SANS to see VolUtility included in the SIFT distribution, and release a Debian package to make it easier to install. Vagrant and Docker instances are coming soon as well. 

The next two major VolUtility additions are:
1. Pre-Select plugins to run on image import.
2. Image Threat Score.

Notifications recently moved from notification bars to the toolbar, and there is now a right click context menu on the plugin output, which adds new features.

Installation

VolUtility installation is well documented on its GitHub site, but for the TLDR readers amongst you, here's the abbreviated version, step by step. This installation guidance assumes Ubuntu 14.04 LTS where Volatility has not yet been installed, nor have tools such as Git or Pip.
Follow this command set verbatim and you should be up and running in no time:
  1. sudo apt-get install git python-dev python-pip
  2. git clone https://github.com/volatilityfoundation/volatility
  3. cd volatility/
  4. sudo python setup.py install
  5. sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
  6. echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
  7. sudo apt-get update
  8. sudo apt-get install -y mongodb-org
  9. sudo pip install pymongo pycrypto django virustotal-api distorm3
  10. git clone https://github.com/kevthehermit/VolUtility
  11. cd VolUtility/
  12. ./manage.py runserver 0.0.0.0:8000
Point your browser to http://localhost:8000 and there you have it.

VolUtility and an Old Friend

I pulled out an old memory image (hiomalvm02.raw) from September 2011's toolsmith specific to Volatility where we first really explored Volatility, it was version 2.0 back then. :-) This memory image will give us the ability to do a quick comparison of our results from 2011 against a fresh run with VolUtility and Volatility 2.5.

VolUtility will ask you for the path to Volatility plugins and the path to the memory image you'd like to analyze. I introduced my plugins path as /home/malman/Downloads/volatility/volatility/plugins.


The image I'd stashed in Downloads as well, the full path being /home/malman/Downloads/HIOMALVM02.raw.


Upon clicking Submit, cats began loading stuffs. If you enjoy this as much as I do, the Help menu allows you to watch the loading page as often as you'd like.


If you notice any issues such as the image load hanging, check your console, it will have captured any errors encountered.
On my first run, I had not yet installed distorm3, the console view allowed me to troubleshoot the issue quickly.

Now down to business. In our 2011 post using this image, I ran imageinfo, connscan, pslist, pstree, and malfind. I also ran cmdline for good measure via VolUtility. Running plugins in VolUtility is as easy as clicking the associated green arrow for each plugin. The results will accumulate on the toolbar and the top of the plugin selection pane, while the raw output for each plugin will appears beneath the plugin selection pane when you select View Output under Actions.


Results were indeed consistent with those from 2011 but enhanced by a few features. Imageinfo yielded WinXPSP3x86 as expected, connscan returned 188.40.138.148:80 as our evil IP and the associated suspect process ID of 1512. Pslist and pstree then confirmed parent processes and the evil emanated from an ill-conceived click via explorer.exe. If you'd like to export your results, it's as easy as selecting Export Output from the Action menu. I did so for pstree, as it is that plugin from whom all blessings flow, the results were written to pstree.csv.


We're reminded that explorer.exe (PID 1512) is the parent for cleansweep.exe (PID 3328) and that cleansweep.exe owns no threads current threads but is likely the original source of pwn. We're thus reminded to explore (hah!) PID 1512 for information. VolUtility allows you to run commands directly from the Tools Bar, I did so with vol.py -f /home/malman/Downloads/HIOMALVM02.raw malfind -p 1512.


Rather than regurgitate malfind results as noted from 2011 (you can review those for yourself), I instead used the VolUtility Tools Bar feature Yara Scan Memory. Be sure to follow Kevin's Yara installation guidance if you want to use this feature. Also remember to git pull! Kevin updated the Yara capabilities between the time I started this post and when I ran yarascan. Like he said, pull often. There is a yararules folder now in the VolUtility file hierarchy, I added spyeye.yar, as created from Jean-Philippe Teissier's rule set. Remember, from the September 2011 post, we know that hiomalvm02.raw taken from a system infected with SpyEye. I then selected Yara Scan Memory from the Tools Bar, and pointed to the just added spyeye.yar file.


The results were immediate, and many, as expected.


You can also use String Search / yara rule from the Tools Bar Search Type field to accomplish similar goals, and you can get very granular with you string searches to narrow down results.
Remember that your sessions will persist thanks to VolUtility's use of MongoDB, so if you pull, then restart VolUtility, you'll be quickly right back where you left off.

In Closing

VolUtility is a great effort, getting better all the time, and I find its convenience irresistible. Kevin's doing fine work here, pull down the project, use it, and offer feedback or contributions. It's a no-brainer for VolUtility to belong in SIFT by default, but as you've seen, building it for yourself is straightforward and quick. Add it to your DFIR utility belt today.
As always, ping me via email or Twitter if you have questions: russ at holisticinfosec dot org or @holisticinfosec.

ACK

Thanks to Kevin (@kevthehermit) Breen for VolUtility and his feedback for this post.

Thursday, January 03, 2013

toolsmith: Violent Python - A Book Review Applied to Security Analytics




Prerequisites/dependencies
Python interpreter
BackTrack 5 R3 is ideally suited to make immediate use of Violent Python scripts

Introduction
Happy New Year and congratulations on surviving the end of the world as we know it (nyah, nyah Mayan calendar). Hard to imagine we’re starting yet another year already; 2012 simply screamed by. Be sure to visit the HolisticInfoSec blog post for the 2012 Toolsmith Tool of the Year and vote for your favorite tool of 2012.
I thought I’d start off 2013 with a bit of a departure from the norm. Herein is the first treatment of a book as a tool where the content and associated code can be utilized to perform duties specific to the information security practitioner. I can think of no better book with which to initiate this approach than TJ O’Connor’s Violent Python, A Cookbook for Hackers, Forensic Analysts, Penetration Testers, and Security Engineers. Yes, this implies that you should buy the book; trust me, it’s worth every dime of the $34. Better still, TJ has donated all his proceeds to the Wounded Warrior Project. That said, I’ll post TJ’s three scripts we’ll discuss here so as to whet your appetite. I’ve had the distinct pleasure of working with TJ as part of the SANS Technical Institute’s graduate program where we, along with Beth Binde, wrote AssessingOutbound Traffic to Uncover Advanced Persistent Threat. I’ve known some extremely bright capable information security experts in my day and I can comfortably say TJ is hands down amongst the very best of that small group. As part of his service as an officer in the U.S. Army (hooah) TJ has served as the course director for both computer exploitation and digital forensics at the US Military Academy and as an communications officer supporting tactical communications. His book maps nicely to a philosophy I embrace and incorporate in the workplace. Security monitoring, incident response (and forensics), and attack and penetration testing are the three pillars of security analytics, each feeding and contributing the others in close cooperation. As an example, capable security monitoring inevitably leads to a need for incident response, and after mitigation and remediation have ensued, penetration testing is key to validating that corrective measures were successful, which in turn helps the monitoring team assess and tune detection and alerting logic. Security analytics: the information security circle of life J.
How does a book such as TJ’s Violent Python reverberate with this philosophy? How about entire chapters dedicated to each of the above mentioned pillars, including Python scripts for network traffic analysis (monitoring), forensic investigations (IR), as well as web recon and penetration testing. We’ll explore one script from each discipline shortly, but not before hearing directly from the author:
“In a lot of ways writing a book is a cathartic experience where you capture a lot of things you have done. All too often I'm writing scripts to achieve an immediate effect and then I throw away the script. For me the book was an opportunity to capture a lot of those small projects I've done and simplify the learning curve for others. My favorite example was the UAV takeover in the book. We show how to take over any really Ad-Hoc WiFi toys in under 70 lines of code. A few friends joked that I couldn't write a script in under 100 lines to crash a UAV. This was my chance to provide them a working concept and it worked! Unfortunately it left my daughter with a toy UAV cracked into several pieces as I refined the code. From a defensive standpoint, understanding a scripting language is absolutely essential in my opinion. The ability to parse data such as DNS traffic or geo-locate IP traffic (both shown in the book) can give a great deal of visibility. Forensics tools are great but the ability to build your own are even better. We show how to write tools to parse out iPhone backups for data and scrape for specific objects. The initial feedback from the book has been overwhelming and I've really enjoyed hearing positive feedback. No future plans right now but a good friend of mine has mentioned writing "Violent Powershell" so we'll see where that goes.”    
Violent Python provides readers the basis for scripts to attack network services, analyze digital artifacts, investigate network traffic for malicious activity, and data-mine social media, not to mention numerous other activities. This is a must-read book that includes a companion site with all the code discussed. Let’s take a closer look at three of these efficient and useful Python scripts.

Making Use of Violent Python

As noted above, I’ve posted the three scripts discussed in this section, along with the PCAP and PDF (malicious) discussed on my website. Email or Tweet for the zip passwords.
TJ suggests utilizing a BackTrack distribution given that many of the dependencies and libraries required to use the scripts in this book are inherent to BackTrack. We’ll follow suit on a BackTrack 5 R3 virtual machine. Before beginning, we’ll need to set up a few prerequisites. Execute easy_install pyPDF python-nmap pygeoip mechanize BeautifulSoup4 at the BT5R3 root prompt. This will install pygeoip as needed for our first exercise. I’m going to conduct these exercises a bit out of chapter sequence in order to follow the security analytics lifecycle starting with monitoring. This drops us first into Chapter 4 where we’ll utilize MaxMind’s GeoLiteCity to map IP addresses to cities. In order to do so, you’ll need to set up GeoLiteCity on BackTrack or your preferred system with the following steps:
1.  mkdir /opt/GeoIP
2.  cd /opt/GeoIP/
3.  wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
4.  gunzip GeoLiteCity.dat.gz
You’ll then need to edit line 7 of geoPrint.py to read gi = pygeoip.GeoIP('/opt/GeoIP/GeoLiteCity.dat') or download the updated copy of the script I’ve posted for you.

I’ve created a partially arbitrary scenario for you with which to walk through the security analytics lifecycle using Violent Python. To do so I’ll refer to what was, in 2009 an actual malicious domain, used to host shellcode for PDF-based malware attacks. I grabbed a malicious PDF sample from Contagio, an excellent sample resource. The IP address I associate with this domain is where I am taking creative liberties as the domain we’ll discuss, ax19.cn, no longer exists, and there is no record of what its IP address was when it was in use. The PCAP we’ll use here is one I edited with bittwiste to arbitrarily introduce a suspect Chinese IP address to what was originally a packet capture from a machine compromised by Win32.Banload.MC. I’ve shared this PCAP and the PDF as mentioned above so you can try the Python scripts with them for yourself.  
In this scenario, your analysis machine is Linux only. Just you, a Python interpreter, and a shell; no fuss, no muss.
As we’re starting in the monitoring phase, imagine you have a network for which the traffic baseline is well understood. You can assert, from one particular high value VLAN, that at no time should you ever see traffic bound for China.  Your netflow monitoring for that VLAN is showing far more egress traffic bound for IP space that is not on your approved list established from learned baselines. You initiate a real-time packet capture to confirm. Capture (suspect.pcap) in hand, you’d like to validate that the host is indeed conversing with an IP address in China. Violent Python’s geoPrint.py script is a great place to start as it leverages the above-mentioned GeoLiteCity data from MaxMind along with the PyGeoIP library from Jennifer Ennis and dpkt. Execute python geoPrint.py -p suspect.pcap and you’ll see results as noted in Figure 1.

Figure 1: geoPrint.py confirms Chinese takeout
Your internal host (RFC 1918, and thus unregistered) with IP address 192.168.248.114 is clearly conversing with 116.254.188.24 in Beijing. Uh-oh.
Your team now moves into incident response mode and seizes the host in question. You interview the system’s user who indicates they received an email what the user thought was a legitimate help desk notification to read a new policy. The email had an attached PDF file which the user downloaded and opened. Your suspicions are heightened, as such you grab a copy of the PDF and head back to your analysis workstation. You’re interested to see if there is any interesting metadata in the PDF that might help further your investigation. You refer to Chapter 3 of Violent Python which discusses Forensic Investigations with Python. The pdfRead.py script incorporates the PyPDF library which allows you to extract PDF document information (metadata) in addition to other capabilities. Execute python pdfRead.py -F suspect.pdf and dump the metadata as seen in Figure 2.

Figure 2: pdfRead.py dumps suspect PDF metadata
The author reference is a standout for you; from a workstation with a browser you search “Zeon Technical Publications” and find reference to it on VirusTotal and JSunpack; these results along with a quick MD5sum hash match indicate that this PDF is clearly malicious. The JSunpack reference indicates that shellcode phones home to www.ax19.cn (see Figure 3), a domain for which you’d now like to learn more.

Figure 3: JSunpack confirms an evil PDF
You could have sought anonymity to conduct the above mentioned search, which lead us to the third pillar of our security analytics lifecycle. This third phase here includes web recon as discussed in Chapter 6 of Violent Python, a common step in the attack and penetration testing discipline, to see what more we can learn about this malicious domain. As we often seek anonymity during the recon phase, Violent Python allows you maintain a bit of stealth by leveraging the deprecated Google API against which a few queries a day can still be executed. The newer API requires a developer’s key which one can easily argue is not anonymous. Executing python anonGoogle.py -k 'www.ax19.cn' will return yet another validating result as seen in Figure 4.

Figure 4: anonGoogle matches ax19.cn to malicious activity
With seven rich chapters of Python goodness, TJ’s Violent Python represents a golden opportunity to expanding your security analytics horizons. There is so much to learn from here while accentuating your use of Python in your information security practice.

In Conclusion

I’m hopeful this slightly different approach to toolsmith was useful for you this month. I’m looking to shake things up a bit here in 2013 and am certainly open to suggestions you may have regarding ideas and approaches to doing so. Violent Python was a great read for me and a pleasure to put to use for both this article as well as in my personal tool box. I’m certain you’ll find this book equally useful.
Ping me via email if you have questions (russ at holisticinfosec dot org).
Cheers…until next month.

Acknowledgements

TJ O’Connor, Violent Python author
Mila Parkour, Contagio

Sunday, November 11, 2012

CTIN Digital Forensics Conference - No fluff, all forensics

For those of you in the Seattle area or willing to travel who are interested in digital forensics there is a great opportunity to learn and socialize coming up in March.
The CTIN Digital Forensics Conference will be March 13 though 15, 2013 at the Hilton Seattle Airport & Conference Center. CTIN, the Computer Technology Investigators Network, is non-profit, free membership organization comprised of public and private sector computer forensic examiners and investigators focused on the areas of high-tech security, investigation, and prosecution of high-tech crimes for both private and public sector.

Topics slated for the conference agenda are many, with great speakers to discuss them in depth:
Windows Time Stamp Forensics, Incident Response Procedures, Tracking USB Devices, Timeline Analysis with Encase, Internet Forensics, Placing the Suspect Behind the Keyboard, Social Network Investigations, Triage, Live CDs (WinFE & Linux)
F-Response and Intella, Lab - Hard drive repair, Mobile Device Forensics, Windows 7/8 Forensics
Child Pornography, Legal Update, Counter-forensics, Linux Forensics, X-Ways Forensics
Expert Testimony, ProDiscover, Live Memory Forensics, Encase, Open Source Forensic Tools
Cell Phone Tower Analysis, Mac Forensics, Registry Forensics, Malware Analysis, iPhone/iPad/other Apple products, Imaging Workshop, Paraben Forensics, Virtualization Forensics


Register before 1 DEC 2012 for $295, and $350 thereafter.

While you don't have to be a CTIN member to attend I strongly advocate your joining and supporting CTIN.

Friday, December 02, 2011

toolsmith: Registry Decoder









Prerequisites
Binaries require no external dependencies; working from a source checkout requires Python 2.6.x or 2.7.x and additional third-party apps and libraries.

Merry Christmas: "Christmas is not a time nor a season, but a state of mind. To cherish peace and goodwill, to be plenteous in mercy, is to have the real spirit of Christmas.” -Calvin Coolidge

Introduction
Readers of the SANS Computer Forensics Blog or Harlan Carvey’s Windows Incident Response blog have likely caught wind of Registry Decoder. Harlan even went so far as to say “sounds like development is really ripping along (no pun intended). If you do any analysis of Windows systems and you haven't looked at this tool as a resource, what's wrong with you?” When Registry Decoder was first released in September 2011, I spotted it via Team Cymru’s Dragon News Bytes mailing list and filed it away for future use. Then, in most fortuitous fashion, Andrew Case, one of the Volatility developers I’d reached out to for September’s Volatility column, contacted me regarding Registry Decoder in early November. Andrew co-develops Registry Decoder with Lodovico Marziale as part of Digital Forensic Solutions and kindly provided me with content for the remaining entirety of this introduction.

Registry Decoder is open source (GPL) and written completely in Python and is downloadable via Google Code projects. It was initially funded by the National Institute of Justice and now is funded by Digital Forensics Solutions.
Registry Decoder was devised to automate the acquisition, analysis, and reporting of registry contents. To accomplish this, there are actually two projects. The first is RegistryDecoder Live which allows for the safe acquisition of registry files from a live machine by forcing a system restore point, thus putting the currently active registry files into a read-only state in backup. It then reads these files from backup either in System Restore Points for XP or from the Volume Shadow Service on Windows Vista & Windows 7. As Registry Decoder Live acquires files, it creates a database that can then be imported into the second tool, Registry Decoder.
Registry Decoder can analyze registry files from a number of sources and then provide a number of GUI-driven analysis capabilities. The current version of the tool (1.1 as this is written) can import individual registry files, raw (dd) disk images, raw (dd) split images, Encase (E01) images, and databases from the live tool. Once evidence is imported and pre-processed, the investigator then has a number of analysis tools available and new evidence can be added to a case at any time.
Registry Decoder’s analysis capabilities include:
·         Browsing Hives (similar to Access Data’s Registry Viewer)
·         Hive Searching (more on this below)
·         Plugin System (similar to regripper)
·         Hive Differencing
·         Timelining based on last write time
·         Path Based Analysis
·         Automated reporting of all of the above
Registry Decoder automates all of this functionality for any number of registry hives and the reporting can handle exporting results from multiple hives and analysis types into one report.

Andrew’s favorite Registry Decoder use case is USBSTOR analysis. Almost every case involving investigating a specific employee requires determining which (if any) USB drives were in use.  To do this with Registry Decoder, all an investigator has to do is create a case with the disk images or hives acquired, run the USBSTOR plugin, and then export the results. After pre-processing is done, it takes mere minutes to have a report created with the device name, serial number, etc. of any devices connected. Also, since Registry Decoder pulls historical files from live machines and disk images (System Restore & Volume Shadow Service), this analysis can be run across hives going back months or years.
Similarly, while investigating data exfiltration between multiple employees of a company, Andrew needed to know if they shared USB drives. To make the determination he took the SYSTEM files from each machine, loaded them into Registry Decoder and then used the plugin differencing ability on the USBSTOR plugin. It immediately revealed what drives were shared between computers, including their serial number.  Another common use of the differencing feature is with the Services plugin as this quickly identifies malware if you difference your known good disk image vs. a disk image of a machine suspected to be infected.

Registry Decoder’s search feature is one of its strongest features. It allows you to search across any number of hives and filter by keys/values/names, last write time range, wildcard searching, and bulk searching with keyword files.
For a recent case, Andrew had to determine if a person was accessing files they shouldn’t have been looking at. They had a desktop and a laptop, both running XP and both with many System Restore Points. In less than 30 minutes with Registry Decoder, Andrew needed only load the disk images from the two machines into Registry Decoder, make a text file with all the search terms, and then search all the terms across all the hives in the case (including historical ones). This returned results that he then exported into one report and was finished.  Another useful search is noted when viewing the search results tab, right click on any result, and immediately jump into the Browse view positioned at that key.

Another good use case includes path-based analysis which allows you to determine if a registry path exists in any number of files. For whichever files it is present in, one can then export the path and optionally its key/value pairs. This is extremely useful in two situations:
1.       Determining if certain software is installed (P2P, cracked software, etc.), as you can simply search any of the paths that the program creates and then export its key/values inclusive of when and where the software was installed.
2.       During malware analysis as most malware writes to the registry. Searching across numerous suspect systems for the malware’s path allows investigators to immediately determine the extent of infection.

Registry Decoder’s roadmap includes more analysis plugins and added support for memory analysis (integrate with Volatility’s existing in-memory registry functionality).
The developers also want to add support for analyzing previously deleted keys and name/value pairs within hives. The library utilized for enumerating hives, reglookup, already supports this functionality so it is just a matter of integration.


Running the Registry Decoder online acquisition component

I ran regdecoderlive32 on a 32bit Windows XP SP3 virtual machine infected with Lurid and regdecoderlive64 on a Windows 7 SP1 64bit machine.
One note for regdecoderlive32 on Windows XP systems with drives formatted with NTFS. Even when running regdecoderlive32 with administrator privileges the hidden System Volume Information directory is protected with unique ACLs. To circumvent this issue, issue cacls "C:\System Volume Information" /E /G :F from a command prompt at the root of C: (this assumes the OS is installed on C:).
As seen in Figure 1, running regdecoderlive is as simple as executing and defining a few parameters including description, output directory (must be empty) and check boxes for acquisition of current and backup files.

Figure1: Registry Decoder Live
Once acquisition is complete, the results directory will be populated with registryfiles/acquire_files.db and related files. This results directory can (should) be written to portable storage mounted on the target system or a network share, which can then be consumed by Registry Decoder for offline analysis.

Running the Registry Decoder offline analysis component

Registry Decoder can consume individual registry files, raw (dd) disk images, and Encase (E01) images, including split images. Building a case is as easy as adding a case name and number, investigator, comments, and case directory. Adding evidence to a case after initial processing is created is quite simple; you’ll be prompted to add new evidence after choosing Start Case and opening an existing case.
I only tested Registry Decoder with the acquisition database acquired from a Lurid-infected Windows XP VM via Registry Decoder Live.
Initial processing can take some time depending on the number of restore points or volume shadows.
Once initial processing is complete however, Registry Decoder is nimble and effective.
I mimicked some of Andrew’s use cases in this analysis of a Lurid victim. From runtime analysis of the Lurid sample I had (md5: 84d24967cb5cbacf4052a3001692dd54) I knew a few key attributes to test Registry Decoder with. Services and registry keys created include WmdmPmSp. As the search functionality is a strong suit, I selected CORE from the current snapshot acquired and searched WmdmPmSp. Right-click search results and select Switch to File View then navigate to the Browser tab for key values, etc. as seen in Figure 2.

Figure 2: Registry Decoder search results
I made use of the timeline functionality and was amply rewarded. Imagine a scenario where have a ballpark time window for a malware compromise or unauthorized access. You can filter the timeline window accordingly and produce output that is compliant to the SleuthKit’s mactime format. It’s not human readable currently (next release) so read it in with Autopsy or TSK. Timeline gathering and results are combined in Figure 3. It clearly identified exactly when Lurid wrote to HKLM\SYSTEM\CONTROLSET001\SERVICES\WmdmPmSp.
Figure 3: Registry Decoder timeline results
I also tested USBSTOR (unrelated to Lurid) on both acquisitions (Windows 7 and Windows XP) and the results were accurate and immediate in both cases as seen Figure 4.

Figure 4: Registry Decoder USBSTOR results
Explore the Plugins options included with Registry Decoder, the possibilities are endless. SYSTEM will provide you a nice summary overview as you begin, IE Typed URLs is great for inappropriate browser use, Services with Perform Diff enabled is excellent for malware hunting, System Runs will give you instant gratification regarding what’s configured to run on startup, ACMRU queries the registry keys that have been typed into the Windows Search dialog box, and on and on and on. J Brilliant!

In Conclusion

I’m extremely excited about this tool and imagining its use at scale to be of incredible use for enterprise incident responders and forensic examiners. I’ve been chatting with Andrew at length while writing this and he continuously mentions pending features including some visualization options and the aforementioned Volatility interaction. I can’t wait; check out Registry Decoder out for yourself ASAP.
Merry Christmas!
Ping me via email if you have questions (russ at holisticinfosec dot org).
Cheers…until next month.

Acknowledgements

Andrew Case, Registry Decoder developer and project lead

Moving blog to HolisticInfoSec.io

toolsmith and HolisticInfoSec have moved. I've decided to consolidate all content on one platform, namely an R markdown blogdown sit...