HolisticInfoSec™: 2018

Monday, August 06, 2018

Moving blog to HolisticInfoSec.io

toolsmith and HolisticInfoSec have moved.
I've decided to consolidate all content on one platform, namely an R markdown blogdown site running with Hugo for static HTML creation. My frustration with Blogger/Blogspot met its limit when a completed draft of toolsmith #134 vanished in to thin air, with no prospect of recovery. I'm not a fan of losing hours and hours of work in the simple act of an accidental tab refresh.

As such, I've been meaning to do this for a while now, so I bought holisticinfosec.io and mastered blogdown as fast as possible.

toolsmith will continue to publish on a regular basis and will be the focal content for holisticinfosec.io.
Old content will remain right here, I'm not pulling it down or republishing it.

Update your feed readers, favorites, and bookmarks accordingly (holisticinfosec.org redirects to holisticinfosec.io), and see you at holisticinfosec.io for a far more modern, social experience.
Do note that the site overall has a bit of ongoing work but will be finished soon.
Regardless, toolsmith #134 Shodan As A Verb - Find The Fail Before It Finds You is ready for your reading pleasure.

Cheers, see you there.
Russ

Sunday, June 03, 2018

toolsmith #133 - Anomaly Detection & Threat Hunting with Anomalize

When, in October and November's toolsmith posts, I redefined DFIR under the premise of Deeper Functionality for Investigators in R, I discovered a "tip of the iceberg" scenario. To that end, I'd like to revisit the concept with an additional discovery and opportunity. In reality, this is really a case of DFIR (Deeper Functionality for Investigators in R) within the general practice of the original and paramount DFIR (Digital Forensics/Incident Response).
As discussed here before, those of us in the DFIR practice, and Blue Teaming at large, are overwhelmed by data and scale. Success truly requires algorithmic methods. If you're not already invested here I have an immediately applicable case study for you in tidy anomaly detection with anomalize.
First, let me give credit where entirely due for the work that follows. Everything I discuss and provide is immediately derivative from Business Science (@bizScienc), specifically Matt Dancho (@mdancho84). He created anomalize, "a tidy anomaly detection algorithm that’s time-based (built on top of tibbletime) and scalable from one to many time series," when a client asked Business Science to build an open source anomaly detection algorithm that suited their needs. I'd say he responded beautifully, when his blogpost hit my radar via R-Bloggers it lived as an open tab in my browser for more than a month until generating this toolsmith. Please consider Matt's post a mandatory read as step one of the process here. I'll quote Matt specifically before shifting context: "Our client had a challenging problem: detecting anomalies in time series on daily or weekly data at scale. Anomalies indicate exceptional events, which could be increased web traffic in the marketing domain or a malfunctioning server in the IT domain. Regardless, it’s important to flag these unusual occurrences to ensure the business is running smoothly. One of the challenges was that the client deals with not one time series but thousands that need to be analyzed for these extreme events."
Key takeaway: Detecting anomalies in time series on daily or weekly data at scale. Anomalies indicate exceptional events.
Now shift context with me to security-specific events and incidents, as the pertain to security monitoring, incident response, and threat hunting. In my November 2017 post, recall that I discussed Time Series Regression with the Holt-Winters method and a focus on seasonality and trends. Unfortunately, I couldn't share the code for how we applied TSR, but pointed out alternate methods, including Seasonal and Trend Decomposition using Loess (STL):

Handles any type of seasonality ~ can change over time
Smoothness of the trend-cycle can also be controlled by the user
Robust to outliers

Here now, Matt has created a means to immediately apply the STL method, along with the Twitter method (reference page), as part of his time_decompose() function, one of three functions specific to the anomalize package. In addition to time_decompose(), which separates the time series into seasonal, trend, and remainder components, anomalize includes:

anomalize(): Applies anomaly detection methods to the remainder component.
time_recompose(): Calculates limits that separate the “normal” data from the anomalies

The methods used in anomalize(), including IQR and GESD are described in Matt's reference page. Matt ultimately set out to build a scalable adaptation of Twitter's AnomalyDetection package in order to address his client's challenges in dealing with not one time series but thousands needing to be analyzed for extreme events. You'll note that Matt describes anomalize using a dataset of the daily download counts of the 15 tidyverse packages from CRAN, relevant as he leverages the tidyverse package. I initially toyed with tweaking Matt's demo to model downloads for security-specific R packages (yes, there are such things) from CRAN, including RAppArmor, net.security, securitytxt, and cymruservices, the latter two courtesy of Bob Rudis (@hrbrmstr) of our beloved Data-Driven Security: Analysis, Visualization and Dashboards. Alas, this was a mere rip and replace, and really didn't exhibit the use of anomalize in a deserving, varied, truly security-specific context. That said, I was able to generate immediate results doing so, as seen in Figure 1.

Figure 1: Initial experiment

As an initial experiment you can replace packages names with those of your choosing in tidyverse_cran_downloads.R, run it in R Studio, then tweak variable names and labels in the code per Matt's README page.

I wanted to run anomalize against a real security data scenario, so I went back to the dataset from the original DFIR articles where I'd utilized counts of 4624 Event IDs per day, per user, on a given set of servers. As utilized originally, I'd represented results specific to only one device and user, but herein is the beauty of anomalize. We can achieve quick results across multiple times series (multiple systems/users). This premise is but one of many where time series analysis and seasonality can be applied to security data.
I originally tried to write log data from log.csv straight to an anomalize.R script with logs = read_csv("log.csv") into a tibble (ready your troubles with tibbles jokes), which was not being parsed accurately, particularly time attributes. To correct this, from Matt's Github I grabbed tidyverse_cran_downloads.R, and modified it as follows:
This helped greatly thanks to the tibbletime package, which is "is an extension that allows for the creation of time aware tibbles. Some immediate advantages of this include: the ability to perform time based subsetting on tibbles, quickly summarising and aggregating results by time periods. Guess what, Matt wrote tibbletime too. :-)
I then followed Matt's sequence as he posted on Business Science, but with my logs defined as a function in Security_Access_Logs_Function.R. Following, I'll give you the code snippets, as revised from Matt's examples, followed by their respective results specific to processing my Event ID 4624 daily count log.
First, let's summarize daily login counts across three servers over four months.
The result is evident in Figure 2.

Figure 2: Server logon counts visualized

Next, let's determine which daily download logons are anomalous with Matt's three main functions, time_decompose(), anomalize(), and time_recompose(), along with the visualization function, plot_anomalies(), across the same three servers over four months.
The result is revealed in Figure 3.

Figure 3: Security event log anomalies

Following Matt's method using Twitter’s AnomalyDetection package, combining time_decompose(method = "twitter") with anomalize(method = "gesd"), while adjusting the trend = "4 months" to adjust median spans, we'll focus only on SERVER-549521.
In Figure 4, you'll note that there are anomalous logon counts on SERVER-549521 in June.

Figure 4: SERVER-549521 logon anomalies with Twitter & GESD methods

We can compare the Twitter (time_decompose) and GESD (anomalize) methods with the STL (time_decompose) and IQR (anomalize) methods, which use different decomposition and anomaly detection approaches.
Again, we note anomalies in June, as seen in Figure 5.

Figure 5: SERVER-549521 logon anomalies with STL & IQR methods

Obviously, the results are quite similar, as one would hope. Finally, let use Matt's plot_anomaly_decomposition() for visualizing the inner workings of how algorithm detects anomalies in the remainder for SERVER-549521.
The result is a four part visualization, including observed, season, trend, and remainder as seen in Figure 6.

Figure 6: Decomposition for SERVER-549521 Logins

I'm really looking forward to putting these methods to use at a much larger scale, across a far broader event log dataset. I firmly assert that blue teams are already way behind in combating automated adversary tactics and problems of sheer scale, so...much...data. It's only with tactics such as Matt's anomalize, and others of its ilk, that defenders can hope to succeed. Be sure the watch Matt's YouTube video on anomalize, Business Science is building a series of videos in addition, so keep an eye out there and on their GitHub for more great work that we can apply a blue team/defender's context to.

All the code snippets are in my GitHubGist here, and the sample log file, a single R script, and a Jupyter Notebook are all available for you on my GitHub under toolsmith_r. I hope you find anomalize as exciting and useful as I have, great work by Matt, looking forward to see what's next from Business Science.

Cheers...until next time.

Tuesday, April 03, 2018

toolsmith #132 - The HELK vs APTSimulator - Part 2

Continuing where we left off in The HELK vs APTSimulator - Part 1, I will focus our attention on additional, useful HELK features to aid you in your threat hunting practice. HELK offers Apache Spark, GraphFrames, and Jupyter Notebooks as part of its lab offering. These capabilities scale well beyond a standard ELK stack, this really is where parallel computing and significantly improved processing and analytics truly take hold. This is a great way to introduce yourself to these technologies, all on a unified platform.

Let me break these down for you a little bit in case you haven't been exposed to these technologies yet. First and foremost, refer to @Cyb3rWard0g's wiki page on how he's designed it for his HELK implementation, as seen in Figure 1.

Figure 1: HELK Architecture

First, Apache Spark. For HELK, "Elasticsearch-hadoop provides native integration between Elasticsearch and Apache Spark, in the form of an RDD (Resilient Distributed Dataset) (or Pair RDD to be precise) that can read data from Elasticsearch." Per the Apache Spark FAQ, "Spark is a fast and general processing engine compatible with Hadoop data" to deliver "lighting-fast cluster computing."
Second, GraphFrames. From the GraphFrames overview, "GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs. GraphFrames represent graphs: vertices (e.g., users) and edges (e.g., relationships between users). GraphFrames also provide powerful tools for running queries and standard graph algorithms. With GraphFrames, you can easily search for patterns within graphs, find important vertices, and more."
Finally, Jupyter Notebooks to pull it all together.
From Jupyter.org: "The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more." Jupyter Notebooks provide a higher order of analyst/analytics capabilities, if you haven't dipped your toe in that water, this may be your first, best opportunity.
Let's take a look at using Jupyter Notebooks with the data populated to my Docker-based HELK instance as implemented in Part 1. I repopulated my HELK instance with new data from a different, bare metal Windows instance reporting to HELK with Winlogbeat, Sysmon enabled, and looking mighty compromised thanks to @cyb3rops's APTSimulator.
To make use of Jupyter Notebooks, you need your JUPYTER CURRENT TOKEN to access the Jupyter Notebook web interface. It was presented to you when your HELK installation completed, but you can easily retrieve it via sudo docker logs helk-analytics, then copy and paste the URL into your browser to connect for the first time with a token. It will look like this,
http://localhost:8880/?token=3f46301da4cd20011391327647000e8006ee3574cab0b163, as described in the Installation wiki. After browsing to the URL with said token, you can begin at http://localhost:8880/lab, where you should immediately proceed to the Check_Spark_Graphframes_Integrations.ipynb notebook. It's found in the hierarchy menu under training > jupyter_notebooks > getting_started. This notebook is essential to confirming you're ingesting data properly with HELK and that its integrations are fully functioning. Step through it one cell at a time with the play button, allowing each task to complete so as to avoid errors. Remember the above mentioned Resilient Distributed Dataset? This notebook will create a Spark RDD on top of Elasticsearch using the logs-endpoint-winevent-sysmon-* (Sysmon logs) index as source, and do the same thing with the logs-endpoint-winevent-security-* (Window Security Event logs) index as source, as seen in Figure 2.

Figure 2: Windows Security EVT Spark RDD

The notebook will also query your Windows security events via Spark SQL, then print the schema with:
df = spark.read.format("org.elasticsearch.spark.sql").load("logs-endpoint-winevent-security-*/doc")
df.printSchema()
The result should resemble Figure 3.

Figure 3: Schema

Assuming all matches with relative consistency in your experiment, let's move on to the Sysmon_ProcessCreate_Graph.ipynb notebook, found in training > jupyter_notebooks. This notebook will again call on the Elasticsearch Sysmon index and create vertices and edges dataframes, then create a graph produced with GraphFrame built from those same vertices and edges. Here's a little walk-through.
The v parameter (yes, for vertices) is populated with:
v = df.withColumn("id", df.process_guid).select("id","user_name","host_name","process_parent_name","process_name","action")
v = v.filter(v.action == "processcreate")
Showing the top three rows of that result set, with v.show(3,truncate=False), appears as Figure 4 in the notebook, with the data from my APTSimulator "victim" system, N2KND-PC.

Figure 4: WTF, Florian :-)

The epic, uber threat hunter in me believes that APTSimulator created nslookup, 7z, and regedit as processes via cmd.exe. Genius, right? :-)
The e parameter (yes, for edges) is populated with:
e = df.filter(df.action == "processcreate").selectExpr("process_parent_guid as src","process_guid as dst").withColumn("relationship", lit("spawned"))
Showing the top three rows of that result set, with e.show(3,truncate=False), produces the source and destination process IDs as it pertains to the spawning relationship.
Now, to create a graph from the vertices and edges dataframes as defined in the v & e parameters with g = GraphFrame(v, e). Let's bring it home with a hunt for Process A spawning Process B AND Process B Spawning Process C, the code needed, and the result, are seen from the notebook in Figure 5.

Figure 5: APTSimulator's happy spawn

Oh, yes, APTSimulator fully realized in a nice graph. Great example seen in cmd.exe spawning wscript.exe, which then spawns rundll32.exe. Or cmd.exe spawning powershell.exe and schtasks.exe.
Need confirmation? Florian's CactusTorch JS dropper is detailed in Figure 6, specifically cmd.exe > wscript.exe > rundll32.exe.

Figure 6: APTSimulator source for CactusTorch

Still not convinced? How about APTSimulator's schtasks.bat, where APTSimulator kindly loads mimikatz with schtasks.exe for persistence, per Figure 7?

Figure 7: schtasks.bat

I certainly hope that the HELK's graph results matching nicely with APTSimulator source meets with your satisfaction.
The HELK vs APTSimulator ends with a glorious flourish, these two monsters in their field belong in every lab to practice red versus blue, attack and defend, compromise and detect. I haven't been this happy to be a practitioner in the defense against the dark arts in quite awhile. My sincere thanks to Roberto and Florian for their great work on the HELK and APTSimulator. I can't suggest strongly enough how much you'll benefit from taking the time to run through Part 1 and 2 of The HELK vs APTSimulator for yourself. Both tools are well documented on their respective Githubs, go now, get started, profit.
Cheers...until next time.

Sunday, February 11, 2018

toolsmith #131 - The HELK vs APTSimulator - Part 1

Ladies and gentlemen, for our main attraction, I give you...The HELK vs APTSimulator, in a Death Battle! The late, great Randy "Macho Man" Savage said many things in his day, in his own special way, but "Expect the unexpected in the kingdom of madness!" could be our toolsmith theme this month and next. Man, am I having a flashback to my college days, many moons ago. :-) The HELK just brought it on. Yes, I know, HELK is the Hunting ELK stack, got it, but it reminded me of the Hulk, and then, I thought of a Hulkamania showdown with APTSimulator, and Randy Savage's classic, raspy voice popped in my head with "Hulkamania is like a single grain of sand in the Sahara desert that is Macho Madness." And that, dear reader, is a glimpse into exactly three seconds or less in the mind of your scribe, a strange place to be certain. But alas, that's how we came up with this fabulous showcase.
In this corner, from Roberto Rodriguez, @Cyb3rWard0g, the specter in SpecterOps, it's...The...HELK! This, my friends, is the s**t, worth every ounce of hype we can muster.
And in the other corner, from Florian Roth, @cyb3rops, the The Fracas of Frankfurt, we have APTSimulator. All your worst adversary apparitions in one APT mic drop. This...is...Death Battle!

Now with that out of our system, let's begin. There's a lot of goodness here, so I'm definitely going to do this in two parts so as not undervalue these two offerings.
HELK is incredibly easy to install. Its also well documented, with lots of related reading material, let me propose that you take the tine to to review it all. Pay particular attention to the wiki, gain comfort with the architecture, then review installation steps.
On an Ubuntu 16.04 LTS system I ran:

git clone https://github.com/Cyb3rWard0g/HELK.git
cd HELK/
sudo ./helk_install.sh

Of the three installation options I was presented with, pulling the latest HELK Docker Image from cyb3rward0g dockerhub, building the HELK image from a local Dockerfile, or installing the HELK from a local bash script, I chose the first and went with the latest Docker image. The installation script does a fantastic job of fulfilling dependencies for you, if you haven't installed Docker, the HELK install script does it for you. You can observe the entire install process in Figure 1.

Figure 1: HELK Installation

You can immediately confirm your clean installation by navigating to your HELK KIBANA URL, in my case http://192.168.248.29.
For my test Windows system I created a Windows 7 x86 virtual machine with Virtualbox. The key to success here is ensuring that you install Winlogbeat on the Windows systems from which you'd like to ship logs to HELK. More important, is ensuring that you run Winlogbeat with the right winlogbeat.yml file. You'll want to modify and copy this to your target systems. The critical modification is line 123, under Kafka output, where you need to add the IP address for your HELK server in three spots. My modification appeared as hosts: ["192.168.248.29:9092","192.168.248.29:9093","192.168.248.29:9094"]. As noted in the HELK architecture diagram, HELK consumes Winlogbeat event logs via Kafka.
On your Windows systems, with a properly modified winlogbeat.yml, you'll run:

./winlogbeat -c winlogbeat.yml -e
./winlogbeat setup -e

You'll definitely want to set up Sysmon on your target hosts as well. I prefer to do so with the @SwiftOnSecurity configuration file. If you're doing so with your initial setup, use sysmon.exe -accepteula -i sysmonconfig-export.xml. If you're modifying an existing configuration, use sysmon.exe -c sysmonconfig-export.xml. This will ensure rich data returns from Sysmon, when using adversary emulation services from APTsimulator, as we will, or experiencing the real deal.
With all set up and working you should see results in your Kibana dashboard as seen in Figure 2.

Figure 2: Initial HELK Kibana Sysmon dashboard.

Now for the showdown. :-) Florian's APTSimulator does some comprehensive emulation to make your systems appear compromised under the following scenarios:

POCs: Endpoint detection agents / compromise assessment tools
Test your security monitoring's detection capabilities
Test your SOCs response on a threat that isn't EICAR or a port scan
Prepare an environment for digital forensics classes

This is a truly admirable effort, one I advocate for most heartily as a blue team leader. With particular attention to testing your security monitoring's detection capabilities, if you don't do so regularly and comprehensively, you are, quite simply, incomplete in your practice. If you haven't tested and validated, don't consider it detection, it's just a rule with a prayer. APTSimulator can be observed conducting the likes of:

Creating typical attacker working directory C:\TMP...
Activating guest user account

Adding the guest user to the local administrators group

Placing a svchost.exe (which is actually srvany.exe) into C:\Users\Public
Modifying the hosts file

Adding update.microsoft.com mapping to private IP address

Using curl to access well-known C2 addresses

C2: msupdater.com

Dropping a Powershell netcat alternative into the APT dir
Executes nbtscan on the local network
Dropping a modified PsExec into the APT dir
Registering mimikatz in At job
Registering a malicious RUN key
Registering mimikatz in scheduled task
Registering cmd.exe as debugger for sethc.exe
Dropping web shell in new WWW directory

A couple of notes here.
Download and install APTSimulator from the Releases section of its GitHub pages.
APTSimulator includes curl.exe, 7z.exe, and 7z.dll in its helpers directory. Be sure that you drop the correct version of 7 Zip for your system architecture. I'm assuming the default bits are 64bit, I was testing on a 32bit VM.

Let's do a fast run-through with HELK's Kibana Discover option looking for the above mentioned APTSimulator activities. Starting with a search for TMP in the sysmon-* index yields immediate results and strikes #1, 6, 7, and 8 from our APTSimulator list above, see for yourself in Figure 3.

Figure 3: TMP, PS nc, nbtscan, and PsExec in one shot

Created TMP, dropped a PowerShell netcat, nbtscanned the local network, and dropped a modified PsExec, check, check, check, and check.
How about enabling the guest user account and adding it to the local administrator's group? Figure 4 confirms.

Figure 4: Guest enabled and escalated

Strike #2 from the list. Something tells me we'll immediately find svchost.exe in C:\Users\Public. Aye, Figure 5 makes it so.

Figure 5: I've got your svchost right here

Knock #3 off the to-do, including the process.commandline, process.name, and file.creationtime references. Up next, the At job and scheduled task creation. Indeed, see Figure 6.

Figure 6. tasks OR schtasks

I think you get the point, there weren't any misses here. There are, of course, visualization options. Don't forget about Kibana's Timelion feature. Forensicators and incident responders live and die by timelines, use it to your advantage (Figure 7).

Figure 7: Timelion

Finally, for this month, under HELK's Kibana Visualize menu, you'll note 34 visualizations. By default, these are pretty basic, but you quickly add value with sub-buckets. As an example, I selected the Sysmon_UserName visualization. Initially, it yielded a donut graph inclusive of malman (my pwned user), SYSTEM and LOCAL SERVICE. Not good enough to be particularly useful I added a sub-bucket to include process names associated with each user. The resulting graph is more detailed and tells us that of the 242 events in the last four hours associated with the malman user, 32 of those were specific to cmd.exe processes, or 18.6% (Figure 8).

Figure 8: Powerful visualization capabilities

This has been such a pleasure this month, I am thrilled with both HELK and APTSimulator. The true principles of blue team and detection quality are innate in these projects. The fact that Roberto consider HELK still in alpha state leads me to believe there is so much more to come. Be sure to dig deeply into APTSimulator's Advance Solutions as well, there's more than one way to emulate an adversary.
Next month Part 2 will explore the Network side of the equation via the Network Dashboard and related visualizations, as well as HELK integration with Spark, Graphframes & Jupyter notebooks.
Aw snap, more goodness to come, I can't wait.
Cheers...until next time.

Monday, January 01, 2018

toolsmith #130 - OSINT with Buscador

First off, Happy New Year! I hope you have a productive and successful 2018. I thought I'd kick off the new year with another exploration of OSINT. In addition to my work as an information security leader and practitioner at Microsoft, I am privileged to serve in Washington's military as a J-2 which means I'm part of the intelligence directorate of a joint staff. Intelligence duties in a guard unit context are commonly focused on situational awareness for mission readiness. Additionally, in my unit we combine part of J-6 (command, control, communications, and computer systems directorate of a joint staff) with J-2, making Cyber Network Operations a J-2/6 function. Open source intelligence (OSINT) gathering is quite useful in developing indicators specific to adversaries as well as identifying targets of opportunity for red team and vulnerability assessments. We've discussed numerous OSINT offerings as part of toolsmiths past, there's no better time than our 130th edition to discuss an OSINT platform inclusive of previous topics such as Recon-ng, Spiderfoot, Maltego, and Datasploit. Buscador is just such a platform and comes from genuine OSINT experts Michael Bazzell and David Wescott. Buscador is "a Linux Virtual Machine that is pre-configured for online investigators." Michael is the author of Open Source Intelligence Techniques (5th edition) and Hiding from the Internet (3rd edition). I had a quick conversation with him and learned that they will have a new release in January (1.2), which will address many issues and add new features. Additionally, it will also revamp Firefox since the release of version 57. You can download Buscador as an OVA bundle for a variety of virtualization options, or as a ISO for USB boot devices or host operating systems. I had Buscador 1.1 up and running on Hyper-V in a matter of minutes after pulling the VMDK out of the OVA and converting it with QEMU. Buscador 1.1 includes numerous tools, in addition to the above mentioned standard bearers, you can expect the following and others:

Creepy
Metagoofil
MediaInfo
ExifTool
EmailHarvester
theHarvester
Wayback Exporter
HTTrack Cloner
Web Snapper
Knock Pages
SubBrute
Twitter Exporter
Tinfoleak
InstaLooter
BleachBit

Tools are conveniently offered via the menu bar on the UI's left, or can easily be via Show Applications.
To put Buscador through its paces, using myself as a target of opportunity, I tested a few of the tools I'd not prior utilized. Starting with Creepy, the geolocation OSINT tool, I configured the Twitter plugin, one of the four available (Flickr, Google+, Instagram, Twitter) in Creepy, and searched holisticinfosec, as seen in Figure 1.


Figure 1: Creepy configuration

The results, as seen in Figure 2, include some good details, but no immediate location data.

Figure 2: Creepy results

Had I configured the other plugins or was even a user of Flickr or Google+, better results would have been likely. I have location turned off for my Tweets, but my profile does profile does include Seattle. Creepy is quite good for assessing targets who utilize social media heavily, but if you wish to dig more deeply into Twitter usage, check out Tinfoleak, which also uses geo information available in Tweets and uploaded images. The report for holisticinfosec is seen in Figure 3.

Figure 3: Tinfoleak

If you're looking for domain enumeration options, you can start with Knock. It's as easy as handing it a domain, I did so with holisticinfosec.org as seen in Figure 4, results are in Figure 5.

Figure 4: Knock run

Figure 5: Knock results

Other classics include HTTrack for web site cloning, and ExifTool for pulling all available metadata from images. HTTrack worked instantly as expected for holisticinfosec.org. I used Instalooter, "a program that can download any picture or video associated from an Instagram profile, without any API access", to grab sample images, then ran pyExifToolGui against them. As a simple experiment, I ran Instalooter against the infosec.memes Instagram account, followed by pyExifToolGui against all the downloaded images, then exported Exif metadata to HTML. If I were analyzing images for associated hashtags the export capability might be useful for an artifacts list.
Finally, one of my absolute favorites is Metagoofil, "an information gathering tool designed for extracting metadata of public documents." I did a quick run against my domain, with the doc retrieval parameter set at 50, then reviewed full.txt results (Figure 6), included in the output directory (home/Metagoofil) along with authors.csv, companies.csv, and modified.csv.

Figure 6: Metagoofil results

Metagoofil is extremely useful for gathering target data, I consider it a red team recon requirement. It's a faster, currently maintained offering that has some shared capabilities with Foca. It should also serve as a reminder just how much information is available in public facing documents, consider stripping the metadata before publishing.

It's fantastic having all these capabilities ready and functional on one distribution, it keeps the OSINT discipline close at hand for those who need regular performance. I'm really looking forward to the Buscador 1.2 release, and better still, I have it on good authority that there is another book on the horizon from Michael. This is a simple platform with which to explore OSINT, remember to be a good citizen though, there is an awful lot that can be learned via these passive means.
Cheers...until next time.