Prerequisites
This month finds us in a new phase for toolsmith
as it will not be associated with ISSA or the ISSA Journal any further. Suffice it to say
that the ISSA board and management organization decided they no longer wanted to pay the small monthly stipend I’d been receiving since the inception of the toolsmith column. As I am by no means a profiteer, I am also not a charity, so we simply parted ways. All the better I say, as I have been less than satisfied with ISSA
as an organization: Ira
Winkler and Mary Ann
Davidson should
serve to define that dissatisfaction.
I will say this, however. All dissatisfaction aside, it
has been my distinct pleasure to write for the ISSA Journal editor, Thom
Barrie, who has been a loyal, dedicated, committed, and capable editor and
someone I consider a friend. I will miss our monthly banter, I will miss him,
and I thank him most sincerely for these nine years as editor. The ISSA Journal is better for his care and attention. Thank you, Thom.
Enough said, what’s next? I’ll continue posting toolsmith
here while I consider options for a new home or partnership. I may just stick
exclusively to my blog and see if there is a sponsor or two who might be
interested in helping me carry the toolsmith message.
I thought I'd use our new circumstances to test a few different ideas with you over the next few months, your feedback is welcome as always, including ideas regarding what you might like to see us try. As always toolsmith will continue to offers insights on tools useful to the information security practitioner, typically open source and free.
To that end, I thought I'd offer you a bit of R code I recently cranked out for a MOOC I was taking. The following visualizations with R are the result of fulfilling a recent assignment for Coursera’s online Data Visualization class. The assignment was meant to give the opportunity to do non-coordinate data visualization with network data as it lends itself easily to graph visualization. I chose, with a bit of cheekiness in mind, to visualize network data…wait for it…with security-related network data.
Data Overview
The packet capture I used was gathered during a ZeroAccess run-time analysis in my lab using a virtualized Windows victim and Wireshark, which allowed me to capture data to be saved as a CSV. The resulting CSV provides an excellent sample set inclusive of nodes and edges useful for network visualization. Keep in mind that this is a small example with a reduced node count to avoid clutter and serve as an exemplar. A few notes about the capture:
- Where the protocol utilized was HTTP, the resulting packet length was approximately 220 bytes.
- Where the protocol was TCP other than HTTP, the resulting packet length was approximately 60 bytes.
- For tidy visualization these approximations are utilized rather than actual packet length.
- Only some hosts utilized HTTP, specific edges are visualized where appropriate.
DiagrammeR and Graphviz
The DiagrammeR package for R includes Graphviz, which, in turn, includes four rendering engines including dot, neato, twopi, and circo. I’ve mentioned Graphviz as part of my discussion of ProcDot and AfterGlow as it is inherent to both projects. The following plots represent a subset of the ZeroAccess malware network traffic data.
- The green node represents the victim system.
- Red nodes represent the attacker systems.
- Orange nodes represent the protocol utilized.
- The cyan node represent the length of the packet (approximate.)
- Black edges represent the network traffic to and from the victim and attackers.
- Orange edges represent hosts conversing over TCP protocol other than HTTP.
- Cyan edges represent the relationship of protocol to packet length.
- Purple edges represent hosts communicating via the HTTP protocol.
Graphs are plotted in order of my preference for effective visualization; code for each follows.
After these first four visualizations, keep reading, I pulled together a way to read in the related CSV and render a network graph automagically.
--------------------------------------------------------------------------------------------------------------------------
Visualization 1: Graphviz ZeroAccess network circo plot
Visualization 1 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "circo")
--------------------------------------------------------------------------------------------------------------------------
Visualization 2: Graphviz ZeroAccess network dot plot
Visualization 2 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "dot")
--------------------------------------------------------------------------------------------------------------------------Visualization 3: Graphviz ZeroAccess network twopi plot
Visualization 3 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "twopi")
--------------------------------------------------------------------------------------------------------------------------
Visualization 4: Graphviz ZeroAccess network neato plot
Visualization 4 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "neato")
Read in a CSV and render plot
Populating graphs arbitrarily as above as examples is nice...for examples. In the real world, you'd likely just want to read in a CSV derived from a Wireshark capture.
As my code is crap at this time, I reduced zeroaccess.csv to just the source and destination columns, I'll incorporate additional data points later. To use this from your own data, reduce CSV columns down to source and destination only.
Code first, with comments to explain, derived directly from Rich Iannone's DiagrammerR example for using data frames to define Graphviz graphs.
Visualization 5 |
Following is a quick data summary, but you can grab it from Github too.
Network Data
Summary: zeroaccess.csv
zeroaccess <- span=""> read.csv("zeroaccess.csv", sep = ",")->
summary(zeroaccess)
## Source Destination Protocol Length
## 192.168.248.21:340 192.168.248.21:152 HTTP: 36 Min. : 54.00
## 176.53.17.23 : 90 176.53.17.23 : 90 TCP :456 1st Qu.: 60.00
## 140.112.251.82: 6 140.112.251.82: 6 Median : 62.00
## 178.19.22.191 : 6 178.19.22.191 : 6 Mean : 84.98
## 89.238.36.146 : 6 89.238.36.146 : 6 3rd Qu.: 62.00
## 14.96.213.41 : 3 1.160.72.47 : 3 Max. :1506.00
## (Other) : 41 (Other) :229
head(zeroaccess)
## Source Destination Protocol Length
## 1 192.168.248.21 176.53.17.23 TCP 62
## 2 192.168.248.21 176.53.17.23 TCP 62
## 3 192.168.248.21 176.53.17.23 TCP 62
## 4 176.53.17.23 192.168.248.21 TCP 62
## 5 192.168.248.21 176.53.17.23 TCP 54
## 6 192.168.248.21 176.53.17.23 HTTP 221
In closing
Hopefully this leads you to wanting to explore visualization of security data a bit further, note the reference material in Acknowledgments.
I've stuffed all this material on Github for you as well and will keep working on the CSV import version as well.
Ping me via email or Twitter if you have questions (russ at holisticinfosec dot org or @holisticinfosec). Cheers…until next month.
Acknowledgements
Rich Iannone for DiagrammeR and the using-data-frames-to-define-graphviz-graphs example
Jay and Bob for Data-Driven Security (the security data scientist's bible)