library(gganimate)
library(ggplot2)
library(reshape2)
Git repositories for extra packages reference:
https://github.com/thomasp85/gganimate
https://bcable.net/x/Rproj/shared
source("shared/load_recurse.R")
source("shared/load_varlog.R")
source("shared/parse_rawsplit.R")
source("shared/cleanup_logs.R")
source("shared/country_code_cleanup.R")
source("shared/fill_zeroes.R")
source("shared/geoip.R")
source("shared/heatmap_prep.R")
source("shared/themes.R")
source("shared/turn_to_animation.R")
source("shared/world_mapper.R")
site_name <- "bcable.net"
path_syslog <- "./appel"
year_filt <- 2019
source("shared/paths.R")
Geolocation based on IP address is not to be taken as entirely accurate as to the source of traffic or attacks conducted. There are many reasons for this, which include (but are not limited to):
Large quantities of traffic, especially attack based traffic, will use a VPN or the Tor network (or some reasonable facsimile), to mask the origin of the traffic. This will in turn change the appearance of the location of origin. Usually, an attacker will also intentionally want the traffic to appear to come from somewhere that has some form of lesser legal jurisdiction, some form of lesser ability to police traffic, or come from a well known source of malicious attacks such as China or Russia.
For instance, the following log entry was generated by myself against my servers while sitting at my desk in the United States, but it gets geolocated as Russia because of how the packet was sent. This sort of masking is trivial to perform, even by a nine year old on a cellphone.
httpd_data[grep("/from/russia/with/logs", httpd_data$Request), c("Request", "Response.Code", "Country.Code")]
## Request Response.Code Country.Code
## 1 GET /from/russia/with/logs HTTP/1.1 404 RU
Some locations will have a higher distribution of virtual servers than others, such as Silicon Valley or China. This can lead to larger quantities of vulnerable virtual machines and servers in those regions, and distort the resulting aggregate data.
It is possible that due to address assignment for governmental intelligence purposes or other economic or political reasons a nation could re-allocate address space and forge the identity similarly to a NAT (network address translation). They could also funnel information via VPN technologies for another nation.
Because most of these agreements are made in private, and due to the fact that most geolocation and WHOIS records are based on self-reporting, it is impossible to know the 100% true nature of geographic address assignment.
This geolocation uses the rgeolocate package available in CRAN, and uses the internal country database that is shipped with it. There could be an error in the database shipped, there could be an error in the lookup code, etc. Bugs happen. I have no reason to believe that any false geolocation is being performed by these packages, however.
Despite these weaknesses, this doesn't change the fact that looking at this sort of data can be quite fun and interesting, and potentially enlightening. Generalized conclusions should not be made from this data or the maps herein. You have been warned.
messages_records <- load_varlog(path_syslog, "messages")
messages_records <- raw_populate(messages_records)
messages_records <- cleanup_syslog(messages_records)
secure_records <- load_varlog(path_syslog, "secure")
secure_records <- raw_populate(secure_records)
secure_records <- cleanup_syslog(secure_records)
secure_records$Raw.Split <- NA
messages_records <- messages_records[
messages_records$Date$year == year_filt - 1900,
]
secure_records <- secure_records[
secure_records$Date$year == year_filt - 1900,
]
ipt_data <- cleanup_iptables(messages_records)
messages_records$Raw.Split <- NA
ipt_data$Raw.Split <- NA
Records: 1974599
Date Min: 2019-01-01 00:00:01
Date Max: 2019-12-31 23:59:57
Records: 875
Date Min: 2019-01-02 06:10:33
Date Max: 2019-12-18 16:14:45
Checking “POSSIBLE BREAK-IN ATTEMPT!” messages, they all appear to be innocuous enough (usually me logging in successfully 5 seconds later, so a typo in my password or somesuch). However, the following is interesting:
sub(
"([0-9][0-9]:[0-9][0-9]:[0-9][0-9]) [^ ]+ ", "\\1 [REDACTED] ",
secure_records$Raw[grepl("Bad protocol", secure_records$Raw)]
)
## [1] "Feb 8 01:06:05 [REDACTED] sshd[27407]: Bad protocol version identification '\\003' from [IPREDACTED] port 46318"
## [2] "Feb 8 01:06:09 [REDACTED] sshd[27408]: Bad protocol version identification '\\003' from [IPREDACTED] port 53422"
## [3] "Feb 12 06:41:24 [REDACTED] sshd[1511]: Bad protocol version identification '\\003' from [IPREDACTED] port 489"
## [4] "Feb 12 06:41:24 [REDACTED] sshd[1511]: Bad protocol version identification '\\003' from [IPREDACTED] port 489"
## [5] "Feb 12 06:41:24 [REDACTED] sshd[1511]: Bad protocol version identification '\\003' from [IPREDACTED] port 489"
## [6] "Mar 4 19:44:40 [REDACTED] sshd[772]: Bad protocol version identification '\\003' from [IPREDACTED] port 156"
## [7] "Mar 6 04:56:50 [REDACTED] sshd[2936]: Bad protocol version identification '\\003' from [IPREDACTED] port 185"
## [8] "Mar 10 00:10:25 [REDACTED] sshd[8761]: Bad protocol version identification '\\003' from [IPREDACTED] port 285"
## [9] "Mar 4 19:44:40 [REDACTED] sshd[772]: Bad protocol version identification '\\003' from [IPREDACTED] port 156"
## [10] "Mar 6 04:56:50 [REDACTED] sshd[2936]: Bad protocol version identification '\\003' from [IPREDACTED] port 185"
## [11] "Mar 10 00:10:25 [REDACTED] sshd[8761]: Bad protocol version identification '\\003' from [IPREDACTED] port 285"
## [12] "Mar 4 19:44:40 [REDACTED] sshd[772]: Bad protocol version identification '\\003' from [IPREDACTED] port 156"
## [13] "Mar 6 04:56:50 [REDACTED] sshd[2936]: Bad protocol version identification '\\003' from [IPREDACTED] port 185"
## [14] "Mar 10 00:10:25 [REDACTED] sshd[8761]: Bad protocol version identification '\\003' from [IPREDACTED] port 285"
## [15] "Aug 20 10:36:43 [REDACTED] sshd[2671]: Bad protocol version identification '\\003' from [IPREDACTED] port 3255"
## [16] "Aug 20 10:36:43 [REDACTED] sshd[2672]: Bad protocol version identification '\\003' from [IPREDACTED] port 18769"
## [17] "Aug 20 10:36:43 [REDACTED] sshd[2671]: Bad protocol version identification '\\003' from [IPREDACTED] port 3255"
## [18] "Aug 20 10:36:43 [REDACTED] sshd[2672]: Bad protocol version identification '\\003' from [IPREDACTED] port 18769"
## [19] "Sep 12 11:30:01 [REDACTED] sshd[6444]: Bad protocol version identification '\\003' from [IPREDACTED] port 58895"
## [20] "Sep 14 17:14:25 [REDACTED] sshd[9926]: Bad protocol version identification '\\003' from [IPREDACTED] port 341"
## [21] "Sep 12 11:30:01 [REDACTED] sshd[6444]: Bad protocol version identification '\\003' from [IPREDACTED] port 58895"
## [22] "Sep 14 17:14:25 [REDACTED] sshd[9926]: Bad protocol version identification '\\003' from [IPREDACTED] port 341"
All IPs appear to be hosts from specific hosts from Germany, Russia, and Bulgaria. My message to the Bulgarians: “NODNOL 871 SELIM? Thankski Verski Muchski Budski!”
What was odd is that after looking at the information for the WHOIS on the Bulgarian IP address, the physical address and name is very, extremely specific. It gave a specific apartment number, name, etc, that was easily pulled up on Google Street View. Lots of satellite dishes on the side of the apartment complex! Nice enough city, though. Maybe a slight bit crowded. Very creepy that this can be done today, huh? I'm literally looking at the apartment and surrounding city for someone who likely sent a payload at my server. All of this with PUBLIC tools and PUBLIC information. Technology must be destroyed. This kind of goes to show how sensitive an IP address can be, and why I tend to redact these when publishing things like this (even though he's probably being a naughty boy, I do not know the context of what actually occurred).
This also confirms my suspicions that you should never use your actual IP address and send all traffic through a VPN connection you trust. ALL traffic. And ALL traffic going over that should be over an encrypted means to the destination in case the VPN provider turns out to be sketchy.
WHOIS data can be too specific sometimes. This gets into a weird area with GDPR, too, since the US has sided with this information being public, and the EU siding with masking WHOIS information. Might be an interesting factoid to throw into the debate, but who cares about politics anyway? It's just domesticated primates flinging poo at each other. Facts rarely enter the debate, and when they do ideology destroys their purpose. Only way to keep yourself private is to take your privacy into your own hands and don't create data to begin with if you can help it, or mask it well. Better to treat the internet as a more public place than the out of doors.
I'll probably end up using the Rwhois package I made to dig through these IPs next.
Also, another disclaimer. The physical address discovered could be inaccurate or incomplete. I didn't investigate to see if it was a Tor node and there's no way for me to know if it's a VPN this guy runs for his friends or a private collection of clients, or a variety of other circumstances.
Unrumble.
ipt_data$Country.Code <- geoip(ipt_data$IP.Source, "country_code")$country_code
ipt_country_df <- country_code_cleanup(ipt_data$Country.Code)
ipt_top10_threshold <- tail(
head(sort(ipt_country_df$Count, decreasing=TRUE), n=11), n=1
)
ipt_top20_threshold <- tail(
head(sort(ipt_country_df$Count, decreasing=TRUE), n=21), n=1
)
ipt_top10 <- ipt_country_df[ipt_country_df$Count > ipt_top10_threshold,]
ipt_top20 <- ipt_country_df[ipt_country_df$Count > ipt_top20_threshold,]
ipt_data$Protocol.Clean <- as.character(ipt_data$Protocol)
ipt_data$Protocol.Clean[
!(ipt_data$Protocol.Clean %in% c("ICMP", "TCP", "UDP"))
] <- "Other"
ipt_data$Date.NoTime <- as.POSIXlt(strftime(ipt_data$Date, format="%Y-%m-%d"))
ipt_data$Count <- rep(1, nrow(ipt_data))
agg_proto_counts <- aggregate(
Count ~ Protocol.Clean + as.factor(Date.NoTime), data=ipt_data, FUN=sum
)
names(agg_proto_counts) <- c("Protocol", "Date", "Count")
agg_proto_counts$Date <- as.POSIXct(agg_proto_counts$Date)
agg_proto_counts$Protocol <- as.factor(agg_proto_counts$Protocol)
agg_proto_counts <- fill_zeroes(agg_proto_counts, by="Protocol")
order_df <- aggregate(
Count ~ Protocol, data=agg_proto_counts, FUN=sum
)
agg_proto_counts$Protocol <- factor(
agg_proto_counts$Protocol, levels=order_df[[1]][order(order_df[[2]])]
)
agg_country_time <- aggregate(
Count ~ Country.Code + as.factor(Date.NoTime),
data=ipt_data, FUN=sum
)
agg_country_time <- country_code_merge(agg_country_time)
names(agg_country_time) <- c(
"Country.Code", "Date", "Count", "Latitude", "Longitude", "Country.Name"
)
agg_country_time$Date <- as.POSIXct(agg_country_time$Date)
agg_country_time_top10 <- agg_country_time[
agg_country_time$Country.Name %in% unique(ipt_top10$Country),
]
agg_dst_ports <- aggregate(
Hostname ~ Destination.Port, data=ipt_data, FUN=length
)
names(agg_dst_ports) <- c("Destination.Port", "Count")
agg_dst_ports$Destination.Port <- as.numeric(as.character(
agg_dst_ports$Destination.Port
))
agg_dst_ports_time <- aggregate(
Hostname ~ Destination.Port + as.factor(Date.NoTime),
data=ipt_data, FUN=length
)
names(agg_dst_ports_time) <- c("Destination.Port", "Date", "Count")
agg_dst_ports_time$Date <- as.POSIXct(agg_dst_ports_time$Date)
agg_dst_ports_time$Destination.Port <- as.factor(
as.character(agg_dst_ports_time$Destination.Port)
)
common_dst_ports <- head(
agg_dst_ports$Destination.Port[order(-agg_dst_ports$Count)], n=10
)
common_dst_ports_time <- agg_dst_ports_time[
as.numeric(as.character(agg_dst_ports_time$Destination.Port)) %in%
as.numeric(as.character(common_dst_ports)),
]
order_df <- aggregate(
Count ~ Destination.Port, data=common_dst_ports_time, FUN=sum
)
common_dst_ports_time$Destination.Port <- factor(
common_dst_ports_time$Destination.Port,
levels=order_df[[1]][order(order_df[[2]])]
)
top10_dst_ports <- head(
agg_dst_ports$Destination.Port[
order(-agg_dst_ports$Count[
-max(as.numeric(as.character(agg_dst_ports$Destination.Port)))
])
], n=10
)
top10_dst_ports_time <- agg_dst_ports_time[
as.numeric(as.character(agg_dst_ports_time$Destination.Port)) %in%
top10_dst_ports,
]
order_df <- aggregate(
Count ~ Destination.Port, data=top10_dst_ports_time, FUN=sum
)
top10_dst_ports_time$Destination.Port <- factor(
top10_dst_ports_time$Destination.Port,
levels=order_df[[1]][order(order_df[[2]])]
)
g <- world_mapper(ipt_country_df)
g <- g + labs(
title=paste0(
site_name, ": IPTables: INPUT Table Packet Drops", collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- g
g
g <- ggplot(agg_proto_counts, aes(x=Date, y=Count, colour=Protocol))
g <- g + geom_line()
g <- g + theme_simple()
g <- g + scale_colour_brewer(palette="Paired")
g <- g + labs(x="", y="Dropped Packets",
title=paste0(site_name,
": IPTables DROPs on Public IP by Protocol", collapse=""
)
)
g
g <- ggplot(agg_proto_counts, aes(x=Date, y=Count, fill=Protocol))
g <- g + geom_area()
g <- g + theme_simple()
g <- g + scale_fill_brewer(palette="Paired")
g <- g + labs(x="", y="Dropped Packets",
title=paste0(site_name,
": IPTables DROPs on Public IP by Protocol", collapse=""
)
)
g
g <- ggplot(ipt_top20, aes(x=Country, y=Count/1000))
g <- g + geom_bar(stat="identity")
g <- g + labs(
title=paste0(site_name, ": IPTables: INPUT DROPs by Top 20 Countries"),
y="Count (thousands)"
)
g <- g + theme_simple()
g
g <- ggplot(agg_country_time_top10, aes(
x=Date, y=Count, group=Country.Name, colour=Country.Name)
)
g <- g + geom_line() + coord_cartesian(ylim=c(0,10000))
g <- g + labs(
title=paste0(site_name, ": IPTables: INPUT DROPs by Top 10 Countries")
)
g <- g + theme_simple()
g <- g + scale_colour_brewer(palette="Paired")
g
order_df <- aggregate(
Count ~ Country.Code, data=agg_country_time_top10, FUN=sum
)
agg_country_time_top10$Country.Code <- factor(
agg_country_time_top10$Country.Code,
levels=order_df[[1]][order(order_df[[2]])]
)
g <- ggplot(agg_country_time_top10,
aes(x=Date, y=Count, fill=Country.Name)
)
g <- g + geom_area() + coord_cartesian(ylim=c(0,10000))
g <- g + labs(
title=paste0(site_name, ": IPTables: INPUT DROPs by Top 10 Countries")
)
g <- g + theme_simple()
g <- g + scale_fill_brewer(palette="Paired")
g
g <- ggplot(common_dst_ports_time, aes(x=Date, y=Count, colour=Destination.Port))
g <- g + geom_line()
g <- g + theme_simple()
g <- g + scale_colour_brewer(palette="Paired")
g <- g + labs(x="", y="Dropped Packets",
title=paste0(site_name,
": IPTables DROPs on Public IP by Top 10 Common Ports", collapse=""
)
)
g
g <- ggplot(common_dst_ports_time, aes(x=Date, y=Count, fill=Destination.Port))
g <- g + geom_area()
g <- g + theme_simple()
g <- g + scale_fill_brewer(palette="Paired")
g <- g + labs(x="", y="Dropped Packets",
title=paste0(site_name,
": IPTables DROPs on Public IP by Top 10 Common Ports", collapse=""
)
)
g
g <- ggplot(top10_dst_ports_time, aes(x=Date, y=Count, colour=Destination.Port))
g <- g + geom_line()
g <- g + theme_simple()
g <- g + scale_colour_brewer(palette="Paired")
g <- g + labs(x="", y="Dropped Packets",
title=paste0(site_name,
": IPTables DROPs on Public IP by Top 10 Max Frequency Common Ports",
collapse=""
)
)
g
g <- ggplot(top10_dst_ports_time, aes(x=Date, y=Count, fill=Destination.Port))
g <- g + geom_area()
g <- g + theme_simple()
g <- g + scale_fill_brewer(palette="Paired")
g <- g + labs(x="", y="Dropped Packets",
title=paste0(site_name,
": IPTables DROPs on Public IP by Top 10 Max Frequency Common Ports",
collapse=""
)
)
g
agg_dst_ports$Destination.Port <- as.numeric(as.character(
agg_dst_ports$Destination.Port
))
non_ephemeral_ports <- heatmap_prep(
agg_dst_ports[agg_dst_ports$Destination.Port < 1024,], 1024, 32,
merge.field="Destination.Port"
)
names(non_ephemeral_ports) <- c("Destination.Port", "Scale", "X", "Y")
non_ephemeral_graph <- function(data, post_title=""){
g <- ggplot(data, aes(x=X, y=Y, fill=Scale, label=Destination.Port))
g <- g + geom_tile() + geom_text()
g <- g + labs(
title=paste0(site_name,
": IPTables Filtered Non-Ephemeral Destination Ports",
post_title, collapse=""
), x="", y=""
)
g <- g + theme_heatmap()
g <- g + scale_fill_continuous(
low="#500000", high="#E00000", guide="colorbar"
)
g <- g + scale_x_discrete(expand=c(0,0)) + scale_y_discrete(expand=c(0,0))
g
}
non_ephemeral_graph(non_ephemeral_ports)
Truncated at 1000 for visual purposes.
non_ephemeral_ports$Scale[non_ephemeral_ports$Scale > 1000] <- 1000
non_ephemeral_graph(non_ephemeral_ports, " (truncated)")
common_ports <- head(agg_dst_ports[order(-agg_dst_ports$Count),], n=256)
common_ports <- common_ports[order(common_ports$Destination.Port),]
common_ports <- heatmap_prep(common_ports)
names(common_ports) <- c("Destination.Port", "Scale", "X", "Y")
common_ports$Destination.Port <- as.factor(common_ports$Destination.Port)
common_ports_graph <- function(data, post_title=""){
g <- ggplot(data, aes(x=X, y=Y, fill=Scale, label=Destination.Port))
g <- g + geom_tile() + geom_text()
g <- g + labs(
title=paste0(site_name,
": IPTables Top 256 Commonly Filtered Destination Ports",
post_title, collapse=""
), x="", y=""
)
g <- g + theme_heatmap()
g <- g + scale_fill_continuous(
low="#500000", high="#E00000", guide="colorbar"
)
g <- g + scale_x_discrete(expand=c(0,0)) + scale_y_discrete(expand=c(0,0))
g
}
g <- ggplot(common_ports, aes(x=Destination.Port, y=Scale))
g <- g + geom_bar(stat="identity")
g <- g + labs(
title=paste0(site_name,
": IPTables Filtered Destination Ports Barchart", collapse=""
), x="Port Number (0-65535)", y=""
)
g <- g + theme_simple() %+replace% theme(axis.text.x=element_blank())
g
common_ports_graph(common_ports)
Truncated at 1000 for visual purposes.
common_ports$Scale[common_ports$Scale > 1000] <- 1000
common_ports_graph(common_ports, " (truncated)")
Attacks going after/scanning most commonly attacked or used ports.
ipt_country_22 <- country_code_cleanup(
ipt_data$Country.Code[ipt_data$Destination.Port == 22]
)
g <- world_mapper(ipt_country_22)
g <- g + labs(
title=paste0(
site_name, ": IPTables: INPUT Table Packet Drops (Port 22: ssh)",
collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- g
g
Why are people still using telnet. :(
ipt_country_23 <- country_code_cleanup(
ipt_data$Country.Code[ipt_data$Destination.Port == 23]
)
g <- world_mapper(ipt_country_23)
g <- g + labs(
title=paste0(
site_name, ": IPTables: INPUT Table Packet Drops (Port 23: telnet)",
collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- g
g
Yucky.
ipt_country_445 <- country_code_cleanup(
ipt_data$Country.Code[ipt_data$Destination.Port == 445]
)
g <- world_mapper(ipt_country_445)
g <- g + labs(
title=paste0(
site_name,
": IPTables: INPUT Table Packet Drops (Port 445: microsoft-ds)",
collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- g
g
Due to alleged rise in RDP attacks after COVID-19 mass migrations to staying at home, this will be interesting to watch over the next couple years.
Mind you, I follow the Larry David approach, and don't know why anyone would leave their home in the first place. There's nothing but trouble out there, nothing is gained by leaving your home, so why do it? Why were people doing it for so long, and why are they squandering this amazing opportunity to have an excuse to block out the rest of the stupid world full of domesticated primates flinging poo at each other?
Read a book. Watch TV. Play some video games. Calm the fuck down you sheeple. Also, stop trying to break into my house. It won't end well for either one of us, think “No Country for Old Men” or “Enemy of the State”.
ipt_country_3389 <- country_code_cleanup(
ipt_data$Country.Code[ipt_data$Destination.Port == 3389]
)
g <- world_mapper(ipt_country_3389)
g <- g + labs(
title=paste0(
site_name,
": IPTables: INPUT Table Packet Drops (Port 3389: rdp/rdesktop)",
collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- g
g
Huawei has been in the news for quite a few security vulnerabilities, and I've noticed on some other servers that are getting blasted on this port which is a Huawei administration port, so would be interesting to see here where it's coming from.
Most likely a Chinese Community Party // Government run company, Huawei could potentially be just dumping insecure product, then attacking those vulnerabilities. Based on where the attacks are coming this doesn't really say much, but is interesting nonetheless.
If it's from China, it could be actually coming from Chinese hackers and/or government agents. Or US/Russia/criminals using a Chinese VPN or proxy to throw off detection systems.
If it's from elsewhere, it could just be from where it says it is, or China routing it from another country.
As I said in the disclaimer above, none of these country codes are very reliable for many reasons. It could just be that Huawei is an incompetent company at designing secure routing equipment.
Regardless of the truth, there is no reason to use anything developed by Huawei.
ipt_country_37215 <- country_code_cleanup(
ipt_data$Country.Code[ipt_data$Destination.Port == 37215]
)
g <- world_mapper(ipt_country_37215)
g <- g + labs(
title=paste0(
site_name,
": IPTables: INPUT Table Packet Drops (Port 37215: Huawei Admin)",
collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- g
g
A port scan is detected if any specific IP address attempts to connect to more than 50 unique destination ports. Under normal usage of my resources, zero will occur. One off connections to random ports that aren't being used are cut out of this detection (for instance, incorrect IP address configured somewhere). No resource should be using more than 50 unique destination ports.
Two detection mechanisms are used in this code. One detects on a per-day basis, to see who is spamming the server (such as: nmap -T insane
), and on a long-term basis, connecting over multiple days from the same IP but to unique destination ports (such as: nmap -T paranoid
).
agg_ip_port_date <- aggregate(
Destination.Port ~ IP.Source + Country.Code + as.factor(Date.NoTime),
data=ipt_data, FUN=function(x){ length(unique(x)); }
)
names(agg_ip_port_date) <- c(
"IP.Source", "Country.Code", "Date", "Count"
)
agg_ip_port_date$Count <- as.numeric(as.character(agg_ip_port_date$Count))
agg_ip_port <- aggregate(
Destination.Port ~ IP.Source + Country.Code,
data=ipt_data, FUN=function(x){ length(unique(x)); }
)
names(agg_ip_port) <- c("IP.Source", "Country.Code", "Unique.Ports")
agg_ip_port$Unique.Ports <- as.numeric(as.character(agg_ip_port$Unique.Ports))
agg_unique_ip <- aggregate(
IP.Source ~ Country.Code,
data=agg_ip_port_date[agg_ip_port_date$Count > 50,], FUN=length
)
unique_ip_map_insane <- country_code_merge(agg_unique_ip)
names(unique_ip_map_insane) <- c("Country.Code", "Count", "X", "Y", "Country")
agg_unique_ip_paranoid <- aggregate(
IP.Source ~ Country.Code,
data=agg_ip_port[agg_ip_port$Unique.Ports > 50,], FUN=length
)
unique_ip_map_paranoid <- country_code_merge(agg_unique_ip_paranoid)
names(unique_ip_map_paranoid) <- c("Country.Code", "Count", "X", "Y", "Country")
nmap -T insane
port scans:
nrow(agg_ip_port_date[agg_ip_port_date$Count > 50,])
## [1] 989
nmap -T paranoid
port scans:
nrow(agg_ip_port[agg_ip_port$Unique.Ports > 50,])
## [1] 890
Top nmap -T insane
scan dates:
agg_ip_port_date$Date[agg_ip_port_date$Count > 3000]
## [1] 2019-03-01 2019-03-02 2019-03-03 2019-04-01 2019-04-05 2019-04-08
## 365 Levels: 2019-01-01 2019-01-02 2019-01-03 2019-01-04 ... 2019-12-31
g <- world_mapper(unique_ip_map_insane)
g <- g + labs(
title=paste0(site_name,
": IPTables: Detected Port Scans (`nmap -T insane`-like)",
collapse=""
),
fill="Unique IPs", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g
g <- world_mapper(unique_ip_map_paranoid)
g <- g + labs(
title=paste0(site_name,
": IPTables: Detected Port Scans (`nmap -T paranoid`-like)",
collapse=""
),
fill="Unique IPs", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g
ipt_map_data <- ipt_data[!is.na(ipt_data$Country.Code),]
anim_geoip <- turn_to_animation(ipt_map_data)
anim_geoip$Count[anim_geoip$Count > 1000] <- 1000
agg_dst_ports_time_trunc <- agg_dst_ports_time
agg_dst_ports_time_trunc$Count[agg_dst_ports_time_trunc$Count > 1000] <- 1000
agg_dst_ports_time_trunc$Destination.Port <- as.numeric(as.character(
agg_dst_ports_time_trunc$Destination.Port
))
anim_ports <- turn_to_animation(
agg_dst_ports_time_trunc[
agg_dst_ports_time_trunc$Destination.Port < 1024,
], "Destination.Port", "Count"
)
names(anim_ports) <- c("Animate.Time", "Value", "Count")
anim_ports$Value <- as.numeric(as.character(anim_ports$Value))
anim_ports$Count[is.na(anim_ports$Count)] <- 0
anim_ports <- anim_ports[
(!is.na(anim_ports$Animate.Time) & !is.na(anim_ports$Value)),
]
names(anim_ports) <- c("Animate.Time", "Destination.Port", "Value")
anim_ports$Animate.Time <- as.POSIXct(anim_ports$Animate.Time)
anim_ports$Destination.Port <- as.numeric(as.character(
anim_ports$Destination.Port
))
anim_ports_org <- heatmap_prep(
anim_ports, 1024, 32,
date.field="Animate.Time", merge.field="Destination.Port",
value.ordering=TRUE
)
names(anim_ports_org) <- c(
"Animate.Time", "Destination.Port", "Scale", "X", "Y"
)
anim_ports_org$Scale <- as.numeric(as.character(anim_ports_org$Scale))
anim_ports_org$Animate.Time <- as.character(strptime(
anim_ports_org$Animate.Time, format="%Y-%m-%d"
))
common_anim_ports_lbls <- head(
agg_dst_ports$Destination.Port[order(-agg_dst_ports$Count)], n=256
)
common_anim_ports_lbls <- common_anim_ports_lbls[order(common_anim_ports_lbls)]
common_anim_ports <- turn_to_animation(
agg_dst_ports_time_trunc[
agg_dst_ports_time_trunc$Destination.Port %in% common_anim_ports_lbls,
], "Destination.Port", "Count"
)
names(common_anim_ports) <- c("Animate.Time", "Destination.Port", "Value")
common_anim_ports$Animate.Time <- as.POSIXct(common_anim_ports$Animate.Time)
common_anim_ports_org <- heatmap_prep(
common_anim_ports[
common_anim_ports$Destination.Port %in% common_anim_ports_lbls,
], 256, 16,
date.field="Animate.Time", merge.field="Destination.Port",
date.ordering=TRUE, expand.values=common_anim_ports_lbls
)
names(common_anim_ports_org) <- c(
"Animate.Time", "Destination.Port", "Scale", "X", "Y"
)
graph_to_animation <- function(g, x=Inf, y=Inf){
g <- g + geom_label(
aes(x=x, y=y, label=Animate.Time),
vjust="inward", hjust="inward",
colour="#808080", fill="#FFFFFF", label.size=0
)
g <- g + transition_manual(Animate.Time)
g
}
g <- world_mapper(anim_geoip)
g <- g + labs(
title=paste0(
site_name, ": IPTables: INPUT Table Packet Drops GeoIP Lookup",
collapse=""
),
fill="Dropped Packets", x="", y=""
)
g <- g + scale_fill_continuous(low="#300000", high="#E00000", guide="colorbar")
g <- graph_to_animation(g)
options(
gganimate.fps=5,
gganimate.nframes=length(levels(as.factor(anim_geoip$Animate.Time)))
)
g
g <- non_ephemeral_graph(anim_ports_org)
g <- graph_to_animation(g, y=-32.5)
options(
gganimate.fps=5,
gganimate.nframes=length(levels(as.factor(anim_ports_org$Animate.Time)))
)
g