IP handling, conversion and validation

iptools is a package to make IP addresses convenient to deal with, parse and validate. It is heavily influenced by the Python iptools module, and will hopefully make users’ lives a heck of a lot easier if they have to deal with IP data. Much of it is currently IPv4-specific, out of necessity, but as R’s support for bigger numbers increases, we’ll hopefully make as much of it support IPv6 as possible!

Validating, converting and classifying IP addresses

How do you know an IP address is an IP address? How do you know what type of IP address it is? Most of the time the answer is a complicated regular expression, made more complicated by the need to check for things that are syntactically valid IP addresses, but aren’t actually possible. iptools contains ip_classify, which accepts a vector of IP addresses (or things that might be IP addresses) and identifies whether they’re valid - and if they are valid, what type they are.

ips <- c("192.168.0.1", "2607:f8b0:4006:80b::aaa",
         "the next IP is also invalid", "256.256.190.900")
ip_classify(ips)
[1] "IPv4"    "IPv6"    "Invalid" "Invalid"

Once you’ve validated and classified the IPs, you might want to look at them in their numeric form, rather than the “dotted-decimal” form - or you might have numeric forms, and need dotted-decimal. Either way, there’s a function for it.

#Dotted-decimal to numeric
ips <- c("192.168.0.1","172.18.0.0","172.18.0.15")
numeric_ips <- ip_to_numeric(ips)
numeric_ips
[1] 3232235521 2886860800 2886860815

#And back again
numeric_to_ip(numeric_ips)
[1] "192.168.0.1" "172.18.0.0"  "172.18.0.15"

These functions only currently work for IPv4 IPs - we’ll have IPv6 support as soon as R can handle numbers that big!

Resolving hostnames

The iptools package has integrated the AsioHeaders package which includes the asio networking library. Thanks to this we can take a hostname and work out what IP address(es) are associated with it:

hostname_to_ip("lga15s49-in-f6.1e100.net")
[[1]]
[1] "173.194.123.102"

This works in reverse, too:

ip_to_hostname("173.194.123.10")
[[1]]
[1] "lga15s46-in-f10.1e100.net"

Both operations are fully recognised, but aren’t particularly fast - in fact, they’re incredibly slow compared to the rest of the package - since they need to call out of the system to work. For the same reason, they require a net connection to work, and may slow down said connection while running. You’ve been warned.

Handling IP ranges

As well as specific, unique IP addresses, you may also encounter IP ranges - subsets of the IP address space, looking something like “172.18.0.0/28”. iptools provides a variety of functions to manipulate and expand on these.

range_boundaries takes an IP range (or as many IP ranges as you want!) and produces a data.frame containing the smallest-valued and largest-valued IP addresses in that range:

range_boundaries(c("172.18.0.0/28","148.20.57.0/28","148.20.57.0/24")
   minimum_ip    maximum_ip
1  172.18.0.0   172.18.0.15
2 148.20.57.0  148.20.57.15
3 148.20.57.0 148.20.57.255

If you want all the values within a certain range, rather than just the smallest and largest, you can use range_generate, which provides you with all of the valid addresses within a specific IP range:

range_generate(range = "148.20.57.0/28")
[1]  "148.20.57.0"  "148.20.57.1"  "148.20.57.2"  "148.20.57.3"  "148.20.57.4"
[6]  "148.20.57.5"  "148.20.57.6"  "148.20.57.7"  "148.20.57.8" "148.20.57.9"
[11] "148.20.57.10" "148.20.57.11" "148.20.57.12" "148.20.57.13" "148.20.57.14" "148.20.57.15"

On the other hand, if you simply want to check if an IP address is within a certain range, without caring about the range itself, you can use ip_in_range. This accepts a vector of IP addresses and either one range, or an equally-sized vector of ranges, to check them against. It then tells you which of the provided IPs are within the range (TRUE) or not (FALSE):

ips <- c("148.20.57.15", "148.20.57.255")
ip_in_range(ips, "148.20.57.0/28")
[1]  TRUE FALSE

Generating IP addresses

Sometimes you need spoof data - IPs that are valid, but aren’t sourced from anywhere in particular. iptools contains ip_random, which lets you generate a set of pseudo-random and totally valid IPv4 addresses:

ip_random(n = 5)