Parallel DNS reverse lookups

Need to do lots of reverse DNS lookups for some reason? Maybe b/c you’re trying to get a seed list for a web crawl or hack attempt on a bunch of ISPs. Who cares. Here’s a quick way to generate names from a big list of IPs like:

1.1.1.1
1.1.1.2
[...]
254.254.254.253
254.254.254.254

We can use hadoop streaming to chunk the list so we can do the DNS lookups in parallel. Easy and requires little to know thought:

./bin/hadoop jar contrib/streaming/*-streaming.jar -input /home/aday/classC.dat -output /home/aday/classC_dns.dat -mapper 'perl -ne '\''print `host $_`'\''' -numReduceTasks 0

We wrap the host call in backticks so we can trap non-zero exit codes and get an error message on stdout courtesy of perl.