Some bots are good

By lucastech
bots internet archive good bots

Even though the internet is full of bad bots these days, there are a few worth highlighting that are good and should be allowed.

holy traffic spike batman

After seeing this spike, even though it was no longer happening, I wanted to see who was behind it

Top IPs by total time (last 35 periods):
IP Address             Requests   Total Time   Avg Time  Mins  Sites
---------------------------------------------------------------------
207.241.225.x               454     27139.36      59.78    25  
207.241.225.x               464     27118.47      58.44    33  
207.241.225.x               435     25315.76      58.20    27  
207.241.225.x               212     13341.87      62.93    27  
207.241.225.x               208     12868.12      61.87    18  
207.241.225.x               230     12734.87      55.37    33  
207.241.235.x               219     12058.01      55.06    28
127.0.0.1                   389      4523.27      11.63    34
68.192.x.x                   58      1669.57      28.79     1 

This was very clearly some kind of bulk scraping operation, making a lot of requests over a period of about 14 minutes, all IP addresses in the same range.

Checking the user agent I saw this

Mozilla/5.0 (compatible; archive.org_bot +http://archive.org/details/archive.org_bot) Zeno/9c8cca3 warc/v0.8.90

and looking up the IP in whois gives

CIDR:           207.241.224.0/20
NetName:        INTERNET-ARCHIVE-1
NetHandle:      NET-207-241-224-0-1
Parent:         NET207 (NET-207-0-0-0-0)
NetType:        Direct Allocation
Organization:   Internet Archive (INTERN-95)

Given that the Internet Archive is kind of like our modern internet library, I consider this a good bot.

Unfortunately, they do not appear to be checking the robots.txt which has a 30 second crawl delay set, but I'll let them get away with it so long as the request spikes don't last too long.

If you haven't played with their Software selection, I highly recommend the trip down nostalgia lane! Personally I always enjoy games like Doom and Jazz Jackrabbit if you're looking for some quality games.