In the past week while I was trying to fight off a rather large spam/virus attack I decided that I wanted to see who well the RBL lists that I was using was blocking spam. As it turned out I wasn’t blocking spam very well with the RBL and I really needed to use a different RBL.

The way I was evaluating the IP addresses was by using http://www.senderbase.org/ They have a great tool that will search multiple RBLs and tell you which ones an IP address is on.

Now I just needed to know which RBLs would block the IPs that were sending the spam to my servers.

So I decided that I needed to check the IP addresses in my log files against senderbase.org. Doing this by hand would have been a daunting task. But a quick script using regular expressions made a a snap.

Regular Expressions are a simple way to search for some very complex patterns. in this case I was looking for the tell tale signs of an IP Address.

An IP address has four octets separated by dot or periods. So I know that each number of the four numbers can be from 0 to 255 and there are four number with three dot between them.

Here is the regular expression that I came up with to find these IPs:

[0-2]*[0-9]*[0-9].\.[0-2]*[0-9]*[0-9].\.[0-2]*[0-9]*[0-9].\.[0-2]*[0-9]*[0-9]

The basics on how to read this is that the first number can be from 0 to 2 with 0 to as many characters as possible. The second number can be from 0 to 9 with as many characters as possible. The third number can be from 0 to 9 with at least one number necessary. Then there is a period and the pattern starts over again.

This regular expression will catch things that are not IP addresses. Like many programs, including sendmail, will make their version numbers look like this format.

It also might be prudent that make the first part of the regular expression only look for digits from 1 to 2 instead of 0 to 2.

This is a very quick and dirty example of how to do this and I am positive that there are many more elegant ways of using regular expressions to find IP addresses, the bottom line is that this one worked very well for me.

So what was the rest of my script? I did a conditional loop to go through each line of the qmail log file, checked to see if any IP addresses were already collected in my list of IP addresses and then presented a web page that listed the IPs with a direct link to senderbase.org.

Now I am able to check through my log files to see if an IP address that recently sent me email should be blocked by an RBL that I am not using right now. If I notice enough of them, I will add a new RBL to my list and watch the spam dwindle away :-)