While working with a client, trying to figure out why thier spam filter was catching more messages then it should, we realized that we had accidentally added some regular expressions.

How do you accidentally add a regular expression you ask?

In this case the system admin had gotten tired of seeing the word ‘on|ine’ spelled with a pipe. By copying and pasting that word out of an email message they inadvertantly added a regular expression into thier spam fitlers.

This particular regular expression was pretty bad. It reads as “on OR ine” so any instance of ‘on’ or ‘ine’ was getting caught as a spam word. About 60% of all of thier email was getting caught by this filter.

Regular exprssions can be wonderful things for filtering spam, when you truely mean to use them. For example say you want to filter the word ‘penny’ and the plural ‘pennies’. (my example to my client was another word that started with a P and has the same plural issue.) A simple regualr expression for this woudl be /penn[y|ies]/. This reads “anything that starts with ‘penn’ and ends with ‘y’ or ‘ies’”

So be careful with those regular expressions. Know what they really mean and make sure they mean what you want them to.