I was looking through my log files and I noticed a query for “Commonly used spam email words”. This got me to thinking. What about a tool that you could lookup one word, a small phrase or an entire email message and get a potential spam score for them.

Using a bayesian filter to process a set of known good and known bad spam messages and then using that collection give a set of query tools to get a percentile score.

Of course initally this would be based on my own email box and the spam messages that I get, but over time this would develop into a good query tool.

I’m going to add this to my todo list, which never seems to get any smaller, and see where it gets to.