This morning I got to thinking that knowing the overall accuracy of the bayesian filter would give a good handle on how well the filter is learning.

The thing I am most interested in is what percentage of messages are getting marked neutral, or unclassified.

From the beginning of the data I have, which is about 21 days of messages, I see that about 30% of the messages have been marked neutral, 6.5% of have been reclassified as good and 93.5 have been reclassified as bad.

Over the past 14 days I see 24.2% are neutral, while 8.6 have been reclassified as good and 91.3 have been reclassified as bad.

and in the past 2 days 17.2% have been marked neutral with 3.7% reclassified as good and 96.2 reclassified as bad.

Over all I can see that the filter is learning although not as fast as I would like it to. The really good news is that of all of the messages that have been originally classified as good or bad, none of them have been reclassified.