Spam Free Email

Anti-spam ideas, tools and services

December 31st, 2006

ErlMail status

I’ve been slowly working the ErlMail during the past few weeks. Some client work and the holidays have made it more difficult to carve out programming time then I would have liked.

I’ve been reading many different RFCs and I have been starting to build some leex/yacc grammar files to parse the different types of content with. The process is working out better then I hoped in many cases, but I’m still running into trouble.

The biggest complaint I have at this point is the fact that many people simply don’t follow the RFCs. There are some instances that are innocent enough, but I’m getting annoyed with dates that look nothing like RFC822 dates or have information in then that was never intended to be part of a date field.

In one instance I found that a mail server fixed an error of omission from a client, making sure that a date was in the headers and then put “(added by postmaster@somedomain.com)”. I haven’t re-checked the RFCs, but I’m pretty sure that is blatantly wrong.

How exactly is someone suppose to write a RFC compliant mail client when they know for certain that they will inevitably receive mail that is not RFC compliant and no number of SOMEs, SHOULDs or MIGHTS will prepare you for the array of bizarre interpretations that people seem to have about what is meant in the RFCs.

This just re-enforces the amazement that I have in the fact that email even works. This system was never designed to handle the traffic, or frankly the content, that is does and the number of successful email messages that travel across the Internet every day is mid boggling.

If they only knew it was smoke and mirrors help together with bailing wire and string ….

December 18th, 2006

Parsing Dates

I’ve been testing erlmail-0.0.2 and I came across some problems in my date parsing code. In reality the problems are in the util package that erlmail is dependant on, but date parsing is the problem none the less …

I decided that the randomness of RFC822 dates was worth another leex/yecc proejct, which I have always intended to do with many of the mail based RFCs. So I started the RFC822 scanner/parser which can now handle dates and dates only. It handles them much better then my custom code and once again proves that leex and yecc are better for parsing then I am.

December 14th, 2006

ErlMail-0.0.2 Release

I just finished porting the SMTP client I had built into SpamFreeEmail.com into ErlMail. I had forgotten how much easier SMTP is then IMAP, I’m talking orders of magnitude easier.

In any case, I had a functional API and FSM that I was using in SFE, so I upgraded it to reflect my current knowledge level and we now have http://www.spamfreeemail.com/releases/erlmail/erlmail-0.0.2.tar.gz

I also added dnsbl.erl, which is the core code for a DNS black-hole list checking module. Technically it works, it just has no consequences towards anything at this point.

Erlmail still requires my util package to work:

Please direct any comment, questions or patches to sjackson@simpleenigma.com and put ErlMail in the subject line. My Email is aggressively filtered for spam, so if I don’t see something to catch my attention, I delete it.

December 13th, 2006

ErlMail-0.0.1 released

This is the initial release of ErlMail, the only functional part of ErlMail at the moment is the IMAP client. Most of the rest of the file sin this distribution are meant as framework for future improvements.

There is enough documentation in the imapc.erl file for most (Erlang) people to figure out how to use it.

I have not implemented the AUTHENTICATE or STARTTLS commands yet. There is some framework in place for them, but they are not functional at all. STARTTLS will crash the system at present :-)

I have used some of my own utilities in ErlMail, they are in a separate package I an also releasing today. I may decide to remove these dependencies in the future, but the make my life easier as of today :-)

So here they are:

Please direct any comment, questions or patches to sjackson@simpleenigma.com and put ErlMail in the subject line. My Email is aggressively filtered for spam, so if I don’t see something to catch my attention, I delete it.

[UPDATE: I fixed a quick bug or two in the last code I was working on and re-uploaded the file at about 4PM PST]

December 11th, 2006

Back to FETCH … again

I’ve been working feverishly to develop leex and yecc grammar files to parse IMAP command and responses and it’s going wonderfully.

I’ve anaged to get through all of the commands that I had previously parsed by hand and then got to the FETCH command … again …

This time the parsing of the syntax has been a dream and I finally understand how to read the BNF in the back of the IMAP RFC.

I’ve also developed a new appreciation for the IMAP protocol. I use think that the protocol was oddly obscure and formatted simply to make it difficult to parse, but now I understand that IMAP was developed around it’s BNF. Which when you are using yacc/yecc to create a grammar file for it make it a wonderful protocol to work with.

I recently had to go back into my leex file and add some code to pre-process the fetch command. Basically I found a regular expression and then added some double quotes in the appropriate places to Ame the fetch command look more like strings then individual tokens. It works great and now I can get back to the yecc grammar file.

After that pre-processing sidetrak the rest of translating the BNF into the yecc grammar file should be a breeze, now I just wish I had the time to finish it tonight :-)

December 9th, 2006

Yecc, Leex and IMAP

I’ve gotten distracted again, but in a good way this time. I discovered what parsetools/yecc does recently and I figured it would be perfect to parse the commands and responses for an IMAP server and the IMAP client. I have proven to myself that this is true, but now I need to create a grammar file that fulfills the entire IMAP4 spec.

This is where leex comes in; as I need the data string that is either a command or a response to be tokenized into something that is reasonably easy to parse. I was workign with erl_scan:string/1, which was working adeqatly well, but after buying hte O’rielly book on Lex and Yacc I’ve come to the conclusion that leex needs to make a scanner for yecc to parse the tokens from.

I got the concepts of leex figured out last night just before I went to bed and ened up having regular expression and parsing dreams all night long. Annoying, but I solved a few of my problem that I knew about last night what I was dreaming :-)

In any case this will slow down the release of ErlMail-0.0.1 for just a bit, but it will solve all of the bugs that I mentioned in an earlier post. Fair enough trade off in my book.

December 4th, 2006

IMAP Client testing

For the past few days I’ve been working on getting the IMAP client in ErlMail to work with a YAWS based web-mail application that I am working on. To that end there is some good news and some bad news ;-)

The good news is that it is working pretty well. It’s fast and the data is getting back to the web-mail client exactly how it is suppose to, or rather exactly how I told it to, which is part of the bad new.

I always find it amazing how different test data is from real life data :-) Which is where the somewhat bad news comes in. Bugs, lots of bugs. Things I never would have thought of while working with it originally.

It doesn’t seems to like spaces in file names. I understand that one, but haven’t figured out how I want to fix it yet.

I made an assumption that one email address was coming through at a time, and now when an email message goes to more then one person it break, which will just require another layer of parsing on the envelope.

And my personal favorite of my current test data set, it doesn’t like extra parenthesis in the subject line. The parenthesis are used as a delimiter items in the envelope and it breaks when it seem more then it should. I have an idea of how to fix this, but nothing solid yet.

So the really great news is that I can display email messages from different mailboxes. I can get the subject, from, date, size and other general information. and I can display the detail of a message, save the actual body of the message so far.

After working out these bugs and being able to display the full message with potential attachment, I’ll need to start working on the SMTP client so that I can send email as well as view what I have received.

|