For the past month I have been working on the http://www.netflixprize.com trying to develop a collaborative filter based completely in Erlang. It’s been great amount of fun and boy have I learned a LOT!
For instance, Mnesia crashes hard when you try to put 2GB of data into 1.5GB of RAM. More importantly I’ve been trying different approaches to store large amount of data into Mnesia in data structures that are easy to work with in Erlang. Coming from an SQL background I was designing the data structures completely wrong for use in Mnesia.
I’ve also been playing with ram_only and disc_only tables in Mnesia and the idea of having ram_only tables on one server and disc_only tables on another server for the same table. This approach finally gave me the room in the RAM that I needed and the fault tolerance I was looking for.
I am actually using 3 Mnesia server, one with ram_only tables and two with disc-only tables and I am doing this with three different tables. This way I can have any two of the server fail and still maintain my system, albeit slow, but when all three system are up it run each table is in RAM on at least on of them.
The experience I have gain so far have changed my ideas on how to design Erlang system and I will be redesigning http://www.spamfreeemail.com with this new knowledge in mind.
I will also be breaking SFE into small segments and using them as libraries, which will be the birth of ErlMail, which is the open source Erlang email server I am starting to write as of today. I’m going to break the SMTP client and server, a POP3 client and server, and an IMAP4 client and server into ErlMail and then use those libraries as part of SFE. Hopefully that will come into it’s own over the next few months.
I’ve also been working with YAWS and designing my new websites with Erlang only code in mind. (Okay, there is some SQL…) And collaborative filtering was going to be a very large part of my new website designs. So I’ve learned a whole lot about how I want to design those system as well.
While I am not giving up on the http://www.netflixprize.com I am quickly coming to the conclusion that I need to work on my other projects for a while and get them up to speed with my new ideas, once implemented they may give me more insight into how to win the http://www.netflixprize.com 