Spam Improvements

Post Information

Posted on December 12, 2010

By John Nunemaker

Spam sucks. Knowing this, we decided from the start to use an outside service for spam filtering (defensio). We also did not want users to ever think about it—no signing up for an account and pasting in an API key. Instead, we have a key for all of Harmony and take care of all that for you.

When a comment is created, it is stored as an unapproved comment and a job is queued up to ping defensio. Defensio then pings us back with a spaminess percentage and whether or not they think the comment should be allowed.

There is a lot of spam out there and your moderation queue can fill up fast. The first thing we did is sort unapproved comments by spaminess ascending. This placed comments that were least likely to be spam at the beginning. This mean you could check those and then just delete all of the rest. We quickly noticed that all comments greater than 60% spaminess were always spam.

Rather than force you all to deal with those comments, we purge any unapproved comment over 60% likely to be spam on an hourly basis. This means less crappy data in our system and less for you to think about.

The next thing we noticed is that occasionally, for whatever reason, the wires get crossed between Defensio and Harmony, leaving comments that are very obviously spam in your moderation queue. Finding this quite annoying, I deployed code today that automatically re-queues any unapproved comment with no spaminess that is over an hour old.

This addition in combination with the automatic purging should bring the amount of moderation you need to do down to almost zero. For example, before deploying this last tweak, RailsTips had about 150 comments in the moderation queue. I just checked and it was down to 2. One had a spaminess of 52% and the other will be purged within the hour as it was at 99%.

Hope you enjoy the zapping of more spam as much as I already am!


  1. Ryan Heath Ryan Heath

    I had a similar problem (I think everyone does) and went about it a similar way, only I was blocking comments at 75% instead of 60%. Here’s the post where I wrote about my solution:

    The reason I’m commenting is because of the comment Carl Mercier (Defensio founder) left—see the first comment. He suggested that you should never block comments based on the “spaminess” and to only go by their “allow” value.

    I’m still taking the same approach, though, and it seems to be working fine. I just wanted to bring it up since the important of my blog and all the sites in Harmony are on two different levels :-)

  2. John Nunemaker John Nunemaker

    @Ryan I did not explain well enough. We only go by allow when adding comment to post. This is in regards to the moderation queue before a comment has been added to a post.

Make a Comment