Akismet is a great tool for fighting spam, however due to licensing and call limits I was looking to decrease the number of calls to Akismet. In my search for alternatives I stumbled upon Stop Forum Spam. For a free project, I have been impressed with their accuracy. While primary focused on, well forums, they have an API.
Other alternatives exist but for now I have been happy with the combination of Aksimet and Stop Form Spam combined with judicial word and IP blocks when needed.
Spam Strategy
While my exact spam strategy is somewhat depending on the content type. I often use a PHP library for Akismet in combination with a word block and/or IP block list.
For my most recent project the strategy looked like this:
- First check the input against an IP list. While IP blocks are a cat and mouse game, I use it to block known spammers. I’m talking about IPs that have made more than five known spam attempts in the last fourteen days get blocked for two weeks.
- Next, check the content against a known word block list. Again this is the low hanging fruit for the crypto and link spammers who often uses the same words or phrases in their spam.
- Once content passes both of those, the data is checked against the Stop Forum Spam database.
- Finally, if that passes, then check the content with Akismet.
I found this catches about 98% of spam. All entries require manual approval, but catching spam means less moderation.
PHP Example code
Here is an example using the Stop Forum Spam API with Symfony HTTP Client. A similar method should be possible with any Request package. For Akismet I’m using the Omines Akismet Package as I can use the same Request package.
One quirk of the API is that the blocklists sets the frequency field to 255, and the lastseen date to the current time (UTC). I am being somewhat conservative here and checking if the lastseen is within the last hour.
/**
* Check IP / email / username against Stop ForumSpam database
*
* @param string | null $ip IP address to check
* @param string | null $email Email to check
* @param string | null $username Username to check
* @param int $threshold Confidence threshold(0 - 100, default: 75)
* @param bool $checkTor include TOR exit nodes in spam detection(default: false)
* @param bool $checkBlacklist include blacklisted entries(default: true)
* @return bool | string false if clean, or string with reason if spam detected('confidence' | 'tor' | 'blacklist') {
*/
private static function stopForumSpamLookup($ip = null, $email = null, $username = null, $threshold = 75, $checkTor = true, $checkBlacklist = true)
{
$client = HttpClient::create();
$params = ['json' => '1', 'confidence' => '1'];
if ($checkTor) {
$params['badtorexit'] = '1';
}
if ($checkBlacklist) {
$params['nobaduser'] = '0';
}
if ($ip) {
$params['ip'] = $ip;
}
if ($email) {
$params['email'] = $email;
}
if ($username) {
$params['username'] = $username;
}
try {
$response = $client->request('GET', 'http://api.stopforumspam.org/api', [
'query' => $params
]);
$data = json_decode($response->getContent(), true);
if (!isset($data['success']) || $data['success'] != 1) {
return false;
}
foreach (['ip', 'email', 'username'] as $field) {
if (isset($data[$field]) && $data[$field]['appears'] == 1) {
if ($checkBlacklist && isset($data[$field]['frequency']) && $data[$field]['frequency'] == 255 && isset($data[$field]['lastseen'])) {
$lastSeen = strtotime($data[$field]['lastseen']);
$hourAgo = time() - 3600;
if ($lastSeen >= $hourAgo) {
return 'sfs_blacklist';
}
}
if ($checkTor && $field === 'ip' && isset($data[$field]['torexit']) && $data[$field]['torexit'] == 1) {
return 'sfs_tor';
}
if (isset($data[$field]['confidence']) && $data[$field]['confidence'] >= $threshold) {
return 'sfs_confidence';
}
}
}
return false;
} catch (Exception $e) {
return false;
}
}
