Tag Archives: captcha

An easy tweak with PHPBB forums to avoid automated registration of spambot users

Spammers  finally reached PHPBB version 3 “Olympus” default CAPTCHA automated OCR task in their development schedule and recently started registering bot users passing the provided CAPTCHA confirmation code.

PHPBB3 CAPTCHA Sample

Luckily to them, PHPBB3 default CAPTCHA code is ridiculously easy to OCR, os basically this was rather expected. It does not however mean that there is no way to effectively stop automated registrations without spending too much time on forum engine update.

Automated registration spider sends HTTP POST with the code it OCR’red from the image and we can leave the same code querstion in place and just ask the interactive user to type some extra information into input field. For example, it is possible to instruct him/her to type an extra asterisk before the code, so that the following is expected to be typed in: *25K9RGS. This makes the only thing important: to put a proper not for the user so that he/she is aware that he needs this character to be also entered. PHP code update is relatively simple:

includes\ucp\ucp_register.php, near line 235:

<?php
////////////////////////////////
// NOTE: Checking extra asterisk in front of CAPCTCHA code to prevent from automated CAPTCHA readers
$confirm_code = $data['confirm_code'];
$confirm_code = (substr($confirm_code, 0, 1) == "*") ? substr($confirm_code, 1) : "";
if (strcasecmp($row['code'], $confirm_code) === 0)
// original:
//if (strcasecmp($row['code'], $data['confirm_code']) === 0)
////////////////////////////////
?>

then default style (e.g. subsilver2) HTML tempalte needs to have an extra character (9 instead of 8) space in the input field, styles\subsilver2\template\ucp_register.html, line 92:

<td class="row2"><input class="post" type="text" name="confirm_code" size="9" maxlength="9" /></td>

And finally the CONFIRM_CODE_EXPLAIN comment needs to be updated to instruct user to type the extra asterisk in language\en\common.php.

Automated CAPTCHA reader

I recently came across a discussion about automated reader of CAPTCHA images. A guy told they sold an implementation of such a reader for $100K (in total; certain initial payment followed by $5K/mo payments). While this might appear to be an exaggeration, I recalled another interview given by another OCR fellow who mentioed a simiar offer he declined for reasons he chose to not specify.

I am afraid I am losing something here, as CAPTCHA reader questions in in most cases not an issue as soon as it is required to decode particular type of images. Image prefiltering followed by OCR will pass through 95% of the protection implementations around, one need only an experienced software engineer and a desire to break the protection complemented by a modest budget. Moreover, CAPTCHA code can be changed anytime so the game is actually of a different nature: one makes it harder to decode in automated fashion and the other tries to get even. I would rather say that the task for the former guy is more difficult (as soon as we still expect web user to be able to recognize the code).