Making valid CAPTCHAs

The WP-reCAPTCHA plugin is very helpful, providing a way to stop spam comments, and assisting electronic storage of old books. It has as a feature “XHTML compliance,” but at the price of requiring JavaScript from users. I think I can get both.

In wp-recaptcha.php we find these lines:

440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
		if ($recaptcha_opt['re_xhtml']) {
		$comment_string = < <<COMMENT_FORM
				<div id="recaptcha-submit-btn-area">
				<script type='text/javascript'>
				var sub = document.getElementById('submit');
				sub.parentNode.removeChild(sub);
				document.getElementById('recaptcha-submit-btn-area').appendChild (sub);
				document.getElementById('submit').tabIndex = 6;
				if ( typeof _recaptcha_wordpress_savedcomment != 'undefined') {
						document.getElementById('comment').value = _recaptcha_wordpress_savedcomment;
				}
				document.getElementById('recaptcha_table').style.direction = 'ltr';
				</script>
COMMENT_FORM;
		}
 
		else {
		$comment_string = < <<COMMENT_FORM
				<div id="recaptcha-submit-btn-area"> 
				<script type='text/javascript'>
				var sub = document.getElementById('submit');
				sub.parentNode.removeChild(sub);
				document.getElementById('recaptcha-submit-btn-area').appendChild (sub);
				document.getElementById('submit').tabIndex = 6;
				if ( typeof _recaptcha_wordpress_savedcomment != 'undefined') {
						document.getElementById('comment').value = _recaptcha_wordpress_savedcomment;
				}
				document.getElementById('recaptcha_table').style.direction = 'ltr';
				</script>
				<noscript>
				 <style type='text/css'>#submit {display:none;}</style>
				 <input name="submit" type="submit" id="submit-alt" tabindex="6" value="Submit Comment"/> 
				</noscript>
COMMENT_FORM;
		}

Setting aside the fact that good code never has large blocks of duplicate lines, we can move entirely to the first block, eliminating the need for the second one. The second block – the non-XHTML compliant one – adds only a noscript block. What does the block do? It hides one submit button and adds another one. Both the hidden button and the new one are exactly the same, so I’m not sure what the developer was going for there. We end up with this:

440
441
442
443
444
445
446
447
448
449
450
451
452
		$comment_string = < << COMMENT_FORM
				<div id="recaptcha-submit-btn-area">
				<script type='text/javascript'>
				var sub = document.getElementById('submit');
				sub.parentNode.removeChild(sub);
				document.getElementById('recaptcha-submit-btn-area').appendChild (sub);
				document.getElementById('submit').tabIndex = 6;
				if ( typeof _recaptcha_wordpress_savedcomment != 'undefined') {
						document.getElementById('comment').value = _recaptcha_wordpress_savedcomment;
				}
				document.getElementById('recaptcha_table').style.direction = 'ltr';
				</script>
COMMENT_FORM;

The next change is in recaptchalib.php. If you’re using XHTML Transitional, you don’t need to do this; we’re going to get rid of an iframe element that isn’t valid in XHTML Strict, but is fine in Transitional. Replace this code:

120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
  if ($xhtml_compliant) {
     return '<script type="text/javascript" src="'. $server . '/challenge?k=' . $pubkey . $errorpart . '"></script>
 
<noscript>
<div>
<textarea name="recaptcha_challenge_field" rows="3" cols="40"></textarea>
<input type="hidden" name="recaptcha_response_field" value="manual_challenge"/>
</div>
</noscript>';
  }
 
  else {
     return  '<script type="text/javascript" src="'. $server . '/challenge?k=' . $pubkey . $errorpart . '"></script>
 
<noscript>
<iframe src="'. $server . '/noscript?k=' . $pubkey . $errorpart . '" height="300" width="500" frameborder="0"></iframe><br />
<textarea name="recaptcha_challenge_field" rows="3" cols="40"></textarea>
<input type="hidden" name="recaptcha_response_field" value="manual_challenge"/>
</noscript>';
  }
  return $output;

We’re going to switch to the modern object element, leaving the iframe in place for old versions of IE that don’t work nicely. Some versions of IE also need a proprietary value for the classid which breaks other browsers. Since this other stuff is IE only we can hide it in a conditional comment, resulting in two different blocks of code. Also the developer seems to have forgotten about HEREDOC, I’ve put it back in to make things much much simpler:

120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
  return  < << HTML
<script type="text/javascript" src="$server/challenge?k=$pubkey$errorpart">
 
<noscript>
<div>
<!--[if IE]>
<object data="$server/noscript?k=$pubkey$errorpart" height="300" width="500" type="text/html" classid="clsid:25336920-03F9-11CF-8FD0-00AA00686F13">
<iframe src="$server/noscript?k=$pubkey$errorpart" height="300" width="500" frameborder="0"></iframe>
</object>
< ![endif]-->
<!--[if !IE]> < -->
<object data="$server/noscript?k=$pubkey$errorpart" height="300" width="500" type="text/html">
<p>There was an error loading the CAPTCHA.</p>
</object>
<!--> < ![endif]-->
<br />
<textarea name="recaptcha_challenge_field" rows="3" cols="40"></textarea>
<input type="hidden" name="recaptcha_response_field" value="manual_challenge"/>
</div>
</noscript>
HTML;

Note I have not tested this extensively, and this is guaranteed not to work with browsers that don’t support JavaScript and don’t work with the object element. But those people have bigger problems to worry about than commenting on your blog.
Wordpress is causing problems with this code; there are a few places where it’s inserted a space after a < which you’d want to fix if you copied this code.
Thanks to this page for confirming for me that the classid was breaking Firefox.