Captcha Service With Python and Redis

Intro

While implementing my new personal website, I added a Contact page, on which people can send messages to me by filling a form.

This is a low-cost way to say something to the owner of the website. It can be anonymous, for instance, you can just type yo in the message box and click Send, then seconds later, I receive an email showing the message, and I have no idea who you are.

This feature looks convenient but also opens a window to everyone who wants to spam me. Someone may even willing to spend time writing a load-testing script for me (sending hundreds of thousands of messages to me within seconds to see if the website can handle it).

To prevent such case happens, I have to add a CAPTCHA (completely automated public Turing test to tell computers and humans apart) field. As a result, it became hard for people to write a program to flood my inbox.

My homepage is a static page generated by Vue, it requests external APIs through Javascript. Thus, the sending email service, or with the captcha service, is a standalone service running on my server.

Design

   User           Web Page               Server            Redis
    |                |                      |                |
    |                |-----request key----->|                |
    |                |                 generate key          |    (1)
    |                |                      |---save key---->|
    |                |<-----return key------|                |
    |                |--request img by key->|                |
    |                |              generate code & img      |    (2)
    |                |                      |---save code--->|
    |                |<-----return img------|                |
    |---input code-->|                      |                |
    |                |----send code & key-->|                |    (3)
    |                |                      |------key------>|
    |                |                      |<-----code------|
    |                |                compare codes          |
    |                |             op(e.g. send email)       |
    |                |<----return result----|                |

(1) Cache Key

The first request initialized by the client is asking for a key, which is used as the key in Redis as well as the argument used for asking for Captcha image in the following request.

The reason why the client has to send two separate requests to get the Captcha image is that the server is not able to return the key and the image data at the same time. This API is designed to be stateless, otherwise, you can manage a session instead of keeping an identifier like the key value in this example.

In the following code snippet, we also remove the Captcha data from Redis by the key given in the request. This allows the user to refresh the captcha image and the server clears the previous captcha. Besides, each key will expires automatically (deleted from Redis) if it is not cleared by any request.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@app.route(BASE_URL + "/captcha_pre")
def captcha_pre():
	# ckey refresh, remove the previous one
	ckey = request.args.get('ckey')
	if ckey:
		r.delete(ckey)

	key = str(uuid.uuid4())
	r.set(key, CAPTCHA_INIT_VALUE, ex=CAPTCHA_INIT_EXPIRE)
	return json.dumps({'ckey': key})

(2) Captcha Key-Image Pair

The captcha module is used here to generate the Captcha image.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from captcha.image import ImageCaptcha

image_captcha = ImageCaptcha()

@app.route(BASE_URL + "/captcha")
def captcha_image():
	ckey = request.args.get('ckey')

	# verify captcha key
	try:
		uuid_obj = uuid.UUID(ckey, version=4)
	except:
		return json.dumps({'result': 1, 'message': 'Invalid key'})

	# verify init
	captcha_cache = r.get(ckey)
	if not (captcha_cache and captcha_cache.decode("utf-8") == CAPTCHA_INIT_VALUE):
		return json.dumps({'result': 1, 'message': 'Invalid key'})
	
	# generate captcha
	value = ''.join(random.choices(CAPTCHA_CHARSET, k=CAPTCHA_LENGTH))
	data = image_captcha.generate(value)

	# cache captcha
	r.set(ckey, value, ex=CAPTCHA_EXPIRE)
	return send_file(data, attachment_filename="captcha.jpg", as_attachment=False)

(3) Verification

Nothing magical here, just compare the cached code with the user input. Take note that each captcha is one-time use only, it will be deleted no matter the user input is correct or not.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def verify_captcha(captcha_key, captcha_input):
	captcha_cache = r.get(captcha_key)
	result = False
	if captcha_cache:
		captcha_cache_str = captcha_cache.decode("utf-8")
		if (captcha_cache_str.lower() == captcha_input.lower()
			and not captcha_cache_str == CAPTCHA_INIT_VALUE):
			result = True
		# clean captcha cache regardless of the result
		r.delete(captcha_key)
	return result

Conclusion

This is one possible way to implement a Captcha service, at least it can protect my mail inbox to a certain extent. In addition, there can be some improvements on it, for example, adding IP limitations, so that users won’t be able to initialize a huge amount of Captcha data in your Redis cache.

The full code example will be available at my GitHub soon.