portrait picture

TIMO ZIMMERMANN

balancing software engineering & infosec

Deep dive: Django Q and SQS

posted on Wednesday 23rd of September 2020 in

When working on a Django application the de facto recommendation for a task queue is Celery. I believe this is a good recommendation. It is kind of like buying IBM – “no one was ever fired for buying IBM”. I started using Django Q more recently and it is doing a great job. One system I built using it processes roughly 400k tasks per day. Surely not the largest system and surely not the most impressive number, but decent enough to say that Django Q is a solid choice.

But as with many smaller projects there are sometimes a few gotchas you are running into. This becomes painfully obvious when setting up an app using SQS. Let me walk you through the steps I took to make Django Q play nicely with our AWS setup at Grove Collaborative.

Redis is great, but…

First of all you have to configure Django Q to use SQS. You do this by adding the Q_CLUSTER dictionary to your settings.py with the sqs key.

If you are familiar with AWS and boto3 you might know that you can either provide the AWS region when initialising a new connection or you can have a standard config stored outside your project which is automatically used. Or you avoid access keys and the hassle of managing them and start using IAM roles.

Django Qs documentation actually explains this option and shows all three keys for the SQS configuration as optional. So a valid configuration would be

Q_CLUSTER = {
    'name': 'SQSExample',
    'workers': 4,
    'timeout': 60,
    'retry': 90,
    'queue_limit': 100,
    'bulk': 5,
    'sqs': {}
}

But when you try to deploy your app this way it might start raising an exception explaining that it cannot connect to Redis. If we look at the get_broker function we see that Redis is the default. Conf.SQS is simply the value of Q_CLUSTER["sqs"]. Know what is going on here?

settings = {
    "sqs": {}
}

x = settings.get("sqs", None)

if x:
  print("let's use SQS!")
else:
  print("okay, Redis it is")

This code snippet will also tell you that we will be using Redis. Empty dictionaries evaluate to False. According to the documentation the configuration is valid, but it will not work.

Computer says no

Okay, we got this. We are finally loading the SQS broker. But we might run into an exception telling us that we cannot create a queue.

Usually when you provision a new AWS environment you setup the instances, database, queues,… all the services you want AWS to host for you. While setting them up you most likely create an IAM role or access key with the right amount of permissions to use the services the way your service will use them.

So why do we get permission errors when starting our service?

Django Q calls boto3 create_queue method when initialising the broker. If a queue already exists, assuming you do not pass around conflicting attributes, it returns the URL for the queue. All good, right? Not if your IAM role or access key does not have “create permissions” for SQS, which it should not need.

An easy fix is creating your own broker and overwriting get_queue.

# coding: utf-8
from django_q.brokers.aws_sqs import Sqs


class GetSqsBroker(Sqs):
	self.sqs = self.connection.resource("sqs")
    return self.sqs.get_queue_by_name(QueueName=self.list_key)

The only thing left to do is specify the broker in your settings. Assuming your class lives in broker.py you replace sqs with broker_class and a path to your custom broker so it can be imported.

Q_CLUSTER = {
    'name': 'SQSExample',
    'workers': 4,
    'timeout': 60,
    'retry': 90,
    'queue_limit': 100,
    'bulk': 5,
    "broker_class": "broker.GetSqsBroker",
}

This surely is not an as generic implementation as the one Django Q ships with and one could argue that it is less comfortable to use. I would counter that with the argument that libraries should not randomly create resources or expect permissions not absolutely necessary to operate.

What was my queue name again?

While working through the code you will often see Conf.PREFIX and list_key being mentioned.

Looking at the configuration class we know Conf.PREFIX is the name we specify in the configuration dictionary. __init__ is setting list_key to Conf.PREFIX if the keyword argument is not passed in when initialising a new instance of the broker. For completeness the list_key on the instance is set here. When we finally get the queue the list key is used as queue name.

I do not think this is a surprise, but when you implement a custom broker or try to debug why you see unexpected error messages you should know the way data and configuration flows through the system. When you design a library you often will have to mutate variable names going from a human readable config file to a generic implementation. This is just a nice, simple to follow example I will surely use one day during a training.

Django Q is ready for production

As I mentioned earlier I have successfully used Django Q at a decent scale and I do not expect it to fall over at larger scales. You might want to set your save_limit to -1. Overall it is easy to use, straight forward to debug and small enough to not be a big liability if you would ever have to take over maintenance.

While you can surely create your own broker and  ensure your configuration dictionary has a key in it, you should not have to. I opened two pull requests to fix those issues.