Last week we launched Legalbump, a legal news aggregator in the style of Digg/Reddit/Hacker News.
Standing on the shoulders of giants (as we like to do in the Django world), we used several pluggable apps to provide some of the core social features you’d expect: comments and voting. In particular, we used django-voting and django-threadedcomments (be sure to get the latter from GitHub – PyPi has an older package, 5.3 at time of writing).
Both are very flexible packages and provide template tags to get vote and comment counts (django-threadedcomments in particular is almost a drop-in replacement for Django’s contrib.comments, so you can use {% get_comment_count for entry as comment_count %}
However, the price of that flexibility is (some) performance drag. In particular, both apps end up making a LOT of queries (the precise number and impact of which you can discover for yourself via the invaluable django-debug-toolbar). One strategy to ameliorate that is caching; another is some denormalization (which, in effect, is another form of caching). In this case, we decided to start by denormalizing the comment counts, so that instead of making a separate query to get comment counts for each item, we store that information with the Post object itself (which we retrieve anyway for all of the list displays).
The first step is obvious: add a num_comments = models.IntegerField(default=0) field to our Post model (and of course make the appropriate migration). Then, define a method on the Post model to get the actual comment count – luckily, django-threadedcomments makes this a one-liner:
def get_num_comments(self): return ThreadedComment.objects.filter(object_pk=self.id).count() |
So then the only remaining question is “how do we update the comment count whenever someone comments?” One less-than-elegant method might be to override the Comment.save() method; another might be to add an extra step in the view. However, these are both, well, ingelegant and hackish. Luckily, Django provides just the right piece of infrastructure to get the job done: signals!
To quote Django’s superb documentation:
[signals] allow decoupled applications get notified when actions occur elsewhere in the framework
In a nutshell, we can use the Django signal dispatcher to tell our Post object to “do something” when a ThreadedComment is saved. So, to update the comment count when a new comment is added, we’d do something like:
def denormalize_comments(sender, instance, created=False, **kwargs): instance.content_object.num_comments = instance.content_object.get_num_comments() instance.content_object.save() models.signals.post_save.connect(denormalize_comments, sender=ThreadedComment) models.signals.post_delete.connect(denormalize_comments, sender=ThreadedComment) |
Credit where credit is due: this approach is modeled on a django-snippet that does something very similar for vote counts (though Legalbump’s actual vote denormalization is a bit more complicated, since it accounts for both Comment and Post votes).
It doesn’t take much imagination to see how this approach can be used to set up all sorts of interactions between apps without necessarily touching the apps themselves. Indeed, many well-written Django pluggable apps send their own custom signals.
Finally, a quick word of caution: despite being “separate” the actions triggered by signals are still executed inside the original HTTP request. So if you’re using signals to kick off some complex or long-running process, consider an asynchronous task queue (i.e. django-celery), so that these tasks run outside of the request.
