Archive for January, 2011

Django signals and comment denormalization

Wednesday, January 26th, 2011

Last week we launched Legalbump, a legal news aggregator in the style of Digg/Reddit/Hacker News.

Standing on the shoulders of giants (as we like to do in the Django world), we used several pluggable apps to provide some of the core social features you’d expect: comments and voting. In particular, we used django-voting and django-threadedcomments (be sure to get the latter from GitHub – PyPi has an older package, 5.3 at time of writing).

Both are very flexible packages and provide template tags to get vote and comment counts (django-threadedcomments in particular is almost a drop-in replacement for Django’s contrib.comments, so you can use {% get_comment_count for entry as comment_count %}

However, the price of that flexibility is (some) performance drag. In particular, both apps end up making a LOT of queries (the precise number and impact of which you can discover for yourself via the invaluable django-debug-toolbar). One strategy to ameliorate that is caching; another is some denormalization (which, in effect, is another form of caching). In this case, we decided to start by denormalizing the comment counts, so that instead of making a separate query to get comment counts for each item, we store that information with the Post object itself (which we retrieve anyway for all of the list displays).

The first step is obvious: add a num_comments = models.IntegerField(default=0) field to our Post model (and of course make the appropriate migration). Then, define a method on the Post model to get the actual comment count – luckily, django-threadedcomments makes this a one-liner:

def get_num_comments(self):
    return ThreadedComment.objects.filter(object_pk=self.id).count()

So then the only remaining question is “how do we update the comment count whenever someone comments?” One less-than-elegant method might be to override the Comment.save() method; another might be to add an extra step in the view. However, these are both, well, ingelegant and hackish. Luckily, Django provides just the right piece of infrastructure to get the job done: signals!

To quote Django’s superb documentation:

[signals] allow decoupled applications get notified when actions occur elsewhere in the framework

In a nutshell, we can use the Django signal dispatcher to tell our Post object to “do something” when a ThreadedComment is saved. So, to update the comment count when a new comment is added, we’d do something like:

def denormalize_comments(sender, instance, created=False, **kwargs):
    instance.content_object.num_comments = instance.content_object.get_num_comments()
    instance.content_object.save()
models.signals.post_save.connect(denormalize_comments, sender=ThreadedComment)
models.signals.post_delete.connect(denormalize_comments, sender=ThreadedComment)

Credit where credit is due: this approach is modeled on a django-snippet that does something very similar for vote counts (though Legalbump’s actual vote denormalization is a bit more complicated, since it accounts for both Comment and Post votes).

It doesn’t take much imagination to see how this approach can be used to set up all sorts of interactions between apps without necessarily touching the apps themselves. Indeed, many well-written Django pluggable apps send their own custom signals.

Finally, a quick word of caution: despite being “separate” the actions triggered by signals are still executed inside the original HTTP request. So if you’re using signals to kick off some complex or long-running process, consider an asynchronous task queue (i.e. django-celery), so that these tasks run outside of the request.

JQuery Plugins: Column View, Part 1

Wednesday, January 19th, 2011

I recently had the task of replicating the OS X Finder’s Column View for a web application we are writing. The app itself is not quite done – but the columnview plugin is already out for public use.

I am one of those few developers who really like working with JavaScript. Because of my affinity I have gotten a lot of JavaScript projects over the years. As far as I am concerned, jQuery is the bee’s knees for working in JavaScript – it captures the natural feel of the language while exploiting its features (e.g. lexical scoping). So for a long time I just worked on these projects writing the necessary code to “get ‘er done”… creating a sloppy mess. These days I am concerned about the architectural issues – and I am not the only one. Questions around jQuery plugins are some of the more common I see on developer community sites like StackOverflow. With these questions circling I wanted to explore the topic in two parts.

First, I want to talk about creating jQuery plugins in general. Second, in the next post I want to talk specifically about this plugin and some of the design choices I made along the way.

What I am writing about here comes from some experiences and, more importantly, the jQuery Authoring Plugins Page. I am going to assume you understand and agree that cluttering the global namespace is considered “Bad Form” and will get you snide looks at parties. You also understand singletons and closures. There are wonderful resources only a-google-away if you don’t, but I won’t be providing them here. With that out of the way, this is the generic jQuery-approved boilerplate for creating a new plugin:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
(function ( $ ) {
    var methods = {
    ...
    };
 
    $.fn.columnview = function(method) {
        methods.columnview = this;
 
        if ( methods[method] ) {
            return methods[ method ].apply( this, Array.prototype.slice.call( arguments, 1 ));
        } else if ( typeof method === 'object' || ! method ) {
            return methods.init.apply( this, arguments );
        } else {
            $.error( 'Method ' +  method + ' does not exist on jQuery.columnview' );
        } 
 
        return this;
    };
})( jQuery );

The jQuery folks ask that a single plugin does not create more than one entry into the jQuery namespace (if you peek at their authoring page you’ll see the “Not Like This” example). So you stick all of your plugin’s methods into the methods singleton. You place both the singleton and the actual assignment to the jQuery object within an anonymous function and call it immediately. This prevents any of your code from leaking out into the global namespace. You’ll notice the code inserted as your plugin into jQuery is only 11 lines long. All your operations should occur in your methods.

So what do these 11 lines do? Three things. First, whichever jQuery object you called the plugin from is set in the singleton, making it available to all your methods. The method’s `this` context will not be the jQuery object you called from the outside so you absolutely need this reference. Second, when you start up your plugin, like so:

1
2
3
$('#columnview').columnview();
// or like this
$('#columnview').columnview({my:'options', go:'here'});

…it runs the init function (you’re going to have to define one) with the singleton you (optionally) pass in (line 12 in the code snippet). Third, you can call a specific function on the plugin, like so:

1
$('#columnview').columnview("refresh_column", 1);

That would call `methods.refresh_column` with the argument of `1` (line 10 in the code snippet). If you have ever used the jQuery Dialog plugin you have done this with the `close()` method.

There is one more question (really more like a wall) that people run into. It happens with more advanced JavaScript use, the nature of the language itself, and the event driven model we use on the web. Data?! Where do we put it?! Normally we’ve got classes with variables or a database to hold all our stuff. Now we’ve got a singleton and we could put stuff in there. But the singleton belongs to everyone and it can get messy sticking things into arrays (which end up resembling a struct from C). Developers coming from more “normal” programming ecosystems don’t consider the DOM as a place to store things. The DOM in JavaScript can, and should, be thought of as your data-store. jQuery has a handy function to do this storage for you: `data()`.

Data takes a key and a value and sticks it onto a DOM node. It does not put it on the jQuery object (which easily gets GC’d, lost, found, buried in soft peat moss, and subjected to public inquiry [bonus points if you get the reference]) but on the DOM node itself. So you can always be sure to have the up-to-date value and get it from anywhere without any concern. Using `data()` saves lives and makes your brain hurt less.

Next time: we’ll delve into the columnview plugin specifically.

EDIT: Part 2 is up!