An Identity Map for Backbone.js

An Identity Map for Backbone.js

The-Usual-Suspects-1995-Movie-Image-1

One problem with single-page apps is that application state can stick around for longer than it would in a more traditional request-response web app. Because your users aren’t refreshing the page very often, you can have JavaScript objects sitting around in the browser for hours, or even days and weeks.

At Shine, we’ve been working on a large Backbone.js application recently and found that identity issues relating to long-lived objects caused a number of subtle and not-so-subtle bugs. For us the solution was to introduce an identity map.

In this blog entry I’ll talk about what an identity map is, why you’d want to consider using one, and will introduce you to an implementation that we’ve put on Github.

The Problem

If you’ve got several Backbone views on the page that are supposed to be backed by the same model object, it can often end up being the case that they are actually backed by different object instances – even though those instances all have the same class and model ID.

This can occur if each view fetched the model object from the server separately – either directly or as part of a collection – because Backbone’s sync methods don’t guarantee that they’ll always give you the exactly the same object reference for a particular model class and ID.

This causes bugs when one view modifies its model and other views which are subscribed to change-events on the model aren’t notified – because their instance of the model is actually a different object.

In the best case, those views will only find out about the changes when their model is refreshed from the server at some later state. In the worst case, if those views save their stale models to the server before they refresh, previously-updated data will be overwritten with stale information.

The problem is exacerbated if you keep views around in order to retain UI state, a strategy we’ve described previously.

The most obvious way to deal with this issue is to make sure your views always use the same model object. To do this you have to make sure both views are instantiated in the same view hierarchy and load the common model object in the closest parent view.

Unfortunately this is not always practicable or even possible. In our case, for example, one view was instantiated directly with a model object on page load, but another view was using a model with the same class and ID that was deeply nested within another object and loaded as part of a Backbone collection – something we had little control over.

Introducing Identity Maps

To solve this problem, a coworker proposed adding an identity map to our Backbone application.

Identity maps are commonly used by object-relational mappers like Hibernate that need to guarantee that for every row in the database, the same object will always be returned – no matter what mechanism you used to fetch it.

Similarly, if you look under the hood of JavaScript data-persistence frameworks like Backbone Relational and Ember Data, you’ll probably find something very similar to an Identity Map in place. Unfortunately, we weren’t using Backbone Relational and it wasn’t worth switching to it just to get Identity Map capabilities.

Identity Maps can be pretty committing and are not for everybody, as they can have subtle but profound implications for your codebase (as you’ll see shortly). The Rails team introduced an identity map as an optional feature in version 3.1, but later pulled it out owing to known issues with ActiveRecord’s relationship mapping that they couldn’t be bothered resolving.

A quick search for an existing stand-alone Backbone Identity Map implementation lead us to this blog post and GitHub project (see Hacker News for some good discussion). Unfortunately, this implementation redefines all of Backbone.Model and was based on Backbone 0.5.x. We had little luck using it with Backbone 0.9.x and were reluctant to try to get it to work, as it seemed excessive to rewrite Backbone.Model.

Introducing a new Backbone Identity Map

The actual implementation of an identity map is reasonably straightforward – it’s really just a hash that keys a class and ID to a model object instance. The real challenge is to integrate this data structure seamlessly into your app, and also to get Backbone to work correctly with it.

The approach that minimizes the changes required to both our own code and Backbone.js is to override the JavaScript new operator for our Backbone models. Luckily Javascript is flexible enough to allow us to do this.

The ECMAScript language definition details the process that occurs when calling the new operator. When calling a constructor, the Javascript runtime (a browser, for example) will create a new empty javascript object to pass in as this for the constructor.

In most cases, that new empty object will become the value returned by the constructor, even if it’s not explicitly returned, which is why most constructors don’t return this. If, however, the constructor returns some other object, the empty object will be discarded and the returned object is used in it’s place. The only restriction is that the returned object from a constructor must be an object, not a primitive type.

By leveraging this behaviour, we can override new to return an existing model if it’s been created before, or create a new model if it hasn’t. Every Backbone model that has an ID is stored in a global cache – our hash – either when it is created or when a new ID is assigned using the change:id event.

The constructor looks in the cache to see if we already have a reference to an existing object with the given ID, and if we do, just returns that object (after setting any other attributes passed to the constructor). Otherwise, we just delegate to the model’s constructor.

Finally, we provide a means to clear the cache. This should be called when your user logs out of the application – it’ll help avoid memory leakage and prevent security issues.

So without any further ado, here’s the code:

(function() {

  // Stores cached models:
  // key: (unique identifier per class) + ':' + (model id)
  // value: model object
  var cache = {};

  /**
   * Identity Map for Backbone models.
   *
   * Usage:
   *
   * var NewModel = Backbone.IdentityMap(Backbone.Model.extend(
   * {...},
   * {...}
   * ));
   *
   * A model that is wrapped in IdentityMap will cache models by
   * their ID. Any time you call new NewModel(), and you pass in
   * an id attribute, IdentityMap will check the cache to see if
   * that object has already been created. If so, that existing
   * object will be returned. Otherwise, a new model will be
   * instantiated.
   *
   * Any models that are created without an ID will instantiate
   * a new object. If that model is subsequently assigned an ID,
   * it will add itself to the cache with this ID. If by that
   * point another object has already been assigned to the cache
   * with the same ID, then that object will be overridden.
   *
   * realConstructor: a backbone model constructor function
   * returns a constructor function that acts like realConstructor,
   * but returns cached objects if possible.
   */
  Backbone.IdentityMap = function(realConstructor) {
    var classCacheKey = _.uniqueId();
    var modelConstructor = _.extend(function(attributes, options) {
    // creates a new object (used if the object isn't found in
    // the cache)
    var create = function() {
      return new realConstructor(attributes, options);
    };
    var objectId = attributes && attributes[realConstructor.prototype.idAttribute];
    // if there is an ID, check if that object exists in the
    // cache already
    if (objectId) {
      var cacheKey = classCacheKey + ':' + objectId;
      if (!cache[cacheKey]) {
        // the object has an ID, but isn't found in the cache
        cache[cacheKey] = create();
      } else {
        // the object was in the cache
        var object = cache[cacheKey];
        // set up the object just like new Backbone.Model() would
        if (options && options.parse) {
          attributes = object.parse(attributes);
        }
        object.set(attributes);
      }

      return cache[cacheKey];
    } else {
      var obj = create();
      // when an object's id is set, add it to the cache
      obj.on('change:' + realConstructor.prototype.idAttribute,
        function(model, objectId) {
          cache[classCacheKey + ':' + objectId] = obj;
          obj.off(null, null, this);
        }, this);
        return obj;
      }
    }, realConstructor);
    modelConstructor.prototype = realConstructor.prototype;
    return modelConstructor;
  };

  /**
   * Clears the cache. (useful for unit testing)
   */
  Backbone.IdentityMap.resetCache = function() {
    cache = {};
  };
})();

So how do you use it? For all Backbone model classes whose instances you want to make subject to identity-mapping, you simply wrap the model constructor function with the Backbone.IdentityMap function, and use the resultant constructor when creating new instances:

var MyModel = Backbone.Model.extend(
  ...
));
var MyIdentityMappedModel = Backbone.IdentityMap(MyModel);

var myModelInstance1 = new MyIdentityMappedModel({id:1});
var myModelInstance2 = new MyIdentityMappedModel({id:1}); // OMG myModelInstance1 === myModelInstance2 !!!

Note that you can’t extend an identity-mapped class. Instead, you should extend the original class, then apply IdentityMap function to the result:

var MyModel = Backbone.Model.extend({
  ...
});
var MyIdentityMappedModel = Backbone.IdentityMap(MyModel);

var MyExtendedModel = MyModel.extend({
  ...
});
var MyExtendedAndIdentityMappedModel = Backbone.IdentityMap(MyExtendedModel);

One final thing worth noting about our implementation is that because the notion of a ‘class’ in JavaScript is actually rather slippery, there isn’t really a reliable and easy way to identify a class for the purposes of indexing it into an identity map.

To get around this, we use the underscore.js _.uniqueId() method to simply define a unique identifier for each class that is passed to Backbone.IdentityMap. Captured by a closure, this unique identifier will be used every time an instance of the class is added to the identity map. Whilst this is an implementation detail that wouldn’t normally be of interest, it is worth knowing if you’re trying to inspect the contents of the identity map during debugging.

So how does it work with Backbone?

Our Identity Map operated out-of-the-box with the version of Backbone that we are using (0.9.2). It’s instructive to understand why this is the case.

Surprisingly, there are only two places where Backbone 0.9.2 uses the new operator: Collection._prepareModel() and Model.clone().

In Collection._prepareModel, Backbone does this:

  _prepareModel: function(attrs, options) {
    if (attrs instanceof Model) {
      if (!attrs.collection) attrs.collection = this;
      return attrs;
    }
    options || (options = {});
    options.collection = this;
    var model = new this.model(attrs, options);
    if (!model._validate(attrs, options)) return false;
    return model;
  }

You’ll notice that the call to new is using this.model(...). In this context, this.model will refer to the wrapped, identity-mapped constructor function. Fortunately for us, this is the desired behaviour! If it didn’t use our identity map, we’d have no way to replace the objects in the collection with ones from the identity map without overriding Collection._prepareModel.

In the Model.clone case, Backbone does this:

  clone: function() {
    return new this.constructor(this.attributes);
  }

In this context, this.constructor refers to the non-wrapped version of our constructor (i.e. it doesn’t use the identity map), which means the clone will still always be a completely new object. Excellent! The only downside is that if you clone something without an ID, then give it an ID later, it won’t automatically be added to the identity map. However, this hasn’t yet been an issue for us.

Caveats

Whilst this solution is a testament to the flexibility of JavaScript as a functional language, there are a couple of caveats:

  1. The behaviour of the new operator could be mystifying for a new developer on the project if they don’t know about the identity map already. Although it hasn’t turned out to be a problem for us yet, it’s not hard to imagine it causing confusion.
  2. It’s still possible to create objects with the same ID in certain situations (details are documented in the source code).
  3. Memory usage: every model that’s created will stay in memory until the cache is cleared. That said, it’s hard to see how this could be avoided with any implementation of an identity map

Conclusion

We’ve been running this code in production for a few weeks now with no issues, and have found this approach to be unobtrusive and as simple as possible. The code clocks in at less than 3kb and less than 500b minified.

The implementation solves our initial problem of sharing models between views, because any time we try to instantiate a model, we’ll always get the cached version if it exists. Therefore there won’t be any duplicate models unless we specifically want there to be with clone.

The complete source code (with tests) can be found on Github.

6 Comments
  • Greg Reimer
    Posted at 11:22h, 27 December Reply

    View instances that register event handlers on these models don’t get garbage collected until the view does, since the model internally maintains a list of functions to call for every change event, which in turn are closed over view data, including DOM content. So a crazy amount of stuff stays in memory and more and more view instances uselessly re-render in the background as the app goes on. Have sort of an experimental workaround for this issue that I touched on here http://www.reddit.com/r/javascript/comments/15gwdk/an_identity_map_for_backbonejs/c7mjyf7 . Cheers

    • Greg Gross
      Posted at 11:34h, 27 December Reply

      This is definitely something that needs to be thought about before implementing an identity map. In our application, we make sure to always call .off() when the view isn’t needed anymore so the garbage collector can get rid of those views.

      • Anthony Short (@anthonyshort)
        Posted at 21:37h, 29 December

        This isn’t really true anymore with the latest version of backbone. on and off aren’t used as much, and listenTo is used instead and will be automatically cleaned up.

  • Anthony Short (@anthonyshort)
    Posted at 21:45h, 29 December Reply

    I ran into this issue with an app I was building and ended up doing pretty much the same thing. It should almost be a part of backbone itself, or Backbone should at least provide a create method for its objects that can be overridden with custom functionality.

    With caveat 3, it’s really hard until we get WeakMaps which would only hold weak (garbage-collectible) references making it easier to avoid memory leaks with these maps, or you need to do some funky view referencing counting manually to know how many views need a model and when that is 0 then dispose of it. That could be done on an interval or automatically. A lot of complexity that most apps probably wouldn’t need just yet.

    Either way, it’s a bit painful. Your solution of keeping them around would suffice for most apps I’d say. Although it really depends on how much data and the total number of models. Most of the time it will be insignificant.

  • Sean Jacke
    Posted at 16:16h, 25 January Reply

    Firefox (and Chrome, via an opt in preference) now offers WeakMap, which is described as a key/value map where the keys are objects and references to them are held “weakly”, meaning that they do not prevent garbage collection when there are no other references to the object.

    But doesn’t that get it backwards for this use case?

    In an identity map, you want the keys to be the ids and the values to store a weak reference to an object, so that you can check that no other object exists for that class with the same id before instantiating.

    With the objects as the keys and the ids as the values, you’d have to search by value, as opposed to by key, which isn’t possible.

    Or am I missing something?

    • Sean Jacke
      Posted at 01:25h, 28 January Reply

      Unfortunately, it seems I’m not – I just came across a discussion thread ((http://en.usenet.digipedia.org/thread/14436/13934/) where WeakMap’s failure to address this use case was voiced to its developers who seem to favor not exposing the mechanics of garbage collection, which such functionality would apparently require.

      Here is the excerpt:
      >>.Primitive keys are exceptionally common for maps, and this would open up a lot of potential uses, especially regarding caching. >>
      >What you are asking for is a “weak array” (or an array of “weak references”). While some people think that such things have utility others are concerned that they expose GC based non-determination which they don’t want to allow into the language.>

      So apparently, a straight-forward implementation of identity maps in JS is still a ways off…
      : (

Leave a Reply

%d bloggers like this: