art with code

2010-08-10

A slow cache algorithm and a faster one

It took around 30 ms to draw some text titles. The weird thing was that not drawing the titles didn't really affect the draw time. So clearly the problem was not in the drawing speed. Investigating, my text bitmap LRU cache was hitting its max size, destroying all possible benefits of caching. But making the cache large enough didn't fix the performance problem. A mystery!

On closer look, my caching algorithm was retarded:

function makeCache(maxSize) {
return { array: [], hash: {}, maxSize: maxSize };
}

function cacheItem(cache, item) {
// shift old items off the array
while (cache.array.length >= cache.maxSize) {
var it = cache.array.shift();
delete cache.hash[it.key];
}
cache.array.push(item);
cache.hash[item.key] = item;
}

// move item with key to the end of cache array
function refresh(cache, key) {
var it = cache.hash[key];
var idx = cache.array.indexOf(it);
cache.array.splice(idx,1);
cache.array.push(it);
}

function get(cache, key) {
refresh(cache, key);
return cache.hash[key];
}

Why that is bad: O(n) get, O(n) cacheItem after hitting maxSize. If you're going through the whole cache every frame, that's O(n^2). If you're also at maxSize, double that! Aargh!

So I rewrote it to be less insane and now the drawing is nice and fast:

function makeCache(maxSize) {
return { array: [], hash: {}, maxSize: maxSize, lastUsedIndex: 0 };
}

function cacheItem(cache, item) {
// chop off the oldest items after reaching overflow limit
if (cache.array.length >= cache.maxSize * 2) {
cache.array.sort(function(a,b){ return a.lastUsed - b.lastUsed; });
var deleted = cache.array.splice(cache.maxSize);
for (var i=0; i<deleted.length; i++) {
var it = deleted[i];
delete cache.hash[it.key];
}
}
cache.array.push(item);
cache.hash[item.key] = item;
}

// update the cached item's lastUsed
function refresh(cache, key) {
cache.hash[key].lastUsed = cache.lastUsedIndex++;
}

function get(cache, key) {
refresh(cache, key);
return cache.hash[key];
}

The gets are now O(1) and cacheItem with maxSize cache runs in amortized O((2n log 2n) / n) time. Or something. You could further optimize it by using a split instead of sort. Basically run an O(n) selection algorithm to find the maxSizeth element of the cache array, then use that as a pivot and do a single quicksort partition pass. That'd give you amortized O(2n / n), which is close enough to O(1) :P

No comments:

Blog Archive