art with code


JS xpconnect call overhead

Wanted to know how much faster it would be to do the 4x4 matrix math in a C++ library called from JavaScript, so benchmarked an empty method vs. doing JS mmul4x4. The empty method looks like NS_IMETHODIMP nsMyClass::DoNothing() { return NS_OK; }.

JS mmul4x4 a million times took 1.5s with JIT. Calling the empty method a million times took 1.2s. Doing a million mmul4x4s in C took 0.06s.

Which is actually pretty nasty, as it means that a thousand GL calls will have 1.2 milliseconds overhead. And you usually need to do several GL calls per drawn object.

So, assuming an average of 3 GL calls and one mmul4x4 per object, the JavaScript overhead would be around five milliseconds for a thousand objects. Add in the 10 ms it takes for the swapBuffers compositing and hey, doing a thousand objects at 60 fps just became impossible :(

Update 2: gl.isBuffer empty call overhead is 1.4 us instead of the 1.2 us.

Update: Measured the overhead for actual GL calls and it's a bit more grim. The full method call for, say, gl.isBuffer takes 3.6 us. Minus the 1.4 us overhead and we're left with 2.2 us of actual work. Of which getting the NativeJSContext takes 0.2 us, glXMakeContextCurrent 1.6 us, glIsBuffer 0.2 us and the trailing glGetError 0.2 us. Ouch.

The reason why the GL canvas context does glXMakeContextCurrent before each GL call is that Firefox is single-threaded. And you can only have one current GL context per thread. So you need to manually flip between the different GL rendering contexts. Which appears to be costly.

But there's a solution! if (glXGetCurrentContext() != myContext) glXMakeContextCurrent(dpy, pbuf, pbuf, myContext); Now gl.isBuffer takes only 2.0 us if there's no contention.

It's still sad that there's a 1.6 us overhead for a 0.4 us call, but at least it's better than a 3.2 us overhead :>
Post a Comment

Blog Archive

About Me

My photo

Built art installations, web sites, graphics libraries, web browsers, mobile apps, desktop apps, media player themes, many nutty prototypes