art with code


Little benchmark

Got my computer put together yesterday, and now's the time to benchmark it! On a Saturday afternoon...

Image correlation algorithm benchmark:

240 GBps -- Athlon II X4 640, 3GHz (12GHz aggregate), 2MB L2
85 GBps -- Core 2 Duo E6400, 2.1GHz (4.3GHz aggregate), 2MB L2
OpenMP+SSE optimized
103 GBps -- Athlon II X4
45 GBps -- Core 2 Duo
OpenMP+SSE naive
13 GBps -- Athlon II X4
5 GBps -- Core 2 Duo

Pretty much linear scaling with clock frequency in OpenCL. Both have a 3 cycle L1 latency and the algorithm is very much an L1 cache benchmark, so this isn't too surprising. The SSE version has some bandwidth / load-balancing bottleneck going on, and the naive version is pretty much a pure memory bandwidth benchmark.
Post a Comment

Blog Archive

About Me

My photo

Built art installations, web sites, graphics libraries, web browsers, mobile apps, desktop apps, media player themes, many nutty prototypes, much bad code, much bad art.

Have freelanced for Verizon, Google, Mozilla, Warner Bros, Sony Pictures, Yahoo!, Microsoft, Valve Software, TDK Electronics.

Ex-Chrome Developer Relations.