One thing that amazes me in discussions extolling the performance benefit of shared memory over message passing is that no one has apparently looked at the hardware in use. Because hardware has no shared memory. None. Nada. Zilch.
How a computer works in a nutshell
CPU: Hey, I'd like to read eight bytes from memory position 0x8008 to register %RAX.
Cache manager: Sure thing, wait a couple cycles. Hey L1, have you got the cache line with 0x8008?
L1: Sorry! (2 cycles if hit)
Cache manager: Hm, how about you L2? Got 0x8008?
L2: Oh, uh, sec, lemme see... nope. (9 cycles if hit)
Cache manager: Argh, OK, shit happens. DRAM, gimme 0x8008.
DRAM: Durr, 0x8008, yeah, 0x8008, yeah...
Cache manager: DRAM?
Cache manager: DRAM! 0x8008!
DRAM: Huh? Oh. Here you go. (200 cycles if hit, a couple dozen million if read from disk)
Shared memory in a cache coherent SMP machine:
CPU A: Read 0x8000!
CPU B: Write %RAX to 0x8001!
Cache manager: Oh nuts. Hey CPU A, CPU B just wrote to a cache line you have, here's the new version.
CPU A: Ghh, OK. Write 0x8000!
Cache manager: CPU B! CPU A just wrote in 0x8000, you need a new version of the cache line!
CPU B: Damn it A, stop writing on my cache line!
CPU A: Your cache line? I'm so sorry, I didn't see your name on it!
Significant performance loss later:
Cache manager: The person who wrote this algorithm must be a some kind of retard.
Does that look like a shared memory architecture to you? No way. It's a message passing architecture, communicating in 64-byte cache lines. Treating it as shared memory will lead to performance problems.
Theoretically, a message passing system will yield the highest performance for present-day hardware. That is, if a compiler compiles the message passing in terms of CPUs talking to memory via cache lines. Anyone know whether such a mythical beast exists?
Related reading: What every programmer should know about memory, part 2: CPU caches by Ulrich Drepper.
art with code
- ► 2013 (26)
- ► 2011 (20)
- ► 2010 (94)
- ► 2009 (84)
- I/O in programming languages: open and read
- Basics of I/O
- Building an OCaml array library from basic operati...
- A month and a half of iPhone 3G
- Gitbug - In-repo bug tracker for git
- prelude.ml: now on GitHub
- Slow-motion Missile Fleet
- prelude.ml: further modularization
- prelude.ml: more combinatorial wanking
- prelude.ml: range iterators
- "Shared Memory" Parallelism
- Non-copying forked workers using Bigarrays
- Constant-space parallel combinators in OCaml
- Haskell on parallel hardware
- Almost Burning Ship
- Adaptive blur filter Mandelbrot
- Prelude.ml - more multicore mandelbrot
- ▼ September (17)
- Built art installations, web sites, graphics libraries, web browsers, mobile apps, desktop apps, media player themes, many nutty prototypes, much bad code, much bad art.Have freelanced for Verizon, Google, Mozilla, Warner Bros, Sony Pictures, Yahoo!, Microsoft, Valve Software, TDK Electronics.Ex-Chrome Developer Relations.
- Filezoo - Minimalistic zoomable file manager
- Missile Fleet - A game written with Cake.js
- Gitbug - In-repo bug tracker for Git
- Prelude.ml - OCaml stdlib replacement with a Haskellish flavour
- Metadata - File metadata extraction tool and Ruby library
- Thumbnailer - File thumbnailing tool and Ruby library
- Random canvas demos