Table
shows three code fragments, executed concurrently by CPUs 0, 1, and 2.
Both ``a'' and ``b'' are initially zero.
|
Again, suppose CPU 0 recently experienced many cache misses, so that its message queue is full, but that CPU 1 has been running exclusively within the cache, so that its message queue is empty. Then CPU 0's assignment to ``a'' will appear in Node 0's cache immediately (and thus be visible to CPU 1), but will be blocked behind CPU 0's prior traffic. In contrast, CPU 1's assignment to ``b'' will sail through CPU 1's previously empty queue. Therefore, CPU 2 might well see CPU 1's assignment to ``b'' before it sees CPU 0's assignment to ``a'', causing the assertion to fire, despite the memory barriers.
In theory, portable code should not rely on this example code fragment, however, as before, in practice it actually does work on most mainstream computer systems.
Paul E. McKenney 2011-12-16