14.2.10.7 Examples of Memory Barrier Pairings

Firstly, write barriers act as a partial orderings on store operations. Consider the following sequence of events:



STORE A = 1
STORE B = 2
STORE C = 3
<write barrier>
STORE D = 4
STORE E = 5


This sequence of events is committed to the memory coherence system in an order that the rest of the system might perceive as the unordered set of {A=1,B=2,C=3} all occurring before the unordered set of {D=4,E=5}, as shown in Figure [*].

Figure: Write Barrier Ordering Semantics
\includegraphics{advsync/WriteBarrierOrdering}

Secondly, data dependency barriers act as a partial orderings on data-dependent loads. Consider the following sequence of events with initial values {B = 7, X = 9, Y = 8, C = &Y}:



CPU 1 CPU 2
a = 1;
b = 2;
<write barrier>
c = &b; LOAD X
d = 4; LOAD C (gets &B)
LOAD *C (reads B)


Without intervention, CPU 2 may perceive the events on CPU 1 in some effectively random order, despite the write barrier issued by CPU 1:

Figure: Data Dependency Barrier Omitted
\includegraphics{advsync/DataDependencyNeeded}

In the above example, CPU 2 perceives that B is 7, despite the load of *C (which would be B) coming after the LOAD of C.

If, however, a data dependency barrier were to be placed between the load of C and the load of *C (i.e.: B) on CPU 2, again with initial values of {B = 7, X = 9, Y = 8, C = &Y}:



CPU 1 CPU 2
a = 1;
b = 2;
<write barrier>
c = &b; LOAD X
d = 4; LOAD C (gets &B)
<data dependency barrier>
LOAD *C (reads B)


then ordering will be as intuitively expected, as shown in Figure [*].

Figure: Data Dependency Barrier Supplied
\includegraphics{advsync/DataDependencySupplied}

And thirdly, a read barrier acts as a partial order on loads. Consider the following sequence of events, with initial values {A = 0, B = 9}:



CPU 1 CPU 2
a = 1;
<write barrier>
b = 2;
LOAD B
LOAD A


Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in some effectively random order, despite the write barrier issued by CPU 1:

Figure: Read Barrier Needed
\includegraphics{advsync/ReadBarrierNeeded}

If, however, a read barrier were to be placed between the load of B and the load of A on CPU 2, again with initial values of {A = 0, B = 9}:



CPU 1 CPU 2
a = 1;
<write barrier>
b = 2;
LOAD B
<read barrier>
LOAD A


then the partial ordering imposed by CPU 1's write barrier will be perceived correctly by CPU 2, as shown in Figure [*].

Figure: Read Barrier Supplied
\includegraphics{advsync/ReadBarrierSupplied}

To illustrate this more completely, consider what could happen if the code contained a load of A either side of the read barrier, once again with the same initial values of {A = 0, B = 9}:



CPU 1 CPU 2
a = 1;
<write barrier>
b = 2;
LOAD B
LOAD A (1st)
<read barrier>
LOAD A (2nd)


Even though the two loads of A both occur after the load of B, they may both come up with different values, as shown in Figure [*].

Figure: Read Barrier Supplied, Double Load
\includegraphics{advsync/ReadBarrierSupplied1}

Of course, it may well be that CPU 1's update to A becomes perceptible to CPU 2 before the read barrier completes, as shown in Figure [*].

Figure: Read Barrier Supplied, Take Two
\includegraphics{advsync/ReadBarrierSupplied2}

The guarantee is that the second load will always come up with A == 1 if the load of B came up with B == 2. No such guarantee exists for the first load of A; that may come up with either A == 0 or A == 1.

Paul E. McKenney 2011-12-16