IA64 offers a weak consistency model, so that in absence of explicit
memory-barrier instructions, IA64 is within its rights to arbitrarily
reorder memory references [Int02b].
IA64 has a memory-fence instruction named mf, but also has
``half-memory fence'' modifiers to loads, stores, and to some of its atomic
instructions [Int02a].
The acq modifier prevents subsequent memory-reference instructions
from being reordered before the acq, but permits
prior memory-reference instructions to be reordered after the acq,
as fancifully illustrated by Figure .
Similarly, the rel modifier prevents prior memory-reference
instructions from being reordered after the rel, but allows
subsequent memory-reference instructions to be reordered before
the rel.
These half-memory fences are useful for critical sections, since it is safe to push operations into a critical section, but can be fatal to allow them to bleed out. However, as one of the only CPUs with this property, IA64 defines Linux's semantics of memory ordering associated with lock acquisition and release.
The IA64 mf instruction is used for the smp_rmb(), smp_mb(), and smp_wmb() primitives in the Linux kernel. Oh, and despite rumors to the contrary, the ``mf'' mnemonic really does stand for ``memory fence''.
Finally, IA64 offers a global total order for ``release'' operations, including the ``mf'' instruction. This provides the notion of transitivity, where if a given code fragment sees a given access as having happened, any later code fragment will also see that earlier access as having happened. Assuming, that is, that all the code fragments involved correctly use memory barriers.
Paul E. McKenney 2011-12-16