One such obstacle is atomic operations. The whole idea of an atomic operation in some sense conflicts with the piece-at-a-time assembly-line operation of a CPU pipeline. To hardware designers' credit, modern CPUs use a number of extremely clever tricks to make such operations look atomic even though they are in fact being executed piece-at-a-time, but even so, there are cases where the pipeline must be delayed or even flushed in order to permit a given atomic operation to complete correctly.
The resulting effect on performance is depicted in
Figure .
Unfortunately, atomic operations usually apply only to single elements of data. Because many parallel algorithms require that ordering constraints be maintained between updates of multiple data elements, most CPUs provide memory barriers. These memory barriers also serve as performance-sapping obstacles, as described in the next section.
Quick Quiz 4.2: What types of machines would allow atomic operations on multiple data elements? End Quick Quiz
Paul E. McKenney 2011-12-16