``After'' is an intuitive, but surprisingly difficult concept.
An important non-intuitive issue is that code can be delayed at
any point for any amount of time.
Consider a producing and a consuming thread that communicate using
a global struct with a timestamp ``t'' and integer fields ``a'', ``b'',
and ``c''.
The producer loops recording the current time
(in seconds since 1970 in decimal),
then updating the values of ``a'', ``b'', and ``c'',
as shown in Figure .
The consumer code loops, also recording the current time, but also
copying the producer's timestamp along with the fields ``a'',
``b'', and ``c'', as shown in
Figure
.
At the end of the run, the consumer outputs a list of anomalous recordings,
e.g., where time has appeared to go backwards.
Quick Quiz A.1: What SMP coding errors can you see in these examples? See time.c for full code. End Quick Quiz
One might intuitively expect that the difference between the producer
and consumer timestamps would be quite small, as it should not take
much time for the producer to record the timestamps or the values.
An excerpt of some sample output on a dual-core 1GHz x86 is shown in
Table .
Here, the ``seq'' column is the number of times through the loop,
the ``time'' column is the time of the anomaly in seconds, the ``delta''
column is the number of seconds the consumer's timestamp follows that
of the producer (where a negative value indicates that the consumer
has collected its timestamp before the producer did), and the
columns labelled ``a'', ``b'', and ``c'' show the amount that these
variables increased since the prior snapshot collected by the consumer.
Why is time going backwards? The number in parentheses is the difference in microseconds, with a large number exceeding 10 microseconds, and one exceeding even 100 microseconds! Please note that this CPU can potentially execute about more than 100,000 instructions in that time.
One possible reason is given by the following sequence of events:
In this scenario, the producer's timestamp might be an arbitrary amount of time after the consumer's timestamp.
How do you avoid agonizing over the meaning of ``after'' in your SMP code?
Simply use SMP primitives as designed.
In this example, the easiest fix is to use locking, for example,
acquire a lock in the producer before line 10 in
Figure and in
the consumer before line 13 in
Figure
.
This lock must also be released after line 13 in
Figure
and
after line 17 in
Figure
.
These locks cause the code segments in line 10-13 of
Figure
and in line 13-17 of
Figure
to exclude
each other, in other words, to run atomically with respect to each other.
This is represented in
Figure
:
the locking prevents any of the boxes of code from overlapping in time, so
that the consumer's timestamp must be collected after the prior
producer's timestamp.
The segments of code in each box in this figure are termed
``critical sections''; only one such critical section may be executing
at a given time.
This addition of locking results in output as shown in
Figure .
Here there are no instances of time going backwards, instead,
there are only cases with more than 1,000 counts different between
consecutive reads by the consumer.
|
Quick Quiz A.2: How could there be such a large gap between successive consumer reads? See timelocked.c for full code. End Quick Quiz
In summary, if you acquire an exclusive lock, you know that
anything you do while holding that lock will appear to happen after
anything done by any prior holder of that lock.
No need to worry about which CPU did or did not execute a memory
barrier, no need to worry about the CPU or compiler reordering
operations - life is simple.
Of course, the fact that this locking prevents these two pieces of
code from running concurrently might limit the program's ability
to gain increased performance on multiprocessors, possibly resulting
in a ``safe but slow'' situation.
Chapter describes ways of
gaining performance and scalability in many situations.
However, in most cases, if you find yourself worrying about what happens before or after a given piece of code, you should take this as a hint to make better use of the standard primitives. Let these primitives do the worrying for you.
Paul E. McKenney 2011-12-16