Unfortunately, when causing a given thread to give up its count, it is necessary to atomically manipulate both that thread's counter and countermax variables. The usual way to do this is to combine these two variables into a single variable, for example, given a 32-bit variable, using the high-order 16 bits to represent counter and the low-order 16 bits to represent countermax.
The variables and access functions for a simple atomic limit counter
are shown in
Figure
(count_lim_atomic.c).
The counter and countermax variables in earlier algorithms
are combined into the single variable counterandmax shown on
line 1, with counter in the upper half and countermax in
the lower half.
This variable is of type atomic_t, which has an underlying
representation of int.
Lines 2-6 show the definitions for globalcountmax, globalcount,
globalreserve, counterp, and gblcnt_mutex, all of which
take on roles similar to their counterparts in
Figure .
Line 7 defines CM_BITS, which gives the number of bits in each half
of counterandmax, and line 8 defines MAX_COUNTERMAX, which
gives the maximum value that may be held in either half of
counterandmax.
Quick Quiz 6.29:
In what way does line 7 of
Figure
violate the C standard?
End Quick Quiz
Lines 10-15 show the split_counterandmax_int() function, which, when given the underlying int from the atomic_t counterandmax variable. Line 13 isolates the most-significant half of this int, placing the result as specified by argument c, and line 14 isolates the least-significant half of this int, placing the result as specified by argument cm.
Lines 17-25 show the split_counterandmax() function, which picks up the underlying int from the specified variable on line 21, stores it as specified by the old argument on line 23, and then invokes split_counterandmax_int() to split it on line 24.
Quick Quiz 6.30:
Given that there is only one counterandmax variable,
why bother passing in a pointer to it on line 18 of
Figure ?
End Quick Quiz
Lines 27-33 show the merge_counterandmax() function, which can be thought of as the inverse of split_counterandmax(). Line 31 merges the counter and countermax values passed in c and cm, respectively, and returns the result.
Quick Quiz 6.31:
Why does merge_counterandmax() in
Figure
return an int rather than storing directly into an
atomic_t?
End Quick Quiz
Figure
shows the add_count(), sub_count(), and
read_count() functions.
Lines 1-32 show add_count(), whose fastpath spans lines 8-15, with the remainder of the function being the slowpath. Lines 8-14 of the fastpath form a compare-and-swap (CAS) loop, with the atomic_cmpxchg() primitives on lines 13-14 performing the actual CAS. Line 9 splits the current thread's counterandmax variable into its counter (in c) and countermax (in cm) components, while placing the underlying int into old. Line 10 checks whether the amount delta can be accommodated locally (taking care to avoid integer overflow), and if not, line 11 transfers to the slowpath. Otherwise, line 11 combines an updated counter value with the original countermax value into new. The atomic_cmpxchg() primitive on lines 13-14 then atomically compares this thread's counterandmax variable to old, updating its value to new if the comparison succeeds. If the comparison succeeds, line 15 returns success, otherwise, execution continues in the loop at line 9.
Quick Quiz 6.32:
Yecch!
Why the ugly goto on line 11 of
Figure ?
Haven't you heard of the break statement???
End Quick Quiz
Quick Quiz 6.33:
Why would the atomic_cmpxchg() primitive at lines 13-14 of
Figure
ever fail?
After all, we picked up its old value on line 9 and have not
changed it!
End Quick Quiz
Lines 16-32 of
Figure
show add_count()'s slowpath, which is protected by gblcnt_mutex,
which is acquired on line 17 and released on lines 24 and 30.
Line 18 invokes globalize_count(), which moves this thread's
state to the global counters.
Lines 19-20 check whether the delta value can be accommodated by
the current global state, and, if not, line 21 invokes
flush_local_count() to flush all threads' local state to the
global counters, and then lines 22-23 recheck whether delta can
be accommodated.
If, after all that, the addition of delta still cannot be accommodated,
then line 24 releases gblcnt_mutex (as noted earlier), and
then line 25 returns failure.
Otherwise, line 28 adds delta to the global counter, line 29 spreads counts to the local state if appropriate, line 30 releases gblcnt_mutex (again, as noted earlier), and finally, line 31 returns success.
Lines 34-63 of
Figure
show sub_count(), which is structured similarly to
add_count(), having a fastpath on lines 41-48 and a slowpath on
lines 49-62.
A line-by-line analysis of this function is left as an exercise to
the reader.
Figure shows read_count().
Line 9 acquires gblcnt_mutex and line 16 releases it.
Line 10 initializes local variable sum to the value of
globalcount, and the loop spanning lines 11-15 adds the
per-thread counters to this sum, isolating each per-thread counter
using split_counterandmax on line 13.
Finally, line 17 returns the sum.
Figure
shows the utility functions
globalize_count(),
flush_local_count(),
balance_count(),
count_register_thread(), and
count_unregister_thread().
The code for globalize_count() is shown on lines 1-12, and
it is similar to that of previous algorithms, with the addition of
line 7, which is now required to split out counter and
countermax from counterandmax.
The code for flush_local_count(), which moves all threads' local counter state to the global counter, is shown on lines 14-32. Line 22 checks to see if the value of globalreserve permits any per-thread counts, and, if not, line 23 returns. Otherwise, line 24 initializes local variable zero to a combined zeroed counter and countermax. The loop spanning lines 25-31 sequences through each thread. Line 26 checks to see if the current thread has counter state, and, if so, lines 27-30 move that state to the global counters. Line 27 atomically fetches the current thread's state while replacing it with zero. Line 28 splits this state into its counter (in local variable c) and countermax (in local variable cm) components. Line 29 adds this thread's counter to globalcount, while line 30 subtracts this thread's countermax from globalreserve.
Quick Quiz 6.34:
What stops a thread from simply refilling its
counterandmax variable immediately after
flush_local_count() on line 14 of
Figure
empties it?
End Quick Quiz
Quick Quiz 6.35:
What prevents concurrent execution of the fastpath of either
atomic_add() or atomic_sub() from interfering with
the counterandmax variable while
flush_local_count() is accessing it on line 27 of
Figure
empties it?
End Quick Quiz
Lines 34-54 show the code for balance_count(), which refills the calling thread's local counterandmax variable. This function is quite similar to that of the preceding algorithms, with changes required to handle the merged counterandmax variable. Detailed analysis of the code is left as an exercise for the reader, as it is with the count_register_thread() function starting on line 56 and the count_unregister_thread() function starting on line 65.
Quick Quiz 6.36:
Given that the atomic_set() primitive does a simple
store to the specified atomic_t, how can line 53 of
balance_count() in
Figure
work correctly in face of concurrent flush_local_count()
updates to this variable?
End Quick Quiz
Paul E. McKenney 2011-12-16