D.4.2.3.1 Grace-Period State Machine Overview
The state (recorded in rcu_try_flip_state)
can take on the following values:
- rcu_try_flip_idle_state: the grace-period state
machine is idle due to there being no RCU grace-period activity.
The rcu_ctrlblk.completed grace-period counter
is incremented upon exit from this state, and all of the
per-CPU rcu_flip_flag variables are set
to rcu_flipped.
- rcu_try_flip_waitack_state:
waiting for all CPUs to acknowledge that they have seen the
previous state's increment, which they do by setting their
rcu_flip_flag variables to rcu_flip_seen.
Once all CPUs have so acknowledged, we know that the old
set of counters can no longer be incremented.
- rcu_try_flip_waitzero_state:
waiting for the old counters to sum to zero.
Once the counters sum to zero, all of the per-CPU
rcu_mb_flag variables are set to
rcu_mb_needed.
- rcu_try_flip_waitmb_state:
waiting for all CPUs to execute a memory-barrier instruction,
which they signify by setting their rcu_mb_flag
variables to rcu_mb_done.
Once all CPUs have done so, all CPUs are guaranteed to see
the changes made by any RCU read-side critical section that
started before the beginning of the corresponding grace period,
even on weakly ordered machines.
Figure:
Preemptible RCU State Machine
|
The grace period state machine cycles through these states sequentially,
as shown in
Figure
.
Figure:
Preemptible RCU State Machine Timeline
|
Figure
shows how the state machine operates over time.
The states are shown along the figure's left-hand side and the relevant events
are shown along the timeline, with time proceeding in the downward direction.
We will elaborate on this figure when we validate the algorithm in
a later section.
In the meantime, here are some important things to note:
- The increment of the rcu_ctrlblk.completed counter
might be observed at different times by different CPUs, as
indicated by the blue oval. However, after a given
CPU has acknowledged the increment, it is required to
use the new counter.
Therefore, once all CPUs have acknowledged, the old counter
can only be decremented.
- A given CPU advances its callback lists just before
acknowledging the counter increment.
- The blue oval represents the fact that memory reordering
might cause different CPUs to see the increment at
different times.
This means that a given CPU might believe that some
other CPU has jumped the gun, using the new value of the counter
before the counter was actually incremented.
In fact, in theory, a given CPU might see the next increment of the
rcu_ctrlblk.completed counter as early as
the last preceding memory barrier.
(Note well that this sentence is very imprecise.
If you intend to do correctness proofs involving memory barriers,
please see Appendix
.
- Because rcu_read_lock() does not contain any
memory barriers, the corresponding RCU read-side critical
sections might be reordered by the CPU to follow the
rcu_read_unlock().
Therefore, the memory barriers are required to ensure
that the actions of the RCU read-side critical sections
have in fact completed.
- As we will see, the fact that different CPUs can see the
counter flip happening at different times means that a
single trip through the state machine is not sufficient
for a grace period: multiple trips are required.
Paul E. McKenney
2011-12-16