D.4.2.3.2 Grace-Period State Machine Walkthrough

Figure: rcu_check_callbacks() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 void rcu_check_callbacks(int cp...
...2 spin_unlock_irqrestore(&rdp->lock, flags);
13 }\end{verbatim}
}\end{figure}

This section walks through the C code that implements the RCU grace-period state machine, which is invoked from the scheduling-clock interrupt, which invokes rcu_check_callbacks() with irqs (and thus also preemption) disabled. This function is implemented as shown in Figure [*]. Line 4 selects the rcu_data structure corresponding to the current CPU, and line 6 checks to see if this CPU needs to execute a memory barrier to advance the state machine out of the rcu_try_flip_waitmb_state state. Line 7 checks to see if this CPU is already aware of the current grace-period stage number, and line 8 attempts to advance the state machine if so. Lines 9 and 12 hold the rcu_data's lock, and line 11 advances callbacks if appropriate. Line 10 updates RCU tracing statistics, if enabled via CONFIG_RCU_TRACE.

Figure: rcu_check_mb() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static void rcu_check_mb(int cp...
...per_cpu(rcu_mb_flag, cpu) = rcu_mb_done;
6 }
7 }\end{verbatim}
}\end{figure}

The rcu_check_mb() function executes a memory barrier as needed as shown in Figure [*]. Line 3 checks to see if this CPU needs to execute a memory barrier, and, if so, line 4 executes one and line 5 informs the state machine. Note that this memory barrier ensures that any CPU that sees the new value of rcu_mb_flag will also see the memory operations executed by this CPU in any prior RCU read-side critical section.

Figure: rcu_try_flip() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static void rcu_try_flip(void)
...
...ck_irqrestore(&rcu_ctrlblk.fliplock, flags);
28 }\end{verbatim}
}\end{figure}

The rcu_try_flip() function implements the top level of the RCU grace-period state machine, as shown in Figure [*]. Line 6 attempts to acquire the global RCU state-machine lock, and returns if unsuccessful. Lines; 5 and 7 accumulate RCU-tracing statistics (again, if CONFIG_RCU_TRACE is enabled). Lines 10 through 26 execute the state machine, each invoking a function specific to that state. Each such function returns 1 if the state needs to be advanced and 0 otherwise. In principle, the next state could be executed immediately, but in practice we choose not to do so in order to reduce latency. Finally, line 27 releases the global RCU state-machine lock that was acquired by line 6.

Figure: rcu_try_flip_idle() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static int rcu_try_flip_idle(vo...
...flip_flag, cpu) = rcu_flipped;
15 return 1;
16 }\end{verbatim}
}\end{figure}

The rcu_try_flip_idle() function is called when the RCU grace-period state machine is idle, and is thus responsible for getting it started when needed. Its code is shown in Figure [*]. Line 6 checks to see if there is any RCU grace-period work pending for this CPU, and if not, line 8 leaves, telling the top-level state machine to remain in the idle state. If instead there is work to do, line 11 increments the grace-period stage counter, line 12 does a memory barrier to ensure that CPUs see the new counter before they see the request to acknowledge it, and lines 13 and 14 set all of the online CPUs' rcu_flip_flag. Finally, line 15 tells the top-level state machine to advance to the next state.

Figure: rcu_try_flip_waitack() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static int rcu_try_flip_waitack...
...rcupreempt_trace_try_flip_a2);
13 return 1;
14 }\end{verbatim}
}\end{figure}

The rcu_try_flip_waitack() function, shown in Figure [*], checks to see if all online CPUs have acknowledged the counter flip (AKA "increment", but called "flip" because the bottom bit, which rcu_read_lock() uses to index the rcu_flipctr array, does flip). If they have, it tells the top-level grace-period state machine to move to the next state.

Line 6 cycles through all of the online CPUs, and line 7 checks to see if the current such CPU has acknowledged the last counter flip. If not, line 9 tells the top-level grace-period state machine to remain in this state. Otherwise, if all online CPUs have acknowledged, then line 11 does a memory barrier to ensure that we don't check for zeroes before the last CPU acknowledges. This may seem dubious, but CPU designers have sometimes done strange things. Finally, line 13 tells the top-level grace-period state machine to advance to the next state.

Figure: rcu_try_flip_waitzero() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static int rcu_try_flip_waitzer...
...rcupreempt_trace_try_flip_z2);
18 return 1;
19 }\end{verbatim}
}\end{figure}

The rcu_try_flip_waitzero() function, shown in Figure [*], checks to see if all pre-existing RCU read-side critical sections have completed, telling the state machine to advance if so. Lines 8 and 9 sum the counters, and line 10 checks to see if the result is zero, and, if not, line 12 tells the state machine to stay right where it is. Otherwise, line 14 executes a memory barrier to ensure that no CPU sees the subsequent call for a memory barrier before it has exited its last RCU read-side critical section. This possibility might seem remote, but again, CPU designers have done stranger things, and besides, this is anything but a fastpath. Lines 15 and 16 set all online CPUs' rcu_mb_flag variables, and line 18 tells the state machine to advance to the next state.

Figure: rcu_try_flip_waitmb() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static int rcu_try_flip_waitmb(...
...rcupreempt_trace_try_flip_m2);
13 return 1;
14 }\end{verbatim}
}\end{figure}

The rcu_try_flip_waitmb() function, shown in Figure [*], checks to see if all online CPUs have executed the requested memory barrier, telling the state machine to advance if so. Lines 6 and 7 check each online CPU to see if it has done the needed memory barrier, and if not, line 9 tells the state machine not to advance. Otherwise, if all CPUs have executed a memory barrier, line 11 executes a memory barrier to ensure that any RCU callback invocation follows all of the memory barriers, and line 13 tells the state machine to advance.

Figure: __rcu_advance_callbacks() Implementation
\begin{figure}{ \scriptsize
\begin{verbatim}1 static void __rcu_advance_callb...
...g, cpu) = rcu_flip_seen;
42 smp_mb();
43 }
44 }\end{verbatim}
}\end{figure}

The __rcu_advance_callbacks() function, shown in Figure [*], advances callbacks and acknowledges the counter flip. Line 7 checks to see if the global rcu_ctrlblk.completed counter has advanced since the last call by the current CPU to this function. If not, callbacks need not be advanced (lines 8-37). Otherwise, lines 8 through 37 advance callbacks through the lists (while maintaining a count of the number of non-empty lists in the wlc variable). In either case, lines 38 through 43 acknowledge the counter flip if needed.

Quick Quiz D.58: How is it possible for lines 38-43 of __rcu_advance_callbacks() to be executed when lines 7-37 have not? Won't they both be executed just after a counter flip, and never at any other time? End Quick Quiz

Paul E. McKenney 2011-12-16