|
The most straightforward answer to ``what is RCU'' is that RCU is
an API used in the Linux kernel, as summarized by
Tables and
,
which shows the wait-for-RCU-readers portions of the non-sleepable and
sleepable APIs, respectively,
and by
Table
,
which shows the publish/subscribe portions of the API.
If you are new to RCU, you might consider focusing on just one
of the columns in
Table ,
each of which summarizes one member of the Linux kernel's RCU API family..
For example, if you are primarily interested in understanding how RCU
is used in the Linux kernel, ``RCU Classic'' would be the place to start,
as it is used most frequently.
On the other hand, if you want to understand RCU for its own sake,
``SRCU'' has the simplest API.
You can always come back for the other columns later.
If you are already familiar with RCU, these tables can serve as a useful reference.
Quick Quiz 10.22:
Why do some of the cells in
Table
have exclamation marks (``!'')?
End Quick Quiz
The ``RCU Classic'' column corresponds to the original RCU implementation,
in which RCU read-side critical sections are delimited by
rcu_read_lock() and rcu_read_unlock(), which
may be nested.
The corresponding synchronous update-side primitives,
synchronize_rcu(), along with its synonym
synchronize_net(), wait for any currently executing
RCU read-side critical sections to complete.
The length of this wait is known as a ``grace period''.
The asynchronous update-side primitive, call_rcu(),
invokes a specified function with a specified argument after a
subsequent grace period.
For example, call_rcu(p,f); will result in
the ``RCU callback'' f(p)
being invoked after a subsequent grace period.
There are situations,
such as when unloading a Linux-kernel module that uses call_rcu(),
when it is necessary to wait for all
outstanding RCU callbacks to complete [McK07e].
The rcu_barrier() primitive does this job.
Note that the more recent hierarchical
RCU [McK08a]
implementation described in
Sections and
also adheres to ``RCU Classic'' semantics.
Finally, RCU may be used to provide
type-safe memory [GC96], as described in
Section .
In the context of RCU, type-safe memory guarantees that a given
data element will not change type during any RCU read-side critical section
that accesses it.
To make use of RCU-based type-safe memory, pass
SLAB_DESTROY_BY_RCU to
kmem_cache_create().
It is important to note that SLAB_DESTROY_BY_RCU will
in no way
prevent kmem_cache_alloc() from immediately reallocating
memory that was just now freed via kmem_cache_free()!
In fact, the SLAB_DESTROY_BY_RCU-protected data structure
just returned by rcu_dereference might be freed and reallocated
an arbitrarily large number of times, even when under the protection
of rcu_read_lock().
Instead, SLAB_DESTROY_BY_RCU operates by preventing
kmem_cache_free()
from returning a completely freed-up slab of data structures
to the system until after an RCU grace period elapses.
In short, although the data element might be freed and reallocated arbitrarily
often, at least its type will remain the same.
Quick Quiz 10.23: How do you prevent a huge number of RCU read-side critical sections from indefinitely blocking a synchronize_rcu() invocation? End Quick Quiz
Quick Quiz 10.24: The synchronize_rcu() API waits for all pre-existing interrupt handlers to complete, right? End Quick Quiz
In the ``RCU BH'' column, rcu_read_lock_bh() and rcu_read_unlock_bh() delimit RCU read-side critical sections, and call_rcu_bh() invokes the specified function and argument after a subsequent grace period. Note that RCU BH does not have a synchronous synchronize_rcu_bh() interface, though one could easily be added if required.
Quick Quiz 10.25: What happens if you mix and match? For example, suppose you use rcu_read_lock() and rcu_read_unlock() to delimit RCU read-side critical sections, but then use call_rcu_bh() to post an RCU callback? End Quick Quiz
Quick Quiz 10.26: Hardware interrupt handlers can be thought of as being under the protection of an implicit rcu_read_lock_bh(), right? End Quick Quiz
In the ``RCU Sched'' column, anything that disables preemption acts as an RCU read-side critical section, and synchronize_sched() waits for the corresponding RCU grace period. This RCU API family was added in the 2.6.12 kernel, which split the old synchronize_kernel() API into the current synchronize_rcu() (for RCU Classic) and synchronize_sched() (for RCU Sched). Note that RCU Sched did not originally have an asynchronous call_rcu_sched() interface, but one was added in 2.6.26. In accordance with the quasi-minimalist philosophy of the Linux community, APIs are added on an as-needed basis.
Quick Quiz 10.27: What happens if you mix and match RCU Classic and RCU Sched? End Quick Quiz
Quick Quiz 10.28: In general, you cannot rely on synchronize_sched() to wait for all pre-existing interrupt handlers, right? End Quick Quiz
The ``Realtime RCU'' column has the same API as does RCU Classic, the only difference being that RCU read-side critical sections may be preempted and may block while acquiring spinlocks. The design of Realtime RCU is described elsewhere [McK07a].
Quick Quiz 10.29: Why do both SRCU and QRCU lack asynchronous call_srcu() or call_qrcu() interfaces? End Quick Quiz
The ``SRCU'' column in
Table
displays a specialized RCU API that permits
general sleeping in RCU read-side critical sections
(see Appendix
for more details).
Of course,
use of synchronize_srcu() in an SRCU read-side
critical section can result in
self-deadlock, so should be avoided.
SRCU differs from earlier RCU implementations in that the caller
allocates an srcu_struct for each distinct SRCU
usage.
This approach prevents SRCU read-side critical sections from blocking
unrelated synchronize_srcu() invocations.
In addition, in this variant of RCU, srcu_read_lock()
returns a value that must be passed into the corresponding
srcu_read_unlock().
The ``QRCU'' column presents an RCU implementation with the same API structure as SRCU, but optimized for extremely low-latency grace periods in absence of readers, as described elsewhere [McK07f]. As with SRCU, use of synchronize_qrcu() in a QRCU read-side critical section can result in self-deadlock, so should be avoided. Although QRCU has not yet been accepted into the Linux kernel, it is worth mentioning given that it is the only kernel-level RCU implementation that can boast deep sub-microsecond grace-period latencies.
Quick Quiz 10.30: Under what conditions can synchronize_srcu() be safely used within an SRCU read-side critical section? End Quick Quiz
The Linux kernel currently has a surprising number of RCU APIs and implementations. There is some hope of reducing this number, evidenced by the fact that a given build of the Linux kernel currently has at most three implementations behind four APIs (given that RCU Classic and Realtime RCU share the same API). However, careful inspection and analysis will be required, just as would be required in order to eliminate one of the many locking APIs.
The various RCU APIs are distinguished by the forward-progress guarantees that their RCU read-side critical sections must provide, and also by their scope, as follows:
In other words, SRCU and QRCU compensate for their extremely weak forward-progress guarantees by permitting the developer to restrict their scope.
Paul E. McKenney 2011-12-16