D.2.3 RCU Desiderata
The list of real-time RCU desiderata [MS05]
is a very good start:
- Deferred destruction, so that an RCU grace period cannot end
until all pre-existing RCU read-side critical sections have
completed.
- Reliable, so that RCU supports 24x7 operation for years at
a time.
- Callable from irq handlers.
- Contained memory footprint, so that mechanisms exist to expedite
grace periods if there are too many callbacks. (This is weakened
from the LCA2005 list.)
- Independent of memory blocks, so that RCU can work with any
conceivable memory allocator.
- Synchronization-free read side, so that only normal non-atomic
instructions operating on CPU- or task-local memory are permitted.
(This is strengthened from the LCA2005 list.)
- Unconditional read-to-write upgrade, which is used in several
places in the Linux kernel where the update-side lock is
acquired within the RCU read-side critical section.
- Compatible API.
- Because this is not to be a real-time RCU, the requirement for
preemptible RCU read-side critical sections can be dropped.
However, we need to add the following new requirements to account
for changes over the past few years.
- Scalability with extremely low internal-to-RCU lock contention.
RCU must support at least 1,024 CPUs gracefully, and preferably
at least 4,096.
- Energy conservation: RCU must be able to avoid awakening
low-power-state dynticks-idle CPUs, but still determine
when the current grace period ends.
This has been implemented in real-time RCU, but needs serious
simplification.
- RCU read-side critical sections must be permitted in NMI
handlers as well as irq handlers. Note that preemptible RCU
was able to avoid this requirement due to a separately
implemented synchronize_sched().
- RCU must operate gracefully in face of repeated CPU-hotplug
operations.
This is simply carrying forward a requirement met by both
classic and real-time.
- It must be possible to wait for all previously registered
RCU callbacks to complete, though this is already provided
in the form of rcu_barrier().
- Detecting CPUs that are failing to respond is desirable,
to assist diagnosis both of RCU and of various infinite
loop bugs and hardware failures that can prevent RCU grace
periods from ending.
- Extreme expediting of RCU grace periods is desirable,
so that an RCU grace period can be forced to complete within
a few hundred microseconds of the last relevant RCU read-side
critical second completing.
However, such an operation would be expected to incur
severe CPU overhead, and would be primarily useful when
carrying out a long sequence of operations that each needed
to wait for an RCU grace period.
The most pressing of the new requirements is the first one, scalability.
The next section therefore describes how to make order-of-magnitude reductions
in contention on RCU's internal locks.
Paul E. McKenney
2011-12-16