D.2.3 RCU Desiderata

The list of real-time RCU desiderata [MS05] is a very good start:

  1. Deferred destruction, so that an RCU grace period cannot end until all pre-existing RCU read-side critical sections have completed.
  2. Reliable, so that RCU supports 24x7 operation for years at a time.
  3. Callable from irq handlers.
  4. Contained memory footprint, so that mechanisms exist to expedite grace periods if there are too many callbacks. (This is weakened from the LCA2005 list.)
  5. Independent of memory blocks, so that RCU can work with any conceivable memory allocator.
  6. Synchronization-free read side, so that only normal non-atomic instructions operating on CPU- or task-local memory are permitted. (This is strengthened from the LCA2005 list.)
  7. Unconditional read-to-write upgrade, which is used in several places in the Linux kernel where the update-side lock is acquired within the RCU read-side critical section.
  8. Compatible API.

  9. Because this is not to be a real-time RCU, the requirement for preemptible RCU read-side critical sections can be dropped. However, we need to add the following new requirements to account for changes over the past few years.

  10. Scalability with extremely low internal-to-RCU lock contention. RCU must support at least 1,024 CPUs gracefully, and preferably at least 4,096.
  11. Energy conservation: RCU must be able to avoid awakening low-power-state dynticks-idle CPUs, but still determine when the current grace period ends. This has been implemented in real-time RCU, but needs serious simplification.
  12. RCU read-side critical sections must be permitted in NMI handlers as well as irq handlers. Note that preemptible RCU was able to avoid this requirement due to a separately implemented synchronize_sched().
  13. RCU must operate gracefully in face of repeated CPU-hotplug operations. This is simply carrying forward a requirement met by both classic and real-time.
  14. It must be possible to wait for all previously registered RCU callbacks to complete, though this is already provided in the form of rcu_barrier().
  15. Detecting CPUs that are failing to respond is desirable, to assist diagnosis both of RCU and of various infinite loop bugs and hardware failures that can prevent RCU grace periods from ending.
  16. Extreme expediting of RCU grace periods is desirable, so that an RCU grace period can be forced to complete within a few hundred microseconds of the last relevant RCU read-side critical second completing. However, such an operation would be expected to incur severe CPU overhead, and would be primarily useful when carrying out a long sequence of operations that each needed to wait for an RCU grace period.

The most pressing of the new requirements is the first one, scalability. The next section therefore describes how to make order-of-magnitude reductions in contention on RCU's internal locks.

Paul E. McKenney 2011-12-16