The fact that reference-count acquisition can run concurrently with reference-count release adds further complications. Suppose that a reference-count release finds that the new value of the reference count is zero, signalling that it is now safe to clean up the reference-counted object. We clearly cannot allow a reference-count acquisition to start after such clean-up has commenced, so the acquisition must include a check for a zero reference count. This check must be part of the atomic increment operation, as shown below.
Quick Quiz 10.5: Why can't the check for a zero reference count be made in a simple ``if'' statement with an atomic increment in its ``then'' clause? End Quick Quiz
The Linux kernel's fget() and fput() primitives
use this style of reference counting.
Simplified versions of these functions are shown in
Figure .
Line 4 of fget() fetches the pointer to the current process's file-descriptor table, which might well be shared with other processes. Line 6 invokes rcu_read_lock(), which enters an RCU read-side critical section. The callback function from any subsequent call_rcu() primitive will be deferred until a matching rcu_read_unlock() is reached (line 10 or 14 in this example). Line 7 looks up the file structure corresponding to the file descriptor specified by the fd argument, as will be described later. If there is an open file corresponding to the specified file descriptor, then line 9 attempts to atomically acquire a reference count. If it fails to do so, lines 10-11 exit the RCU read-side critical section and report failure. Otherwise, if the attempt is successful, lines 14-15 exit the read-side critical section and return a pointer to the file structure.
The fcheck_files() primitive is a helper function for fget(). It uses the rcu_dereference() primitive to safely fetch an RCU-protected pointer for later dereferencing (this emits a memory barrier on CPUs such as DEC Alpha in which data dependencies do not enforce memory ordering). Line 22 uses rcu_dereference() to fetch a pointer to this task's current file-descriptor table, and line 24 checks to see if the specified file descriptor is in range. If so, line 25 fetches the pointer to the file structure, again using the rcu_dereference() primitive. Line 26 then returns a pointer to the file structure or NULL in case of failure.
The fput() primitive releases a reference to a file structure. Line 31 atomically decrements the reference count, and, if the result was zero, line 32 invokes the call_rcu() primitives in order to free up the file structure (via the file_free_rcu() function specified in call_rcu()'s second argument), but only after all currently-executing RCU read-side critical sections complete. The time period required for all currently-executing RCU read-side critical sections to complete is termed a ``grace period''. Note that the atomic_dec_and_test() primitive contains a memory barrier. This memory barrier is not necessary in this example, since the structure cannot be destroyed until the RCU read-side critical section completes, but in Linux, all atomic operations that return a result must by definition contain memory barriers.
Once the grace period completes, the file_free_rcu() function obtains a pointer to the file structure on line 39, and frees it on line 40.
This approach is also used by Linux's virtual-memory system, see get_page_unless_zero() and put_page_testzero() for page structures as well as try_to_unuse() and mmput() for memory-map structures.
Paul E. McKenney 2011-12-16