E.7.3 Lessons (Re)Learned
Figure:
Memory-Barrier Fix Patch
 |
Figure:
Variable-Name-Typo Fix Patch
 |
This effort provided some lessons (re)learned:
- Promela and spin can verify interrupt/NMI-handler
interactions.
- Documenting code can help locate bugs.
In this case, the documentation effort located
a misplaced memory barrier in
rcu_enter_nohz() and rcu_exit_nohz(),
as shown by the patch in
Figure
.
- Validate your code early, often, and up to the point
of destruction.
This effort located one subtle bug in
rcu_try_flip_waitack_needed()
that would have been quite difficult to test or debug, as
shown by the patch in
Figure
.
- Always verify your verification code.
The usual way to do this is to insert a deliberate bug
and verify that the verification code catches it. Of course,
if the verification code fails to catch this bug, you may also
need to verify the bug itself, and so on, recursing infinitely.
However, if you find yourself in this position,
getting a good night's sleep
can be an extremely effective debugging technique.
- Use of atomic instructions can simplify verification.
Unfortunately, use of the cmpxchg atomic instruction
would also slow down the critical irq fastpath, so they
are not appropriate in this case.
- The need for complex formal verification often indicates
a need to re-think your design.
In fact the design verified in this section turns out to have
a much simpler solution, which is presented in the next section.
Paul E. McKenney
2011-12-16