The signal-theft implementation runs more than twice as fast as the atomic implementation on my Intel Core Duo laptop. Is it always preferable?
The signal-theft implementation would be vastly preferable on Pentium-4 systems, given their slow atomic instructions, but the old 80386-based Sequent Symmetry systems would do much better with the shorter path length of the atomic implementation. If ultimate performance is of the essence, you will need to measure them both on the system that your application is to be deployed on.
This is but one reason why high-quality APIs are so important: they permit implementations to be changed as required by ever-changing hardware performance characteristics.
Quick Quiz 6.45: What if you want an exact limit counter to be exact only for its lower limit? End Quick Quiz