4.2.1 Hardware System Architecture

Figure: System Hardware Architecture
\resizebox{3in}{!}{\includegraphics{cpu/SystemArch}}

Figure [*] shows a rough schematic of an eight-core computer system. Each die has a pair of CPU cores, each with its cache, as well as an interconnect allowing the pair of CPUs to communicate with each other. The system interconnect in the middle of the diagram allows the four dies to communicate, and also connects them to main memory.

Data moves through this system in units of ``cache lines'', which are power-of-two fixed-size aligned blocks of memory, usually ranging from 32 to 256 bytes in size. When a CPU loads a variable from memory to one of its registers, it must first load the cacheline containing that variable into its cache. Similarly, when a CPU stores a value from one of its registers into memory, it must also load the cacheline containing that variable into its cache, but must also ensure that no other CPU has a copy of that cacheline.

For example, if CPU 0 were to perform a compare-and-swap (CAS) operation on a variable whose cacheline resided in CPU 7's cache, the following over-simplified sequence of events might ensue:

  1. CPU 0 checks its local cache, and does not find the cacheline.
  2. The request is forwarded to CPU 0's and 1's interconnect, which checks CPU 1's local cache, and does not find the cacheline.
  3. The request is forwarded to the system interconnect, which checks with the other three dies, learning that the cacheline is held by the die containing CPU 6 and 7.
  4. The request is forwarded to CPU 6's and 7's interconnect, which checks both CPUs' caches, finding the value in CPU 7's cache.
  5. CPU 7 forwards the cacheline to its interconnect, and also flushes the cacheline from its cache.
  6. CPU 6's and 7's interconnect forwards the cacheline to the system interconnect.
  7. The system interconnect forwards the cacheline to CPU 0's and 1's interconnect.
  8. CPU 0's and 1's interconnect forwards the cacheline to CPU 0's cache.
  9. CPU 0 can now perform the CAS operation on the value in its cache.

Quick Quiz 4.3: This is a simplified sequence of events? How could it possibly be any more complex? End Quick Quiz

Quick Quiz 4.4: Why is it necessary to flush the cacheline from CPU 7's cache? End Quick Quiz

Paul E. McKenney 2011-12-16