4.1.6 I/O Operations

Figure: CPU Waits for I/O Completion
\resizebox{3in}{!}{\includegraphics{cartoons/PhoneBooth}}

A cache miss can be thought of as a CPU-to-CPU I/O operation, and as such is one of the cheapest I/O operations available. I/O operations involving networking, mass storage, or (worse yet) human beings pose much greater obstacles than the internal obstacles called out in the prior sections, as illustrated by Figure [*].

This is one of the differences between shared-memory and distributed-system parallelism: shared-memory parallel programs must normally deal with no obstacle worse than a cache miss, while a distributed parallel program will typically incur the larger network communication latencies. In both cases, the relevant latencies can be thought of as a cost of communication--a cost that would be absent in a sequential program. Therefore, the ratio between the overhead of the communication to that of the actual work being performed is a key design parameter. A major goal of parallel design is to reduce this ratio as needed to achieve the relevant performance and scalability goals.

Of course, it is one thing to say that a given operation is an obstacle, and quite another to show that the operation is a significant obstacle. This distinction is discussed in the following sections.

Paul E. McKenney 2011-12-16