3.2.3 Generality

One way to justify the high cost of developing parallel software is to strive for maximal generality. All else being equal, the cost of a more-general software artifact can be spread over more users than can a less-general artifact.

Unfortunately, generality often comes at the cost of performance, productivity, or both. To see this, consider the following popular parallel programming environments:

C/C++ ``Locking Plus Threads''
: This category, which includes POSIX Threads (pthreads) [Ope97], Windows Threads, and numerous operating-system kernel environments, offers excellent performance (at least within the confines of a single SMP system) and also offers good generality. Pity about the relatively low productivity.
Java
: This programming environment, which is inherently multithreaded, is widely believed to be much more productive than C or C++, courtesy of the automatic garbage collector and the rich set of class libraries, and is reasonably general purpose. However, its performance, though greatly improved over the past ten years, is generally considered to be less than that of C and C++.
MPI
: this message-passing interface [MPI08] powers the largest scientific and technical computing clusters in the world, so offers unparalleled performance and scalability. It is in theory general purpose, but has generally been used for scientific and technical computing. Its productivity is believed by many to be even less than that of C/C++ ``locking plus threads'' environments.
OpenMP
: this set of compiler directives can be used to parallelize loops. It is thus quite specific to this task, and this specificity often limits its performance. It is, however, much easier to use than MPI or parallel C/C++.
SQL
: structured query language [Int92] is extremely specific, applying only to relational database queries. However, its performance is quite good, doing quite well in Transaction Processing Performance Council (TPC) benchmarks [Tra01]. Productivity is excellent, in fact, this parallel programming environment permits people who know almost nothing about parallel programming to make good use of a large parallel machine.

Figure: Software Layers and Performance, Productivity, and Generality
\resizebox{3in}{!}{\includegraphics{intro/PPGrelation}}

The nirvana of parallel programming environments, one that offers world-class performance, productivity, and generality, simply does not yet exist. Until such a nirvana appears, it will be necessary to make engineering tradeoffs among performance, productivity, and generality. One such tradeoff is shown in Figure [*], which shows how productivity becomes increasingly important at the upper layers of the system stack, while performance and generality become increasingly important at the lower layers of the system stack. The huge development costs incurred near the bottom of the stack must be spread over equally huge numbers of users on the one hand (hence the importance of generality), and performance lost near the bottom of the stack cannot easily be recovered further up the stack. Near the top of the stack, there might be very few users for a given specific application, in which case productivity concerns are paramount. This explains the tendency towards ``bloatware'' further up the stack: extra hardware is often cheaper than would be the extra developers. This book is intended primarily for developers working near the bottom of the stack, where performance and generality are paramount concerns.

Figure: Tradeoff Between Productivity and Generality
\resizebox{3in}{!}{\includegraphics{intro/Generality}}

It is important to note that a tradeoff between productivity and generality has existed for centuries in many fields. For but one example, a nailgun is far more productive than is a hammer, but in contrast to the nailgun, a hammer can be used for many things besides driving nails. It should therefore be absolutely no surprise to see similar tradeoffs appear in the field of parallel computing. This tradeoff is shown schematically in Figure [*]. Here, Users 1, 2, 3, and 4 have specific jobs that they need the computer to help them with. The most productive possible language or environment for a given user is one that simply does that user's job, without requiring any programming, configuration, or other setup.

Quick Quiz 3.10: This is a ridiculously unachievable ideal! Why not focus on something that is achievable in practice? End Quick Quiz

Unfortunately, a system that does the job required by user 1 is unlikely to do user 2's job. In other words, the most productive languages and environments are domain-specific, and thus by definition lacking generality.

Another option is to tailor a given programming language or environment to the hardware system (for example, low-level languages such as assembly, C, C++, or Java) or to some abstraction (for example, Haskell, Prolog, or Snobol), as is shown by the circular region near the center of Figure [*]. These languages can be considered to be general in the sense that they are equally ill-suited to the jobs required by users 1, 2, 3, and 4. In other words, their generality is purchased at the expense of decreased productivity when compared to domain-specific languages and environments.

With the three often-conflicting parallel-programming goals of performance, productivity, and generality in mind, it is now time to look into avoiding these conflicts by considering alternatives to parallel programming.

Paul E. McKenney 2011-12-16