Why Threads?
- The primary motivation for using threads is to realize potential program performance gains.
- When compared to the cost of creating and managing a process, a thread can be created with much less OS overhead.
- Managing threads requires fewer system resources than managing processes.
- Threaded programming models offer significant advantages over message-passing programming models along with some disadvantages as well.
- Software Portability;
- Threaded applications can be developed on serial machines and run on parallel machines without any changes.
- This ability to migrate programs between diverse architectural platforms is a very significant advantage of threaded APIs.
- Latency Hiding;
- One of the major overheads in programs (both serial and parallel) is the access latency for memory access, I/O, and communication.
- By allowing multiple threads to execute on the same processor, threaded APIs enable this latency to be hidden.
- In effect, while one thread is waiting for a communication operation, other threads can utilize the CPU, thus masking associated overhead.
- Scheduling and Load Balancing;
- While in many structured applications the task of allocating equal work to processors is easily accomplished,
- In unstructured and dynamic applications (such as game playing and discrete optimization) this task is more difficult.
- Threaded APIs allow the programmer
- to specify a large number of concurrent tasks
- and support system-level dynamic mapping of tasks to processors with a view to minimizing idling overheads.
- Ease of Programming, Widespread Use
- Due to the mentioned advantages, threaded programs are significantly easier to write (!) than corresponding programs using message passing APIs.
- With widespread acceptance of the POSIX thread API, development tools for POSIX threads are more widely available and stable.
- Overlapping CPU work with I/O: For example, a program may have sections where it is performing a long I/O operation. While one thread is waiting for an I/O system call to complete, CPU intensive work can be performed by other threads.
- Priority/real-time scheduling: tasks which are more important can be scheduled to supersede or interrupt lower priority tasks.
- Asynchronous event handling: tasks which service events of indeterminate frequency and duration can be interleaved. For example, a web server can both transfer data from previous requests and manage the arrival of new requests.
- A number of vendors provide vendor-specific thread APIs. Standardization efforts have resulted in two very different implementations of threads.
- Microsoft has its own implementation for threads, which is not related to the UNIX POSIX standard or OpenMP.
- POSIX Threads. Library based; requires parallel coding.
- C Language only. Very explicit parallelism; requires significant programmer attention to detail.
- Commonly referred to as Pthreads.
- POSIX has emerged as the standard threads API, supported by most vendors.
- OpenMP. Compiler directive based; can use serial code.
- Jointly defined by a group of major computer hardware and software vendors.
- The OpenMP C/C++ API was released in late 1998.
- Portable / multi-platform, including Unix and Windows platforms
- Can be very easy and simple to use - provides for “incremental parallelism“.
- MPI
on-node communications,
- MPI libraries usually implement on-node task communication via shared memory, which involves at least one memory copy operation (process to process).
- Threads
on-node data transfer.
- For Pthreads there is no intermediate memory copy required because threads share the same address space within a single process.
- There is no data transfer.
- It becomes more of a cache-to-CPU or memory-to-CPU bandwidth (worst case) situation.
- These speeds are much higher.