Clusters

Next: Threads Up: Network Computing Previous: A Client Server Framework

Clusters

The 1990s have witnessed a significant shift from expensive and specialized parallel machines to the more cost-effective clusters of PCs and workstations. Advances in network technology and the availability of low-cost and high-performance commodity workstations have driven this shift.
Clusters provide an economical way of achieving high performance. Each node in a cluster could be a workstation, personal computer, or even a multiprocessor system. Each node has its own input/output systems and its own operating system.
When all nodes in a cluster have the same architecture and run the same operating system, the cluster is called homogeneous, otherwise, it is heterogeneous.
The interconnection network could be a fast LAN or a switch. To achieve high-performance computing, the interconnection network must provide high-bandwidth and low-latency communication.
Dedicated clusters are normally packaged compactly in a single room. With the exception of the front-end node, all nodes are headless with no keyboard, mouse, or monitor. Dedicated clusters usually use high-speed networks such as fast Ethernet and Myrinet.
Alternatively, nodes owned by different individuals on the Internet could participate in a cluster only part of the time. In this case, the cluster can utilize the idle CPU cycles of each participating node if the owner's permission is granted.

Figure 5: A cluster made of homogenous single-processor computers.

$\includegraphics[scale=1]{figures/supervisorworkers.ps}$
Figure 5 shows the architecture of a homogeneous cluster made of similar nodes, where each node is a single-processor workstation. The middleware layer in the architecture makes the cluster appears to the user as a single parallel machine, which is referred to as the single system image (SSI). The SSI infrastructure offers unified access to system resources by supporting a number of features including:
- Single entry point: A user can connect to the cluster instead of to a particular node.
- Single file system: A user sees a single hierarchy of directories and files.
- Single image for administration: The whole cluster is administered from a single window.
- Coordinated resource management: A job can transparently compete for the resources in the entire cluster.
In addition to providing high-performance computing, clusters can also be used to provide high-availability environment. High availability can be achieved when only a subset of the nodes is used in the computation and the rest is used as a backup in case of failure.
In cases when one of the main objectives of the cluster is high availability, the middleware will also support features that enable the cluster services for recovery from failure and fault tolerance among all nodes of the cluster. For example, the middleware should offer the necessary infrastructure for checkpointing. A checkpointing scheme makes sure that the process state is saved periodically. In the case of node failure, processes on the failed node can be restarted on another working node.
The programming environment and tools layer provide the programmer with portable tools and libraries for the development of parallel applications. Examples of such tools and libraries are Threads, Parallel Virtual Machine (PVM), and Message Passing Interface (MPI).

**Figure 5:** A cluster made of homogenous single-processor computers.
$\includegraphics[scale=1]{figures/supervisorworkers.ps}$

Subsections

Threads

Next: Threads Up: Network Computing Previous: A Client Server Framework

Cem Ozdogan 2006-12-27