Abstract: Resource-Aware Scientific Computation on a Heterogeneous Cluster

Resource-Aware Scientific Computation on a Heterogeneous Cluster

James D. Teresco, Jamal Faik, Joseph E. Flaherty
Previously published as Williams College Department of Computer Science Technical Report CS-04-10, 2004.
Computing in Science & Engineering, Vol. 7, Number 2, pp. 40-50, 2005.

The popularity of clusters has opened a new platform for software originally designed for tightly-coupled supercomputers. This includes a large base of portable software, such as solvers for systems of partial differential equations using finite element and related methods. However, design decisions and optimizations are based on the platform for which the software is first developed. Moving from a tightly-coupled supercomputer to a cluster, or even from one cluster to another may reduce efficiency. We describe experiences using the "Bullpen Cluster" at Williams College to run parallel adaptive finite element software. The cluster added complications including nonuniform processor speeds, a mixture of 1-, 2-, and 4-processor nodes, and a slower network relative to processing speed than previous platforms. We describe two application-independent tools that we have developed to support computing on heterogeneous and hierarchical clusters: the Dynamic Resource Utilization Model (DRUM), and a Hierarchical Partitioning and Dynamic Load Balancing system.

Citation (BIBTEX) Full Paper (Link to CiSE web site)