DRUM: The Dynamic Resource Utilization Model


Jamal Faik (now at Oracle, Inc.), Joe Flaherty (retired), Luis Gervasio (now at Rockwell Collins)
Department of Computer Science
Rensselaer Polytechnic Institute
110 Eighth Street
Troy, NY 12180
Jim Teresco (formerly Williams College)
Department of Computer Science
Siena College
515 Loudon Rd.
Loudonville, NY 12211

Overview

The Dynamic Resource Utilization Model (DRUM) supports resource-aware, large-scale scientific computing in heterogeneous and hierarchical parallel computing environments. Clusters have gained wide acceptance as a viable alternative to tightly-coupled parallel computers. They provide cost-effective environments for running computationally-intensive parallel and distributed applications. An attractive feature of clusters is the ability to expand their computational power incrementally by incorporating additional nodes. This expansion often results in heterogeneous environments, as the newly-added nodes often have superior capabilities. Grid technologies have enabled computation on even more heterogeneous and widely-distributed systems. Internet-connected systems include more heterogeneity and extreme network hierarchy.

DRUM encapsulates hardware resources and their interconnection topology. DRUM provides monitoring facilities for dynamic evaluation of communication, memory, and processing capabilities. Heterogeneity is quantified by aggregating the information from the monitors into a scalar form, easily usable by existing load balancing algorithms.

Sample machine model

DRUM is a standalone library, requiring only MPI and a C compiler. DRUM will make use of other libraries (e.g., pthreads, SNMP, NWS) if they are available. When combined with the partitioners and load balancers in Sandia National Laboratories' Zoltan Toolkit, DRUM provides a convenient and straightforward way to tailor partitions to a given parallel computing environment.

DRUM requires knowledge of the execution environment. Some of this information can be inferred at run time, but other information must be specified manually. DRUM provides a graphical front end called DrumHead that aids in this initial configuration, which is required only when first using DRUM, or when characteristics of the cluster are changed.

Graphical configuration program

Availability

DRUM will be made freely available, but it is not yet ready for a general release. If you would like to find out if a prerelease version might be useful, please contact the developers at drum-devel AT teresco.org (replace AT with @). It currently works for Solaris and Linux systems. FreeBSD and Mac OS X/Darwin support is partially implemented but not yet functional.

Publications

The most thorough description of DRUM to date is contained in Jamal Faik's Ph.D. dissertation: This article describes the Bullpen Cluster at Williams College and how it motivated DRUM and hierarchical partitioning and load balancing: The last section of this article describes DRUM and presents some DRUM results: This manuscript describes DRUM in the some detail: This manuscript describes the DrumHead graphical configuration tool and work on the DRUM interface to the Network Weather Service: This manuscript describes resource-aware parallel computation, including DRUM and hierarchical balancing: This manuscript describes hierarchical balancing in Zoltan, which is part of the DRUM project:

Acknowledgements

Faik, Gervasio, Teresco and Flaherty were supported in part by contract 15162 with Sandia National Laboratories, a multiprogram laboratory operation by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under Contract DE-AC04-94AL85000.

Significant DRUM development was done by Rensselaer undergraduate Jin Chang, who is now a graduate student at the University of Chicago.

Karen Devine and Erik Boman at Sandia National Laboratories have been consulted extensively during the design and implementation of DRUM.

NWS support and FreeBSD support were added by Williams Summer Science program undergraduate Laura Effinger-Dean, and the new graphical user interface DrumHead was initially developed by Williams Summer Science program undergraduate Arjun Sharma. Williams Summer Science program undergraduate Bartley Tablante developed new machine model tree construction routines that use DrumHead's XML files, added code to allow DRUM to guide Zoltan's hierarchical balancing procedures, and redesigned the Makefiles used to configure and build DRUM.


Jim Teresco - Tue Jun 7 11:41:05 EDT 2011