IMW: Interactive Master-Worker Style Parallel Data Analysis Tool on the Grid

by the Condor Project,  Department of Computer Sciences, University of Wisconsin - Madison
04/16/2002

Motivation

The task of analyzing huge amount of data residing at different locations can be greatly eased by the Grid. First, the analyzing tools can be executed on resources close to the data, without incurring the cost of moving the data around; second, the computing and I/O resources provided by the Grid can be utilized to perform the analysis task in parallel, which leads much shorter turn-around time; third, the various services provided by the Grid infrastructure can simply to the job of managing distributed resources. In order to fully exploit such opportunities, we are implementing a framework to build interactive parallel data analysis tools on the Grid - IMW.
 

Our Approach

Data analysis tasks lend themselves naturally to master-worker style parallel processing, with its special requirements on user-interaction, visualization and data locality support. Previously, we've developed a C++  framework called MW [1] to support master-worker style parallel applications on Grid resources managed by Globus and Condor. It has been used by several groups and helped solve some very demanding computational problems [2]. MW has a layered architecture to encapsulate the functionalities of the underlying resource manager and communication mechanisms, it also provides portability across different underlying infrastructures. It exposes a set of simple APIs to ease the application programmers' work.

IMW is based on our experiences of developing MW, it has been rewritten (mostly in Java) to

  1. add support of user interaction and computation steering;
  2. directly support data analyzing tasks, such as, histogram and parallel execution of scripts;
  3. provide a lightweight Java implementation to ease porting and maintenance.

The current prototype consists of three different components:

  1. a client that interacts with the user and controls the analysis task;
  2. a master that manages the analysis task by decomposing the analysis task into sub-tasks and assigns them to the workers in parallel;
  3. workers that receive the sub-tasks  from the master and returns the result.

Different components communicate with each other using Socket. Condor, as the resource manager, is used to gather worker hosts, spawn worker processes, and report dynamic worker changes such as worker exit, suspension/resume, etc. Fault-tolerance can be achieved by user-provided checkpoint functions, to save partially completed result and recover after restart.
 

Status and Future Work

The bare-bone IMW prototype is working now, which can execute shell scripts in parallel, and handles the worker dynamic changes. More work needs to be done to add features and make it easy-to-use.

After providing the basic functionality of parallel data analysis and user interaction, we can also exploit two other tools developed by our group at UW.

  1. DEViseDEVise is an environment for data exploration and visualization written in, with flexible data selection and layout mechanisms. By using DEVise as an visualization frontend, IMW's GUI development can be greatly simplified, while it still provides customizable histogram support.
  2. ClassAdsClassads are used by Condor to describe resources and task requirements. It would be a powerful tool for IMW to describe the data category and analysis requirement, so that analyzing processes can be directed and run on machines that exploit data locality and equipped with sufficient resources.

References

[1] Jeff Linderoth, Sanjeev Kulkarni, Jean-Pierre Goux, and Michael Yoder, "An Enabling Framework for Master-Worker Applications on the Computational Grid", Proceedings of the Ninth IEEE Symposium on High Performance Distributed Computing (HPDC9), Pittsburgh, Pennsylvania,
    August 2000, pp 43-50.

[2] S. J. Wright, "Solving optimization problems on computational grids," November, 2000. Optima 65, May, 2001.