by
the Condor Project, Department of
Computer Sciences,
The task of analyzing huge amount of data residing at different
locations can be greatly eased by the Grid. First, the analyzing tools can be
executed on resources close to the data, without incurring the cost of moving the
data around; second, the computing and I/O resources provided by the Grid
can be utilized to perform the analysis task in parallel, which leads much
shorter turn-around time; third, the various services provided by the Grid
infrastructure can simply to the job of managing distributed resources. In
order to fully exploit such opportunities, we are implementing a framework to
build interactive parallel data analysis tools on the Grid - IMW.
Data analysis tasks lend themselves naturally to master-worker style parallel processing, with its special requirements on user-interaction, visualization and data locality support. Previously, we've developed a C++ framework called MW [1] to support master-worker style parallel applications on Grid resources managed by Globus and Condor. It has been used by several groups and helped solve some very demanding computational problems [2]. MW has a layered architecture to encapsulate the functionalities of the underlying resource manager and communication mechanisms, it also provides portability across different underlying infrastructures. It exposes a set of simple APIs to ease the application programmers' work.
IMW is based on our experiences of developing MW, it has been rewritten (mostly in Java) to
The current prototype consists of three different components:
Different components communicate with each other using
Socket. Condor, as the resource manager, is used to gather worker hosts, spawn
worker processes, and report dynamic worker changes such as worker exit,
suspension/resume, etc. Fault-tolerance can be achieved by user-provided checkpoint
functions, to save partially completed result and recover after restart.
The bare-bone IMW prototype is working now, which can execute shell scripts in parallel, and handles the worker dynamic changes. More work needs to be done to add features and make it easy-to-use.
After providing the basic functionality of parallel data analysis and user interaction, we can also exploit two other tools developed by our group at UW.
[1] Jeff Linderoth, Sanjeev Kulkarni, Jean-Pierre Goux, and Michael Yoder, "An Enabling Framework for
Master-Worker Applications on the Computational Grid", Proceedings of the Ninth IEEE
Symposium on High Performance Distributed Computing (HPDC9), Pittsburgh,
Pennsylvania,
August 2000, pp 43-50.
[2] S. J. Wright, "Solving optimization problems on computational grids," November, 2000. Optima 65, May, 2001.