Notes from JDL discussion, 16 Dec 2003 Editorial note: A lot of the discussion was too detailed to easily include in these notes. Overall, the discussion was very fruitful and there was a clear desire by all to work together. Job description languages "language" is loaded terminology and should be used carefully. The WSDL is not the language, but takes a description written in the language as a parameter. ie: "submit" submits a job written in JDL. The WSDL methods parse the jobs in the language. Should the JDL contain enough information to do fine-grained interactive analysis, such as using a slider to change processing parameters and having the job change and run immediately. Gabrielle thinks that the langauge should be flexible or extensible enough for this. Can different experiments share implementations of a JDL? Possibly, each one would have to examine their jobs in JDL to find common areas of functionality. A service interface should be generic, but the JDL may have experiment-specific extensions that aren't shared, such as the dataset definition and specific analysis tasks. There was some discussion about the need for generic APIs for JDL management (submission, splitting), and such. Some of this has already been discussed before in CS11 and at the Caltech workshop. Joe reviewed what has already been discussed: CS11 API definitions and Caltech Workshop. It was decided to start investigating the interface for this JDL service (starting from David Adams' analysis service http://www.ppdg.net/archives/talks/2003/ppt00078.ppt) boolean has_application(Application) ??? install_application(Application) boolean install_task(Application, Task) JobID submit(Application, Task, Dataset, Config) Job job(JobID) boolean kill(JobID) Data object types: Application - an installed software package Task - add-on scripts and configuration to an Application. Dataset - A description of the dataset to use. This can be either physical or logical (such as a query input to the Dataset Catalog service) Configuration - Result - Job - This only allows coarse job control. It does not allow more fine-grained job control, such as changing processing parameters while a job is running (without killing and restarting the job) or pausing a job. A previously discussed Capabilities API would be used to determine if these behaviours exist. The capabilities API could be part of the JDL service or attached to the Job/Task. Provenance an be a problem because it includes not only the parameters for processing, but also configuration files and scripts that aren't carried along in the JDL. Full provenance must include these parameters and scripts, which might change over time (they are in a user's home directory), thus, the JDL is not able to completely store the provenance. This information is supposed to be part of the task, but it could grow quite large, especially if the tasks are repeated in the JDL. Perhaps a "delta" task containing a "diff" between tasks can help with this. Tasks should be generic enough that a scheduler or other service can deal with generic tasks, but using typed tasks has value. How do we proceed from here to arrive at a language and service specification? David Alexander will pull the schema for the JDL out from the job description language specification.