
Hiro Kishimoto wrote:
From June 8 call
- EMS team (Steven, Andreas, Donal) to review data staging scenario (section 5 of https://forge.gridforum.org/sf/go/doc13591?nav=1 ) Bring to people's attention again at the next EMS call. (postpone until June 29)
I've finally had some time to review that document. The principal things about the detailed scenario from an EMS perspective are: Service instances are chosen (including any deployment) before the commencement of the scenario. A question they don't address is whether moving the computation to the data would be better. There is a probably application-specific service to build the input data sets for the parameter space exploration. There is a need for some kind of parameter space exploration support at the job description level. The use of the phrase "results database" could be substituted with "renamed output files" and the scenario would still work. There is a post-processing phase to be applied to the output data which the scenario does not properly highlight. A more sensible structure might involve some kind of Orchestration service (e.g. based on workflows) to control the parameterized input data creation and the maintenance of a high throughput resource pool to perform the actual parameterized simulations. There are probably some interesting EMS topics relating to the maintenance of such a processing pool. In short, it's a very crude sketch. I think that what would be far more useful would be to merge this with some of the EMS scenarios to form a combined OGSA Parameter Space Exploration use-case. But that's a lot of work to do since it looks like the compute and the data people are failing to see eye-to-eye (again). Since I go on vacation tomorrow, I'll have to leave that to others. :-) As a side note (having skim-read through the rest of the data scenarios doc) I believe that some of the things I've been thinking about for the EPS spec are likely to be useful in the selection of Data-related resources. All that changes really (at the interface level) are the input request language and the candidate description language. For compute tasks JSDL is a suitable base, but I don't know if there is an equivalent common basis possible for data tasks. I suppose that's really an issue to feed back to the data people (but their section 7 reminds me a lot of things I've been working on). Another question is whether the EPS is in some sense an aspect of a Registry. Looks like we'll need to iterate this as I suspect there are a number of reusable concepts that are being independently reinvented. Hope you get the chance to pursue this at the F2F. Donal.