
Sorry, late reply... Quoting [Christopher Smith] (Aug 12 2005):
As for supporting sandboxes, what does that actually mean? In a chroot jail? With a restricted user id (whatever that means)? Why should I care? What's the use case?
I guess you are right: sandbox is by definition transparent to the end user, isn't it? So while it might be useful to know where your job runs (see above), it may no make sense to enforce sandboxing (either its used or it isn't - what can SAGA do about this? nothing).
Right. In a sandboxed environment, you basically need to a) stage files in and out in-line with the job using relative paths, or b) use some kind of "third party" storage service that you can then retrieve files from (basically you use fully qualified paths and service endpoints).
I'm not sure there is much in between.
Right, I agree. So we should leave it as is.
Does a job have a unique job ID I can use to identify it? (That question is related to the session persistency discussed in another thread I think).
There is a getJobId method on the Job interface for this purpose. It's up to the backend to provide the ID, so uniqueness is not something SAGA can guarantee.
I semi-agree. For finding your job again, you need more then the backend job-id - you need also the contact point for the backend. Your SAGA implementation might know about that, so it may be able to create a 'better' job id.
In GAT, we did that, and had the distinction between a Native-JobID (the backends), and GAT-JobID (globally unique). That might be overkill to mandate for SAGA at this point, unless we have a clear use case wanting so I guess.
So, bottom line, I guess you are right, backend-id should be sufficient unless we run into problems with that.
I think that the idea of a SAGA-JobID that is some kind of composite of the backend ID and some "SAGA decoration" is a good idea ... especially if a SAGA session is used to access multiple back ends. Generating global IDs within one implementation is easy enough, but do we want to take a stab at defining a format that all implementations should support? How hard do you think it would be?
I think its difficult enough to leave it out of the spec - we kept away from backend dependend definition until now, and I think thats good. However, we could make a decent proposal, which should hold for the most common use cases, and which we should try to get included into the reference implementation. That would help a lot I think.
The idea is that two SAGA implementations (running concurrently) would have globally unique job id spaces.
I can think of two ways to create such ideas. A: create a unique string (MD5 or so) and make an external entity responsible for maintaining the mapping between that ID and the backend instance and native job id. One could also allow a non random string (e.g. a user specified name), but the naming collision problem is then moved into user space. B: combine backen-url and native jobID in a well defined (i.e. parsable) way. Possibly allow to add another part as user specific. <free string>-<backend url>-<nativeID> <MyJob>-<gram://www.test.net:1234/>-<SAD12412SDF> B seems simplier, and does not introduce an external dependency. The free string would allo the user to recognice the jobs - nice for browsing. Could default to the executable name or so. I am pretty sure it braks for some cases. E.g. the backend may have a moving URL, or may reuse ID's (as Unix does with pid's). However, as long as it is not mandatory, it might just help... my $0,02 Andre. -- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+