Andre Merzky a écrit :
Hi Sylvain,
Hi Andre,
Quoting [Sylvain Reynaud] (Sep 24 2009):
Hi,
Last attempt to propose a better name for flag 'DoNotFailIfDoesNotExist' (see last item: "avoid check for existence on open/creation of ns entries")... What do you think about "MissingOK" ?
It gives a good idea of what it is supposed to do. It is short. It is already used at least in the linux world : used by logrotate to continue with no error message when the log file does not exist, and used in rpm spec files to continue with no error message when a package is not installed.
I pondered about your problem again, and have a couple of questions (again). Sorry if I reopen the can of worms, after we converged pretty much already...
So, in fact what you want to achieve is a delayed initialization, because in your use case the additional round trip time for making sure the file exists is expensive.
I understand that - if you just read small amounts of data from many files, just opening the files can double the overall latency, for example.
Yes. That's the point.
The first question though is, why don't you use the asynchronous file constructor for performing the open operation? After all, the async operations have been introduced in particular to hide latencies.
Assuming that async ops do not help, for some reason:
Just allowing to delay the error on synchronous construction may form a bad precedence, really, as one could argue that we would need that on all operations. Like, one could create a job::service instance for some endpoint URL, but report a DoesNotExist error only when later trying to submit a job. Yes, this use-case is equivalent to mine, although the impact on the latency may not be as significant.
Or one could write some data to a file, and report an error later on when trying to read that data back again,
This one is a different use-case because SAGA implementation does not have to query the remote service to know if data is being written. Hence, user can implement this behavior by catching the exception raised.
etc.
Is your use case different from those cases where one could delay error reporting, too? If so, how?
Or positively speaking:
If there is a reason why async operations won't work, It would work, but it is still useless messages sent to the server...
and one still needs to have a flag to cover the use case, then in fact a 'DelayErrors' flag may be more appropriate, as it would allow us to use it in other situations, and not only in the specific one your use case met (file does not exist).
I agree. 'DelayErrors' is a better name than 'MissingOK', even for my use case, because all these errors may be raised on later invocations of method objects.
But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like
// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";
// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"
The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.
I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error. What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.
Effectively, the 'DelayErrors' flag discussed above does the same for constructors.
I am not saying that we should consider those signatures, at least not for the current SAGA version (it is far too late in the specification roadmap to do so), but just wanted to mention it, as it seems to touch upon the same problem space.
Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help? Better than synchronous, but still not optimal...
If not, should we consider a DelayErrors flag, possibly for the next SAGA version?
'DelayErrors' sounds good to me. Best regards, Sylvain
Best, Andre.
Best regards, Sylvain
Andre Merzky a écrit :
Dear Sylvain,
I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open.
Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp)
-> added as get_mtime() to namespace::entry
- IncorrectType for
task t = f.get_size <Sync> (); size_t s = t.get_result <char> ();
-> added to the spec
- context.toString():
-> this is a language binding issue - no change in spec
- NSEntry.remove() should allow for rmdir
-> corrected in spec (Recursive flag only required for non-empty dirs)
- CPUArchitecture and OperatingSystemType should be scalar
-> needs to be fixed in spec
- job are missing a state "QUEUED"
-> this is a state_detail of the Running state - no change in spec.
- removing the Queue attrib
-> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first
- avoid check for existence on open/creation of ns entries
-> two possible solutions
(a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success
// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail
(b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success
// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail
I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-)
Group should consider this to be a last call!
So, I hope I covered all items - let me know if not!
Best, Andre.
Quoting [Sylvain Reynaud] (Jun 06 2009):
Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud
To: Andre Merzky CC: Thilo Kielmann , saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time Andre Merzky a écrit :
Quoting [Sylvain Reynaud] (Jun 05 2009):
Andre Merzky a écrit :
> Hi again, > > > Hi again,
> Quoting [Sylvain Reynaud] (Jun 05 2009): > > > >>> >>> >>> >>> >>>> - Queue: this attribute makes the job description dependent on the >>>> targeted >>>> execution site, this information should be put in the URL instead. >>>> >>>> >>>> >>>> >>>> >>> Interesting point. The problem I see is that its hard to >>> define a standard way on *how* to encode it in the URL, as >>> each URL component (host, path, query, ...) may already be >>> interpreted by the backend. >>> >>> For example, a globus job manager URL may well look like >>> >>> https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... >>> >>> Where would you put the queue? >>> >>> >>> >>> >>> >> In JSAGA, such URL is used internally, user gives this URL: >> wsgram://some.remote.host:9443/Fork >> >> >> >> > sure, that will mostly work. The point is however, that we > can't assure that it breaks for other backends which require > a path specification on the URL. > > > But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.
IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.
Hi Sylvain,
Hi,
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?
I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).
Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.
>> If encoding the queue in the URL is not an acceptable >> solution, then I think the queue should be moved from >> attributes of job description to arguments of method >> job_service.create_job. >> >> >> >> > Thats also an option. What would be the difference however > to keeping it in the job description? The info arrives at > the same call, once in the description, once separate. > > > > The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".
Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.
IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution,
I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...
I think you are right to be hesitant; specifications must not change too much!
> I understand that having only JSDL approved keys in the job > description is a clean solution - but that is mostly for the > benefit of the SAGA implementors. For the SAGA users, that > makes not much of the difference, IMHO. > > > Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".
But I agree, if their usefulness is confirmed, they must be kept.
I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...
>>>> SAGA Name Spaces: >>>> ================ >>>> * add a flag to disable checking existence of entry in constructor >>>> and open methods, because the cost for this check is not negligible >>>> with some protocols (then subsequent method calls on this object >>>> may throw an IncorrectState exception >>>> if the entry does not exist). >>>> >>>> >>>> >>>> >>>> >>> Makes sense. We could also overload 'Exclusive', which, at >>> the moment, is only evaluated if 'Create' is specified. It >>> has the same semantic meaning so (inversed): if 'Exclusive' >>> is not specified on 'Create', an existing file is ignored. >>> >>> Would it make sense to allow Exclusive to be evaluated on >>> all c'tors and open calls? >>> >>> >>> >>> > Any feedback on this one? :-) > > > Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.
Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?
No proposal yet... I am thinking about it!
I throw in 'FailIfExists' ...
FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)
Best regards, Sylvain
> I am still not sure about introducing an additional > exception here, but that is another issue... > > > > Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :
<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>
I think thats an excellent proposal.
Best regards,
Andre.