
Hi Hartmut, all, I went through your list, and the comments from OGF (see below). Attached is the next draft iteration. The major changes are: - time to live must be enforced by the backend (expired entries are treated as non-existing) - serialization scheme is not part of the document anymore, but is moved out into a separate document - a 'delete_object()| method has been added A number of clarifications and minor changes have been made, too. Although there are some minor things still missing (more examples, some missing bib entries), I would like to put the document in final call on the list now: the remaining work should not change the API in any form. So, please all provide feedback within the next 2 weeks. Best, Andre. addressed comments from OGF28 ------------------------------ - ttl mandatory, need to be respected - done - backend should complain on invalid ttl settings - done, value is now also unsigned int - add the clone() reference/statement as discussed on list - done - rediscuss exclusion/limitation of session/context serialization - done - make refs to GFD.90 clearer - done - add interop section which explains what level of interop we expect (cross application/adaptor/backend/language/...) - done addressed comments from Hartmut: -------------------------------- Quoting [Hartmut Kaiser] (Mar 09 2010):
Hmm, not sure what went wrong - opens fine for me with Acrobat and Preview, on MacOS. I simply post it again - maybe the attachement process mangled the doc, or whatever. Please let me know if the problem persists, and I look into it again.
Here are my comments:
Page 1 - Copyright (update for 2010)
done
Page 4: - saga::advert_directory --> saga::advert::directory - saga::logical_directory --> saga::replica::logical_directory
done
Page 6: - undefined reference [?] - the the --> the - ...second advert directory /this/? acts as a...
done
Page 7: - how to ensure atomicity of setting a workitem to be 'accepted'?
only one worker is operating on a item - they are assigned by the server.
- ...directories adverts... --> directory adverts
done
Page 8: - _SAGA_LOCK is not sufficient if set to 1, it needs to contain the information about what object holds the lock (id of corresponding object?) otherwise it's useless for the implementation of atomic access (or, how should atomic locking be implemented)
| Good point - storing the objects uuid seems sensible and portable. now left out of the spec though.
- ...define their on... --> define their own
Page 9: - I'm feeling a bit uneasy with the requirement to throw IncorrectState for expired entries. If this is generally enforced I see no way to delete those.
No way to delete on user level - has to be done by backend.
Furthermore, couldn't it be necessary to access those entries even after they expired (atleast as long as they still exist)? Debugging, archiving, etc. come to mind here.
| Good point. I guess the IncorrectState exception would only be | sensible for backends which guarantee that the adverts get purged | automatically. In all other cases, I agree a backend should not | deny access to an advert which is expired, in order to allow manual | garbage collection. | | Does that make sense to you? That should not show up on API level.
Page 10: - advert_directory is missing a constructor allowing to reuse default session, it is not possible to inherit constructors (well, at least not in C++, but I doubt it works in other languages)
| The SAGA core spec describes that the session parameter on all | c'tors is optional, even if that is not explicitely rendered in the | IDL. How that is implemented (overloading, inheritance, ...) is up | to the language bindings. So, this is no different than for the | packages in the core spec I think.
Page 11: - advert is missing a constructor allowing to reuse default session
| same here.
- ...uses the same /ag/? semantics...
| Uhm, not sure what you mean? done
- How can Truncate reset TTL counter if it is supposed to empty the attributes? Do you mean _SAGA_MODIFIED should be set to the current time and _SAGA_TTL shouldn't be touched?
| Very good point! I would expect that no advert ever looses its | _SAGA_* attributes. So, the advert c'tor would automatically create | those state attributes, and truncate would either re-use them, or | delete the old and re-create new ones. | | Does that make sense? I agree this needs clarification. _SAGA_* attribs are out of scope now...
Page 12: - does Truncate on a directory purge everything below in the hierachy?
| Uhm, good question. I would not think though, as truncation is, | IMHO, focusing on the adverts associated with the entity. Also, we | allow trauncate on the namespace::directory (or at least do not | forbid it), but do not define any action for that, and in particular | not a recursive remove. clarified
- ...it has two additional method... --> it has two additional methods - directorie's --> directory's - directories --> directory's - PostCond: ...of the context use... --> of the context used
done
Page 13: - 'if the Tuncate flag is given, the returned object MUST NOT have an associated object': what 'associated object' is meant?
| The SAGA object which has been added with store_object(). I'll | clarify.
- The way the spec describes the TTL timer implies that each entry must have a 'hidden' entry representing either the remaining TTL or the starting point in time when the TTL has been reset. Wouldn't it be benefitial to make that explicit, i.e. add another attribute defining the point in time when the entry expires?
| We have a last_modification attribute (see page 8) - updating that | should allow to leave the TTL timer constant, and still extend the | expiry date I think. So, expiry_date = last_modified + ttl. | | Makes sense? clarified.
Page 14: - what's the preferred method of getting a list of all expired items (I'm asking as find() shouldn't return those)?
| Hmm, that sounds stupid on second though - again, that would only | make sense for backends which guarantee garbage collection. All | other backends should return the expired entries just as usual I | guess... backend problem now.
- ...manage manage... --> manage - ...set of user define attributes... --> set of user defined attributes
done
Page 17: - ...semantics as ... --> semantics as
?
Page 18: - how can I remove an associated object from an entry (de-associate the object)?
| Duh. We can't right now - only change it. So, one would have to | truncate the advert, or to re-create it from scratch. Both would | destroy the attributes. | | How does 'delete_object()' sound to you? ;-) added
Pages 19-22: - note: I did not compile/run the code, but looks ok
| :-) This reminds me: I'll probably add some simplier examples to | the document. Thilo pointed out that this one might be a wee bit | long... to be done...
Page 25: - I'd suggest to change the way names for those multiple keys are formed: _SAGA_CONTEXT_<N>_URL --> _SAGA_CONTEXT_URL_<N>, as this makes it easier to generate those, allows simpler search patterns, and even allows to make _SAGA_ANY_ATTRIBUTE and _SAGA_ANY_ATTRIBUTE_0 equivalent by definition (which unifies the attribute naming scheme)
| I agree about the generation, and simpler patterns - but what do you | mean with 'make _SAGA_ANY_ATTRIBUTE and _SAGA_ANY_ATTRIBUTE_0 | equivalent'? I think you mean that _SAGA_ANY_ATTRIBUTE_0 should not | be used, but rather _SAGA_ANY_ATTRIBUTE instead? | | In any case: yes, that makes sense - will change. Attribute names | will not align that nicely anymore though, which will hurt my eyes | for sure :-P out of scope now.
Page 26: - how is the _SAGA_OBJECT_TYPE attribute supposed to be stored? as string, as integer?
| As Enum as defined in the core spec: | | "Values of Enum type attributes are expressed as strings, and have | the literal value of the respective enums as deï¬ned in this | document. For example, the initial task states would have the | values âNewâ, âRunningâ, âDoneâ, etc." out of scope now.
- 'For all classes inheriting saga::object, the serialization SHOULD contain the two attributes, too.' What two attributes?
| 'type' and 'session' - that is confusing though, I agree, since | session can be comprised of multiple attributes. Will fix. out of scope now.
- I believe it might be useful to store th euuid of the original object as well, allowing to make persistent object relations independently of a single application instance
| I had that in there originally, but removed it because, IMHO, | advert::store_object()/retrieve_object() should behave like | object::clone(), which happens to assign a new UUID. Also, one | could call 'retrieve_object()' multiple times on the same advert, | and would end up with the same uuid for multiple objects -- this is | exactly why we required to change the uuid on object.clone() I | think, as that would put quite some burden on the implementation to | track those cases, with no appearent benefit to the end user. | | I'd be happy to be convinced otherwise though - let me know what you | think, please. clone() reference added. otherwise out of scope now.
- How can a saga::object have associated context information stored? It should be sufficient to refer to the session, which in turn refers to the contexts.
| Right, but that would require that the session is actually stored in | the advert service. That could be either done by the backend | automatically, which creates a session entry along with the original | entry, or is done by the SAGA application explicitely. | | However, in both cases the advert service will get swamped with | session and context adverts. Alowing to fold the session incl. | contexts into the advert itself avoids that - for the price of | bloated adverts. Bulk ops or caching are a must in either case I | guess... out of scope now.
- ...If no session information are serialized... --> If no session information is serialized
Page 27: - strings should be always uuencode'd, otherwise we loose interoperability
| We don't require that for replica attribute strings either, nor in | fact for any attributes in the SAGA specs anywhere. IMHO, we should | leave this as simple as it is, and fix it once people start | complaining. uuencode in user space is also always available for | the fringe use cases. Makes sense? out of scope now.
- How vector attributes of a job description have to be serialized
| Ouch, good one - no idea. Possibly similar to the context list, so | via | | _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_NUM = 2 | _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_0 = "localhost" | _SAGA_JOB_DESCRIPTION_CANDIDATE_HOSTS_1 = "remotehost" | | ? That would be at least somewhat consistent, but again bloats the | attribute table... out of scope now.
- the job service id should not be serialized along with the jobid to avoid ambiuities/incompatibilities between the two, parsing the jobid is not a big deal in the end...
| What ambiguities would you expect? | | Yes, parsing is not the big deal, but the job-id format is not a | hard requirement in the Core spec (SHOULD instead of MUST). Also, | our own adaptors don't always follow that format specification IIRC | ;-) out of scope now.
Page 28: - are permissions serialized too?
| I don't think so, as those are backend properties. Like, file | permissions are stored on the backend - as soon as one restores the | file object, the adaptor can query the backend for actual state | information, including permissions. Only client side state should | get serialized: the file pointer being a prominent example. out of scope now.
Page 29: - logical\_file --> logical_file - logical\_directory --> logical_directory
gone
Page 30: - what attributes have advert and advert_directory when serialized?
| Doh. That is not even specified :-) | | I would expect: | session | url | openmode | | The rest is again backend state, IMHO. out of scope now. Thanks again, thats really great feedback! Best, Andre. -- Nothing is ever easy.