Hi all, Quoting [Andre Merzky] (Apr 04 2010):
From: Andre Merzky
To: Thilo Kielmann Cc: SAGA RG Subject: Re: [SAGA-RG] notes from the OGF28 session on 15/03, 16:00-17:30 attached is another revision of the SAGA Core API Experience document, which contains changes as discussed at OGF28. I hope the changes reflect the discussion points.
I just wanted to let you know that both the advert API extension and the Core experience document have been submitted to the OGF editor, and both docs should be entering public comment sometime soon. That means that the Core API (including errata) is now definitely frozen, unless the public comments require additional changes. The submitted documents can be found in https://svn.cct.lsu.edu/repos/saga-ogf/trunk/documents/saga-package-advert/t... https://svn.cct.lsu.edu/repos/saga-ogf/trunk/documents/saga-core-experience/... https://svn.cct.lsu.edu/repos/saga-ogf/trunk/documents/saga-core/tags/v1.1rc...
So, a couple of additional errata from the Naregi group have been applied to the Core API - hopefully the last ones. However, there remains one item unresolved:
appearently we never considered to add a flush() method to the saga::file instance. As is, our API implies that all writes are immediately flushed. While that is certainly valid, the question remains if we should consider an explicit flush() method, which would, amongst others, allow implementations to perform client side caching of write operations. Iff that is considered useful, one could further discuss if that should be introduced on namespace level, so that other namespace derived packages (replica, advert, etc) can also benefit from flush(). FWIW, a close() should always imply a flush() IMHO.
So, please voice your opinion!
There was not much feedback on this item, so I added it to the list of open items for SAGA 2.0. As of now, caching behaviour on write remains undefined, and the safest assumption (for SAGA implementors) is to always flush after write, even if that is costly in terms of performance. That opens the question on when, and if at all, we should start to discuss a next version of the core API. FWIW, I appen the current list of open issues to this mail. We did not have a phone call since OGF28. There are a number of open TODO items however, and I am not sure that any calls are useful at that point, beyond iterating that those items need to be dealt with :-P So, I suggest to suspend the calls until at least some of these items are handled: - CPR package needs to be finalized - message API examples need to be rendered in different versions, to come to a conclusion on the general design approach. - Python bindings need to be shown to be functional on both Java and C++ - the SAGA rendering of GridRPC.v2 needs to be synced with the final version of GridRPC.v2 If anybody has other items to discuss, please let me know, and I'll schedule the calls. Also, the above items are obviously open for input from all of you, so, please feel free to contribute in any form. Finally, the conversion of our CVS repository to SVN is completed. CCT support did not manage to make the CVS repository ReadOnly, but please don't commit there anymore. The new SVN url is, as you probably guessed from above, https://svn.cct.lsu.edu/repos/saga-ogf/trunk That repository should be world-readable. Please let me know if you would like to have write permissions. Best, Andre. Current known open issues for SAGA Core v2.0 -------------------------------------------- - file / stream server / rpc could have state (Unknown, New, Open, Closed). - task: get_task_description just like job desc, would give you information about what the task does, e.g. - "method" = "copy" - "args" = "internet.txt" "internet.bak" (vector attrib (type??)) - "started" = "11:35pm 12/22/2006" - "finished" = "11:35pm 12/22/2007" inspection would be useful to get type and return type of task after getting it from a task_container. - I/O tasks could have a get_buffer() method, to free application from keeping/tracking I/O buffers. That would return a shallow copy of the buffer object which was given as inout parameter. Method would need to be templetized for the different buffer classes we have in the spec (or limited to the buffer base class) - make state transitions less prone to race conditions. E.g., allow suspend() also on jobs in Suspend state, and cancel() on jobs in a final state (state remains the same). Needs some thought... - what error is thrown on incorrectly formatted attributes, and when? - wait() to also report on other state changes, like suspend/resume (see DRMAA-II). - add inspection: job.list_interfaces () - monitorable - attributable - steerable? - checkpointable? to provide seemless integration of extensions, which then can define additional interfaces for core classes (see cpr). - add resource assignment to job description, e.g.: // name: CPUID // desc: CPU id to assign the process thread to // mode: ReadWrite, optional // type: Int // value: '1' // notes: - if supported, the process is guaranteed to // run on the CPU identified by the id. // - id starts at 1 // - not supported by JSDL, DRMAA.v1 // // name: CPUCoreID // desc: CPu core id to assign the process thread to // mode: ReadWrite, optional // type: Int // value: '1' // notes: - if supported, the process is guaranteed to // run on the CPU core identified by the id. // - id starts at 1 // - not supported by JSDL, DRMAA.v1 This could also go into a resource management package, obviously, together with 'queue' attribute btw (see mailing list discussion with Sylvain, and discussion about DRMAA.v2. - session.list_contexts (string type = ""); returns all contexts of that type. Also works on default session! If no type is given, all contexts are returned. - trigger metrics should have a value of 0 or 1, to allow polling for triggers. So, in fact Trigger metrics should be Boolean. - we have mtime for ns entries - there is no reason not to have ctime, or even atime, even if that is not widely supported. So what. Now we have to add ctime to the cpr package... Messy. - properties which are available via get_xyz() and is_abc() should generally also be expressed as attributes (see get_size(), get_mtime(), but also is_file() etc) - attributes and metrics should be unified. Either a metric IS-A attribute, or even better, callbacks can be added to attributes - no metric needed anymore. - file::dir should inherit file::entry *and* ns::dir. Makes in particular sense for advert and cpr ns derivates, which then don't need to duplicate methods anymore. Language bindings may not allow/encourage multiple inheritance, but it would make the spec (IDL) simpler. - we are not sticking to SIDL syntax anyway, so probably should remove references to it, and define our own *blush*. See attributes, metrics, c'tors, multiple inheritance, etc. - reconsider to split the core into \LF and API packages :-/ - reconsider file.get_fd(), for example for checkpoint writing/reading, where apps often have their own native IO routines. But of course, if they get a saga::fs::file, they can just close() it, and reopen the location natively... - file.flush is missing :-( Same for replica etc. Not sure if it makes sense on the ns::entry though. -- Nothing is ever easy.