
Hi, The high-level use case is to monitor/track/update job information in its entirety by potential actors like users, high level middleware services, schedulers, etc... some general requirements for activity schema; Normally, activity document contains the job description as a JSDL instance sent by a user. In terms of UNICORE, the activity document not only contains user submitted JSDL instance but it also maintains the JSDL instance incarnated by the backend execution management system during job execution. Another UNICORE requirement is that the activity instance includes computing job attributes with associated storage information. Particularly UNICORE does this by adding a Storage Management Service (SMS) reference within activity properties document in order to enable clients to manage remote storages. Activity schema can also represent the events occurred during the lifetime of a job. In other words one may take it as job version information and each of its instance may include the job status and time stamp elements. This information should not be over loaded with bits and pieces of each job description item such as resources consumed or JSDL instance. Activity instance contains the resource allocation (consumption) information, it means what resources in real time are/or will be allocated to a job. In a multi-resource environment, this information can also capture exceptions during a job execution, such as job is interrupted due to node failures or resource starvation; or it has been transferred to another server instance. In general, this information is significant to the middleware components managing job recovery and checkpointing functions in the event of failovers. Cheers, -- ------------------ Mohammad Shahbaz Memon Distributed Systems and Grid Computing Jülich Supercomputing Center Forschungszentrum Jülich GmbH Jülich Germany Office: +49 (0)2461 61 6567 Fax: +49 (0)2461 61 6656 http://www.fz-juelich.de/jsc Sitz der Gesellschaft: Jülich Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDirig'in Bärbel Brumme-Bothe Vorstand: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender)

There is a lot of information going on here... how important is it that the information is recorded in one document as opposed to recording where the information can be obtained? It would seem a lot better that the BES endpoint remains the definitive source of the job status rather than some other document floating around the system... otherwise maintaining any form of consistency would be very hard IMHO. Steven

Hi, Steven Newhouse wrote:
There is a lot of information going on here... how important is it that the information is recorded in one document as opposed to recording where the information can be obtained? It would seem a lot better that the BES endpoint remains the definitive source of the job status rather than some other document floating around the system... otherwise maintaining any form of consistency would be very hard IMHO.
good point. On the other hand endpoints tend to disappear, so to keep some information around in the activity instance document has its merits. But it should be historical information, not intended to be "live". Regards, Bernd. -- Dr. Bernd Schuller | mail: b.schuller@fz-juelich.de | phone: +49 2461 61-8736 (fax: -6656) Distributed Systems and Grid Computing | personal blog: Juelich Supercomputing Centre | http://www.jroller.com/page/gridhaus http://www.fz-juelich.de/jsc | ------------------------------------------------------------------- ------------------------------------------------------------------- Forschungszentrum Jülich GmbH 52425 Jülich Sitz der Gesellschaft: Jülich Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Bärbel Brumme-Bothe Geschäftsführung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr. Harald Bolt, Dr. Sebastian M. Schmidt ------------------------------------------------------------------- -------------------------------------------------------------------

Steven, 2008/4/16, Steven Newhouse <Steven.Newhouse@microsoft.com>:
There is a lot of information going on here... how important is it that the information is recorded in one document as opposed to recording where the information can be obtained? It would seem a lot better that the BES endpoint remains the definitive source of the job status rather than some other document floating around the system... otherwise maintaining any form of consistency would be very hard IMHO.
I agree that there should be a fixed endpoint for retrieval. But which BES would we use? Plus, will the service that initially creates an activity instance for a client request have a BES interface no matter what? Regards, Alexander -- Dipl.-Inform. Alexander Papaspyrou http://ds.e-technik.uni-dortmund.de/~alexp Robotics Research Institute phone : +49(231)755-5058 Information Technology Section fax : +49(231)755-3251 Dortmund University of Technology, Germany

Alexander Papaspyrou wrote:
Steven Newhouse wrote:
There is a lot of information going on here... how important is it that the information is recorded in one document as opposed to recording where the information can be obtained? It would seem a lot better that the BES endpoint remains the definitive source of the job status rather than some other document floating around the system... otherwise maintaining any form of consistency would be very hard IMHO.
I agree that there should be a fixed endpoint for retrieval. But which BES would we use? Plus, will the service that initially creates an activity instance for a client request have a BES interface no matter what?
As I understand it, the "fixed endpoint" could actually just be a front that uses something like WS-Naming to redirect to the endpoint for the repository that is actually serving the information. However, from the perspective of JSDL, I think we shouldn't be worrying about this at all (officially); it's part of the service implementation and not the data representation. (OK, I know we have the fact that we want to make some services in mind when we write this, but we don't have to solve it all.) On the point of consistency, I'd just ignore it! To be precise, if the data is being collected from across some sort of distributed system, there is no way to guarantee that it is consistent (messages do get lost and delayed; you can't avoid it). I'd instead state that the information is only ever of "best effort" quality unless it specifically states otherwise. Donal.

While it was not my intention to cause 'trouble' some of the responses to me seem to indicate very different architectures/implementations. I think you need to firm up your models as to how you expect this activity schema to be stored, represented and accessed. Steven

I think this is a fair assessment of where we are at the moment since we have just started collecting and discussing use cases; and not quite started breaking the use cases down into requirements. We'd be happy to have you join us. The regular call is at 7am Pacific and the cadence is (typically) bi-weekly. Andreas Steven Newhouse wrote:
While it was not my intention to cause 'trouble' some of the responses to me seem to indicate very different architectures/implementations.
I think you need to firm up your models as to how you expect this activity schema to be stored, represented and accessed.
Steven
-- Andreas Savva Fujitsu Laboratories Ltd.

Steven Newhouse wrote:
While it was not my intention to cause 'trouble' some of the responses to me seem to indicate very different architectures/implementations.
I don't currently have any specific architecture or implementation in mind. On this topic, I'm closer to being in the peanut gallery than usual. :-)
I think you need to firm up your models as to how you expect this activity schema to be stored, represented and accessed.
I suppose I'm currently thinking that, for a generalized Activity, there'll be information originating in a whole range of services and resources, and that there will be some model of aggregation. The info that represents the current state will be "out of date" (and so perhaps ought to be tagged with EPRs of services where you can go and ask for the most up-to-date version?[*]) but the historical info will (eventually) be accurate. Given that we're aggregating, we have to have a common structure/organization for how to gather the disparate subdocuments together. Which I think is the focus of what we're doing here, yes? Donal. [* I only just thought of this now; the idea's not been subject to any kind of deep thought, analysis or in-service testing! ]
participants (6)
-
Alexander Papaspyrou
-
Andreas Savva
-
Bernd Schuller
-
Donal K. Fellows
-
Mohammad Shahbaz Memon
-
Steven Newhouse