
Hi All, I hope you have (or have had) a good face-to-face meeting. On the conference call previously we have discussed aggregate accounting and made moves to move it out of scope. I attach below, a valid and well formed record, using the definitions from section 10 of the spec. <UsageRecord> <RecordId urwg:recordId="foo"/> <UserIdentity> <ds:KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#" xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <X509Data> <X509SubjectName>cn=matt ford</X509SubjectName> </X509Data> </ds:KeyInfo> </UserIdentity> <StartTime urwg:description="accountperiod">2003-06-16T08:24:32Z</StartTime> <EndTime urwg:description="accountperiod">2006-05-08T10:34:58Z</EndTime> <CPUDuration urwg:description="sum over accountperiod for UserIdentity">234234324325</CPUDuration> <WallDuration urwg:description="sum over accountperiod for UserIdentity">23423434</WallDuration> <Status urwg:description="all complete jobs over accounting period for UserIdentity">completed</Status> </UsageRecord> This I think certainly constitutes a valid aggregate record. It would require a farily significant rewrite to make this _not_ be allowed. I'm thinking of using something like this to report total cpuduration for some of my users. Is this against the spirit of the clarification we are trying to make, should this be something that is allowed? The move away from aggregation is a big one...or have I been too liberal in my interpretation (but I'd argue against this) Matt.

Hi Matthew, hi all! Matthew Ford wrote:
Hi All,
I hope you have (or have had) a good face-to-face meeting. On the conference call previously we have discussed aggregate accounting and made moves to move it out of scope.
I attach below, a valid and well formed record, using the definitions from section 10 of the spec.
<UsageRecord> <RecordId urwg:recordId="foo"/> <UserIdentity> <ds:KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#" xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <X509Data> <X509SubjectName>cn=matt ford</X509SubjectName> </X509Data> </ds:KeyInfo> </UserIdentity> <StartTime urwg:description="accountperiod">2003-06-16T08:24:32Z</StartTime> <EndTime urwg:description="accountperiod">2006-05-08T10:34:58Z</EndTime> <CPUDuration urwg:description="sum over accountperiod for UserIdentity">234234324325</CPUDuration> <WallDuration urwg:description="sum over accountperiod for UserIdentity">23423434</WallDuration> <Status urwg:description="all complete jobs over accounting period for UserIdentity">completed</Status> </UsageRecord>
This I think certainly constitutes a valid aggregate record. It would require a farily significant rewrite to make this _not_ be allowed. I'm thinking of using something like this to report total cpuduration for some of my users.
Is this against the spirit of the clarification we are trying to make, should this be something that is allowed? The move away from aggregation is a big one...or have I been too liberal in my interpretation (but I'd argue against this)
I think a usage record should be explicitly declared to be aggregate (how should be discussed), otherwise we would easily undermine all standardization efforts. How does for example an arbitrary implementation of the Resource Usage Service (RUS), that uses the UR, realize that your UR is meant to be aggregate. It would most probably (and usually should) interpret the StartTime as the start time of job execution and the End Time as stop time of job execution of a single job. It wouldn't understand urwg:description="accountperiod" as modifier of the meaning of Start and EndTime. Of course you might customize the behaviour of your RUS (or whatever service you use for accounting), but that would make it a non-standard solution. But I agree that an aggregate format would be most useful. If it will not be defined by the UR-WG then most probably most Grid environments/projects will define their own aggregate format. This as well would risk to undermine the standardization efforts. Before declaring an aggregate UR out of scope, we should at least start (via the mailing list) a discussion on what would be necessary (new elements? Or simply an additional attribute 'urwg:scope="aggregate"' to some of the elements? To which? Allow multiple job IDs for aggregate URs? etc. But I think this would be a longer process and wouldn't lead to quick results. The improvement of the current UR for single jobs will (and should) have the priority I guess. Cheers, Rosario.
Matt.
-- ------------------------------------- Rosario Piro (piro@to.infn.it) http://www.to.infn.it/~piro/ ------------------------------------- Istituto Nazionale di Fisica Nucleare Sezione di Torino ------------------------------------- National Insitute for Nuclear Physics Section of Turin, Italy -------------------------------------

Quoting Rosario Michael Piro <piro@to.infn.it>:
I think a usage record should be explicitly declared to be aggregate (how should be discussed), otherwise we would easily undermine all standardization efforts. How does for example an arbitrary implementation of the Resource Usage Service (RUS), that uses the UR, realize that your UR is meant to be aggregate. It would most probably (and usually should) interpret the StartTime as the start time of job execution and the End Time as stop time of job execution of a single job. It wouldn't understand urwg:description="accountperiod" as modifier of the meaning of Start and EndTime.
So, my point here, is that section 10 of the spec defines StartTime, Endtime, and CPUDuration as relating to Usage *not* job. The descriptor is used to attach meaning but it's not actually necessary, it's essentially a free element. Without the descriptor field the aggregate example is still valid. I guess the issue is that the current section 10 defs allow this; I think deliberately. For example the CPUDuration field can be used in a number of circumstances: aggregate info is one of them, time on a batch system another, cputime spent copying files may be yet another. If we do move these fields into seperate sub-elements or specific "Job" types (which I'm in favour off) then we have to be very careful. Anyone using the schema as I think original intended will have to made aware of these significant changes. I'd love to understand from the original authors if this is actually how they envisaged its use.
Of course you might customize the behaviour of your RUS (or whatever service you use for accounting), but that would make it a non-standard solution.
But I agree that an aggregate format would be most useful. If it will not be defined by the UR-WG then most probably most Grid environments/projects will define their own aggregate format. This as well would risk to undermine the standardization efforts.
Before declaring an aggregate UR out of scope, we should at least start (via the mailing list) a discussion on what would be necessary (new elements? Or simply an additional attribute 'urwg:scope="aggregate"' to some of the elements? To which? Allow multiple job IDs for aggregate URs? etc.
But I think this would be a longer process and wouldn't lead to quick results. The improvement of the current UR for single jobs will (and should) have the priority I guess.
Cheers,
Rosario.
Matt.
-- ------------------------------------- Rosario Piro (piro@to.infn.it) http://www.to.infn.it/~piro/ ------------------------------------- Istituto Nazionale di Fisica Nucleare Sezione di Torino ------------------------------------- National Insitute for Nuclear Physics Section of Turin, Italy -------------------------------------
participants (2)
-
Matthew Ford
-
Rosario Michael Piro