
Hi all! This morning at the UR-WG session at GGF18 we briefly discussed a few proposals for version 2 of the UR. In order to involve also those into the discussion that could not attend I send you some slides on my proposals, in two version: One that contains also some information on our accounting system DGAS (not essential for the discussion and not presented at the meeting, but maybe they might the info might be interesting as a use case), and a version that does contain only the proposals themselves. Some comments on the single proposals: (1) This proposal is based on existing use cases, we actually have VOs in LCG that need more detailed user information, such as the group and role of the user within the VO. One specific use case: BIOINFOGRID decided to become part (group) of the BIOMED VO, but still want distinct accounting statistics in order to compare their resource usage to that of the rest of the VO. The data fields we propose for this information can be considered a part of the user's identity (above all the VO is important, I know of users that are subscribed - with the same User DN/certificate - to multiple VOs, which VO to charge for a job submitted by such a user?) (2) We propose to have an element for all ResourceIdentity information, in analogy to UserIdentity and JobIdentity. This element would contain already present fields, such as MachineName, Queue, Host. It should also contain a SiteName (that might also be put into MachineName, but then where to put the cluster name? Site name is semantically different from machine name). The most important proposal for this RecordIdentity, however is to have a field that can be used to specify the global resource ID within the Grid environment (in our case the LCG ComputingElement ID, that is based on the Globus contact string). We deem this most important because for some Grid environments it might be impossible to reconstruct the correct global resource ID from fields like MachineName (above all if the meaning of machine name is left open), Host and Queue. Moreover, if having a GlobalUsername (and ds:KeyInfo) and a GloblaJobId, it is only logical to also have a GlobalResourceId ("Resource" because then it might also be used for accounting of other resource, like storage, in the future). (3) Proposal 3 has caused some heavy discussion between two fundamentally different point of views (in my opinion both understandable and reasonable): a) leave out of the UR the information that is not directly related to resource usage (performance is not directly an issue of the job that is running on the resource) b) include all information that is necessary to interpret the usage values (CPU time doesn't mean much when performances are significantly different - 1 CPU sec on a low performant host is not equal to 1 CPU sec on a high preformant host) BUT: These performance values (which ones to take? which do make sense?) might also be specified by means of the existing extension framework. (4) We should start thinking about going beyond job usage. Storage accounting, for example, will be an important issue in the near future (LCG/EGEE for example is currently working on it). We propose to have a basic usage record type from which more specific usage records (JobUsageRecord, StorageUsageRecord, ...) can be derived by adding additional specific elements (JobIdentity for jobs, FileIdentity for files on a storage resource, ecc.) (5) There was no time for looking at this proposal at the meeting. We should decide wether requested/reseved resources can be considered as "used". It is, for example, most likely that reserved storage space will be charged to a user. Some data fields might occur multiple times (Disk, Network, ...) having an additional attribute (I called it "usage", but there surely are better names) that specifies whether the resource has been actually "consumed" or only "reserved/requested". The default should be "consumed" for reasons of backward compatibility (old URs would still be valid). Among these proposals I deem (1) as important and (2) as essential. (4) might be important for achieving a common framework for more generic usage accounting (It wouldn't be nice if the UR would be used only of jobs and storage would be handled by a completely different approach). Proposals (3) and (5) are just ideas, but might also be handled by the extension framework. But, we should always think twice before just saying "well that can be done with an extension", otherwise we might end up with N different versions (through different extensions) of the UR in N different Grid environments. This might undermine the standardization effort. A specific use case: DGAS requires the GlobalResourceId, since it specifies the resource account to which a usage record should be associated. If we use extensions for that, and other Grids/accounting systems use other extensions for the same thing, then we might not be able to exchange usage information between these accounting systems. Please check the proposals, think about them, think also about actual use cases, and then let's have a fruitful discussion! :o) Cheers, Rosario.