Hi Donal! I think it's a good idea to start from a survey. I've added a few ideas and comments inline.
Hi everyone
First let me apologize for letting things slide for a while; I've been up to my ears in the project I work on.
We need to start work on designing the survey to gather experiences with the UR 1.0 format. I'm not quite sure how to go about this, but I suppose a good way to start would be to provide some kind of template for people to fill in. So I'm going to write some random ideas I have for what we should be asking; chime in if you think I'm right or wrong.
Name & Contact Email Project/Product Prose Description of Use
Structured Use Characterization: Which UR1.0 Elements do you support? (Table, one row per spec element)
Element Name | Generate | Parse -------------+----------+------- UsageRecord | yes | yes ....
What extensions do you put in a Usage Record?
e.g. executable name, working directory, arguments
This is important, I guess, for knowing what properties might be added to those explicitly defined (if everybody is ussing it, why not add it as regular property ...). For that purpose I suggest to ask also _how_ important are those extensions for the correct functioning of the project/product. For example: if an accounting system can only accept/process records containing a non-standard property (i.e. extension) the impact on interoperability is different from cases in which an extension is allowed but not required. A practical example: DGAS requires the unique global resource ID (Computing Element ID) to be present in a record (because it associates these records to pre-registered CE accounts). This might be either put into the MachineName (that however can also be the site name, etc.) or in an extenstion. In both cases there is a risk of incompatibility of otherwise perfectly valid usage records (if the MachineName contains something else or the extension is not defined). But this is just an example, there will be other projects/products that require specific non-standard properties (VOName, for example, is important for nearly all large-scale Grid environments), but that is of course something that the survey has to show. Just keep in mind that some extensions might be more essential than others.
When do you generate usage records?
At job start. At job end. At points during the job. What triggers the generation? After the job has run from other captured data. Corresponding to whole job? Corresponding to events within job?
Good point, we usually talk about "periodic" accounting (after the facts; e.g. once a day), "immeditate" accounting (right after job completion) and "real-time" accounting (incrementally while a service is being furnished) I'd clarify by using the expressions "at job end (upon job completion)" and "after job ends (periodically)".
What do you do with Usage Records after generating them?
Is there anything else that we ought to ask about?
I'd also ask how important an aggregate/summary usage record (for usage statistics) would be, making however clear that that would be a distinct specification (such as the one proposed by Xiaoyu). Just to have an idea on how much effort should be put into this (as a co-chair of the RUS-WG I say a lot ;oP , but of course it's the community that should decide).
I should note here that the OGSA HPC Profile WG is looking into the collection of usage data from simple jobs, and that they are minded to use UR1.0 as the basis for this.
Can you point me to some documentation about this, reading their charter they seem to focus on job scheduling, not building usage statistics. Cheers, Rosario.
Donal. -- ur-wg mailing list ur-wg@ogf.org http://www.ogf.org/mailman/listinfo/ur-wg
participants (1)
-
piro@to.infn.it