
glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Paul Millar said: Does GLUE require an information system to respond quickly? If so, on what time-scale?
Well, we use the CE information for scheduling, so it needs to be fast enough to give a reasonable representation of e.g. the number of jobs in a queue. As a rough guide, say you have 1000 job slots and jobs typically run for 10 hours, you might have a transition every 30 seconds or so. However, you don't need accuracy down to the level of a single job, so a latency of a few minutes is OK and typically that's what we have now. On the other hand a latency of an hour would not be good enough, the queue state could change quite dramatically in that time. It's also worth remembering that the batch schedulers themselves have cycle times, typically something like 30 seconds I think, so there is no point in the info system being faster than that. It has often been suggested that we should separate dynamic and static data so we could get frequent updates of the small amount of dynamic information and longer-lived caches for the static stuff. I think Laurence has done some work along those lines for the BDII, but at the moment everything is treated the same way.
So, worse-case delay is the simple sum of these: 291s or the best part of five minutes.
That's probably about right. In the original design for the WMS it queried the GRIS directly before submitting a job to get more recent data than from the II, but we disabled that a long time ago because it was inefficient and didn't make much practical difference - where we have problems they tend to be more fundamental, e.g. poor algorithms for EstimatedTraversalTime or black-hole WNs which fail jobs quickly and hence give a small ERT. Stephen