
Hi all, I had a recent discussion with a site over the meaning of Entity.CreationTime and Entity.Validity. I wanted to share my thoughts and see whether others have the same view and whether this might be worth documenting: either as part of an updated GLUE document or as some auxiliary document (a "GLUE Processing Model" perhaps). The core question is this: how to know when published data is stale? Prima facie, this is easy: simply add Entity.Validity to Entity.CreationTime. If the resulting time is in the past then the object is stale. It becomes slightly tricky when considering an object that passes though several caching agents, with each agent pulling updated information periodically. [For those that don't know, this is how the EGEE information system currently works; there are three caching levels ("resource-level", "site-level", "top-level"). Each level has the same software, "BDII", with different configuration.] First off, I assert that the current info-provider model (a script that provides the current up-to-date information) cannot publish Entity.Validity. With a pull-update model, the info-provider cannot know when the next request will come. (To illustrate this, consider two agents calling the info-provider with different schedules, or an info-provider that supplies information "on demand" whenever a user clicks on some web-page). As the info-provider cannot know when the next request will come, it cannot set an Entity.Validity value. However, the info-provider can publish an Entity.CreationTime. Let's suppose it does. Let's also suppose that the first agent (the resource-level BDII) queries every two minutes. Ideally, the BDII will read the data and calculates what Entity.Validity is needed so that Entity.Validity+Entity.CreationTime is two minutes in the future. It then records this Entity.Validity along with the original Entity.CreationTime. If (somehow) this data is observed after the two minutes has elapsed, it is clear that the data is stale. Note that it is also possible (i.e., logically self-consistent) for the BDII to reset the Entity.CreationTime to the current time. I suggest this isn't done as knowing the original creation time could be quite useful. The same procedure would happen for the higher levels: each one would calculate a (potentially new) Entity.Validity so that the object will not be valid after it anticipates fetching fresh data. This calculated Entity.Validity would replace any that already exists in the supplied data. Assuming you agree with this approach, there are some open questions: 1. should the agent (BDII) reset Entity.CreationTime (I'd say "no"), 2. should the agent (BDII) add an Entity.CreationTime if the source does not provide one? 3. what should the agent (BDII) do when faced with stale data? Should it simply log a warning or should it reject the data? Any thoughts? Cheers, Paul.

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Paul Millar said: Prima facie, this is easy: simply add Entity.Validity to Entity.CreationTime. If the resulting time is in the past then the object is stale.
Like many things I think it needs to be driven by the use cases in a particular environment. In EGI the BDII system isn't capable of providing especially fresh data, there are several propagation steps each of which can take several minutes. The EGI profile document suggests a 1 hour validity period for dynamic information, and that's what my info provider sets, as a compromise between allowing some tolerance for slow propagation against the desire to spot problems. The glue validator implements the checks defined in the profile: warning if the creation stamp is more than 2 years old, error if CreationTime+Validity is in the past, and fatal if CreationTime+10*Validity is in the past. There's an extra wrinkle that the top BDII will cache data for a configurable time which defaults to four days.
As the info-provider cannot know when the next request will come, it cannot set an Entity.Validity value.
I think that's the wrong way to look at it; the validity should be an estimate of how long the information can be reasonably trusted, irrespective of how the system updates. However, there would be little point in trying to publish information which is only reliable for a few seconds if the information system can't possibly transmit it that fast! Conversely you can obviously publish configuration information valid for weeks or months through a system which updates in minutes.
Note that it is also possible (i.e., logically self-consistent) for the BDII to reset the Entity.CreationTime to the current time. I suggest this isn't done as knowing the original creation time could be quite useful.
It would be incorrect, since it would not be the time at which the information was created.
The same procedure would happen for the higher levels: each one would calculate a (potentially new) Entity.Validity so that the object will not be valid after it anticipates fetching fresh data. This calculated Entity.Validity would replace any that already exists in the supplied data.
I think this is just the wrong idea. The BDII does in fact supply information like update times if you want it (I forget the exact query to do that), but that isn't part of GLUE. The GLUE attributes refer to the creation and validity of the information content, and are the same whatever technology is used - translate them to XML or JSON and the values would be the same. Stephen

On 25/03/2015 17:41, Paul Millar wrote:
2. should the agent (BDII) add an Entity.CreationTime if the source does not provide one? No; that would make the value useless because it would not be used consistently with those that set it.
3. what should the agent (BDII) do when faced with stale data? Should it simply log a warning or should it reject the data? It should publish it. Let the client figure out what to do with it.
I don't think 'Validity' is the right name, though. To solve your use case, you need, IMHO, is a time estimate of when the next update of the data will appear (or should have appeared). A resource BDII which refreshes its data every 20 mins would put a creation date of now() and a nextUpdate field of now()+20mins. CASTOR at RAL actually have two different creation times because the tape accounting is updated once every 24 hrs and the disk data every 2 hrs. Some StorageAreas would combine the two, so it'd be hard to give a definite creation time and validty/nextupdate. Cheers --jens

Hi Paul, Prologue: in the ARC information system model we always thought that having intermediate stages with a lot if information between the generation of information and its propagation is useless. Time drifts are not negligible. For this reason everything that doesn't come from the source is not trusted in ARC. One must ask directly the resource level to be sure. This might have performance drawbacks but keeps freshness of information consistent. Our own information index was based on these hypotheses, unfortunately is not easy to maintain and was never used outside the NorduGrid Consortia. Hence, for me as an ARC developer, but also for Stephen and Jens for other reasons, validity has only one meaning: the expected time when the actual provider that _generates_ the information will run again and update that information. The fact that you read it in a top-bdii doesn't change this, since all the BDII system is inherently asynchronous. It was designed like that for performance and technology reasons. If you want information about the time drift of hierarchical collection, then you should have additional fields in GLUE2 to represent the aggregation hierarchy and its steps. As I stated many times, the GLUE2 model itself just models information representation of a single entity/source, but does NOT discuss aggregation. The three-level BDII architecture, presented also in the LDAP realisation document, actually contains the first and only existing description on how to do such aggregation. Your arguments apply to information aggregation, that requires a completely different approach to me. This is the reason why in the EMI project we tried to develop the EMIR service as an alternative to top-bdii, with an alternative architecture and a data model that was not limited to GLUE2, but it included it. GLUE2 does not define aggregation strategies, neither its fields take it into account. In the light of the above, please read my comments on the three questions below: On 2015-03-25 18:41, Paul Millar wrote:
Hi all,
I had a recent discussion with a site over the meaning of Entity.CreationTime and Entity.Validity.
I wanted to share my thoughts and see whether others have the same view and whether this might be worth documenting: either as part of an updated GLUE document or as some auxiliary document (a "GLUE Processing Model" perhaps).
The core question is this: how to know when published data is stale?
when the client decides it is -- it may vary depending on information and use cases
[...] Assuming you agree with this approach, there are some open questions:
1. should the agent (BDII) reset Entity.CreationTime (I'd say "no"),
No. CreationTime refers to the record related to the entity described. Only the resource provider that creates the datastructure can know such time. BDII levels above resource just copy the data.
2. should the agent (BDII) add an Entity.CreationTime if the source does not provide one?
No. That makes no sense for the same reason above -- CreationTime refers to the record creation time for the described entity -- the record existed BEFORE an intermediate BDII level collected it. Changing such information would be faking it, as we don't know when it was created.
3. what should the agent (BDII) do when faced with stale data? Should it simply log a warning or should it reject the data?
Depends on use cases and nature of data. Example: an ComputingEndopint is supposedly more persistent that the job statistics contained in a ComputingShare. The former can generate a warning because the endpoint MAY NOT be reached; the latter MUST be rejected and retrieved anew as it could drive to an incorrect brokering/selection of a resource if outdated. Handling stale data depends on the nature of the data itself IMHO. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Hi all, Thanks for all your replies. Let me try to summarise people's replies: CreationTime: 1. this is the instant of time when the data represented in the object was collected. 2. It is unclear what value is to be used when the attributes come from multiple sources that are collected at different times. 3. In GLUE-based infrastructure that aggregates data from multiple sources, the aggregating agent must not update CreationTime. This is pretty straight forward, but the second and third points are quite interesting. It is claimed that GLUE 2 is agnostic on aggregation. Quoting Florido: "the GLUE2 model [..] does NOT discuss aggregation". Stephen, Jens and Florido were very clear on point 3., "shooting down" a logically self-consistent alternative interpretation of CreationTime, where a site- or top-level BDII adds a CreationTime if it is missing. Rejecting this interpretation is fine (I rejected it, too). However, the arguments doing so were (IMHO) not well thought out. Stephen: "It would be incorrect, since it would not be the time at which the information was created." -- circular argument fallacy: CreationTime is the time information was created. Jens: "No; that would make the value useless because it would not be used consistently with those that set it." -- straw-man fallacy: claiming inconsistency, but if GLUE 2 does not distinguish between resource- and other BDIIs then there is not inconsistency. Florido: "No. CreationTime refers to the record related to the entity described. Only the resource provider that creates the datastructure can know such time. BDII levels above resource just copy the data." Again, straw-man fallacy as it requires GLUE 2 to distinguish between resource- and higher-level BDIIs. Again, let me state I'm happy with rejecting the idea that site- and top-level BDIIs adding CreationTime. However, the difficulty in describing the exact semantics of CreationTime suggests (to me) that GLUE 2 _does_ include the concept of aggregation, at least because it has CreationTime with different processing models depending on whether the agent is a primary data source (e.g., Resource BDII) or an aggregating source (Site- or Top- BDII). Validity: 4. Stephen: "an estimate of how long the information can be reasonably trusted, irrespective of how the system updates". 5. Jens: [I wasn't sure from his response] 6. Florido: "the expected time when the actual provider that _generates_ the information will run again and update that information." So, Stephen and Florido seem to have opposite views of Validity. It seemed that Jens had a similar view to Stephen, at least he suggested Florido's concept be published as a nextUpdate attribute rather than Validity. My question to Stephen: different clients may tolerate different levels of error/uncertainty (is 1% "good enough"? how about 5%, 10%, or 20%?). Given that "reasonably trusted" depends on the client, how to know what value is to be published? My question to Florido: in ARC, how exactly is Validity property set? Is it hard-coded in the code, configured manually by the admin, passed to the info-provider script, or overwritten by the cron/refresh job? Just to add a little bit of "current usage", I did a quick survey using lcg-bdii.cern.ch. Some 13% (13865 of 104853) GLUE2 objects currently published have a Validity attribute. These objects have one of three values: ~0% (34 objects) have Validity of 1 minute, 1% (1080 objects) have Validity of 10 minutes, and 12% (12751 objects) have Validity of 1 hour. So, to a good approximation, only one Validity value is set: 1 hour. This narrow distribution suggests that, when set, Validity is hard-coded to some value. Here is a break-down of validity by object type: 1min 10min 1hr AccessPolicy [X] [X] [X] AdminDomain [ ] [X] [ ] ApplicationEnvironment [ ] [ ] [X] ComputingEndpoint [X] [X] [X] ComputingManager [X] [X] [ ] ComputingService [X] [X] [ ] ComputingShare [X] [X] [ ] Domain [ ] [X] [ ] Endpoint [X] [X] [X] Entity [ ] [ ] [X] ExecutionEnvironment [X] [X] [ ] Manager [X] [X] [ ] MappingPolicy [X] [X] [ ] Policy [X] [X] [X] Resource [X] [X] [ ] Service [X] [X] [X] Share [X] [X] [ ] StorageEndpoint [ ] [ ] [X] Cheers, Paul.

Hi Paul, Thanks for the nice summary. Comments inline. On 2015-04-20 12:38, Paul Millar wrote:
[...] Validity:
4. Stephen: "an estimate of how long the information can be reasonably trusted, irrespective of how the system updates".
5. Jens: [I wasn't sure from his response]
6. Florido: "the expected time when the actual provider that _generates_ the information will run again and update that information."
So, Stephen and Florido seem to have opposite views of Validity.
It might look different views but in practice is the same thing. If you expect the data not to be valid, you will know only at the next update. The difference in ARC is that our clients always contacts and checks the resource, hence my answer. ARC clients do not care about time drifts in the bdii-family hierarchy -- what counts for us is what endpoint to contact to get the freshest information. OF course this cannot apply for sw that does not accept direct requests to resource level. But Stephen's definition is better IMHO, because I got tickets from admins requesting me to set that data. For example a ComputingService can have a Validity bound to its average uptime, even if the machine is down, the information will still report its existence if cached somewhere.
It seemed that Jens had a similar view to Stephen, at least he suggested Florido's concept be published as a nextUpdate attribute rather than Validity.
My question to Stephen: different clients may tolerate different levels of error/uncertainty (is 1% "good enough"? how about 5%, 10%, or 20%?). Given that "reasonably trusted" depends on the client, how to know what value is to be published?
My question to Florido: in ARC, how exactly is Validity property set? Is it hard-coded in the code, configured manually by the admin, passed to the info-provider script, or overwritten by the cron/refresh job?
It is hard coded at the moment, so you might as well think that Stephen's definition applies. We plan to change this in the future for ComputingActivities, but we have performance issues that we don't manage to overcome so it will stay like that until we find another solution. In short, "static" data will always have a fixed timeout that might be even configured by the admin, but this is not implemented. Dynamic data might have a variable Validity that cannot be configured as it depends on the object statuses.
Just to add a little bit of "current usage", I did a quick survey using lcg-bdii.cern.ch. Some 13% (13865 of 104853) GLUE2 objects currently published have a Validity attribute. These objects have one of three values: ~0% (34 objects) have Validity of 1 minute, 1% (1080 objects) have Validity of 10 minutes, and 12% (12751 objects) have Validity of 1 hour.
So, to a good approximation, only one Validity value is set: 1 hour. This narrow distribution suggests that, when set, Validity is hard-coded to some value.
It is the case for ARC. I can tell you I had to modify this value because by the time BDII picked it up, Validity had already expired. We never had this problem before because in our old information system, only one hierarchy level, the information was so small that there was no time drift between creation and aggregation. If I ever had to redefine Validity against aggregation, it should be the time the data is to be considered valid taking into account eventual overhead/drift caused by aggregation. This is why I say GLUE2 does not speak about it; these values must have a different definition if considering aggregation. This si also why Stephen defintion is the most correct. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Warren and I also recently discussed this topic as we’re expanding XSEDE’s GLUE2 implementation. Our two leading interpretations were 1) how soon does the publisher expected to re-publish/refresh the information, and 2) now long before the consumers should consider ignoring the information. 1) is generally tied to refresh interval (plus some propagation fudge factor), and 2) might be tied to how long to still use the information if there is transient failures of planned outages. So slow changing information, such as ApplicationsEnvironment, may have long Validity and we would want the information to be considered valid even during an extended outage, even though during normal operations it might be refreshed hourly. The lack of clarify in the spec leads me to believe that we should produce a best practice on setting Validity based on an agreed to set of use cases (what behaviors we want to be able to support based on Validity values). JP
On Apr 20, 2015, at 8:00 AM, Florido Paganelli <florido.paganelli@hep.lu.se> wrote:
Hi Paul,
Thanks for the nice summary. Comments inline.
On 2015-04-20 12:38, Paul Millar wrote:
[...] Validity:
4. Stephen: "an estimate of how long the information can be reasonably trusted, irrespective of how the system updates".
5. Jens: [I wasn't sure from his response]
6. Florido: "the expected time when the actual provider that _generates_ the information will run again and update that information."
So, Stephen and Florido seem to have opposite views of Validity.
It might look different views but in practice is the same thing. If you expect the data not to be valid, you will know only at the next update. The difference in ARC is that our clients always contacts and checks the resource, hence my answer. ARC clients do not care about time drifts in the bdii-family hierarchy -- what counts for us is what endpoint to contact to get the freshest information. OF course this cannot apply for sw that does not accept direct requests to resource level. But Stephen's definition is better IMHO, because I got tickets from admins requesting me to set that data. For example a ComputingService can have a Validity bound to its average uptime, even if the machine is down, the information will still report its existence if cached somewhere.
It seemed that Jens had a similar view to Stephen, at least he suggested Florido's concept be published as a nextUpdate attribute rather than Validity.
My question to Stephen: different clients may tolerate different levels of error/uncertainty (is 1% "good enough"? how about 5%, 10%, or 20%?). Given that "reasonably trusted" depends on the client, how to know what value is to be published?
My question to Florido: in ARC, how exactly is Validity property set? Is it hard-coded in the code, configured manually by the admin, passed to the info-provider script, or overwritten by the cron/refresh job?
It is hard coded at the moment, so you might as well think that Stephen's definition applies. We plan to change this in the future for ComputingActivities, but we have performance issues that we don't manage to overcome so it will stay like that until we find another solution. In short, "static" data will always have a fixed timeout that might be even configured by the admin, but this is not implemented. Dynamic data might have a variable Validity that cannot be configured as it depends on the object statuses.
Just to add a little bit of "current usage", I did a quick survey using lcg-bdii.cern.ch. Some 13% (13865 of 104853) GLUE2 objects currently published have a Validity attribute. These objects have one of three values: ~0% (34 objects) have Validity of 1 minute, 1% (1080 objects) have Validity of 10 minutes, and 12% (12751 objects) have Validity of 1 hour.
So, to a good approximation, only one Validity value is set: 1 hour. This narrow distribution suggests that, when set, Validity is hard-coded to some value.
It is the case for ARC. I can tell you I had to modify this value because by the time BDII picked it up, Validity had already expired. We never had this problem before because in our old information system, only one hierarchy level, the information was so small that there was no time drift between creation and aggregation.
If I ever had to redefine Validity against aggregation, it should be the time the data is to be considered valid taking into account eventual overhead/drift caused by aggregation. This is why I say GLUE2 does not speak about it; these values must have a different definition if considering aggregation. This si also why Stephen defintion is the most correct.
Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ================================================== _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

Hi JP, On 20/04/15 15:45, JP Navarro wrote:
So slow changing information, such as ApplicationsEnvironment, may have long Validity and we would want the information to be considered valid even during an extended outage, even though during normal operations it might be refreshed hourly.
OK, but why is ApplicationsEnvironment slow-changing? With cloud environments and technologies like VMCatcher, it's possible for sites to acquire new potential applications environment very quickly and often. I'm still wondering whether there really are "intrinsic" Validity values, as Stephen seems to suggests.
The lack of clarify in the spec leads me to believe that we should produce a best practice on setting Validity based on an agreed to set of use cases (what behaviors we want to be able to support based on Validity values).
I think this is a good idea -- I mentioned this in a different scope: as a "processing model" document. In general such a document would describe how information gets into GLUE-2 (which seems to be lacking in GLUE-2). It would also describe CreationTime and Validity and what expectations exist (if any) on clients when processing the information. HTH, Paul.

Paul, It is slow changing in our environment. I agree in principal that different infrastructures may need different Validity values for different entities. JP
On Apr 20, 2015, at 1:03 PM, Paul Millar <paul.millar@desy.de> wrote:
Hi JP,
On 20/04/15 15:45, JP Navarro wrote:
So slow changing information, such as ApplicationsEnvironment, may have long Validity and we would want the information to be considered valid even during an extended outage, even though during normal operations it might be refreshed hourly.
OK, but why is ApplicationsEnvironment slow-changing?
With cloud environments and technologies like VMCatcher, it's possible for sites to acquire new potential applications environment very quickly and often.
I'm still wondering whether there really are "intrinsic" Validity values, as Stephen seems to suggests.
The lack of clarify in the spec leads me to believe that we should produce a best practice on setting Validity based on an agreed to set of use cases (what behaviors we want to be able to support based on Validity values).
I think this is a good idea -- I mentioned this in a different scope: as a "processing model" document.
In general such a document would describe how information gets into GLUE-2 (which seems to be lacking in GLUE-2). It would also describe CreationTime and Validity and what expectations exist (if any) on clients when processing the information.
HTH,
Paul.
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

Hi JP, On 20/04/15 20:40, JP Navarro wrote:
It is slow changing in our environment. I agree in principal that different infrastructures may need different Validity values for different entities.
True, true. Perhaps not for Teragrid, but it might be that different sites have different deployment policies; for example, one site could review installed software once a month while another could respond to requests (from a particular user-community) for a new package within minutes. So, it might be that the published Validity values for some class isn't constant across that infrastructure. Cheers, Paul.

Hi guys, I am ;late to the party, but would like to comment on the 'validity'. On Mon, Apr 20, 2015 at 8:03 PM, Paul Millar <paul.millar@desy.de> wrote:
Hi JP,
I'm still wondering whether there really are "intrinsic" Validity values, as Stephen seems to suggests.
I actually think that information remain valid forever. Monday, April 15 2015, 8pm : number of free nodes : 2126 That information item was created on Monday -- but it is still true when I look at it on Tuesday (and thus is valid). Now, the *current* number of free nodes may have changed, of course, so the information may not be *useful* anymore -- but that is something different. If information is useful is in the eye of the beholder, not of the information provider. A post-mortem statistics gatherer will find it useful, an on-demand scheduler placing a job on Tuesday may not. My $0.02, Andre. -- 99 little bugs in the code. 99 little bugs in the code. Take one down, patch it around. 127 little bugs in the code...

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Paul Millar said: Sent: 20 April 2015 11:38 Stephen: "It would be incorrect, since it would not be the time at which the information was created." -- circular argument fallacy: CreationTime is the time information was created.
I don't see why it's circular. The BDII doesn't create anything, ergo it should not change the creation time, any more than translating it into XML. I just quoted the timestamp in the snippet of mail above written by you - should I update it to the current time? No, because I haven't changed when you wrote your mail.
Florido: "No. CreationTime refers to the record related to the entity described. Only the resource provider that creates the datastructure can know such time. BDII levels above resource just copy the data." Again, straw-man fallacy as it requires GLUE 2 to distinguish between resource- and higher-level BDIIs.
I don't understand your point at all here. Let's try again: the creation time is the time the *information* represented in a GLUE object is created. That information may be copied, translated or stored in many different formats, none of which has anything to do with the information itself, i.e. the values of the various attributes.
However, the difficulty in describing the exact semantics of CreationTime suggests (to me) that GLUE 2 _does_ include the concept of aggregation, at least because it has CreationTime with different processing models depending on whether the agent is a primary data source (e.g., Resource BDII) or an aggregating source (Site- or Top- BDII).
The thing which sets the CreationTime is not the BDII at any level, it's the information provider. That is often run by a resource BDII, but not necessarily, e.g. some objects are created by YAIM scripts when a service is configured. All the BDIIs do is read some LDIF from somewhere, store it in an internal database and make it available for query, they don't create any of it.
My question to Stephen: different clients may tolerate different levels of error/uncertainty (is 1% "good enough"? how about 5%, 10%, or 20%?). Given that "reasonably trusted" depends on the client, how to know what value is to be published?
As far as I know we have no clients which make use of it. The only use I'm currently aware of is the glue validator, where the goal is to spot services which are stuck or otherwise faulty. If other use cases arise people would have to look at the details to decide what to do - bearing in mind the constraints of the system, e.g. that the top BDII can't manage freshness of much better than an hour. I don't see that this is different to any attribute - what you publish needs to be driven by the use cases. It wouldn't be especially difficult to publish a different Validity for each object type, or even for e.g. different batch systems, but unless you have something to specify the use there's nothing to motivate such a varying choice.
So, to a good approximation, only one Validity value is set: 1 hour. This narrow distribution suggests that, when set, Validity is hard-coded to some value.
In my info provider it is indeed hard-coded to 1 hour - as I say it would be easy enough to change it, but there's no current demand. Stephen

Hi Stephen, On 20/04/15 18:35, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Paul Millar said: Sent: 20 April 2015 11:38 Stephen: "It would be incorrect, since it would not be the time at which the information was created." -- circular argument fallacy: CreationTime is the time information was created.
I don't see why it's circular. The BDII doesn't create anything, ergo it should not change the creation time, any more than translating it into XML.
It's circular because you define CreationTime in terms of the time something is created. This is either circular argument or a semantically null sentence -- you choose :) The underlying problem is your definition uses the term "create", without defining what this means. How do I know when information is created: what is it like before? what is it like after? what has changed? In part, the problem comes because GLUE-2 is completely mum on all the machinery of maintaining information. There's no mention of information-providers. There's no mention of information being added, updated or removed. This may be as-intended, however, it makes defining CreationTime difficult. My usual exercise: try defining CreationTime without using the words "creation" and "time".
Florido: "No. CreationTime refers to the record related to the entity described. Only the resource provider that creates the datastructure can know such time. BDII levels above resource just copy the data." Again, straw-man fallacy as it requires GLUE 2 to distinguish between resource- and higher-level BDIIs.
I don't understand your point at all here. Let's try again: the creation time is the time the *information* represented in a GLUE object is created.
Again, a circular (or semantically null) defn: you're defining CreationTime using the phrase "the time [something] is created". Put another way, the concept of 'information being created' is too loose a term: it could mean almost anything, so defines nothing.
That information may be copied, translated or stored in many different formats, none of which has anything to do with the information itself, i.e. the values of the various attributes.
How do you distinguish between information being copied from some other BDII and being copied from an info-provider? In fact, these are basically the same. BDII even treats them the same: they are just two potential sources of information. The only distinction between being a resource-, site- or top-level BDII is where it fetches its information.
However, the difficulty in describing the exact semantics of CreationTime suggests (to me) that GLUE 2 _does_ include the concept of aggregation, at least because it has CreationTime with different processing models depending on whether the agent is a primary data source (e.g., Resource BDII) or an aggregating source (Site- or Top- BDII).
The thing which sets the CreationTime is not the BDII at any level, it's the information provider.
"BDII" and "information provider" are not defined in GLUE 2 (except in Appendix A.). Therefore, cannot contribute towards the definition of CreationTime. In case it isn't obvious, I agree that CreationTime should be set by the info-provider and not modified by any BDII. What is interesting is that (apparently) one cannot describe this desired behaviour rigorously in GLUE-2, despite everyone being agreed this is the desired behaviour and the other behaviour is plain wrong. To me, this points to a deficiency in GLUE 2. [...]
As far as I know we have no clients which make use of it. The only use I'm currently aware of is the glue validator, where the goal is to spot services which are stuck or otherwise faulty.
Incidentally, I've noticed some old objects hanging around in lcg-bdii.cern.ch, but only because I started publishing CreationTime and was checking published values. Do you know if the glue validator is being run against production top-level BDII instances?
If other use cases arise people would have to look at the details to decide what to do - bearing in mind the constraints of the system, e.g. that the top BDII can't manage freshness of much better than an hour.
One hour! Why doesn't someone fix this?
I don't see that this is different to any attribute - what you publish needs to be driven by the use cases. It wouldn't be especially difficult to publish a different Validity for each object type, or even for e.g. different batch systems, but unless you have something to specify the use there's nothing to motivate such a varying choice.
My use-case was what you might expect: allowing detection of a particular failure mode. Specifically, the information publishing "got stuck" at one site. The details don't matter, but the result was old ("stale") data continued to be re-published. What I'd like is for that to be detectable; even if that detection doesn't come out-of-the-box. GLUE-2 seems to support this, with CreationTime & Validity. However, the devil's in the detail, and it seems Validity cannot be used like this, without hard-coding some arbitrary numbers.
In my info provider it is indeed hard-coded to 1 hour - as I say it would be easy enough to change it, but there's no current demand.
OK, but why 1 hour and not 1 minute or 1 day? Cheers, Paul.

Paul Millar [mailto:paul.millar@desy.de] said:
It's circular because you define CreationTime in terms of the time something is created. This is either circular argument or a semantically null sentence -- you choose :)
I don't know if you're being deliberately obtuse - *all* the attribute names are supposed to be descriptive of what they mean, if they aren't the name is poorly chosen. Would you be happier if the attribute were called RabbitFood and I defined it as a creation time?
This may be as-intended, however, it makes defining CreationTime difficult.
No it doesn't. I find this discussion pointless and I don't intend to keep repeating the same things.
How do you distinguish between information being copied from some other BDII and being copied from an info-provider?
I don't, and the BDII doesn't. Either way, if the LDIF contains a CreationTime attribute the BDII stores it along with everything else; if it doesn't the BDII doesn't add it.
Do you know if the glue validator is being run against production top-level BDII instances?
Yes, it's run as part of the site Nagios tests, and sites get tickets for things marked as ERROR, except that known middleware bugs are masked. I'm not sure offhand if that includes this issue - as Florido says, the ARC values were short enough that they always failed the test so it may still be masked.
One hour! Why doesn't someone fix this?
It's actually more like 30 minutes, and it's pretty much intrinsic to the BDII architecture. There have been various attempts to design a new information system but none have come to fruition.
In my info provider it is indeed hard-coded to 1 hour - as I say it would be easy enough to change it, but there's no current demand.
OK, but why 1 hour and not 1 minute or 1 day?
As I said, there's no point in having it much shorter than an hour because the system can't update that fast. For most dynamic information 1 day would be unrealistically long because the dynamic state of most services can change quite a bit faster, e.g. services often go from up to down to up within a day. Ideally we'd have a more responsive information system and one which treated different kinds of information differently - i.e. fast-changing information like running job counts would be updated every few minutes or less, while slowly changing objects would update infrequently. In that case the Validity could be set according to the realistic lifetime of the information and the information system could use it as a guide to when it should refresh. However that isn't what we have at the moment. Stephen

Hi Stephen, On 20/04/15 21:14, stephen.burke@stfc.ac.uk wrote:
I don't know if you're being deliberately obtuse - *all* the attribute names are supposed to be descriptive of what they mean,
*sigh* I'm really not trying being deliberately obtuse. I'm trying to illustrate some (apparently) difficult concept and failing to do so.
if they aren't the name is poorly chosen. Would you be happier if the attribute were called RabbitFood and I defined it as a creation time?
To some extent yes --- When writing a description I consider the name as if written in a foreign language. This forces me not to be "lazy" and write a description that stands on its own. I also try hard to write the description without using the words contained in the name; this helps me avoid the trap of making assumptions on the reader's understanding of those words. From your replies, you appear to have an internal definition of a CreationTime that is to yourself clear, self-obvious and almost axiomatic. Unfortunately, you cannot seem to express that idea in the terms defined within GLUE-2. My point is that the language of GLUE-2 seems to prevent such descriptions: it assumes some kind of steady-state, without describing how information is updated. If so, this makes defining CreationTime impossible without introducing a new concept, such as the info-provider. [...]
Do you know if the glue validator is being run against production top-level BDII instances?
Yes, it's run as part of the site Nagios tests, and sites get tickets for things marked as ERROR,
Excellent.
except that known middleware bugs are masked.
I actually also discovered that bugs are being hidden, this morning: https://its.cern.ch/jira/browse/GRIDINFO-58 This feel this is very wrong! The validator should expose bugs, not hide them. How else are sites going to fix these bugs.
I'm not sure offhand if that includes this issue - as Florido says, the ARC values were short enough that they always failed the test so it may still be masked.
It would be good if we could check this: I think there's a bug in BDII where stale data is not being flushed. If the validator is hiding bugs, and the policy is to do so whenever bugs are found, then it is useless.
One hour! Why doesn't someone fix this?
It's actually more like 30 minutes, and it's pretty much intrinsic to the BDII architecture.
AFAIK, there's no intrinsic reason why there should be anything beyond a 2--3 minute delay: the time taken to fetch the updated information from a site-level BDII. Where's the bug-report for this?
There have been various attempts to design a new information system but none have come to fruition.
Yeah, typical grid middleware response: rewrite the software rather than fix a bug.
OK, but why 1 hour and not 1 minute or 1 day?
As I said, there's no point in having it much shorter than an hour because the system can't update that fast.
OK, but again, this is bad. Rather than fixing a bug, a work-around is introduced.
For most dynamic information 1 day would be unrealistically long because the dynamic state of most services can change quite a bit faster, e.g. services often go from up to down to up within a day.
Personally, I'm still not convinced there's some intrinsic period describing how long an object should stay valid. For example, if an Endpoint is no longer available, that information should propagate quickly. It doesn't matter that the endpoint has been available for the past 6 months, or that endpoints are generally stable for many days.
Ideally we'd have a more responsive information system
Absolutely! 30 minutes delay is ridiculous.
and one which treated different kinds of information differently - i.e. fast-changing information like running job counts would be updated every few minutes or less, while slowly changing objects would update infrequently.
That's merely an optimisation, which might prove useful if we can reasonably label such data. I'm still not convinced about labelling objects as rapidly or slowly updating, so I'm not convinced with this optimisation.
In that case the Validity could be set according to the realistic lifetime of the information and the information system could use it as a guide to when it should refresh. However that isn't what we have at the moment.
True. I think the immediate focus should be fixing top-level BDIIs so they provide reasonably up-to-date information. Cheers, Paul.

Paul Millar [mailto:paul.millar@desy.de] said:
From your replies, you appear to have an internal definition of a CreationTime that is to yourself clear, self-obvious and almost axiomatic. Unfortunately, you cannot seem to express that idea in the terms defined within GLUE-2.
OK, let's have one more try. The concept which you seem to think is missing is "entity instance". That may not be explicitly defined but it's a general computing concept, and I find it hard to see that you could make much sense of the schema without it. The schema defines entities as collections of attributes with types and definitions; an instance of that entity has specific values for the attributes. One of those attributes is CreationTime. Instances are created in a way completely unspecified by the schema document, but whatever the method the CreationTime is the time at which that creation occurs (necessarily approximate since creation will take a finite time). If a new instance is created it gets a new CreationTime even if all the other attributes happen to be the same. However, if an instance is copied the copy preserves *all* the attribute values including CreationTime - if you change that it's a new instance and not a copy.
The validator should expose bugs, not hide them. How else are sites going to fix these bugs.
The point is that sites can't fix middleware bugs, and hence shouldn't get tickets for them. If tickets were raised for errors which would always occur and can't be fixed until a new middleware release is available the validator would have been rejected - sites must be able to clear alarms in a reasonably short time. That's also why only ERRORs generate alarms - ERRORs are always wrong, WARNINGs may be correct so a site may be unable to remove them. Of course, the validator can still be run outside the Nagios framework without the known issues mask.
It would be good if we could check this: I think there's a bug in BDII where stale data is not being flushed.
Maria has been on maternity leave for several months, so all this has been on hold. I think she should be back fairly soon, but no doubt it will take a while to catch up. A couple of years ago there was a bug where old data wasn't being deleted, but it should be out of the system by now. Also bear in mind that top BDIIs can cache data for up to four days.
If the validator is hiding bugs, and the policy is to do so whenever bugs are found, then it is useless.
The policy is to submit a ticket to the middleware developers and keep track of it. There's no point in repeatedly finding the same bug.
AFAIK, there's no intrinsic reason why there should be anything beyond a 2--3 minute delay: the time taken to fetch the updated information from a site-level BDII.
The top BDII has to fetch information from several hundred site BDIIs and the total data volume is large. It takes several minutes to do that. And site BDIIs themselves have to collect information from the resource BDIIs at the site. Back in 2012 Laurence did some tests to see if the top BDII could scale to read from the resource BDIIs directly, but the answer was no, it can cope with O(1000) sources but not O(10000). Also the resource BDII runs on the service and loads it to some extent so it can't update too often - a particular issue for the CE, which is the service with the fastest-changing data.
Yeah, typical grid middleware response: rewrite the software rather than fix a bug.
I could say that your response is typical: criticism without understanding. As far as I'm concerned this correspondence is closed. I've said what I have to say, if you don't understand it I don't propose to make any further attempts to explain, especially since you seem to be resorting to abuse rather than argument. Stephen

Hi Stephen, First, I must apologise if you felt my emails were in any way abusive --- they were certainly not intended that way; rather, I would like the effort we have all invested in GLUE and the grid infrastructure be used properly. Currently, I see different groups developing their own information systems, running in parallel with GLUE+BDII, because of problems (both perceived and actual) with BDII. I would like these problems addressed and find the very slow progress frustrating. Onto the specific points... On 21/04/15 13:00, stephen.burke@stfc.ac.uk wrote:
Paul Millar [mailto:paul.millar@desy.de] said:
From your replies, you appear to have an internal definition of a CreationTime that is to yourself clear, self-obvious and almost axiomatic. Unfortunately, you cannot seem to express that idea in the>> terms defined within GLUE-2.
OK, let's have one more try. The concept which you seem to think is missing is "entity instance". That may not be explicitly defined but it's a general computing concept, and I find it hard to see that you could make much sense of the schema without it. The schema defines entities as collections of attributes with types and definitions; an instance of that entity has specific values for the attributes. One of those attributes is CreationTime. Instances are created in a way completely unspecified by the schema document, but whatever the method the CreationTime is the time at which that creation occurs (necessarily approximate since creation will take a finite time). If a new instance is created it gets a new CreationTime even if all the other attributes happen to be the same. However, if an instance is copied the copy preserves *all* the attribute values including CreationTime - if you change that it's a new instance and not a copy.
Thanks, that makes sense. Just to confirm: you define two general mechanisms through which data is acquired: creating an entity instance and copying an entity instance. In concrete terms, resource-level BDII+info-provider creates entity instances while site- and top- level BDIIs copy entity instances. This breaks the symmetry, allowing CreationTime to operate only on resource-level BDIIs. Perhaps such a description is trivial or "well known", but it seems to me that GLUE-2 when used in a hierarchy (like the WLCG info system) would benefit from such a description. This could go in GLUE-2 itself, or perhaps in a hierarchy profile document.
The validator should expose bugs, not hide them. How else are sites going to fix these bugs.
The point is that sites can't fix middleware bugs [..]
What you say is correct. I would also say that only sites can deploy the bug-fixes.
and hence shouldn't get tickets for them. If tickets were raised for errors which would always occur and can't be fixed until a new middleware release is available the validator would have been rejected - sites must be able to clear alarms in a reasonably short time. That's also why only ERRORs generate alarms - ERRORs are always wrong, WARNINGs may be correct so a site may be unable to remove them. Of course, the validator can still be run outside the Nagios framework without the known issues mask.
Yes, it's always a bit fiddly dealing with a new test where the production instance currently fails.
It would be good if we could check this: I think there's a bug in BDII where stale data is not being flushed.
Maria has been on maternity leave for several months, so all this has been on hold. I think she should be back fairly soon, but no doubt it will take a while to catch up. A couple of years ago there was a bug where old data wasn't being deleted, but it should be out of the system by now. Also bear in mind that top BDIIs can cache data for up to four days.
Sure, I knew Maria was away; but I was hoping there would be someone covering for her, and that the process wasn't based on her heroic efforts alone.
If the validator is hiding bugs, and the policy is to do so whenever bugs are found, then it is useless.
The policy is to submit a ticket to the middleware developers and keep track of it. There's no point in repeatedly finding the same bug.
Yes, that is certainly a sound policy.
AFAIK, there's no intrinsic reason why there should be anything beyond a 2--3 minute delay: the time taken to fetch the updated information from a site-level BDII.
The top BDII has to fetch information from several hundred site BDIIs and the total data volume is large. It takes several minutes to do that. And site BDIIs themselves have to collect information from the resource BDIIs at the site. Back in 2012 Laurence did some tests to see if the top BDII could scale to read from the resource BDIIs directly, but the answer was no, it can cope with O(1000) sources but not O(10000). Also the resource BDII runs on the service and loads it to some extent so it can't update too often - a particular issue for the CE, which is the service with the fastest-changing data.
I'm not sure I agree here. First, the site-level BDII should cache information from resource-level BDIIs, as resource-level BDIIs cache information from info-providers. This means that load from top-level BDIIs is only experienced by site-level BDIIs. Taking a complete (top-level) dump only takes a few seconds. paul@celebrimbor:~$ /usr/bin/time -f %e ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue > /dev/null 4.49 paul@celebrimbor:~$ /usr/bin/time -f %e ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=grid > /dev/null 5.15 Lets say it takes about 10--15 seconds in total. A top-level BDII is updating by this process (invoking the ldapsearch command). Assuming the process is bandwidth limited, this should also take ~10--15 seconds as the total amount of information sent over the network should be about the same. (Note that this doesn't take into account TCP slow-start, so it may be a slight underestimate, but see below for why I don't believe this is a real problem.) Lets assume the problem isn't bandwidth limited, that the update frequency is limited by latency of the individual requests to site-level BDIIs. I surveyed the currently registered site-level BDIIs: for url in $(ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue $(ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue '(GLUE2ServiceType=bdii_site)' GLUE2ServiceID|perl -p00e 's/\n //'|awk 'BEGIN{printf "(|"}/^GLUE2ServiceID/{printf "(GLUE2EndpointServiceForeignKey="$2")"}END{print ")"}') GLUE2EndpointURL|perl -p00e 's/\n //g' | sed -n 's%^GLUE2EndpointURL: \(ldap://[^:]*:[0-9]*/\).*%\1%p'); do /usr/bin/time -a -o times.dat -f %e ldapsearch -LLL -x -H $url -o nettimeout=30 -b o=glue > /dev/null; done This query covered some 318 sites. The ldapsearch command failed for 5 endpoints and the query timed out for 3 endpoints. Of the remaining 310 sites, the maximum time for ldapsearch to complete was about 19.21 seconds and the (median) average was 0.44 seconds. For 82% of sites, ldapsearch completed within a second; for 92% it completed within two seconds. Repeating this for GLUE-1.3 showed similar statistics. This suggests to me that information from responsive sites could be maintained with a lag of order 10 seconds to a minute (depending); information from sites with badly performing site-level BDIIs would be updated less often. I haven't investigated injecting this information: BDII now generates a LDIF diff which is injected into the slapd. This is distinct from the original approach, which employed a "double-buffer" with two slapd instances. Still, I currently don't see why a top-level BDIIs must lag by some 30 minutes.
Yeah, typical grid middleware response: rewrite the software rather than fix a bug.
I could say that your response is typical: criticism without understanding.
Perhaps, but I have reviewed the BDII code-base in the past and I know roughly how it works. My simple investigation suggests maintaining a top-level BDII with sub-minute latencies is possible with at least 80--90% of site-level BDIIs. Of course I may be missing something here, but it certainly seems feasible to achieve much better than is currently being done. Cheers, Paul.

Hi Paul Without replying point to point the benchmarking you've done, which was nice work, I kindly suggest you not to benchmark a technology you maybe don't understand completely. As usual, theoretically everything is fine, but not in practice :( . Your claims are all true, LDAP is very fast in answering queries; this is why we use it... and this is not why BDII is "slow". Most of the time spent by BDII is done on restructuring the LDAP tree. LDAP indexing is tree structured index backed by a key-value berkeley db. That means that when aggregating data, all the data must be reindexed ("rewriting the dn" in LDAP slang) to fit into the tree. It is also such three structure plus the simplicity of a key-value pair db that allows LDAP to perform queries in such a fast way. Unfortunately all of this comes at a cost. Updating the db requires the following steps (I didn't look into the code recently, but I roughly remember this) 1) ldap query the sources (negligible time as you discovered) 2) rebuild the new tree(s) generating new LDIF document(s) (very time consuming, includes rekeying of ALL objects.) 3) check differences between the rebuilt tree(s) and the existing database entries 3) modify existing entries that have changed (one ldap-modify for each object) 4) Remove objects that are not there anymore (ldap-delete) 5) ldap-add new objects -- which boils down to ldap-adding a whole new LDIF document (that is, the entire DB) in most of the cases due to -- guess what -- CrationTime and Validity which are always changing!!! :D As you can see above, you just bechmarked the top of the iceberg. Laurence or Maria can correct me if the above is not true. I don't know the code that well but I had to look into it during EMI times. Over the years Laurence managed to shorten down this update time with several smart ideas, that include also enterprise-level techniques like replication, and probably partial LDIF documents where applicable. I think in this way he avoided having two LDAP servers. You have to understand that the LDAP technology is intended for data that changes rarely, and we're using it for an almost real-time system. One more hint of the fact that is a bad monitoring tool... Trust me, 30 mins is a great achievement for a technology that never was meant to do what we use it for. I might have several arguments against BDII code but not about its performance. The problems we're facing with ARC while trying to move to other technologies is that query times for LDAP are faster than other investigated technologies (e.g. REST web services) Update times are horrible, but for people is more important to have the queries fast than the information fresh it seems... And I can say this because in ARC we put also jobs in the LDAP database, which is EXTREME for today's numbers (i.e. O(10000) jobs). It's nice(?) that these numbers match those that Stephen mentioned. Cheers, Florido On 2015-04-21 20:07, Paul Millar wrote:
Hi Stephen,
First, I must apologise if you felt my emails were in any way abusive --- they were certainly not intended that way; rather, I would like the effort we have all invested in GLUE and the grid infrastructure be used properly.
Currently, I see different groups developing their own information systems, running in parallel with GLUE+BDII, because of problems (both perceived and actual) with BDII. I would like these problems addressed and find the very slow progress frustrating.
Onto the specific points...
On 21/04/15 13:00, stephen.burke@stfc.ac.uk wrote:
Paul Millar [mailto:paul.millar@desy.de] said:
From your replies, you appear to have an internal definition of a CreationTime that is to yourself clear, self-obvious and almost axiomatic. Unfortunately, you cannot seem to express that idea in the>> terms defined within GLUE-2.
OK, let's have one more try. The concept which you seem to think is missing is "entity instance". That may not be explicitly defined but it's a general computing concept, and I find it hard to see that you could make much sense of the schema without it. The schema defines entities as collections of attributes with types and definitions; an instance of that entity has specific values for the attributes. One of those attributes is CreationTime. Instances are created in a way completely unspecified by the schema document, but whatever the method the CreationTime is the time at which that creation occurs (necessarily approximate since creation will take a finite time). If a new instance is created it gets a new CreationTime even if all the other attributes happen to be the same. However, if an instance is copied the copy preserves *all* the attribute values including CreationTime - if you change that it's a new instance and not a copy.
Thanks, that makes sense.
Just to confirm: you define two general mechanisms through which data is acquired: creating an entity instance and copying an entity instance.
In concrete terms, resource-level BDII+info-provider creates entity instances while site- and top- level BDIIs copy entity instances. This breaks the symmetry, allowing CreationTime to operate only on resource-level BDIIs.
Perhaps such a description is trivial or "well known", but it seems to me that GLUE-2 when used in a hierarchy (like the WLCG info system) would benefit from such a description. This could go in GLUE-2 itself, or perhaps in a hierarchy profile document.
The validator should expose bugs, not hide them. How else are sites going to fix these bugs.
The point is that sites can't fix middleware bugs [..]
What you say is correct. I would also say that only sites can deploy the bug-fixes.
and hence shouldn't get tickets for them. If tickets were raised for errors which would always occur and can't be fixed until a new middleware release is available the validator would have been rejected - sites must be able to clear alarms in a reasonably short time. That's also why only ERRORs generate alarms - ERRORs are always wrong, WARNINGs may be correct so a site may be unable to remove them. Of course, the validator can still be run outside the Nagios framework without the known issues mask.
Yes, it's always a bit fiddly dealing with a new test where the production instance currently fails.
It would be good if we could check this: I think there's a bug in BDII where stale data is not being flushed.
Maria has been on maternity leave for several months, so all this has been on hold. I think she should be back fairly soon, but no doubt it will take a while to catch up. A couple of years ago there was a bug where old data wasn't being deleted, but it should be out of the system by now. Also bear in mind that top BDIIs can cache data for up to four days.
Sure, I knew Maria was away; but I was hoping there would be someone covering for her, and that the process wasn't based on her heroic efforts alone.
If the validator is hiding bugs, and the policy is to do so whenever bugs are found, then it is useless.
The policy is to submit a ticket to the middleware developers and keep track of it. There's no point in repeatedly finding the same bug.
Yes, that is certainly a sound policy.
AFAIK, there's no intrinsic reason why there should be anything beyond a 2--3 minute delay: the time taken to fetch the updated information from a site-level BDII.
The top BDII has to fetch information from several hundred site BDIIs and the total data volume is large. It takes several minutes to do that. And site BDIIs themselves have to collect information from the resource BDIIs at the site. Back in 2012 Laurence did some tests to see if the top BDII could scale to read from the resource BDIIs directly, but the answer was no, it can cope with O(1000) sources but not O(10000). Also the resource BDII runs on the service and loads it to some extent so it can't update too often - a particular issue for the CE, which is the service with the fastest-changing data.
I'm not sure I agree here.
First, the site-level BDII should cache information from resource-level BDIIs, as resource-level BDIIs cache information from info-providers. This means that load from top-level BDIIs is only experienced by site-level BDIIs.
Taking a complete (top-level) dump only takes a few seconds.
paul@celebrimbor:~$ /usr/bin/time -f %e ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue > /dev/null 4.49
paul@celebrimbor:~$ /usr/bin/time -f %e ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=grid > /dev/null 5.15
Lets say it takes about 10--15 seconds in total.
A top-level BDII is updating by this process (invoking the ldapsearch command). Assuming the process is bandwidth limited, this should also take ~10--15 seconds as the total amount of information sent over the network should be about the same. (Note that this doesn't take into account TCP slow-start, so it may be a slight underestimate, but see below for why I don't believe this is a real problem.)
Lets assume the problem isn't bandwidth limited, that the update frequency is limited by latency of the individual requests to site-level BDIIs.
I surveyed the currently registered site-level BDIIs:
for url in $(ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue $(ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue '(GLUE2ServiceType=bdii_site)' GLUE2ServiceID|perl -p00e 's/\n //'|awk 'BEGIN{printf "(|"}/^GLUE2ServiceID/{printf "(GLUE2EndpointServiceForeignKey="$2")"}END{print ")"}') GLUE2EndpointURL|perl -p00e 's/\n //g' | sed -n 's%^GLUE2EndpointURL: \(ldap://[^:]*:[0-9]*/\).*%\1%p'); do /usr/bin/time -a -o times.dat -f %e ldapsearch -LLL -x -H $url -o nettimeout=30 -b o=glue > /dev/null; done
This query covered some 318 sites. The ldapsearch command failed for 5 endpoints and the query timed out for 3 endpoints.
Of the remaining 310 sites, the maximum time for ldapsearch to complete was about 19.21 seconds and the (median) average was 0.44 seconds. For 82% of sites, ldapsearch completed within a second; for 92% it completed within two seconds.
Repeating this for GLUE-1.3 showed similar statistics.
This suggests to me that information from responsive sites could be maintained with a lag of order 10 seconds to a minute (depending); information from sites with badly performing site-level BDIIs would be updated less often.
I haven't investigated injecting this information: BDII now generates a LDIF diff which is injected into the slapd. This is distinct from the original approach, which employed a "double-buffer" with two slapd instances.
Still, I currently don't see why a top-level BDIIs must lag by some 30 minutes.
Yeah, typical grid middleware response: rewrite the software rather than fix a bug.
I could say that your response is typical: criticism without understanding.
Perhaps, but I have reviewed the BDII code-base in the past and I know roughly how it works.
My simple investigation suggests maintaining a top-level BDII with sub-minute latencies is possible with at least 80--90% of site-level BDIIs.
Of course I may be missing something here, but it certainly seems feasible to achieve much better than is currently being done.
Cheers,
Paul.
-- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Hi Florido, all,
[...] 5) ldap-add new objects -- which boils down to ldap-adding a whole new LDIF document (that is, the entire DB) in most of the cases due to -- guess what -- CrationTime and Validity which are always changing!!! :D
If that were true, we would not have abandoned the old method: populate a fresh DB in the background, switch when it is ready.

Hi Marteen, On 2015-04-24 01:09, Maarten.Litmaath@cern.ch wrote:
Hi Florido, all,
[...] 5) ldap-add new objects -- which boils down to ldap-adding a whole new LDIF document (that is, the entire DB) in most of the cases due to -- guess what -- CrationTime and Validity which are always changing!!! :D
If that were true, we would not have abandoned the old method: populate a fresh DB in the background, switch when it is ready.
Good to know then! I cannot tell how much is ldapadded, I speak for what I've seen running a top-bdii myself testing the data I produce with ARC, for sure all our records end up being added again due to timestamp change. But you guys did the benchmarks, you know better, and you don't aggregate only ARC stuff, so I blindly trust you. As I said I am not a BDII developer! As I said already, there's a lot of very smart ideas on how BDII works wrt performance. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Hi Florido, Sorry for the delay in replying. On 22/04/15 11:45, Florido Paganelli wrote: [..]
As you can see above, you just bechmarked the top of the iceberg.
Agreed; although it's nice to see that many site-level BDIIs are very fast, while a very small fraction are (perhaps disproportionally) slow. This could be influencing the latency, as the top-level BDII waits for all site-level BDIIs to complete sending before generating the LDIF and updating the slapd. [...]
Over the years Laurence managed to shorten down this update time with several smart ideas, that include also enterprise-level techniques like replication, and probably partial LDIF documents where applicable. I think in this way he avoided having two LDAP servers.
That is my understanding too: Laurence has improved the situation greatly over the years.
You have to understand that the LDAP technology is intended for data that changes rarely, and we're using it for an almost real-time system. One more hint of the fact that is a bad monitoring tool...
Trust me, 30 mins is a great achievement for a technology that never was meant to do what we use it for. I might have several arguments against BDII code but not about its performance.
I think we have to agree to differ slightly on this point. I certainly agree with you that OpenLDAP software is designed for read performance rather than update (or write) performance, with the consequence that it doesn't work so well for all the use-cases we might want. I haven't benchmarked the complete BDII update life-cycle, especially not for lcg-bdii.cern.ch: the top-level BDII that I often use as a reference point. I don't know how much time is spent preparing the LDIF, compared to the time OpenLDAP takes to process the changes. If generating the LDIF takes the majority of the time then perhaps this could be improved: currently this is done with perl and python, perhaps alternative technologies would be faster? However, I still feel that there's nothing intrinsic in what was want to achieve that prevents us using LDAP the network protocol (i.e., not necessarily OpenLDAP) to achieving much reduced latencies.
The problems we're facing with ARC while trying to move to other technologies is that query times for LDAP are faster than other investigated technologies (e.g. REST web services) Update times are horrible, but for people is more important to have the queries fast than the information fresh it seems... And I can say this because in ARC we put also jobs in the LDAP database, which is EXTREME for today's numbers (i.e. O(10000) jobs). It's nice(?) that these numbers match those that Stephen mentioned.
Yes, I guess so. It's living on the "bleeding edge" that triggers innovation, right? Cheers, Paul.

Hi all, I also have the feeling the discussion is becoming a bit sterile. We can make the GLUE2 spec better but I hardly understand how Paul definitions without using the actual terms we want to define could help. I might sound harsh somewhere down but is not my intention, I am trying to get to the definition Paul would like to have. I apologise if the tones sound bad, I really don't want to! Few comments down: On 2015-04-20 19:46, Paul Millar wrote:
[...]
Florido: "No. CreationTime refers to the record related to the entity described. Only the resource provider that creates the datastructure can know such time. BDII levels above resource just copy the data." Again, straw-man fallacy as it requires GLUE 2 to distinguish between resource- and higher-level BDIIs.
I don't understand your point at all here. Let's try again: the creation time is the time the *information* represented in a GLUE object is created.
Again, a circular (or semantically null) defn: you're defining CreationTime using the phrase "the time [something] is created".
Put another way, the concept of 'information being created' is too loose a term: it could mean almost anything, so defines nothing.
Well, this is a rhetorical game and not a scientific discussion anymore IMHO. I understand you want a definition out of the practical implementation, and since you seem to like riddles, I will avoid the words creation and time (at this point a mere exercise of wording) here it is: The CreationTime is the number of seconds elapsed since the Epoch (00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970) formatted as described in the GLUE2 document when BOTH these two are true: 1) the GLUE2 record for a GLUE2 entity is being generated 2) the data contained in the record, that is, the data that describes the entity the record refers to, is being collected. I see no fallacy nor circularity. It's a definition. It does NOT require the knowledge of provider, resource- whatever-BDII Of course, if you want to be really picky there is a time drift between 1) and 2) because a Turing machine is sequential. But we can avoid this discussion I hope... I can provide a similar definition for Validity if you like... but I will shift to Stephen's suggestion that this is community-driven, but it's not because of the model, it's because what is "Valid" is community driven, and by experience I can tell it will be even if you try to define it otherwise! Maybe the only real outcome of this discussion is Jens' comment that 'Validity' was a bad name! :D
That information may be copied, translated or stored in many different formats, none of which has anything to do with the information itself, i.e. the values of the various attributes.
How do you distinguish between information being copied from some other BDII and being copied from an info-provider?
In fact, these are basically the same. BDII even treats them the same: they are just two potential sources of information.
The only distinction between being a resource-, site- or top-level BDII is where it fetches its information.
However, the difficulty in describing the exact semantics of CreationTime suggests (to me) that GLUE 2 _does_ include the concept of aggregation, at least because it has CreationTime with different processing models depending on whether the agent is a primary data source (e.g., Resource BDII) or an aggregating source (Site- or Top- BDII).
The thing which sets the CreationTime is not the BDII at any level, it's the information provider.
"BDII" and "information provider" are not defined in GLUE 2 (except in Appendix A.). Therefore, cannot contribute towards the definition of CreationTime.
In case it isn't obvious, I agree that CreationTime should be set by the info-provider and not modified by any BDII.
What is interesting is that (apparently) one cannot describe this desired behaviour rigorously in GLUE-2, despite everyone being agreed this is the desired behaviour and the other behaviour is plain wrong.
To me, this points to a deficiency in GLUE 2.
[...]
I do not see the needs to describing it in the model. One describes that in an implementation of a hierarchical information system (today only BDII and maybe EMIR, which nobody uses) Otherwise we need a model that takes into account hierarchical propagation of information (as mentioned before, an aggregation model) But for me having the above in the GLUE2 model sounds like if physicist should describe the Standard Model in terms of the pieces of paper, emails, research papers, people, historical events needed to describe the physics in it...
[...] I don't see that this is different to any attribute - what you publish needs to be driven by the use cases. It wouldn't be especially difficult to publish a different Validity for each object type, or even for e.g. different batch systems, but unless you have something to specify the use there's nothing to motivate such a varying choice.
My use-case was what you might expect: allowing detection of a particular failure mode. Specifically, the information publishing "got stuck" at one site. The details don't matter, but the result was old ("stale") data continued to be re-published.
In ARC, we decided long time ago that the information system should NOT be used as a monitor for the information system itself. If one does that it does it at his own risk; the reason lies behind the fact that the information system is more like a business card. It presents services to users. It might fake some of the information to please the users/communities needs, or to hide faults in the system in a way that the overall system still works (and this is what actually happens!) Using the information system as a monitoring tool requires a different approach, namely, the information system itself must be able to self-diagnose. Apart from the philosophical question if this is even possible, for ARC this is difficult because the information system is part of/triggered by other parts of the middleware: if the middleware dies the infosys dies with it. This is not up to GLUE2 to define, and is not part of most current architectures, and to me it indicates that proper monitoring should be done with third party tools. As a matter of fact that claim applies to most software. So if you want to know if the information publishing "got stuck" you'd better be a good sysadmin and use a decent process monitoring tool, let it be Nagios or a simple cronjob that sends emails... Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Hi Florido, Thanks for your reply; my comments below. On 21/04/15 11:53, Florido Paganelli wrote:
I also have the feeling the discussion is becoming a bit sterile. We can make the GLUE2 spec better but I hardly understand how Paul definitions without using the actual terms we want to define could help.
Sorry, it was meant only as an aide towards writing good descriptions. It's certainly not a requirement.
On 2015-04-20 19:46, Paul Millar wrote:
Put another way, the concept of 'information being created' is too loose a term: it could mean almost anything, so defines nothing.
Well, this is a rhetorical game and not a scientific discussion anymore IMHO. I understand you want a definition out of the practical implementation, and since you seem to like riddles, I will avoid the words creation and time (at this point a mere exercise of wording) here it is:
The CreationTime is the number of seconds elapsed since the Epoch (00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970) formatted as described in the GLUE2 document when BOTH these two are true: 1) the GLUE2 record for a GLUE2 entity is being generated 2) the data contained in the record, that is, the data that describes the entity the record refers to, is being collected.
Great, thanks for taking the time to define this.
I see no fallacy nor circularity. It's a definition. It does NOT require the knowledge of provider, resource- whatever-BDII
Yes, absolutely.
Of course, if you want to be really picky there is a time drift between 1) and 2) because a Turing machine is sequential. But we can avoid this discussion I hope...
Certainly, despite evidence to the contrary, I don't want to nitpick. Now, I believe your definition also applies to a site-level BDII. When it refreshes information, it generates a new record and populates this with information it collects from the resource-level BDII. Conditions 1) and 2) are satisfied, so the site-level BDII may set CreationTime. There's a (translational?) symmetry between a site-level BDII fetching information from resource-level BDIIs, and a resource-level BDII fetching information from info-providers. Having said that, the problem only appears in hierarchical systems, like BDII. So, perhaps having a hierarchical profile document would be a better way of solving this.
I can provide a similar definition for Validity if you like... but I will shift to Stephen's suggestion that this is community-driven, but it's not because of the model, it's because what is "Valid" is community driven, and by experience I can tell it will be even if you try to define it otherwise!
I guess it's unclear to me what should happen if CreationTime+Validity is in the past. From what others have said, it seem we make no claims what this means; the client must decide. My naïve thinking was that, if information is updated periodically and CreationTime+Validity is in the past then the data should be considered "stale" as it should have been updated by now.
Maybe the only real outcome of this discussion is Jens' comment that 'Validity' was a bad name! :D
Yeah, I think that's true! [..]
To me, this points to a deficiency in GLUE 2.
I do not see the needs to describing it in the model. One describes that in an implementation of a hierarchical information system (today only BDII and maybe EMIR, which nobody uses)
Otherwise we need a model that takes into account hierarchical propagation of information (as mentioned before, an aggregation model)
But for me having the above in the GLUE2 model sounds like if physicist should describe the Standard Model in terms of the pieces of paper, emails, research papers, people, historical events needed to describe the physics in it...
:-D OK, perhaps this could be in a separate document (a profile?) that describes a hierarchical GLUE system? That could refine concepts, like CreationTime, describe how aggregation happens, etc. This would avoid "polluting" GLUE-2 base document with these hierarchy-specific issues.
[...] I don't see that this is different to any attribute - what you publish needs to be driven by the use cases. It wouldn't be especially difficult to publish a different Validity for each object type, or even for e.g. different batch systems, but unless you have something to specify the use there's nothing to motivate such a varying choice.
My use-case was what you might expect: allowing detection of a particular failure mode. Specifically, the information publishing "got stuck" at one site. The details don't matter, but the result was old ("stale") data continued to be re-published.
In ARC, we decided long time ago that the information system should NOT be used as a monitor for the information system itself. If one does that it does it at his own risk; the reason lies behind the fact that the information system is more like a business card. It presents services to users. It might fake some of the information to please the users/communities needs, or to hide faults in the system in a way that the overall system still works (and this is what actually happens!)
Using the information system as a monitoring tool requires a different approach, namely, the information system itself must be able to self-diagnose. Apart from the philosophical question if this is even possible, for ARC this is difficult because the information system is part of/triggered by other parts of the middleware: if the middleware dies the infosys dies with it. This is not up to GLUE2 to define, and is not part of most current architectures, and to me it indicates that proper monitoring should be done with third party tools. As a matter of fact that claim applies to most software.
So if you want to know if the information publishing "got stuck" you'd better be a good sysadmin and use a decent process monitoring tool, let it be Nagios or a simple cronjob that sends emails...
As with all things: hindsight is 20-20 and failure modes oft choose the gaps in monitoring. In this particular case, the "mechanical" refresh process was working correctly, with the site-level BDII fetching data correctly. Direct monitoring of BDII/LDAP object creation time (the built-in 'createTimestamp' attribute) would not have revealed any problem. Publishing CreationTime and Validity (with the semantics of now()>CreationTime+Validity => problem) would have allowed a script to detect the problem. This isn't to say this is the only way of achieving this, nor that it is necessarily the best way; however, it did seem to fit with the idea of CreationTime and Validity. Publishing just the CreationTime allows a script to detect the problem, provided it happens to know the refresh period. Although this is less idea, it's probably the best I can do, given everyone else feels Validity has a different meaning. Cheers, Paul.

Hi Paul, Ahaha ok ok, touché :D I should not have changed "generated" with "collected", the usual last minute changes... let me do it and see if you like it... On 2015-04-21 17:17, Paul Millar wrote:
[...]
The CreationTime is the number of seconds elapsed since the Epoch (00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970) formatted as described in the GLUE2 document when BOTH these two are true: 1) the GLUE2 record for a GLUE2 entity is being generated 2) the data contained in the record, that is, the data that describes the entity the record refers to, is being collected.
Great, thanks for taking the time to define this.
We might fix the word "collected" in 2) with "generated", as I did in 1), but not sure that will solve the issue... you will reuse the argument that BDII is actually generating records, and I will have to say it's technically true but not theoretically. Argumentation below.
I see no fallacy nor circularity. It's a definition. It does NOT require the knowledge of provider, resource- whatever-BDII
Yes, absolutely.
Of course, if you want to be really picky there is a time drift between 1) and 2) because a Turing machine is sequential. But we can avoid this discussion I hope...
Certainly, despite evidence to the contrary, I don't want to nitpick.
Now, I believe your definition also applies to a site-level BDII. When it refreshes information, it generates a new record and populates this with information it collects from the resource-level BDII. Conditions 1) and 2) are satisfied, so the site-level BDII may set CreationTime.
IF and ONLY IF BDII site- and top- generate a new record. But AFAIK, and for what Stephen said, _it doesn't_. Or actually site just generates a site(AdminDomain) record. It just copies records from a source and puts them in another database untouched. It does change the key though, but this are implementation technicalities of LDAP's own hierarchical organization. It just generates new indexes to reindex the data -- these are even external to GLUE2 as a matter of fact, that was part of the LDAP implementation work. Formally the record from resource is _copied_. Not generated. But we're already describing how to handle aggregation in hierararchical world here... we're already out of GLUE2 IMHO.
There's a (translational?) symmetry between a site-level BDII fetching information from resource-level BDIIs, and a resource-level BDII fetching information from info-providers.
not really, infoproviders can be seen as information generators from the objects GLUE2 describes, the other components do not generate it. The difference is that an infoprovider inspects the object to model, bdii collectors don't. They just repeat the information.
[...]
So if you want to know if the information publishing "got stuck" you'd better be a good sysadmin and use a decent process monitoring tool, let it be Nagios or a simple cronjob that sends emails...
As with all things: hindsight is 20-20 and failure modes oft choose the gaps in monitoring.
In this particular case, the "mechanical" refresh process was working correctly, with the site-level BDII fetching data correctly. Direct monitoring of BDII/LDAP object creation time (the built-in 'createTimestamp' attribute) would not have revealed any problem.
Publishing CreationTime and Validity (with the semantics of now()>CreationTime+Validity => problem) would have allowed a script to detect the problem.
Well, we go back to the concept that you cannot monitor using a site bdii because it only *copies* data *by design*. It does NOT generate it. To my understanding you're saying that we MIGHT use the hierarchical structure of the BDII and the values of CreationTime for this purpose but in my opinion this is wrong - by design as well. Let me explain why. My claim is that you're monitoring on the wrong side -- as a matter of fact, this will not even tell you what is happening, just that "something bad is happening somewhere, where I copied the data from" In principle it does not even tell you where the data is from... unless you look at site-bdii config files :( where the actual URLs are. But that is expected; BDII is not a monitoring infrastructure, GLUE2 was not designed for monitoring... maybe you're right, it can be used for that, but we need to write another software that does that, and maybe a MonitoringService entity...
This isn't to say this is the only way of achieving this, nor that it is necessarily the best way; however, it did seem to fit with the idea of CreationTime and Validity.
Not to me, as I said, hierarchical representation and aggregation of GLUE2 data was never discussed. That's why I assume that intermediate levels should not change the data passed by the levels below but only add records eventually (i.e. the site level) It might be used as you say if we redefine a hierarchical approach to aggregation of multiple GLUE2 documents.
Publishing just the CreationTime allows a script to detect the problem, provided it happens to know the refresh period. Although this is less idea, it's probably the best I can do, given everyone else feels Validity has a different meaning.
Cheers,
Paul.
Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================
participants (7)
-
Andre Merzky
-
Florido Paganelli
-
Jens Jensen
-
JP Navarro
-
Maarten.Litmaath@cern.ch
-
Paul Millar
-
stephen.burke@stfc.ac.uk