LDAP rendering document: new version as an outcome of Lund review

Dear All, I've just uploaded a new version of the "GLUE v. 2.0 – Reference Realization to LDAP Schema" ldap rendering draft to the glue2 gridforge area. The uploaded new version contains comments and tracks all the changes we made in the document. Please find the files here: - word with all the changes tracked: https://forge.ogf.org/sf/go/doc15518?nav=1 - clean pdf: https://forge.ogf.org/sf/docman/do/downloadDocument/projects.glue-wg/docman.... - pdf with all the changes tracked: https://forge.ogf.org/sf/docman/do/downloadDocument/projects.glue-wg/docman.... During the last weeks (months) the NorduGrid/ARC team in Lund carried out a thorough review and major cleanup of the ldap rendering document. Basically we took the document and checked it against our and other LDAP implementations. The ldap rendering draft was created long time ago and since 07/01/2010 it was not touched, at many places it became obsolete. Furthermore, back then when the ldap rendering discussion took place there was only one ldap implementation (the glite-bdii), unfortunately ARC was busy with the xml glue2 rendering part and had no possibility to check/follow the ldap area. Furthermore, the ldap team did not follow the xml rendering discussions although there is quite similarity in the two data models. Now that ARC implements both an LDAP and XML rendering (i think we are the only one) we thought it was time to review and update the LDAP rendering draft. Here are some of the items we modified or run into (everything is tracked in the new version!): - The old document contained a proposed DIT that was incomplete and not followed by any of the actual implementations. We almost completely rewrote the section on DIT, introduced three-level information structuring and provided three detailed pictures that correspond to actual implementation apart from minor proposed changes. - while defining the proposed DIT we tried to keep it in sync with the XML rendering, this was most visible in the selection of the grouping elements - corrected the datatypes to match the current schema used by EMI - made a comment on the usage of structural vs. auxiliary types. The current limited usage of structural types are questionable. - made a comment on the strange and unjustified (for us) choice on the LDAP attributenames selected to form DNs - made a note on the unfortunate choice of GLUE2GRoupID attribute that is not an ID - followed the RFC4512 terminology (e.g. renamed ldap objects to ldap entries) - to be consistent with the xml and sql rendering documents changed "implementation" to "realization" all over the text - made a note that the used OID allocation mechanism is not extensible when it comes to adding attributes to entry. Furthermore, the choice is strange, it is not applied consistently and its benefits are unclear. Florido will attend the OGF Glue2 session this Sunday and prepares a short presentation about our LDAP draft rendering review including open questions and proposed changes. regards, Balazs Konya and Florido Paganelli

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Balazs Konya said: I've just uploaded a new version of the "GLUE v. 2.0 – Reference Realization to LDAP Schema" ldap rendering draft to the glue2 gridforge area. The uploaded new version contains comments and tracks all the changes we made in the document.
I haven't yet read the new document in detail, but as the author of the previous version I have a few general comments. Firstly, I don't think it's accurate to describe this as a revision of the existing document; it seems to be substantially a new document describing a significantly different rendering. I would personally prefer to separate textual and minor technical comments on the existing text from substantive changes to the structure. Secondly, the history of this is that the LDAP rendering was discussed in this mailing list and in open meetings in mid-2009, the resulting LDAP schema was then implemented and we've spent nearly three years deploying it in production in EGEE and EGI. Given the practicalities of deployment I think it's essentially impossible to make any backward-incompatible changes at this stage, and even backward-compatible changes would probably take several years to propagate across the whole Grid, so regardless of how the document changes the major existing implementation is basically fixed. The previous document was intended to correspond to what was actually implemented but may not do so in every respect - for example it seems that we failed to change string types from IA5String to DirectoryString in the deployed schema. I think that's unfortunate but we will in practice have to live with it as a restriction (as we did for GLUE 1) - in that case we could change the schema since it would be backward-compatible, but it would be a long time before non-ASCII characters could be used reliably. It is of course possible to have variant implementations, and to some extent variants can be covered by a single document, e.g. it was explicitly intended to allow the DIT to be flexible. However, at some point if it's really desired to have renderings which differ in major ways I think it would be better to have separate documents. Stephen

Hia, On Thu, Jun 14, 2012 at 5:33 PM, <stephen.burke@stfc.ac.uk> wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Balazs Konya said: I've just uploaded a new version of the "GLUE v. 2.0 – Reference Realization to LDAP Schema" ldap rendering draft to the glue2 gridforge area. The uploaded new version contains comments and tracks all the changes we made in the document.
I haven't yet read the new document in detail, but as the author of the previous version I have a few general comments. Firstly, I don't think it's accurate to describe this as a revision of the existing document; it seems to be substantially a new document describing a significantly different rendering. I would personally prefer to separate textual and minor technical comments on the existing text from substantive changes to the structure.
FWIW, significant normative changes will most likely result in a new GFD number for the document, to avoid confusion over what implementation is implementing what specification. My $0.02, Andre.
Secondly, the history of this is that the LDAP rendering was discussed in this mailing list and in open meetings in mid-2009, the resulting LDAP schema was then implemented and we've spent nearly three years deploying it in production in EGEE and EGI. Given the practicalities of deployment I think it's essentially impossible to make any backward-incompatible changes at this stage, and even backward-compatible changes would probably take several years to propagate across the whole Grid, so regardless of how the document changes the major existing implementation is basically fixed.
The previous document was intended to correspond to what was actually implemented but may not do so in every respect - for example it seems that we failed to change string types from IA5String to DirectoryString in the deployed schema. I think that's unfortunate but we will in practice have to live with it as a restriction (as we did for GLUE 1) - in that case we could change the schema since it would be backward-compatible, but it would be a long time before non-ASCII characters could be used reliably.
It is of course possible to have variant implementations, and to some extent variants can be covered by a single document, e.g. it was explicitly intended to allow the DIT to be flexible. However, at some point if it's really desired to have renderings which differ in major ways I think it would be better to have separate documents.
Stephen _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
-- Nothing is ever easy...

On 06/14/2012 05:33 PM, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Balazs Konya said: I've just uploaded a new version of the "GLUE v. 2.0 – Reference Realization to LDAP Schema" ldap rendering draft to the glue2 gridforge area. The uploaded new version contains comments and tracks all the changes we made in the document.
I haven't yet read the new document in detail, but as the author of the previous version I have a few general comments. Firstly, I don't think it's accurate to describe this as a revision of the existing document; it seems to be substantially a new document describing a significantly different rendering. I would personally prefer to separate textual and minor technical comments on the existing text from substantive changes to the structure.
Secondly, the history of this is that the LDAP rendering was discussed in this mailing list and in open meetings in mid-2009, the resulting LDAP schema was then implemented and we've spent nearly three years deploying it in production in EGEE and EGI. Given the practicalities of deployment I think it's essentially impossible to make any backward-incompatible changes at this stage, and even backward-compatible changes would probably take several years to propagate across the whole Grid, so regardless of how the document changes the major existing implementation is basically fixed.
with all the respect, you should just read the document, and you'll notice that the spirit is really to sync it with the current implementations at the best, correcting deviations that might make the implementation not capable of using the unique features of the underlying technology, or improving (mostly human) readability. Is completely absurd for me to mimic a relational database behaviour in a years-old hierarchical database system such as LDAP, it's just an overkill. I believe there is no real backward incompatible change that cannot be fixed in a simple minor update; the logic is there, we're speaking about renaming of groups, which is in line with the previous version of the documents. I clearly remember an exchange of email where you said the structure of the tree is not important. Did you change your mind? I think it _is_ important in a hierarchical database...
The previous document was intended to correspond to what was actually implemented but may not do so in every respect - for example it seems that we failed to change string types from IA5String to DirectoryString in the deployed schema. I think that's unfortunate but we will in practice have to live with it as a restriction (as we did for GLUE 1) - in that case we could change the schema since it would be backward-compatible, but it would be a long time before non-ASCII characters could be used reliably.
But why is that? I didn't understand the problem there, I was not part of the process at that time, sorry
It is of course possible to have variant implementations, and to some extent variants can be covered by a single document, e.g. it was explicitly intended to allow the DIT to be flexible. However, at some point if it's really desired to have renderings which differ in major ways I think it would be better to have separate documents.
We kept the DIT flexible but if one wants to achieve performance then he must enforce the benefits of the underlying technology, IMHO. That's the idea behind suggesting a pre-defined tree structure. But check it, is no big difference with the existing implementations! I'll try to explain better during the meeting. Cheers -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Florido Paganelli said: with all the respect, you should just read the document, and you'll notice that the spirit is really to sync it with the current implementations at the best, correcting deviations that might make the implementation not capable of using the unique features of the underlying technology, or improving (mostly human) readability.
I'm not clear how you expect to proceed from here. For me the correct way, and the way we've always done this in the past, is for you to prepare a list of proposed changes which we can discuss one by one and record decisions as we go. If we try to base a discussion on a completely re-written document I think the discussion will go nowhere useful, we will start debating it one word at a time! Also, in the meantime you have uploaded the new document over the top of the existing one so anyone looking for documentation on the LDAP rendering is likely to assume that is now the agreed specification, which it certainly isn't. For the specific question of the string type I think this needs wider discussion as it potentially affects all renderings. In any case it's a real problem and not simply a matter of changing the text; at the moment I don't find it obvious what the best solution will be (see below).
I believe there is no real backward incompatible change that cannot be fixed in a simple minor update; the logic is there, we're speaking about renaming of groups, which is in line with the previous version of the documents. I clearly remember an exchange of email where you said the structure of the tree is not important. Did you change your mind?
I will try to say it more clearly. When I say that the tree isn't important I'm referring to queries made by clients; they should query using objectclass and attribute filters and ignore the DN. However, the deployed BDII infrastructure does of course build the DNs in a particular way, and that implementation is for the time being essentially fixed - there are BDIIs and information providers spread over hundreds of sites and thousands of services, and the typical time to propagate a change to all of them is of the order of 3 years. Hence any changes must be made in a gradual, backward-compatible way and can't be relied on for quite a long time. In terms of the document, my view is that it should specify something which is consistent with the current BDII implementation, but on the other hand should not constrain any implementation more than necessary - we may well want to evolve the existing infrastructure in the medium term, and of course other providers, e.g. Nordugrid, may want to produce their own implementation. I don't think there is any necessity to specify the DIT since clients should not need to rely on it. If you do regard it as necessary to specify aspects of the DIT I would like to see the specific use cases which you think require it.
The previous document was intended to correspond to what was actually implemented but may not do so in every respect - for example it seems that we failed to change string types from IA5String to DirectoryString in the deployed schema. I think that's unfortunate but we will in practice have to live with it as a restriction (as we did for GLUE 1) - in that case we could change the schema since it would be backward-compatible, but it would be a long time before non-ASCII characters could be used reliably.
But why is that? I didn't understand the problem there, I was not part of the process at that time, sorry
I'm not sure what you're asking, but the basic issue is that the schema doesn't specify what characters are allowed in strings. In GLUE 1 we typed strings using the IA5String type which is basically 7-bit ASCII; most of the time that's OK but every so often someone gets tripped up when they try to include other characters; the most recent case I helped to debug was only a few weeks ago. The initial definition of the GLUE 2 LDAP schema also used IA5String, but we then discussed this issue and agreed to switch to DirectoryString which allows UTF-8 strings. However it seems that the schema was never updated to reflect that decision. I'm a bit surprised that I never noticed that, but that's where we are. The simplest solution would be to make a global declaration, for all implementations in all technologies, that only ASCII characters are allowed, but that may not be the best way. We could make it implementation-defined, but then we should probably make some statement about interoperability. Stephen

On 06/16/2012 06:15 PM, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Florido Paganelli said: with all the respect, you should just read the document, and you'll notice that the spirit is really to sync it with the current implementations at the best, correcting deviations that might make the implementation not capable of using the unique features of the underlying technology, or improving (mostly human) readability.
I'm not clear how you expect to proceed from here. For me the correct way, and the way we've always done this in the past, is for you to prepare a list of proposed changes which we can discuss one by one and record decisions as we go. If we try to base a discussion on a completely re-written document I think the discussion will go nowhere useful, we will start debating it one word at a time!
I'll try do present it this way in my presentation. I will clearly state what is to be discussed
Also, in the meantime you have uploaded the new document over the top of the existing one so anyone looking for documentation on the LDAP rendering is likely to assume that is now the agreed specification, which it certainly isn't.
the document clearly says draft, and we had no other traceable way of sharing with the rest of the group, I don't know how to do this better!
For the specific question of the string type I think this needs wider discussion as it potentially affects all renderings. In any case it's a real problem and not simply a matter of changing the text; at the moment I don't find it obvious what the best solution will be (see below).
I believe there is no real backward incompatible change that cannot be fixed in a simple minor update; the logic is there, we're speaking about renaming of groups, which is in line with the previous version of the documents. I clearly remember an exchange of email where you said the structure of the tree is not important. Did you change your mind?
I will try to say it more clearly. When I say that the tree isn't important I'm referring to queries made by clients; they should query using objectclass and attribute filters and ignore the DN. However, the deployed BDII infrastructure does of course build the DNs in a particular way, and that implementation is for the time being essentially fixed
So we have a recommendation draft that nobody actually followed and you want us to comment on that? I don't understand. We must make it usable for implementers, the previous version was not, I think the new one is clearer.
- there are BDIIs and information providers spread over hundreds of sites and thousands of services, and the typical time to propagate a change to all of them is of the order of 3 years. Hence any changes must be made in a gradual, backward-compatible way and can't be relied on for quite a long time.
then we will come to a deal and a roadmap, I don't see any bad with that.
In terms of the document, my view is that it should specify something which is consistent with the current BDII implementation, but on the other hand should not constrain any implementation more than necessary - we may well want to evolve the existing infrastructure in the medium term, and of course other providers, e.g. Nordugrid, may want to produce their own implementation.
I am already trying to fix that to fit in the bdii-top, at the moment I've been successful, I would't be so pessimist here. And again, you are contradicting yourself if you say there is no mandatory DIT but the actual infrastructure is based on a specific DIT! We should then make the infrastructure independent on the DIT. But My idea is that is an overkill on LDAP.
I don't think there is any necessity to specify the DIT since clients should not need to rely on it. If you do regard it as necessary to specify aspects of the DIT I would like to see the specific use cases which you think require it.
but you just said current clients rely on it!
The previous document was intended to correspond to what was actually implemented but may not do so in every respect - for example it seems that we failed to change string types from IA5String to DirectoryString in the deployed schema. I think that's unfortunate but we will in practice have to live with it as a restriction (as we did for GLUE 1) - in that case we could change the schema since it would be backward-compatible, but it would be a long time before non-ASCII characters could be used reliably.
But why is that? I didn't understand the problem there, I was not part of the process at that time, sorry
I'm not sure what you're asking, but the basic issue is that the schema doesn't specify what characters are allowed in strings. In GLUE 1 we typed strings using the IA5String type which is basically 7-bit ASCII; most of the time that's OK but every so often someone gets tripped up when they try to include other characters; the most recent case I helped to debug was only a few weeks ago. The initial definition of the GLUE 2 LDAP schema also used IA5String, but we then discussed this issue and agreed to switch to DirectoryString which allows UTF-8 strings. However it seems that the schema was never updated to reflect that decision. I'm a bit surprised that I never noticed that, but that's where we are. The simplest solution would be to make a global declaration, for all implementations in all technologies, that only ASCII characters are allowed, but that may not be the best way. We could make it implementation-defined, but then we should probably make some statement about interoperability.
isn't enough to change the schema? UTF8 strings will be there anyway, it's just a matter of encoding, isn't it? Cheers, -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:
the document clearly says draft, and we had no other traceable way of sharing with the rest of the group, I don't know how to do this better!
There must be other places in the web site ... the only reason the document still says that it's a draft is that the GLUE working group was inactive for something like two years so there was no forum to discuss it. The intention was that it was a final version apart from any detailed textual comments, hence my wish for any more substantial changes to be in a new document.
So we have a recommendation draft that nobody actually followed and you want us to comment on that? I don't understand.
You seem to think that I don't understand either, despite the fact that I worked on both the document and the implementation! I will say it yet again - the document records the *final* decisions that we made in 2009, and the implementation was of course intended to follow that. There may be mistakes, nothing is perfect, and if noticed they should be corrected, but that is very different from trying to change the decisions that were made - including "negative" decisions like allowing the DIT to be largely an implementation choice.
We must make it usable for implementers, the previous version was not,
That's obviously not true since we have in fact now got a near-complete implementation in EMI 2. It may of course be that things are unclear and need to be explained better, but again that is not the same as changing the substance.
In terms of the document, my view is that it should specify something which is consistent with the current BDII implementation, but on the other hand should not constrain any implementation more than necessary - we may well want to evolve the existing infrastructure in the medium term, and of course other providers, e.g. Nordugrid, may want to produce their own implementation.
I am already trying to fix that to fit in the bdii-top, at the moment I've been successful, I would't be so pessimist here.
Fix what? As far as I'm aware there is nothing to fix.
And again, you are contradicting yourself if you say there is no mandatory DIT but the actual infrastructure is based on a specific DIT!
I don't understand why you find this a difficult concept. The intention is that the *specification* (the document) leaves some things open to be an implementation choice, including the details of the DIT. A given implementation, in this case the BDII, obviously has to do some particular thing. However, a different implementation, or the same implementation in future, could do it differently. If you specify the details in the document you remove that freedom to change - in particular, if a particular structure gets embedded in client queries it will probably be fixed forever since we don't in general control all clients.
We should then make the infrastructure independent on the DIT. But My idea is that is an overkill on LDAP.
You keep saying that, but you still haven't given any practical examples. My argument is basically this: 1) I don't believe there is any *need* for queries to refer to the DIT. Queries using only objectclass and attribute filters are sufficient for all needs, and will find objects wherever they are in the tree. If suitable attributes are indexed in the underlying DB then experience says that the BDII performance is not impacted by querying from the root, and in any case the nature of many queries is that they need to cover the entire information system. 2) Queries do of course need to know the base DN, but there is no need for it to be hard-coded, it can e.g. be passed in an environment variable or derived from the information system itself. Hence for example we can have code which can query either a site BDII or a top BDII simply by passing a different base DN. 3) At the moment we have a BDII structure which is based on our traditional GLUE 1 model with a three level hierarchy (resource, site, top). However, we also know that we may want to evolve that structure, for example to separate different kinds of information. Since there is no need to specify either the DIT or the BDII hierarchy I think we should not do so - then we have the freedom to evolve the system gradually without breaking any clients. If we a specify a structure and clients embed it in their queries then any change will be difficult and disruptive, perhaps to the point of being impossible. Note that GLUE 2 itself is the first backward-incompatible change we've made; we started discussing it in 2005, and 7 years later we are still trying to manage the migration! I suspect that we will never be able to do such a thing again.
I don't think there is any necessity to specify the DIT since clients should not need to rely on it. If you do regard it as necessary to specify aspects of the DIT I would like to see the specific use cases which you think require it.
but you just said current clients rely on it!
No I didn't. Perhaps my terminology is confusing, by client I mean some piece of code which queries the information system to retrieve information. Obviously the BDIIs themselves do need to know the structure because they create it! However they are not what I mean by clients. Also even the BDIIs only depend on the structure to the extent that they must, for example a top BDII needs to know that site BDIIs exist but it doesn't need to know what information they contain, it can just copy it.
isn't enough to change the schema? UTF8 strings will be there anyway, it's just a matter of encoding, isn't it?
Unfortunately it isn't that simple. The LDAP servers in the BDIIs validate the published information against the schema definition, and if an attribute is non-compliant the entire object is rejected. Any objects below that in the DIT will then also be rejected because their parent doesn't exist. Hence we basically have to change the schema in every BDII in the system before publishing any non-ASCII characters if we don't want to have objects randomly vanishing. (The case I debugged recently was a Spanish site which quite reasonably had a description string of "Centro de Supercomputación de Galicia (CESGA)", which resulted in their GlueSite object being absent.) Stephen

On 06/17/2012 12:43 PM, stephen.burke@stfc.ac.uk wrote:
2) Queries do of course need to know the base DN, but there is no need for it to be hard-coded, it can e.g. be passed in an environment variable or derived from the information system itself. Hence for example we can have code which can query either a site BDII or a top BDII simply by passing a different base DN.
This is a key aspect for current information system. The base DN is difficult to change. What we have so far deployed is: Top: GLUE2GroupID=grid,o=glue Site: GLUE2DomainID=CERN-PROD,o=glue Resource: GLUE2GroupID=resource,o=glue There is also the concatenation rule on now we go from distributed trees to a single tree. GLUE2GroupID=resource,GLUE2DomainID=CERN-PROD,GLUE2GroupID=grid,o=glue Once these are deployed, it almost becomes impossible to change. It is for this reason we have been using mds-vo-name=local,o=grid for the past 10 years in GLUE 1.3! With OpenLDAP 2.4 may be possible to migrated as we can configure LDAP redirects. The bind points and concatenation rule were discussed over 2 years ago as part of the implementation. At the time it was agreed that client queries should not rely on the DIT. In the current implementation we do not care about the DIT below the bind point, as it is irrelevant for client queries. However, as you can see the base DN and concatenation rule is integral to the infrastructure. Laurence

The initial definition of the GLUE 2 LDAP schema also used IA5String, but we then discussed this issue and agreed to switch to DirectoryString which allows UTF-8 strings. However it seems that the schema was never updated to reflect that decision.
Looking a bit closer, the schema *was* updated: https://forge.ogf.org/integration/viewcvs/viewcvs.cgi/LDAP2/schema/?root=glue&rev=67&system=exsy1001 but not deployed? Stephen

On 06/16/2012 07:54 PM, stephen.burke@stfc.ac.uk wrote:
The initial definition of the GLUE 2 LDAP schema also used IA5String, but we then discussed this issue and agreed to switch to DirectoryString which allows UTF-8 strings. However it seems that the schema was never updated to reflect that decision.
Looking a bit closer, the schema *was* updated:
https://forge.ogf.org/integration/viewcvs/viewcvs.cgi/LDAP2/schema/?root=glue&rev=67&system=exsy1001
but not deployed?
Stephen
I remember I noticed the change while I was updating ARC information system. Most likely is a packaging problem. Laurence can you comment on that? -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

On 06/17/2012 10:14 AM, Florido Paganelli wrote:
I remember I noticed the change while I was updating ARC information system. Most likely is a packaging problem. Laurence can you comment on that?
The last thing that we did was to sync with the latest version in github. https://github.com/OGF-GLUE/LDAP/tree/master/schema Note that this has IA5 strings so I will redirect this to Sergio! I think that most people are using the glue-schema package from Fedora so this should be our reference implementation. As long as changes are backwards compatible, there is not problem to make changes. Now would be a good time to review the schema and make such changes. The changes should be done in git hub and we can push out a new version in Fedora. Laurence

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Laurence Field said: Now would be a good time to review the schema and make such changes. The changes should be done in git hub and we can push out a new version in Fedora.
One other thing that springs to mind is that ExtendedBoolean attributes are typed as LDAP Boolean which only allows TRUE and FALSE. That's the same kind of thing, i.e. a schema change would be backward-compatible but publication wouldn't. The other change we discussed from the original rendering was that attributes like ShareServiceForeignKey should go from optional to mandatory and attributes like ComputingShareComputingServiceForeignKey should go from mandatory to optional (and eventually be deprecated), but it looks like that is in the deployed schema. Stephen -- Scanned by iCritical.
participants (5)
-
Andre Merzky
-
Balazs Konya
-
Florido Paganelli
-
Laurence Field
-
stephen.burke@stfc.ac.uk