Foreign Key implementation on GLUE 2.0 LDAP implementation

Hello all, After having a talk with Maarten, a few emails with Laurence, seen the GLUE 1.3 LDAP implementation on lcg-bdii.cern.ch and though about it thoroughly, I have come to the next idea for implementing Foreign Keys on LDAP: Having an object ObjectA related to object ObjectB and being the relationship ObjectA 0..* - 0..* ObjectB, - ObjectA will contain the attribute ObjectAObjectBFK pointing to ObjectB's DN - ObjectB will contain the attribute ObjectBObjectAFK pointing to ObjectA's DN - In case the multiplicity was 1..X, with X between 1 and *, ObjectYObjectZFK will be mandatory - In case the multiplicity was 1 or 0..1, ObjectYObjectZFK will be SINGLE-VALUE - In case the relationship was Object A 0..* - 1 Object B, ObjectA will be the only one holding the ForeignKey Since LDAP is not relational, there are a few pitfalls with LDAP: - ObjectYObjectZFK is not checked to be the correct - The duality ObjectAObjectBFK and ObjectBObjectAFK is not checked to have the relation stated in both directions - Moving between relations is done in the client and not the server, which is not optimal These should be checked in an external client at appropriate invertals to ensure the integrity of the LDAP server plus all the specifcations done in GLUE 2.0 that can not be covered in LDAP (types, maximum range, etc.). I would like to have an approval of 1 or 2 people before starting with it. So, do you like it this way? :) Regards, David -- David Horat Software Engineer specialized in Grid and Web technologies IT Department – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Phone +41 22 76 77996 http://davidhorat.com/ http://cern.ch/horat http://www.linkedin.com/in/davidhorat

Hi David, I would prefer to use the UniqueID rather than the DN. Can you provide a concrete example. Laurence David Horat wrote:
Hello all,
After having a talk with Maarten, a few emails with Laurence, seen the GLUE 1.3 LDAP implementation on lcg-bdii.cern.ch <http://lcg-bdii.cern.ch> and though about it thoroughly, I have come to the next idea for implementing Foreign Keys on LDAP:
Having an object ObjectA related to object ObjectB and being the relationship ObjectA 0..* - 0..* ObjectB, - ObjectA will contain the attribute ObjectAObjectBFK pointing to ObjectB's DN - ObjectB will contain the attribute ObjectBObjectAFK pointing to ObjectA's DN - In case the multiplicity was 1..X, with X between 1 and *, ObjectYObjectZFK will be mandatory - In case the multiplicity was 1 or 0..1, ObjectYObjectZFK will be SINGLE-VALUE - In case the relationship was Object A 0..* - 1 Object B, ObjectA will be the only one holding the ForeignKey
Since LDAP is not relational, there are a few pitfalls with LDAP: - ObjectYObjectZFK is not checked to be the correct - The duality ObjectAObjectBFK and ObjectBObjectAFK is not checked to have the relation stated in both directions - Moving between relations is done in the client and not the server, which is not optimal
These should be checked in an external client at appropriate invertals to ensure the integrity of the LDAP server plus all the specifcations done in GLUE 2.0 that can not be covered in LDAP (types, maximum range, etc.).
I would like to have an approval of 1 or 2 people before starting with it. So, do you like it this way? :)
Regards, David
-- David Horat Software Engineer specialized in Grid and Web technologies IT Department – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Phone +41 22 76 77996
http://davidhorat.com/ http://cern.ch/horat http://www.linkedin.com/in/davidhorat ------------------------------------------------------------------------
_______________________________________________ glue-wg mailing list glue-wg@ogf.org http://www.ogf.org/mailman/listinfo/glue-wg

In the DN, the UniqueID is included. Example: Having objects Contact: EntityId=testContact,o=grid Service: EntityId=testService,o=grid To make that Contact point to that Service, it fill be: ContactServiceFK: EntityId=testService,o=grid I hope this helps. Regards, David On Thu, Mar 26, 2009 at 10:17 AM, Laurence Field <Laurence.Field@cern.ch>wrote:
Hi David,
I would prefer to use the UniqueID rather than the DN. Can you provide a concrete example.
Laurence
David Horat wrote:
Hello all,
After having a talk with Maarten, a few emails with Laurence, seen the GLUE 1.3 LDAP implementation on lcg-bdii.cern.ch <http://lcg-bdii.cern.ch> and though about it thoroughly, I have come to the next idea for implementing Foreign Keys on LDAP:
Having an object ObjectA related to object ObjectB and being the relationship ObjectA 0..* - 0..* ObjectB, - ObjectA will contain the attribute ObjectAObjectBFK pointing to ObjectB's DN - ObjectB will contain the attribute ObjectBObjectAFK pointing to ObjectA's DN - In case the multiplicity was 1..X, with X between 1 and *, ObjectYObjectZFK will be mandatory - In case the multiplicity was 1 or 0..1, ObjectYObjectZFK will be SINGLE-VALUE - In case the relationship was Object A 0..* - 1 Object B, ObjectA will be the only one holding the ForeignKey
Since LDAP is not relational, there are a few pitfalls with LDAP: - ObjectYObjectZFK is not checked to be the correct - The duality ObjectAObjectBFK and ObjectBObjectAFK is not checked to have the relation stated in both directions - Moving between relations is done in the client and not the server, which is not optimal
These should be checked in an external client at appropriate invertals to ensure the integrity of the LDAP server plus all the specifcations done in GLUE 2.0 that can not be covered in LDAP (types, maximum range, etc.).
I would like to have an approval of 1 or 2 people before starting with it. So, do you like it this way? :)
Regards, David
-- David Horat Software Engineer specialized in Grid and Web technologies IT Department – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Phone +41 22 76 77996
http://davidhorat.com/ http://cern.ch/horat http://www.linkedin.com/in/davidhorat ------------------------------------------------------------------------
_______________________________________________ glue-wg mailing list glue-wg@ogf.org http://www.ogf.org/mailman/listinfo/glue-wg
-- David Horat Software Engineer specialized in Grid and Web technologies IT Department – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Phone +41 22 76 77996 http://davidhorat.com/ http://cern.ch/horat http://www.linkedin.com/in/davidhorat

glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Laurence Field said: I would prefer to use the UniqueID rather than the DN. Can you provide a concrete example.
I more than prefer, I think having references to DNs in the objects is crazy! The only way we should refer to objects is via the ID. Although LDAP uses DNs as a key to find objects glue itself doesn't, the object ID is the key, and DNs should never be used anywhere. Unfortunately I don't really have a chance to comment in detail while I'm at CHEP, and I'm staying in Prague for a few days extra. Probably that means that you should implement something and then I'll comment, but we should be prepared to change the approach if necessary. Stephen -- Scanned by iCritical.

What is the difference between having a DN or an ID in objects? Are objects going to be moved around so their DN will change but you want to preserve that relationship? When implementing relationships you need to know exactly where to look. Although our IDs are specified as "A global Unique ID", LDAP has no way to assure this as it uses the DN as its key. Moreover, when searching, as you said, it uses the DN, so I don't see any advantages of using the ID instead of the DN. So maybe I am skipping something in the middle. Maybe you can explain me why you think this. :) On Thu, Mar 26, 2009 at 10:35 AM, Burke, S (Stephen) < stephen.burke@stfc.ac.uk> wrote:
glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Laurence Field said: I would prefer to use the UniqueID rather than the DN. Can you provide a concrete example.
I more than prefer, I think having references to DNs in the objects is crazy! The only way we should refer to objects is via the ID. Although LDAP uses DNs as a key to find objects glue itself doesn't, the object ID is the key, and DNs should never be used anywhere.
Unfortunately I don't really have a chance to comment in detail while I'm at CHEP, and I'm staying in Prague for a few days extra. Probably that means that you should implement something and then I'll comment, but we should be prepared to change the approach if necessary.
Stephen -- Scanned by iCritical.
-- David Horat Software Engineer specialized in Grid and Web technologies IT Department – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Phone +41 22 76 77996 http://davidhorat.com/ http://cern.ch/horat http://www.linkedin.com/in/davidhorat

David Horat [mailto:david.horat@cern.ch] said:
What is the difference between having a DN or an ID in objects? Are objects going to be moved around so their DN will change but you want to preserve that relationship?
Glue is a generic schema with multiple renderings. A DN won't mean much if you try to connect it to an xml rendering, but an ID will. Also indeed I think it's quite possible that we may want to restructure the tree, especially at higher levels - we already do that for glue 1, objects have different DNs if you read them in different places. Also from a practical point of view there seems to be no real performance hit from querying objects by ID, and I think most/all clients do it that way at the moment.
When implementing relationships you need to know exactly where to look. Although our IDs are specified as "A global Unique ID", LDAP has no way to assure this as it uses the DN as its key.
That's irrelevant, the uniqueness has to be established by the info providers.
Moreover, when searching, as you said, it uses the DN,
It depends how you do the query - in practice we normally just query against the root DN and do everything else with objectclasses or attributes in the query filter. So in the simplest case you'd just query for (EntityID=xxx) and you'd get that object. Stephen -- Scanned by iCritical.

Ahhh, I didn't though about other renderings and that this information may be in other technologies. I understand now tha the unique ID will be the only key that will be common in all implementations. :) Thanks Stephen. Any other comments on the rest of the Foreign Key implementation? On Thu, Mar 26, 2009 at 10:57 AM, Burke, S (Stephen) < stephen.burke@stfc.ac.uk> wrote:
David Horat [mailto:david.horat@cern.ch] said:
What is the difference between having a DN or an ID in objects? Are objects going to be moved around so their DN will change but you want to preserve that relationship?
Glue is a generic schema with multiple renderings. A DN won't mean much if you try to connect it to an xml rendering, but an ID will. Also indeed I think it's quite possible that we may want to restructure the tree, especially at higher levels - we already do that for glue 1, objects have different DNs if you read them in different places.
Also from a practical point of view there seems to be no real performance hit from querying objects by ID, and I think most/all clients do it that way at the moment.
When implementing relationships you need to know exactly where to look. Although our IDs are specified as "A global Unique ID", LDAP has no way to assure this as it uses the DN as its key.
That's irrelevant, the uniqueness has to be established by the info providers.
Moreover, when searching, as you said, it uses the DN,
It depends how you do the query - in practice we normally just query against the root DN and do everything else with objectclasses or attributes in the query filter. So in the simplest case you'd just query for (EntityID=xxx) and you'd get that object.
Stephen
-- Scanned by iCritical.
-- David Horat Software Engineer specialized in Grid and Web technologies IT Department – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Phone +41 22 76 77996 http://davidhorat.com/ http://cern.ch/horat http://www.linkedin.com/in/davidhorat

Burke, S (Stephen) wrote:
I think it's quite possible that we may want to restructure the tree, especially at higher levels - we already do that for glue 1, objects have different DNs if you read them in different places.
IMO that is the main reason for _not_ using DNs to connect objects.

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of David Horat said:
Having an object ObjectA related to object ObjectB and being the relationship ObjectA 0..* - 0..* ObjectB, - ObjectA will contain the attribute ObjectAObjectBFK pointing to ObjectB's DN - ObjectB will contain the attribute ObjectBObjectAFK pointing to ObjectA's DN
I have another general point about foreign keys (aside from the fact that they should point to an ID and not a DN). Even though the relationship has two ends there is no particular reason to have the keys in both the objects. In your example above, even if the second attribute is missing I can navigate in both directions: either I query A for the B ID in its FK attribute and then query for B by its ID, or I query B for its ID and then query for the A which has an FK which references it. There might be marginal performance differences but I don't think it's significant in practice. That might not matter, except that there can also be practical impacts because the multiplicities can in some cases be enormous. For example, consider the ComputingShare/ComputingActivity relation. There could be a huge number of jobs, so if you put all the references into the Share it will inflate the size of the object substantially. There are also practical implications for the way the info providers are written. The info provider has to be able to know the IDs of all the objects it references explicitly. The natural way to to it is to generate the objects in a hierarchical way, so in the same case you'd generate the Share with its ID and then loop over all the jobs passing the share ID in. If you want to have the Activity keys in the Share you'd have to duplicate the loop over the jobs so you could put their IDs into the Share, and also cache the information because otherwise the list might change in between. Stephen -- Scanned by iCritical.

glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Burke, S (Stephen) said: There are also practical implications for the way the info providers are written.
And an even stronger example would be that a UserDomain provider couldn't possibly fill in explicit relations to all the things which refer to it, because it won't have any list of objects to refer to. In general I think there's likely to be a natural hierarchy where the explicit relation should point upwards (c.f. the chunkey discussion), although there may be some cases which are naturally symmetrical where it does make sense to have relations in both directions. It also seems to me that this must apply a fortiori to the SQL representation since that can't directly have multivalued relations at all, and while the two representations don't have to take the same decisions there may be a natural way to do it which is the same in both. Stephen -- Scanned by iCritical.

The 'natural' implementation in LDAP is probably do it the same way as in SQL except in the many to many relations, because in SQL we will have a new table and in LDAP we could only have multivalued attributes. Nevertheless, when I finish the SQL implementation, we can have a framework to discuss about this. Regards, David On Thu, Apr 16, 2009 at 5:53 PM, Burke, S (Stephen) <stephen.burke@stfc.ac.uk> wrote:
glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Burke, S (Stephen) said: There are also practical implications for the way the info providers are written.
And an even stronger example would be that a UserDomain provider couldn't possibly fill in explicit relations to all the things which refer to it, because it won't have any list of objects to refer to. In general I think there's likely to be a natural hierarchy where the explicit relation should point upwards (c.f. the chunkey discussion), although there may be some cases which are naturally symmetrical where it does make sense to have relations in both directions.
It also seems to me that this must apply a fortiori to the SQL representation since that can't directly have multivalued relations at all, and while the two representations don't have to take the same decisions there may be a natural way to do it which is the same in both.
Stephen
-- Scanned by iCritical.
-- David Horat Software Engineer – IT/GD – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Address: 1211 Geneva - Switzerland, Office: 28/R-003 Phone +41 22 76 77996 Professional Web: http://cern.ch/horat Personal Web: http://davidhorat.com/ Profile: http://linkedin.com/in/davidhorat
participants (4)
-
Burke, S (Stephen)
-
David Horat
-
Laurence Field
-
Maarten Litmaath