
On Thu, May 28, 2009 at 1:05 PM, Burke, S (Stephen) < stephen.burke@stfc.ac.uk> wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of David Horat said:
Moreover, in order to decide where to put foreign keys, I have created a table in the wiki so we can vote and discuss in the phone conference:
http://forge.gridforum.org/sf/wiki/do/viewPage/projects.glue-wg/wiki/LDA PForeignKeysDiscussion
Did we decide how we want to do this?
A few comments on my approach to this ... as I said in the meeting it seems obvious to me that the natural relation for Extension is to have the keys in Extension. Actually Extension is a special case - for one thing you can in fact only do it that way since Extension only has a LocalID! Also, I think the fact that we renamed EntityID also gets reflected here, i.e. instead of having one attribute GLUE2ExtensionEntityForeignKey we'll have several like GLUE2ExtensionServiceForeignKey, GLUE2ExtensionDomainForeignKey, ...?
Anyway, some general arguments are:
1) Multiplicity - better to have one key than many, or few than many, especially if the many could be very many (dozens or more). Large numbers of keys will be harder and probably more inefficient to process when you extract them (see below).
I agree.
2) Ease of writing info providers - in general it's easier to create a higher-level object first and then pass its ID in to the code which creates lower-level entities. In some cases it may be impossible to do it the other way around because the code which generates the objects (especially Domains) can't know about the lower-level objects. In some cases, e.g. the computing-storage service relations, you'll be stuck with hard-coding IDs somewhere; in that case I think the relation should point "out", e.g. for ToStorageService you point from the CS to the SS.
Yes. Here the idea could also be, having a DIT, to try to put the ForeignKeys in the children, because usually you would create first the parent. I was discussing this with Laurence right now.
3) Likely queries. Bear in mind that queries can work in two ways - if you know an object ID you can query objects which reference it, or you can query the object itself for its key(s). By and large the first one is more efficient, potentially you can gather all the information you want in one query. As a concrete example, suppose I want to get the Path for all StorageShares with a given Tag in a given StorageService (i.e. I know the ServiceID). One way (leaving out the GLUE2 prefix):
(&(objectclass=StorageShare)(ShareServiceForeignKey=sss)(StorageShareTag =MCDISK)) StorageSharePath
Other way:
(&(objectclass=StorageService)(ServiceID=sss)) ServiceShareForeignKey
which returns numerous keys, so I either then do lots of queries like
(&(objectclass=StorageShare)(ShareID=key1)(StorageShareTag=MCDISK)) StorageSharePath
which in many cases will return nothing, or I construct one huge query like
(&(objectclass=StorageShare)(|(ShareID=key1)(ShareID=key2)(ShareID=key3) ...)(StorageShareTag=MCDISK)) StorageSharePath
I think the first one is clearly preferable! Of course I could find a query that goes the other way, e.g. given a ShareID find the Capability(s) of the corresponding Service, but in this case that's a much less likely thing to want to do, and anyway since a Share only belongs to one Service you have less complexity in the second query.
I agree and here is where your expertise comes from the use cases that you know and I don't. Thus, please vote! :D http://forge.ogf.org/sf/wiki/do/viewPage/projects.glue-wg/wiki/LDAPForeignKe...
Stephen
-- Scanned by iCritical.
-- David Horat Software Engineer – IT/GD – Grid Deployment Group CERN – European Organization for Nuclear Research » Where the web was born Address: 1211 Geneva - Switzerland, Office: 28/R-003 Phone +41 22 76 77996 Fax +41 22 76 68178 (fax to email service) Web: http://cern.ch/horat Web: http://davidhorat.com/ Profile: http://linkedin.com/in/davidhorat