Re: [glue-wg] GlueChunkKey

30 Jan 2008

      Hi Stephen,

Maybe it's me, but I found all web pages describing the extensible queries to 
be pretty poor, and the RFC didn't help much.  I've been intending to 
investigate these further; the wiki page and your email presented an 
excellent opportunity and I think I now understand them.

Comments interleaved.  Apologies if this is all obvious to others; it wasn't 
obvious to me earlier today.

On Tuesday 29 January 2008 20:24:04 Burke, S (Stephen) wrote:
...
Hi all,
On this page:
http://glueschema.forge.cnaf.infn.it/SpecV13/LDAP
there is a comment that GlueChunkKey is obsolete because you can now
query directly on the DN.
I believe this is half-true.

Extensible queries allows "downward queries": one can discover attributes for 
objects where what you know is higher up in the hierarchy.

For example, given a specific GlueSEUniqueID (e.g., ccsrm.in2p3.fr) one can 
query all available protocols that this SE supports with the following single 
query:

ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b 
o=grid '(&(GlueSEUniqueID:dn:=ccsrm.in2p3.fr)
(objectClass=GlueSEAccessProtocol))' GlueSEAccessProtocolType 
GlueSEAccessProtocolVersion

Of course, one can also do this without extensible queries if you happen to 
know the parent DN:

ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b 
GlueSEUniqueID=ccsrm.in2p3.fr,Mds-Vo-name=IN2P3-CC,Mds-Vo-name=local,o=grid '(objectClass=GlueSEAccessProtocol)' 
GlueSEAccessProtocolType GlueSEAccessProtocolVersion

The queries are not identical, but "all things being equal" (if the GLUE/LDAP 
hierarchy is observed and the GlueSEUniqueID is unique), then the results are 
the same; I guess the first query is "better" (doesn't need the site's DN), 
but both should work.

Extensible queries (if I understood correctly) are of no help in navigating 
upward.  For example: what are the GlueSEUniqueIDs for SEs that support a 
specific protocol?

To achieve this query, we can either do this using GLUE- or LDAP- concepts: if 
we don't get it from the ChunkKey, we must get it by (effectively) parsing 
the DN into its constituent RDNs and building the parent object from those 
RDNs.  For a simple fixed-distance ancestor relationship, a simple cut 
(e.g. "cut -d, -f2-") should do.
...
However, I don't think that's the whole story, 
because it may also be useful to *print* the chunkkey value so you can
extract the corresponding ID. That would be much harder to get from the
DN,
True(-ish), ldapsearch certainly doesn't make this easy.  From trawling 
various mailing lists, the shortest "fix": for the line-splitting problem is:
 	perl -p00e 's/\r?\n //g'
...
For example, consider this:
for i in `ldapsearch -x -h lcg-bdii.cern.ch -p 2170 -b o=grid
glueSEAccessProtocolType=gsidcap | grep GlueChunkKey: | cut -d= -f2 `;
do ldapsearch -x -h lcg-bdii.cern.ch -p 2170 -b o=grid glueseuniqueid=$i
glueforeignkey; done | grep GlueSiteUniqueID | cut -d= -f2
That chops the SEUniqueID out of the ChunkKey in the access protocol
object from the first query, and feeds it into another query to get the
site ID. It may not be totally elegant but it works! If you removed the
chunkkey it would be much harder to do something like that on the
command line.
OK, so this query navigates upward: what are the GlueSiteUniqueIDs for sites 
that have an SE that supports the gsidcap protocol?

Below, I've copied an equivalent bash 1-liner that uses DN parsing to 
navigation up the tree (effectively doing "../.." in XPath terms).  It should 
be possible to do the equivalent to "ancestor::GlueSite[1]" (i.e., navigate 
up until you find a GlueSite) by explicitly matching the attribute part of 
the GridSite RDN.

for site_dn in $(ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b o=grid 
glueSEAccessProtocolType=gsidcap |

			perl -p00e 's/\r?\n //g' |

			sed -ne 's/^dn: [^,]*,[^,]*,\(.*\)/\1/p' |

			sort -u); do

	ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b 
$site_dn '(objectClass=GlueSite)' GlueSiteUniqueID |

	perl -p00e 's/\r?\n //g' |

	sed -ne 's/^GlueSiteUniqueID: \(.*\)/\1/p';
done

The example is about the same size, but I've separated the line as much as 
possible for clarity.  It might be possible to merge the perl line-wrap hack 
into the sed, so losing the perl dependency, but my sed is a little too rusty 
for that.
...
While it's true that the information does exist in the DN 
I think I would rather keep the ChunkKey concept anyway.
I'm not sure if parsing the DN to extract parent DN entries is a robust way of 
navigating, but I think it basically works [*].

Cheers,

Paul.

[*] - It doesn't work for a few sites that are published right now; but the 
first few I investigated all had broken information, like not publishing any 
GlueSite class object, or publishing objects that have no parent (!!), etc.

Re: [glue-wg] GlueChunkKey

Paul Millar