
Hi Stephen, Maybe it's me, but I found all web pages describing the extensible queries to be pretty poor, and the RFC didn't help much. I've been intending to investigate these further; the wiki page and your email presented an excellent opportunity and I think I now understand them. Comments interleaved. Apologies if this is all obvious to others; it wasn't obvious to me earlier today. On Tuesday 29 January 2008 20:24:04 Burke, S (Stephen) wrote:
Hi all,
On this page:
http://glueschema.forge.cnaf.infn.it/SpecV13/LDAP
there is a comment that GlueChunkKey is obsolete because you can now query directly on the DN.
I believe this is half-true. Extensible queries allows "downward queries": one can discover attributes for objects where what you know is higher up in the hierarchy. For example, given a specific GlueSEUniqueID (e.g., ccsrm.in2p3.fr) one can query all available protocols that this SE supports with the following single query: ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b o=grid '(&(GlueSEUniqueID:dn:=ccsrm.in2p3.fr) (objectClass=GlueSEAccessProtocol))' GlueSEAccessProtocolType GlueSEAccessProtocolVersion Of course, one can also do this without extensible queries if you happen to know the parent DN: ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b GlueSEUniqueID=ccsrm.in2p3.fr,Mds-Vo-name=IN2P3-CC,Mds-Vo-name=local,o=grid '(objectClass=GlueSEAccessProtocol)' GlueSEAccessProtocolType GlueSEAccessProtocolVersion The queries are not identical, but "all things being equal" (if the GLUE/LDAP hierarchy is observed and the GlueSEUniqueID is unique), then the results are the same; I guess the first query is "better" (doesn't need the site's DN), but both should work. Extensible queries (if I understood correctly) are of no help in navigating upward. For example: what are the GlueSEUniqueIDs for SEs that support a specific protocol? To achieve this query, we can either do this using GLUE- or LDAP- concepts: if we don't get it from the ChunkKey, we must get it by (effectively) parsing the DN into its constituent RDNs and building the parent object from those RDNs. For a simple fixed-distance ancestor relationship, a simple cut (e.g. "cut -d, -f2-") should do.
However, I don't think that's the whole story, because it may also be useful to *print* the chunkkey value so you can extract the corresponding ID. That would be much harder to get from the DN,
True(-ish), ldapsearch certainly doesn't make this easy. From trawling various mailing lists, the shortest "fix": for the line-splitting problem is: perl -p00e 's/\r?\n //g'
For example, consider this:
for i in `ldapsearch -x -h lcg-bdii.cern.ch -p 2170 -b o=grid glueSEAccessProtocolType=gsidcap | grep GlueChunkKey: | cut -d= -f2 `; do ldapsearch -x -h lcg-bdii.cern.ch -p 2170 -b o=grid glueseuniqueid=$i glueforeignkey; done | grep GlueSiteUniqueID | cut -d= -f2
That chops the SEUniqueID out of the ChunkKey in the access protocol object from the first query, and feeds it into another query to get the site ID. It may not be totally elegant but it works! If you removed the chunkkey it would be much harder to do something like that on the command line.
OK, so this query navigates upward: what are the GlueSiteUniqueIDs for sites that have an SE that supports the gsidcap protocol? Below, I've copied an equivalent bash 1-liner that uses DN parsing to navigation up the tree (effectively doing "../.." in XPath terms). It should be possible to do the equivalent to "ancestor::GlueSite[1]" (i.e., navigate up until you find a GlueSite) by explicitly matching the attribute part of the GridSite RDN. for site_dn in $(ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b o=grid glueSEAccessProtocolType=gsidcap | perl -p00e 's/\r?\n //g' | sed -ne 's/^dn: [^,]*,[^,]*,\(.*\)/\1/p' | sort -u); do ldapsearch -LLL -h lcg-bdii.cern.ch -p 2170 -x -b $site_dn '(objectClass=GlueSite)' GlueSiteUniqueID | perl -p00e 's/\r?\n //g' | sed -ne 's/^GlueSiteUniqueID: \(.*\)/\1/p'; done The example is about the same size, but I've separated the line as much as possible for clarity. It might be possible to merge the perl line-wrap hack into the sed, so losing the perl dependency, but my sed is a little too rusty for that.
While it's true that the information does exist in the DN I think I would rather keep the ChunkKey concept anyway.
I'm not sure if parsing the DN to extract parent DN entries is a robust way of navigating, but I think it basically works [*]. Cheers, Paul. [*] - It doesn't work for a few sites that are published right now; but the first few I investigated all had broken information, like not publishing any GlueSite class object, or publishing objects that have no parent (!!), etc.