Hi Sergio,

Thanks for the explanation and I understand if I'm late to the game and this has already been decided. But related to your points:

- I can see how some it might be easier for a user to understand a hierarchical structure, it just seems to break down in places - when there are many-to-many associations or when it isn't clear if one entity should be the parent of another. I personally find a flat structure as easy to understand as a hierarchical one for the GLUE information, but I may not be typical. If entities are stored in a flat structure, a query that matches yours is "I want all elements that have a ComputingService sub-element containing the text <ComputingService_ID>"

- A flat structure seems to also address this same need - I do the same queries to the XML data store on a system as I do to our centralized XML data store.

- Using UNKNOWN as the IDs of upper elements seems to break the XPath searchability of the documents. It seems that these fragments would need to be composed before they can really be searched. I think this is a situation where the flat structure has an advantage.

I was definitely coming at this from the perspective of how to generate and aggregate the data. However, I don't think the flat structure is generally less user friendly.

I haven't been doing complex queries against data in this schema yet. I unfortunately don't have time to go through the example queries in that GFD in detail, but I don't think the equivalent queries to a flattened structure would be very different (I am not a XPath expert, however).

If you'd like to see examples of documents using this flat structure, I've posted a few from our TeraGrid WS-MDS services at http://www.tacc.utexas.edu/~wsmith/glue2/. lonestar.glue2.xml and ranger.glue2.xml are 2 machines at TACC (these documents are the result of a query to the WS-MDS running on each system). teragrid.glue2.xml-multidoc shows documents that result from a query to our central TeraGrid WS-MDS.


Warren


Sergio Andreozzi wrote:
Hi Warren,

I'm coming back to this issue and I'd like to better clarify the
rationale behind the design strategy of the XSD mapping we provided,
which is described in this document:
http://forge.ogf.org/sf/go/doc15514

Have a look at section 3.1.5.

- we opted to reflect the natural hierarchical structure of information
when possible; this simplifies query, e.g.:
"I want all the information concerning a certain computing service";
this can be answered with a simple XPath expression

- we require the full-hierarchy to be always present because this offers
the same structure of information either if you query a primary source
of information or an aggregator service;
the same query will work in both situations

- we enable anyway to publish fragments of info; you need to publish the
hierarchy anyway and if you do not know the ID of the upper entities,
you can put UNKNOWN

I feel that your approach is more suitable for a back-end generation and
aggregation strategy, while the above approach is more suited for
exposing the info to consumers.

Do you have experience in complex query from the information represented
using your approach?
You may want to compare it to the use case provided in this document:

http://www.ogf.org/documents/GFD.137.pdf

it would be useful to better understand pro and cons.


Cheers, Sergio



Warren Smith ha scritto:
  
If you'd like to see the XML Schema that I came up with for the
recommendation, it is at
http://software.teragrid.org/pacman/ctss4/glue2/glue2-1.0.0-r1/teragrid_glue_2.0_r01.xsd


Its structure is a little different from the XML Schemas from the
working group. The main differences are:

* It is a flat structure with associations represented as references
to the IDs of the associated entities.

* Many of the classes are included as top-level elements.


The main reason that I choose this approach was to make it easier to
compose information from multiple sources. For example, I have one
sensor that gathers information about jobs/activities and another that
gathers information about nodes/execution environments. Plus other
sensors on this system and other systems. I wanted to be able to run
the sensors independently and then compose their results because the
information the sensors gather changes at different time scales and
the sensors can be on different systems. I felt that the composition
would be easier if I avoided representing associations as sub-elements.

I also wanted the information provided by a sensor to be a valid
document, so I have many classes as top-level elements. The XML
Schemas from the working group only had Domains as a top-level element.


I'm not sure if folks will like or dislike these differences, but I
thought I'd share them.


Warren


Sergio@CNAF wrote:
    
Hi Weijian,

I'm responsible for the XSD mapping together with Balazs. I had no time
in the last period to work on it.
The planned release date was 10 April
(http://forge.ogf.org/sf/wiki/do/viewPage/projects.glue-wg/wiki/ListOfImplementors)

In May, we will be able to release it.

Cheers, Sergio


Weijian Fang ha scritto:

      
I found a GLUE2 XML schema at
http://schemas.ogf.org/glue/2008/05/spec_2.0_d42_r01, but it is not
very consistent with GLUE spec v. 2.0.  Is there an updated one
consistent with GLUE 2 conceptual model? Thanks!

Cheers,

Weijian
_______________________________________________
glue-wg mailing list
glue-wg@ogf.org
http://www.ogf.org/mailman/listinfo/glue-wg


        
_______________________________________________
glue-wg mailing list
glue-wg@ogf.org
http://www.ogf.org/mailman/listinfo/glue-wg