Re: [glue-wg] Some questions... [WAS: choosing XML Document structure for GLUE 2.0 rendering]

12 Dec 2007

      Hello Paul,

Paul Millar ha scritto:
...
Hi Sergio, rest-of-list,
On Tuesday 11 December 2007 06:43:49 Sergio Andreozzi wrote:
...
...
http://www.doodle.ch/participation.html?pollId=sg4v8qvy3h4h6d9t
this is a gentle reminder for the voting about XML document structure.
Please, express your opinion by December, 12. If you choose for option
3. or 6. you are invited to send your alternative as well.
Sorry I'm new to this discussion, but the current proposals don't make much 
sense to me.  Looking at the discussion[1] further confuses me.  At the risk 
of disrupting this process, I'd like to ask some questions...
[1] 
http://forge.ogf.org/sf/wiki/do/viewPage/projects.glue-wg/wiki/GLUE2XMLSchem...
First, I see that one of the rules is ID is an element.  Why is this?  It 
seems we're reinventing the wheel here: XML already defines the concept of an 
ID (see [2] and [3]), which is used in related standards ([4], [5], etc...).  
Why not just use this?
[2]   http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-attribute-types
[3]   http://www.w3.org/TR/2005/REC-xml-id-20050909
[4]   http://www.w3.org/TR/1999/REC-xslt-19991116
[5]   http://www.w3.org/TR/1999/REC-xpath-19991116
the previous version of the XML rendering proposal had the ID as 
attribute, then after a discussion in the last telecon we agreed to 
change it to element.
I actually do not have a strong opinion on this. As regards your 
references, the most interesting to me is [3]. I would say that we are 
not reinventing the wheel because we are doing something different.
[3] defines a way to attach a unique ID (unique within an XML document) 
to an XML element. We are defining a property of a Grid concept (ID) 
which is supposed to be globally unique and is a URI.From a semantical 
viewpoint they are different. They sit in different namespaces, 
therefore there should be no problem for that (if you see problems, 
please let me know).
...
Is the plan to render (nearly) everything as elements rather than attributes?
in the last telecon, we agreed that we'll use attributes only for 
metadata-like properties (basically CreationTime and Validity, see Sec. 
4.1 of the spec), while all the rest will be mapped to XML elements.
...
GLUE has many items have "required" (1) or "optional" (0..1) cardinality and 
contain no further markup, so I feel they would, for the most part, be better 
rendered as an XML attributes.
given my experience, this choice is mainly a matter of style. Attributes 
can be only of simple types and single-value.
Going for elements gives more flexibility for future changes and also is 
probably more usable (people don't have to remember which properties are 
single value, i.e. attributes or multi-value. i.e. elements when writing 
queries).
...
Is the proposed GLUE/XML intended to be used by services when they publish 
information about themselves for site-level aggregate?  If so, the current 
proposal (the rule that One-to-Many relationships are represented as 
parent-child) looks as if a Service must know site-level information (such as 
AdminDomain).  This is undesirable for data normalisation.
the proposal is intended to be used by both primary services (e.g., 
OGSA-BES, SRM) which want to advertise their characteristics and by 
information services (both primary publishers and aggregators).
For primary services, the only constraint is to know the ID of their 
AdminDomain. That's all. They are not supposed to publish other 
AdminDomain attributes.
The AdminDomain ID will be used to perform the aggregation at the 
higher-level.

The reason for which I prefer Option A is because it looks easier to 
make queries by AdminDomain (no need for join). And at the aggregation 
level, you have all info under a certain AdminDomain aggregated under a 
single element. I don't know how MDS 4 performs aggregations at higher 
level and if this is compatible with its strategies. This is something 
to be investigated.
...
As an alternative, suppose One-to-Many relationships be represented as either 
an XML element hierarchy or (for top-level elements, only) as an attribute 
("parent", say) that has type URI and contains the URI of the containing 
element's ID.  A service could publish its information and only have to know 
the parent element's URI.
yep, this is an option as well. Many options are available. Probably, we 
should make one step back and clarify what we want to optimize.
In my opinion, we should concetrate on giving the final user the easiest 
and more intuitive way to query the properties.

For sure, we need more experience on this with a number of queries to be 
written for different approaches.
One advantage that I like of option A. is that a query would remain 
valid if you query either the primary source of information or the 
aggregated layer.

Consider this for instance. A simple XPath to ask for a service which 
type is org.glite.wms part of a certain adminDomain:

/glue:Grid/AdminDomain[ID='urn:admindomain:t1.infn.it']/Service[Type='org.glite.wms']

this query works both at the primary source level and aggregated level 
and is also quite simple to me.
Of course, we need a larger set of queries to be used for evaluation.
...
Finally (just as a general comment) my impression is that there is too great 
an emphasis on XML Schema; because of this, the GLUE/XML rendering appears 
hampered by limitations of XSD and the rules are designed as if the XML is to 
fit what XSD supports (e.g., the "extensible enumerations" section).  If so, 
I feel this is "putting the cart before the horse": I feel the XML should 
convey precise and compact representation of the schema, whilst being easy to 
parse and comprehend.  "Hacks" to support extensibility in the XSD, like 
<State> vs <RunningState>, obfuscate the XML in favour of XML passing XSD 
validation checks.
we are trying to find the right balance and mainly preserving easy of 
use. In the rules, I mentioned the option of SubstitutionGroups for 
completeness, but this is not the current selected option.
At the moment, we prefer to go for the annotation option
...
(I'm in favour of providing a validation mechanism, but does the validation 
needs to be strong?  If it's a choice between having a simple XML design that 
can only be validated weakly via XSD or a complex XML that can be strongly 
validated, I'd perfer the former.)
yep, me too.

Thanks for your constructive feedback. I hope we can dedicate one more 
call before XMas to XML rendering so that we can refine all these 
choices and align about the rationale behind them.
Please, keep contributing as opinion from different perspectives help us 
to make better choices.

Cheers, Sergio
...
Cheers,
Paul.
-- 
Sergio Andreozzi
INFN-CNAF,                    Tel: +39 051 609 2860
Viale Berti Pichat, 6/2       Fax: +39 051 609 2746
40126 Bologna (Italy)         Web: http://www.cnaf.infn.it/~andreozzi