How do I preserve order in an unordered list?

(Sent on behalf of Roger Costello) Hi Folks, I am trying out the sequenceKind="unordered" property. I created this simple test: <xs:element name="Test"> <xs:complexType> <xs:sequence dfdl:sequenceKind="unordered" dfdl:separator="%SP;"> <xs:element name="A" type="xs:string" dfdl:initiator="A:" /> <xs:element name="B" type="xs:string" dfdl:initiator="B:" /> </xs:sequence> </xs:complexType> </xs:element> That schema says the input data must consist of A:___ and B:___, in any order. Here is sample input: B:Cat A:Dog I processed the input using the schema and here is the result that I got from Daffodil: <Test> <A>Dog</A> <B>Cat</B> </Test> Notice that the order of the input data changed in the result XML. This was quite surprising to me. Upon consulting the DFDL specification, it appears that the exhibited behavior is expected: ... a DFDL processor must sort the members of an unordered group into schema order when parsing. QUESTIONS: 1. Is this really the desired behavior? I would not expect parsing to alter the order of any data. From my XML Schema experience, I would be shocked if an XML Schema validator altered the order of markup in XML instances simply because the XML Schema specified <all> (unordered sequence). 2. If we grant that this really is the desired behavior, then how do I create an unordered sequence in which DFDL parsing preserves the order of the data? In the above example, if the input data lists B:Cat first and A:Dog second, then how do I get that order preserved in the result XML? /Roger

Yes, the behavior is entirely intentional. Erase the DFDL annotation: now your model has a <sequence> containing first element A, then element B. If we allowed those to be in the other order your infoset would not be valid for your model. If you want to preserve the order information you need to model that differently. Perhaps via an array with a element containing a choice inside it. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy<http://www.ogf.org/About/abt_policies.php> On Tue, Oct 8, 2013 at 10:21 AM, Garriss Jr., James P. <jgarriss@mitre.org>wrote:
(Sent on behalf of Roger Costello)****
** **
Hi Folks,****
** **
I am trying out the sequenceKind="unordered" property.****
** **
I created this simple test:****
** **
<xs:element name="Test">****
<xs:complexType>****
<xs:sequence dfdl:sequenceKind="unordered" dfdl:separator="%SP;">****
<xs:element name="A" type="xs:string" dfdl:initiator="A:" />****
<xs:element name="B" type="xs:string" dfdl:initiator="B:" />****
</xs:sequence>****
</xs:complexType>****
</xs:element>****
** **
That schema says the input data must consist of A:___ and B:___, in any order.****
** **
Here is sample input:****
** **
B:Cat A:Dog****
** **
I processed the input using the schema and here is the result that I got from Daffodil: ****
** **
<Test>****
<A>Dog</A>****
<B>Cat</B>****
</Test>****
** **
Notice that the order of the input data changed in the result XML. This was quite surprising to me. ****
** **
Upon consulting the DFDL specification, it appears that the exhibited behavior is expected:****
** **
... a DFDL processor must sort the members of an ****
unordered group into schema order when parsing.****
** **
QUESTIONS:****
** **
1. Is this really the desired behavior? I would not expect parsing to alter the order of any data. From my XML Schema experience, I would be shocked if an XML Schema validator altered the order of markup in XML instances simply because the XML Schema specified <all> (unordered sequence).****
** **
2. If we grant that this really is the desired behavior, then how do I create an unordered sequence in which DFDL parsing preserves the order of the data? In the above example, if the input data lists B:Cat first and A:Dog second, then how do I get that order preserved in the result XML?*** *
** **
/Roger****
-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg

(On behalf of Roger Costello) Ø Erase the DFDL annotation: now your model has a <sequence> Ø containing first element A, then element B. If we allowed those Ø to be in the other order your infoset would not be valid for your Ø model. My goodness, sequenceKind="unordered" is a strange beast. So it means: The input data can be in any order, but I (the parser) am going to reorder the data into the sequence listed here (in the schema). That certainly is contrary to unordered lists in XML Schema. Ø If you want to preserve the order information you need to model Ø that differently. Perhaps via an array with an element containing a Ø choice inside it. Well, that's what James has had to do in his email schema. He introduced a bogus array element to compensate for the lack of unordered lists. The reason we pushed for sequenceKind="unordered" is that we thought it allowed us to create unordered lists (with order preserved). Apparently we were mistaken. So to recap: there is no way to create an unordered list, with order preserved, without introducing a bogus wrapper element. Bummer. /Roger From: Mike Beckerle [mailto:mbeckerle.dfdl@gmail.com] Sent: Tuesday, October 08, 2013 11:06 AM To: Garriss Jr., James P. Cc: dfdl-wg@ogf.org Subject: Re: [DFDL-WG] How do I preserve order in an unordered list? Yes, the behavior is entirely intentional. Erase the DFDL annotation: now your model has a <sequence> containing first element A, then element B. If we allowed those to be in the other order your infoset would not be valid for your model. If you want to preserve the order information you need to model that differently. Perhaps via an array with a element containing a choice inside it. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com<http://www.tresys.com> Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy<http://www.ogf.org/About/abt_policies.php> On Tue, Oct 8, 2013 at 10:21 AM, Garriss Jr., James P. <jgarriss@mitre.org<mailto:jgarriss@mitre.org>> wrote: (Sent on behalf of Roger Costello) Hi Folks, I am trying out the sequenceKind="unordered" property. I created this simple test: <xs:element name="Test"> <xs:complexType> <xs:sequence dfdl:sequenceKind="unordered" dfdl:separator="%SP;"> <xs:element name="A" type="xs:string" dfdl:initiator="A:" /> <xs:element name="B" type="xs:string" dfdl:initiator="B:" /> </xs:sequence> </xs:complexType> </xs:element> That schema says the input data must consist of A:___ and B:___, in any order. Here is sample input: B:Cat A:Dog I processed the input using the schema and here is the result that I got from Daffodil: <Test> <A>Dog</A> <B>Cat</B> </Test> Notice that the order of the input data changed in the result XML. This was quite surprising to me. Upon consulting the DFDL specification, it appears that the exhibited behavior is expected: ... a DFDL processor must sort the members of an unordered group into schema order when parsing. QUESTIONS: 1. Is this really the desired behavior? I would not expect parsing to alter the order of any data. From my XML Schema experience, I would be shocked if an XML Schema validator altered the order of markup in XML instances simply because the XML Schema specified <all> (unordered sequence). 2. If we grant that this really is the desired behavior, then how do I create an unordered sequence in which DFDL parsing preserves the order of the data? In the above example, if the input data lists B:Cat first and A:Dog second, then how do I get that order preserved in the result XML? /Roger -- dfdl-wg mailing list dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org> https://www.ogf.org/mailman/listinfo/dfdl-wg

James - answers in-line below. Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: "Garriss Jr., James P." <jgarriss@mitre.org> To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, Date: 08/10/2013 15:21 Subject: [DFDL-WG] How do I preserve order in an unordered list? Sent by: dfdl-wg-bounces@ogf.org (Sent on behalf of Roger Costello) Hi Folks, I am trying out the sequenceKind="unordered" property. I created this simple test: <xs:element name="Test"> <xs:complexType> <xs:sequence dfdl:sequenceKind="unordered" dfdl:separator="%SP;"> <xs:element name="A" type="xs:string" dfdl:initiator="A:" /> <xs:element name="B" type="xs:string" dfdl:initiator="B:" /> </xs:sequence> </xs:complexType> </xs:element> That schema says the input data must consist of A:___ and B:___, in any order. Here is sample input: B:Cat A:Dog I processed the input using the schema and here is the result that I got from Daffodil: <Test> <A>Dog</A> <B>Cat</B> </Test> Notice that the order of the input data changed in the result XML. This was quite surprising to me. Upon consulting the DFDL specification, it appears that the exhibited behavior is expected: ... a DFDL processor must sort the members of an unordered group into schema order when parsing. QUESTIONS: 1. Is this really the desired behavior? I would not expect parsing to alter the order of any data. From my XML Schema experience, I would be shocked if an XML Schema validator altered the order of markup in XML instances simply because the XML Schema specified <all> (unordered sequence). SMH: Yes it is the intended behaviour. It's because this is not xs:all, but xs:sequence with a DFDL annotation. As far as XSDL is concerned, it is a xs:sequence so the elements must appear in the sequence order (otherwise validation will fail). The original intent was to use xs:all but it imposes the restriction that maxOccurs = 1 on the elements in the group, precluding repeats, which was felt too restrictive, so it was not included in the DFDL subset of XSDL. 2. If we grant that this really is the desired behavior, then how do I create an unordered sequence in which DFDL parsing preserves the order of the data? In the above example, if the input data lists B:Cat first and A:Dog second, then how do I get that order preserved in the result XML? SMH: In DFDL 1.0 you use xs:choice within a repeating element. I agree this is not ideal but it does preserve the order, at the expense of introducing a parent element into the infoset. The latest revision of the DFDL spec is in Public Comment, you are very welcome to create a comment on the OGF site if you feel that xs:all should be supported - the link is http://redmine.ogf.org/projects/editor-pubcom/boards/15. /Roger-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

All, Just wanted to point out that Roger Costello posted his public comment from this thread on the #119 DFDL v1.0 Experience 2 forum (http://redmine.ogf.org/projects/editor-pubcom/boards/17) instead of on the #117 DFDL v1.0 Revision forum with the rest of the public comments (http://redmine.ogf.org/projects/editor-pubcom/boards/15). I almost moved it, but figured it was fine where it was, since all three forums are for public comments. I just wanted to call your attention to it so it didn't get overlooked. I posted a follow-up to Roger's post as well. Both comments are copied below for your review. Added by Roger Costello<http://redmine.ogf.org/users/157> 7 days<http://redmine.ogf.org/projects/editor-pubcom/activity?from=2013-10-08> ago The sequenceKind="unordered" property is a strange beast. It means: The input data can be in any order, but I (the parser) am going to reorder the data into the sequence listed here (in the schema). We need real unordered lists. That is, the parser must not muck with the data - the order of the data must be preserved. Please support one or both of these: 1. Support the XML Schema 1.1 <all> element. 2. Support a new property, to be used with sequenceKind="unordered". The property is used to specify whether the parser is allowed to reorder the input data. How about calling it: allowedToDorkWithTheOrder = true/false Replies (1) [Comment]<http://redmine.ogf.org/boards/17/topics/128?r=143> [http://www.gravatar.com/avatar/449789c7de0f039e406e596c26814d82?rating=PG&size=24&default=mm]RE: Please support (real) unordered lists<http://redmine.ogf.org/boards/17/topics/128?r=143#message-143> - Added by Jonathan Cranford<http://redmine.ogf.org/users/154> about 1 hour<http://redmine.ogf.org/projects/editor-pubcom/activity?from=2013-10-15> ago I'll expand a bit on Roger's request above. The changes that XSD 1.1 made to <all> that are most relevant to supporting an unordered list capability are the following, I believe: * "The value of maxOccurs may now be greater than 1 on particles in an all group. The elements which match a particular particle need not be adjacent in the input." (from http://www.w3.org/TR/xmlschema11-1/#ch_models) * minOccurs can be greater than 1. Here's the big question, as I see it: Is there anything that would prevent DFDL 1.0 from cherry-picking XSD 1.1 features as we're suggesting? As I understand it, the design goals of DFDL include (1) having an infoset compatible with XSD processors and (2) having DFDL schema files that are compatible with XSD processors. If DFDL 1.0 expands what is allowed in <all> the same as XSD 1.1 does, I don't think that would impact the infoset, but it would impact the schema file; the resulting schema file could only be processed by an XSD 1.1 processor. Would that be an impediment to expanding what's allowed in <all> in DFDL 1.0? If so, <all> would carry the same restrictions as in XSD 1.0; namely, the particles within <all> would have to have maxOccurs equal to 1 and minOccurs equal to either 0 or 1. I think that would limit the utility of using <all> to represent unordered lists. Respectfully, Jonathan Cranford
-----Original Message-----
From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf
Of Steve Hanson
Sent: Tuesday, October 08, 2013 9:12 AM
To: Garriss Jr., James P.
Cc: dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org
Subject: Re: [DFDL-WG] How do I preserve order in an unordered list?
James - answers in-line below.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group <http://www.ogf.org/dfdl/>
IBM SWG, Hursley, UK
smh@uk.ibm.com<mailto:smh@uk.ibm.com> <mailto:smh@uk.ibm.com>
tel:+44-1962-815848
From: "Garriss Jr., James P." <jgarriss@mitre.org<mailto:jgarriss@mitre.org>>
To: "dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org>" <dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org>>,
Date: 08/10/2013 15:21
Subject: [DFDL-WG] How do I preserve order in an unordered list?
Sent by: dfdl-wg-bounces@ogf.org<mailto:dfdl-wg-bounces@ogf.org>
________________________________
(Sent on behalf of Roger Costello)
Hi Folks,
I am trying out the sequenceKind="unordered" property.
I created this simple test:
<xs:element name="Test">
<xs:complexType>
<xs:sequence dfdl:sequenceKind="unordered" dfdl:separator="%SP;">
<xs:element name="A" type="xs:string" dfdl:initiator="A:" />
<xs:element name="B" type="xs:string" dfdl:initiator="B:" />
</xs:sequence>
</xs:complexType>
</xs:element>
That schema says the input data must consist of A:___ and B:___, in any order.
Here is sample input:
B:Cat A:Dog
I processed the input using the schema and here is the result that I got from
Daffodil:
<Test>
<A>Dog</A>
<B>Cat</B>
</Test>
Notice that the order of the input data changed in the result XML. This was quite
surprising to me.
Upon consulting the DFDL specification, it appears that the exhibited behavior is
expected:
... a DFDL processor must sort the members of an
unordered group into schema order when parsing.
QUESTIONS:
1. Is this really the desired behavior? I would not expect parsing to alter the order
of any data. From my XML Schema experience, I would be shocked if an XML
Schema validator altered the order of markup in XML instances simply because
the XML Schema specified <all> (unordered sequence).
SMH: Yes it is the intended behaviour. It's because this is not xs:all, but
xs:sequence with a DFDL annotation. As far as XSDL is concerned, it is a
xs:sequence so the elements must appear in the sequence order (otherwise
validation will fail). The original intent was to use xs:all but it imposes the
restriction that maxOccurs = 1 on the elements in the group, precluding repeats,
which was felt too restrictive, so it was not included in the DFDL subset of XSDL.
2. If we grant that this really is the desired behavior, then how do I create an
unordered sequence in which DFDL parsing preserves the order of the data? In
the above example, if the input data lists B:Cat first and A:Dog second, then how
do I get that order preserved in the result XML?
SMH: In DFDL 1.0 you use xs:choice within a repeating element. I agree this is not
ideal but it does preserve the order, at the expense of introducing a parent
element into the infoset. The latest revision of the DFDL spec is in Public
Comment, you are very welcome to create a comment on the OGF site if you feel
that xs:all should be supported - the link is
/Roger--
dfdl-wg mailing list
dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org>
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-----Original Message----- From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Steve Hanson Sent: Tuesday, October 08, 2013 9:12 AM To: Garriss Jr., James P. Cc: dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org Subject: Re: [DFDL-WG] How do I preserve order in an unordered list?
James - answers in-line below.
Regards
Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group <http://www.ogf.org/dfdl/> IBM SWG, Hursley, UK smh@uk.ibm.com <mailto:smh@uk.ibm.com> tel:+44-1962-815848
From: "Garriss Jr., James P." <jgarriss@mitre.org> To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, Date: 08/10/2013 15:21 Subject: [DFDL-WG] How do I preserve order in an unordered list? Sent by: dfdl-wg-bounces@ogf.org
________________________________
(Sent on behalf of Roger Costello)
Hi Folks,
I am trying out the sequenceKind="unordered" property.
I created this simple test:
<xs:element name="Test"> <xs:complexType> <xs:sequence dfdl:sequenceKind="unordered" dfdl:separator="%SP;"> <xs:element name="A" type="xs:string" dfdl:initiator="A:" /> <xs:element name="B" type="xs:string" dfdl:initiator="B:" /> </xs:sequence> </xs:complexType> </xs:element>
That schema says the input data must consist of A:___ and B:___, in any order.
Here is sample input:
B:Cat A:Dog
I processed the input using the schema and here is the result that I got from Daffodil:
<Test> <A>Dog</A> <B>Cat</B> </Test>
Notice that the order of the input data changed in the result XML. This was quite surprising to me.
Upon consulting the DFDL specification, it appears that the exhibited behavior is expected:
... a DFDL processor must sort the members of an unordered group into schema order when parsing.
QUESTIONS:
1. Is this really the desired behavior? I would not expect parsing to alter the order of any data. From my XML Schema experience, I would be shocked if an XML Schema validator altered the order of markup in XML instances simply because the XML Schema specified <all> (unordered sequence).
SMH: Yes it is the intended behaviour. It's because this is not xs:all, but xs:sequence with a DFDL annotation. As far as XSDL is concerned, it is a xs:sequence so the elements must appear in the sequence order (otherwise validation will fail). The original intent was to use xs:all but it imposes the restriction that maxOccurs = 1 on the elements in the group, precluding repeats, which was felt too restrictive, so it was not included in the DFDL subset of XSDL.
2. If we grant that this really is the desired behavior, then how do I create an unordered sequence in which DFDL parsing preserves the order of the data? In the above example, if the input data lists B:Cat first and A:Dog second,
do I get that order preserved in the result XML?
SMH: In DFDL 1.0 you use xs:choice within a repeating element. I agree
ideal but it does preserve the order, at the expense of introducing a
Thanks for bringing the public comment to our attention. In theory there is a 'watch' facility on these forums, but I don't ever get notified of new posts or replies, so either it doesn't work or I am doing something wrong ? I understand the requirement completely, there are industry formats that IBM needs to model and that are unordered, and ideally the elements should be presented in the original order. So it is a concern to me as well. The sticking point is how to solve it. I've posted a reply to move the discussion along ( http://redmine.ogf.org/boards/17/topics/128?r=146). We will be going through all the public comments either on WG calls or extra calls set up to speed things up. Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: "Cranford, Jonathan W." <jcranford@mitre.org> To: Steve Hanson/UK/IBM@IBMGB, Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, "Garriss Jr., James P." <jgarriss@mitre.org>, "Costello, Roger L." <costello@mitre.org> Date: 16/10/2013 00:44 Subject: RE: [DFDL-WG] How do I preserve order in an unordered list? All, Just wanted to point out that Roger Costello posted his public comment from this thread on the #119 DFDL v1.0 Experience 2 forum ( http://redmine.ogf.org/projects/editor-pubcom/boards/17) instead of on the #117 DFDL v1.0 Revision forum with the rest of the public comments ( http://redmine.ogf.org/projects/editor-pubcom/boards/15). I almost moved it, but figured it was fine where it was, since all three forums are for public comments. I just wanted to call your attention to it so it didn’t get overlooked. I posted a follow-up to Roger's post as well. Both comments are copied below for your review. Added by Roger Costello 7 days ago The sequenceKind="unordered" property is a strange beast. It means: The input data can be in any order, but I (the parser) am going to reorder the data into the sequence listed here (in the schema). We need real unordered lists. That is, the parser must not muck with the data - the order of the data must be preserved. Please support one or both of these: 1. Support the XML Schema 1.1 <all> element. 2. Support a new property, to be used with sequenceKind="unordered". The property is used to specify whether the parser is allowed to reorder the input data. How about calling it: allowedToDorkWithTheOrder = true/false Replies (1) RE: Please support (real) unordered lists - Added by Jonathan Cranford about 1 hour ago I'll expand a bit on Roger's request above. The changes that XSD 1.1 made to <all> that are most relevant to supporting an unordered list capability are the following, I believe: · "The value of maxOccurs may now be greater than 1 on particles in an all group. The elements which match a particular particle need not be adjacent in the input." (from http://www.w3.org/TR/xmlschema11-1/#ch_models) · minOccurs can be greater than 1. Here's the big question, as I see it: Is there anything that would prevent DFDL 1.0 from cherry-picking XSD 1.1 features as we're suggesting? As I understand it, the design goals of DFDL include (1) having an infoset compatible with XSD processors and (2) having DFDL schema files that are compatible with XSD processors. If DFDL 1.0 expands what is allowed in <all> the same as XSD 1.1 does, I don't think that would impact the infoset, but it would impact the schema file; the resulting schema file could only be processed by an XSD 1.1 processor. Would that be an impediment to expanding what's allowed in <all> in DFDL 1.0? If so, <all> would carry the same restrictions as in XSD 1.0; namely, the particles within <all> would have to have maxOccurs equal to 1 and minOccurs equal to either 0 or 1. I think that would limit the utility of using <all> to represent unordered lists. Respectfully, Jonathan Cranford then how this is not parent
element into the infoset. The latest revision of the DFDL spec is in Public Comment, you are very welcome to create a comment on the OGF site if you feel that xs:all should be supported - the link is http://redmine.ogf.org/projects/editor-pubcom/boards/15 <http://redmine.ogf.org/projects/editor-pubcom/boards/15> .
/Roger-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg <https://www.ogf.org/mailman/listinfo/dfdl-wg>
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (4)
-
Cranford, Jonathan W.
-
Garriss Jr., James P.
-
Mike Beckerle
-
Steve Hanson