
The worry about allowing absolutely anything and letting speculative parsing sort it out is that it will be easy for users to hang themselves; that us get very complex and inscrutable behaviors. Some redundancy is valuable. Requiring users to specify elements as children of unordered sequences to make the diagnostic messages simpler also dodges the whole issue of whether a choice can be the child of an unordered sequence. Ultimately, you can model anything along these lines as an array of choice, so the question is what simpler cases do we make easier to express. For example why don't we rule out the cases where there are N floating children out of M total children for N >1 ? This just forces you to cross the line over to an unordered sequence if you have something where there is more than one kind of thing floating around. This whole issue really stems from fear of complexity. When something common and simple is expressed I would like a simple dfdl implementation to be able to give users clear diagnostics when data is broken. The point of unordered initiated was that if that is in fact a common case the diagnostics would be very simple. If it's not a common enough case then of course we should drop it. But the fact that speculative parsing has to deal with this as a special case of even more complex things isn't a good argument for dropping it to me. The perspective I am trying to convey is this: speculative parsing is a very big hammer. Many keywords could be dropped from dfdl if we lean on speculation everywhere. That doesn't mean we should do so. To me most data formats shouldn't require any speculation and it should be easy to work in the subset of dfdl where speculation plays no role. ...mikeb On Jan 11, 2010, at 6:37 PM, Stephanie Fetzer <sfetzer@us.ibm.com> wrote:
All:
I remember a bit of a conversation on this topic from a while back - one of the things we were addressing was multi-level unorderedness. An unordered group (or an ordered group with floating components) within an unordered group (or an ordered group with floating components). This is allowed in WTX and we wanted to make sure that it was specifically allowed in DFDL. We do no restrict unordered groups at all in this respect.
In V34 the DFDL spec contained the concept of "unordered" versus "unorderedInitiated" but that was removed ( with v35) and the 'children must be xs:element' phrase was added).
The other goal of the wording in the spec was to make sure that an unordered group was parsed/serialized the exact same way as an ordered group with ALL unordered components. If we have a group with n-1 floating components then that is really the functional equivalent of having n floating components. The one 'static' component will not anchor anything and all will still be unordered. If we had n-2 floating components then we would in fact have two components that would need to be in the same order relative to each other.
So from that perspective - the wording in 16.5 looks correct to me: An ordered sequence of n element children with either n or n-1 of those children with dfdl:floating="true" is equivalent to an unordered sequence with the same n element children with dfdl:floating="false".
Is the question then - why the wording is Section 16?... "The children of an unordered sequence must be xs:element." If so, I did not read that as any type of limitation on the contents of the element..I can have an element which contains a group of sequences containing groups. (or have I misunderstood what this is trying to convey?). Perhaps that phrase isn't really saying anything useful at this point and should be removed. I don't believe that we want to go back to the unorderedInitiated concept (where we had a different set of rules for unordered groups if the content was all initiated).
-My take is that we should consider removing or further explain the "The children of an unordered sequence must be xs:element." . -The other note with the "An ordered sequence of n element children with either n or n-1 of those children with dfdl:floating="true" is equivalent to an unordered sequence with the same n element children with dfdl:floating="false". - looks fine as is to me as far as I can see. -I'd prefer we not reopen the unorderedInitiated concept again.
Cheers,
Stephanie Fetzer WebSphere Common Transformation Industry Packs - Software Engineer
From: Tim Kimber <KIMBERT@uk.ibm.com> To: dfdl-wg@ogf.org Date: 01/11/2010 12:52 PM Subject: [DFDL-WG] Floating elements and unordered groups Sent by: dfdl-wg-bounces@ogf.org
Hi all,
I know this area of the specification was only recently resolved, and I think there may be an inconsistency in the v0.37 wording.
Section 16, re: sequenceKind says: "The children of an unordered sequence must be xs:element." Section 16.5 Floating Elements says: "An ordered sequence of n element children with either n or n-1 of those children with dfdl:floating="true" is equivalent to an unordered sequence with the same n element children with dfdl:floating="false". A complex element with dfdl:floating="true" can have as its content model a sequence with elements that also have dfdl:floating="true". "
Now suppose that, instead of N element children, there are N-1 floating element children + one non-floating group. This group will be equivalent to an unordered group with a non-element member. If the specification was intending to make life easy for implementers, then it should probably disallow groups in any non- ordered context, including when sequenceKind='ordered' and there is at least one floating component. But I think that would be too restrictive. I would be happy for the restriction to be lifted entirely. Given that unordered groups can have dfdl:initiated="false", it will sometimes be necessary to find the correct member by trial and error ( speculative parsing ) anyway. I don't think it's any more difficult to speculatively parse a group than to speculatively parse a complex element.
If I've missed something, and it turns out that the restriction is useful, then we should a) tighten up the wording to say that if a group with N members has N or N-1 floating members, then it must be validated as if it was an unordered group. b) consider lifting the restriction in cases where dfdl:initiated="true" ( because it makes things so much easier for the DFDL processor )
regards,
Tim Kimber, Common Transformation Team, Hursley, UK Internet: kimbert@uk.ibm.com Tel. 01962-816742 Internal tel. 246742
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg
-- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg