
Hi Tim I've made some minor corrections to your summary of the problem. If the user restructures his model to wrap the sequences in elements then the problem goes away. So I think we should keep the solution to this as simple as we can while not being unnecessarily restrictive. Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Tim Kimber/UK/IBM@IBMGB To: dfdl-wg@ogf.org Date: 13/04/2012 10:54 Subject: [DFDL-WG] How to choose the correct choice branch when serializing Sent by: dfdl-wg-bounces@ogf.org There is an interesting edge case which arises when the serializer encounters a choice group. A DFDL xsd is structured as follows: <root> <choice> <sequence> <firstname/> <lastname/> <postcode/> </sequence> <sequence> <lastname/> <telephoneNumber/> </sequence> </choice> </root> Note that both branches of the choice are sequences, not elements. The infoset is <root> <lastName/> <telephoneNumber/> </root> The likely action of the serializer is: - pick the first branch of the choice ( because it contains lastname ) - output the default value of firstname ( assuming that firstname has minOccurs = 1 and has a default ) - output lastname - issue a processing error because telephoneNumber is found in the info set but is not in the first branch. ...but from the infoset the user clearly intended: - select the second branch of the choice and successfully process the entire info set The DFDL specification does not state what the behaviour should be. I think the options are: a) state explicitly that the serializer will choose the first branch that contains a matching element, regardless of minOccurs b) invent a new rule that causes the parser to back out of a branch and try another branch if there is a minOccurs error while processing the branch c) disallow sequences and choices as immediate children of a choice group Currently I'm leaning toward a) by process of elimination, for the following reasons: b) would make this scenario work, but I think it would impose a lot of work on implementers because it would require the serializer to do backtracking. c) would simplify a lot of things, but I think it's too restrictive - I can imagine complex data formats where is might be useful to have a choice as the direct child of a choice because the discrimination rules might be easier to express in a two-level structure. regards, Tim Kimber, Common Transformation Team, Hursley, UK Internet: kimbert@uk.ibm.com Tel. 01962-816742 Internal tel. 246742 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU