In the previous lessons, all the elements had to occur exactly once in the data stream; in this lesson we will learn how to model elements that occur optionally and elements that occur repeatedly
The permitted number of occurrences of an element in the data stream is
given by the element's XML Schema minOccurs
and maxOccurs properties.
Because these are XML Schema properties and not DFDL properties, either
can be omitted, and if so default to '1'
An element that occurs exactly once is identified by both minOccurs and maxOccurs having the value '1' or defaulting to '1'.
Elements can be modeled to occur optionally. An optional element is identified by setting minOccurs to '0' and setting dfdl:occursCountKind to indicate how to determine if the element is present.
Elements can also occur repeatedly and are referred to as arrays. An array element is identified by setting maxOccurs to greater than '1' or the special value 'unbounded' (meaning there is no maximum number of occurrences) and setting dfdl:occursCountKind to indicate how to determine how many occurrences of the element are present. The number of occurrences in an array can be fixed or can vary. A fixed array is identified by setting both minOccurs and maxOccurs to the same value greater than '1'. If minOccurs is set to '0' then the element is both optional and an array.
This is an important property and must be set for all optional elements and array elements. It determines how the number of occurrences is identified in the data stream.
Notice that XSD minOccurs and maxOccurs are only used to assist the parser when dfdl:occursCountKind is 'fixed' and 'implicit'. However, if validation is switched on, the parser checks that the number of occurrences in the infoset is within the bounds specified by minOccurs and maxOccurs, whatever the setting of dfdl:occursCountKind.
We extend the variable length Address example to model an optional 'country' element.
or
or
The optional 'country' element on line 24 is indicated by XSD minOccurs='0'.
The parser needs to be able to find out if the element is present in the data stream. For initiated elements like the example above it can be done easily by looking for the initiator so the dfdl:occursCountKind should be set to 'parsed' or 'implicit'.
Text elements without an initiator are often not optional, or if they are then there is something in the data stream which can indicate the absence of the optional element, such as a delimiter or perhaps another element earlier in the data stream (dfdl:occursCountKind 'expression').
Fixed length binary fields are usually not optional, but if they are can use the same techniques as text elements.
This is typical of fixed length data where there are no initiators, so all elements are identified by their position in the data stream. We extend the fixed length Address example to model the 'street' element repeating a fixed number of times.
On line 11 element 'street' occurs exactly twice so XSD minOccurs and maxOccurs are both set to '2' and dfdl:occursCountKind is 'fixed'.
This is typical of data where there are initiators.
We extend the variable length address example to model the 'street' element repeating a variable number of times.
On line 18 of the DFDL Schema, the element 'street' occurs 1 or 2 times so XSD minOccurs is set to '1' and xs:maxOccurs is set to '2''. The element has an initiator 'street:' and can be easily identified in the data stream, so dfdl:occursCountKind is 'parsed' or 'implicit'.
Notice that when an array element has an initiator, every occurrence in the data must have the initiator. What happens if only the first occurrence in the data has the initiator, so the initiator is really identifying the array as a whole?
This is best modeled by wrapping the array element in a complex element which carries the initiator, and removing the initiator from the array element.
A commonly used technique is the use of a different separator between the occurrences, as it enables the occurrences of the array element to be distinguished from the following element, especially useful when there are no initiators. Again, this is best modeled by wrapping the array element in a complex element, the xs:sequence of which carries the different separator.
We extend the example to show both of these, the 'street' element losing its initiator and being wrapped by an element called 'streets' with initiator 'streets:' and with an xs:sequence with different separator '~'.
In this lesson we have looked at how you can define optional and repeating elements. We have seen that there is more than one way in DFDL to determine the number of occurrences of an element in the data stream, controlled by the dfdl:occursCountKind property, and shown some examples. We have only shown simple elements but the same principles apply to complex elements, allowing the modeling of optional and repeating structures.