How to specify type/subtype combos?

Content types come in pairs. So while the type can be "text" and the subtype can be "pdf", the complete content type cannot be "text/pdf". It can, however, be "text/html" or "application/pdf". So in my DFDL it's not sufficient to specify enumerated lists for types and subtypes (as I have done below), I must also specify which types are allowed to go with which subtypes. How would I do that, given this complexType? TIA! <xsd:complexType name="MimeTypeType"> <xsd:sequence dfdl:separator="/"> <xsd:element name="Type" dfdl:initiator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The type must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([a|A][p|P][p|P][l|L][i|I][c|C][a|A][t|T][i|I][o|O][n|N])| ([m|M][u|U][l|L][t|T][i|I][p|P][a|A][r|R][t|T])| ([m|M][e|E][s|S][s|S][a|A][g|G][e|E])| ([t|T][e|E][x|X][t|T])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Subtype" dfdl:terminator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The subtype must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([p|P][d|D][f|F])| ([a|A][l|L][t|T][e|E][r|R][n|N][a|A][t|T][i|I][v|V][e|E])| ([m|M][i|I][x|X][e|E][d|D])| ([r|R][f|F][c|C]822)| ([p|P][l|L][a|A][i|I][n|N])| ([h|H][t|T][m|M][l|L])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType>

You could use an assert on the sequence that tested for the allowed combinations? Such an assert is evaluated at the end of the sequence. If you get that far then both the Type and Subtype must have passed their individual asserts. Use fn:upper-case() to handle mixed case. Note that you can't reuse the information in the facets in the assert's expression though. Strictly speaking, this sort of cross-field 'complex' validation is not something DFDL was designed to do. The WG has considered this a post-parse validation step, the sort of thing that folk use Schematron or similar tools to do. Although asserts are designed as a parsing aid, they can be used to do some limited amount of complex validation - and I think this is an example where it would work ok. Let us know how you get on. Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: "Garriss Jr., James P." <jgarriss@mitre.org> To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, Date: 20/06/2013 16:34 Subject: [DFDL-WG] How to specify type/subtype combos? Sent by: dfdl-wg-bounces@ogf.org Content types come in pairs. So while the type can be “text” and the subtype can be “pdf”, the complete content type cannot be “text/pdf”. It can, however, be “text/html” or “application/pdf”. So in my DFDL it’s not sufficient to specify enumerated lists for types and subtypes (as I have done below), I must also specify which types are allowed to go with which subtypes. How would I do that, given this complexType? TIA! <xsd:complexType name="MimeTypeType"> <xsd:sequence dfdl:separator="/"> <xsd:element name="Type" dfdl:initiator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/ "> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The type must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value= "([a|A][p|P][p|P][l|L][i|I][c|C][a|A][t|T][i|I][o|O][n|N])| ([m|M][u|U][l|L][t|T][i|I][p|P][a|A][r|R][t|T])| ([m|M][e|E][s|S][s|S][a|A][g|G][e|E])| ([t|T][e|E][x|X][t|T])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Subtype" dfdl:terminator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/ "> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The subtype must match one of the values on the enumerated list." /> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([p|P][d|D][f|F])| ([a|A][l|L][t|T][e|E][r|R][n|N][a|A][t|T][i|I][v|V][e|E])| ([m|M][i|I][x|X][e|E][d|D])| ([r|R][f|F][c|C]822)| ([p|P][l|L][a|A][i|I][n|N])| ([h|H][t|T][m|M][l|L])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType> -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

This is most easily done by computing two physical fields that just grab the strings, then two logical fields, which standardize those strings to lower case (or upper if you prefer), then an assert with a big case analysis that passes for valid pairings, and fails for invalid pairings. Because DFDL expressions are based on XPath, there is only nested if-then-else available to do this, but that said, at least it is straightforward what you need to create: <dfdl:assert message="Illegal type and subtype pair."> { if (../Type eq 'text') then if (../Subtype eq 'html') then true else if (../Subtype eq 'plain') then true else false else if (../Type eq 'application') then if (../Subtype eq 'pdf') then true else false else false } </dfdl:assert> I believe we need the fn:error function (not currently in DFDL spec, but has been discussed), so that you can get a reasonable diagnostic message from this rather than just "Assertion failed: illegal type and subtype pair.", but something that actually mentions what the type and subtype values were. On Thu, Jun 20, 2013 at 11:33 AM, Garriss Jr., James P. <jgarriss@mitre.org>wrote:
Content types come in pairs. So while the type can be “text” and the subtype can be “pdf”, the complete content type cannot be “text/pdf”. It can, however, be “text/html” or “application/pdf”.****
** **
So in my DFDL it’s not sufficient to specify enumerated lists for types and subtypes (as I have done below), I must also specify which types are allowed to go with which subtypes.****
** **
How would I do that, given this complexType? TIA!****
** **
<xsd:complexType name="MimeTypeType"> <xsd:sequence dfdl:separator="/"> <xsd:element name="Type" dfdl:initiator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/ "> <dfdl:assert test="{ dfdl:checkConstraints(.) }"message ="The type must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value= "([a|A][p|P][p|P][l|L][i|I][c|C][a|A][t|T][i|I][o|O][n|N])|
([m|M][u|U][l|L][t|T][i|I][p|P][a|A][r|R][t|T])| ([m|M][e|E][s|S][s|S][a|A][g|G][e|E])| ([t|T][e|E][x|X][t|T])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Subtype" dfdl:terminator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/ "> <dfdl:assert test="{ dfdl:checkConstraints(.) }"message ="The subtype must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([p|P][d|D][f|F])|
([a|A][l|L][t|T][e|E][r|R][n|N][a|A][t|T][i|I][v|V][e|E])| ([m|M][i|I][x|X][e|E][d|D])| ([r|R][f|F][c|C]822)| ([p|P][l|L][a|A][i|I][n|N])| ([h|H][t|T][m|M][l|L])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType>****
** **
** **
-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg
-- Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com

Good question. I call this type of validation 'cross-field validation' because ( unlike XML Schema validation ) it involves the value or count of more than one field. It's quite common in industry messaging standards. The general answer is as follows: DFDL is based on XML Schema, so its validation facilities don't include cross-field validation. However, by using DFDL to convert to DOM/XML you can bring your non-XML data within reach of whole lot of tools and technologies that were originally designed for XML. I'm thinking about XPath, XQuery, Schematron, RELAX-NG etc. This should not be viewed as a shortcoming of the DFDL language - DFDL is for describing the format, not the higher-level validation rules. Having said that, I wouldn't be surprised if, with sufficient ingenuity, it was possible to make DFDL do what you need in this particular case. Let's see if any suggestions are forthcoming... regards, Tim Kimber, DFDL Team, Hursley, UK Internet: kimbert@uk.ibm.com Tel. 01962-816742 Internal tel. 37246742 From: "Garriss Jr., James P." <jgarriss@mitre.org> To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, Date: 20/06/2013 16:34 Subject: [DFDL-WG] How to specify type/subtype combos? Sent by: dfdl-wg-bounces@ogf.org Content types come in pairs. So while the type can be “text” and the subtype can be “pdf”, the complete content type cannot be “text/pdf”. It can, however, be “text/html” or “application/pdf”. So in my DFDL it’s not sufficient to specify enumerated lists for types and subtypes (as I have done below), I must also specify which types are allowed to go with which subtypes. How would I do that, given this complexType? TIA! <xsd:complexType name="MimeTypeType"> <xsd:sequence dfdl:separator="/"> <xsd:element name="Type" dfdl:initiator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/ "> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The type must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value= "([a|A][p|P][p|P][l|L][i|I][c|C][a|A][t|T][i|I][o|O][n|N])| ([m|M][u|U][l|L][t|T][i|I][p|P][a|A][r|R][t|T])| ([m|M][e|E][s|S][s|S][a|A][g|G][e|E])| ([t|T][e|E][x|X][t|T])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Subtype" dfdl:terminator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/ "> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The subtype must match one of the values on the enumerated list." /> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([p|P][d|D][f|F])| ([a|A][l|L][t|T][e|E][r|R][n|N][a|A][t|T][i|I][v|V][e|E])| ([m|M][i|I][x|X][e|E][d|D])| ([r|R][f|F][c|C]822)| ([p|P][l|L][a|A][i|I][n|N])| ([h|H][t|T][m|M][l|L])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType> -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

I wouldn't be surprised if, with sufficient ingenuity, it was possible to make DFDL do what you need in this particular case
Mike Bekerle, what’s your current job title? Something like “Senior Software Developer” maybe? Time to upgrade. Your new title is “Possessor of Sufficient Ingenuity,” as you came up with this answer: <xsd:sequence> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert message="The type and subtype must be a matching pair."> { if ((Type/text() eq 'application') and (Subtype/text() eq 'pdf')) then true() … else if ((Type/text() eq 'text') and (Subtype/text() eq 'plain')) then true() else false() } </dfdl:assert> </xsd:appinfo> </xsd:annotation> </xsd:sequence>
DFDL is based on XML Schema, so its validation facilities don't include cross-field validation.
Partially true. XML Schema 1.1 supports cross-field validation. The first working draft came out 9 years ago, and it’s been stable for 4 or 5 years. Maybe DFDL should be upgraded to 1.1? From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Tim Kimber Sent: Thursday, June 20, 2013 4:10 PM To: dfdl-wg@ogf.org Subject: Re: [DFDL-WG] How to specify type/subtype combos? Good question. I call this type of validation 'cross-field validation' because ( unlike XML Schema validation ) it involves the value or count of more than one field. It's quite common in industry messaging standards. The general answer is as follows: DFDL is based on XML Schema, so its validation facilities don't include cross-field validation. However, by using DFDL to convert to DOM/XML you can bring your non-XML data within reach of whole lot of tools and technologies that were originally designed for XML. I'm thinking about XPath, XQuery, Schematron, RELAX-NG etc. This should not be viewed as a shortcoming of the DFDL language - DFDL is for describing the format, not the higher-level validation rules. Having said that, I wouldn't be surprised if, with sufficient ingenuity, it was possible to make DFDL do what you need in this particular case. Let's see if any suggestions are forthcoming... regards, Tim Kimber, DFDL Team, Hursley, UK Internet: kimbert@uk.ibm.com<mailto:kimbert@uk.ibm.com> Tel. 01962-816742 Internal tel. 37246742 From: "Garriss Jr., James P." <jgarriss@mitre.org<mailto:jgarriss@mitre.org>> To: "dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org>" <dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org>>, Date: 20/06/2013 16:34 Subject: [DFDL-WG] How to specify type/subtype combos? Sent by: dfdl-wg-bounces@ogf.org<mailto:dfdl-wg-bounces@ogf.org> ________________________________ Content types come in pairs. So while the type can be “text” and the subtype can be “pdf”, the complete content type cannot be “text/pdf”. It can, however, be “text/html” or “application/pdf”. So in my DFDL it’s not sufficient to specify enumerated lists for types and subtypes (as I have done below), I must also specify which types are allowed to go with which subtypes. How would I do that, given this complexType? TIA! <xsd:complexType name="MimeTypeType"> <xsd:sequence dfdl:separator="/"> <xsd:element name="Type" dfdl:initiator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The type must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([a|A][p|P][p|P][l|L][i|I][c|C][a|A][t|T][i|I][o|O][n|N])| ([m|M][u|U][l|L][t|T][i|I][p|P][a|A][r|R][t|T])| ([m|M][e|E][s|S][s|S][a|A][g|G][e|E])| ([t|T][e|E][x|X][t|T])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Subtype" dfdl:terminator=""> <xsd:annotation> <xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert test="{ dfdl:checkConstraints(.) }" message="The subtype must match one of the values on the enumerated list."/> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="([p|P][d|D][f|F])| ([a|A][l|L][t|T][e|E][r|R][n|N][a|A][t|T][i|I][v|V][e|E])| ([m|M][i|I][x|X][e|E][d|D])| ([r|R][f|F][c|C]822)| ([p|P][l|L][a|A][i|I][n|N])| ([h|H][t|T][m|M][l|L])" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:complexType> -- dfdl-wg mailing list dfdl-wg@ogf.org<mailto:dfdl-wg@ogf.org> https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (4)
-
Garriss Jr., James P.
-
Mike Beckerle
-
Steve Hanson
-
Tim Kimber