I have analyzed this use of (.*) notation in the
Sentinel2X-bandTMISPData.xsd file.
Below is my discussion of what this is, why it is used, the alternative to
it, and why it is more problematic than it seems at first, as an addition
to DFDL.
There are two instances of (.*) in the Sentinel schema files.
What it expresses is a partial step-name wildcard. This is not a variation
on the XPath * notation, as that matches any step, and can't match a
partial name. In the usage in the Sentinel schema, this wild card is a
regex that matches against a step name, and it matches exactly a single
name. It is not used in a way where it can result in a node-set instead of
a single node.
The question really is why is this (.*) needed - that is what is it trying
to achieve, and whether there is an acceptable alternative for what it is
doing.
I find that this wildcard (.*) is used to achieve a parameterization of the
types TypeISPData and TypeISPData_HKTM. These types are polymorphic, in
that their exact behavior depends on aspects of their surrounding context.
Each of these types incorporates a length of some content that is outside
of its own definition.
As the types are used now, the length comes from a thing outside of them in
the schema that happens to have a particular suffix on its name, which is
"Packet_Secondary_Header" or just "_Secondary_Header".
So these names, while outside of the type, are in some sense being
hard-coded in these types. Even though they are outside of these types, you
cannot change the names of these elements without breaking the ability of
the type to find them via this (.*) notation and a name suffix.
These types, TypeISPData and TypeISPData_HKTM are placed inside other
types, and those are then placed in context with various packet-header
structures. Those structures have various distinguishing prefixes of
sub-elemetnts such as MSI_Packet_Secondary_Header. There are a variety of
different things instead of "MSI_", but there's always something with
suffix "Packet_Secondary_Header" or "Secondary_Header" in it.
The use of this (.*) name wildcard seems convenient, but doesn't offer
anything that isn't better captured by a true parameterization mechanism
which decouples the names used outside the type from those used inside it.
DFDL provides a general mechanism for this sort of parameterization using
variables and dfdl:newVariableInstance.
Let's look at just one of the two instances, TypeISPData.
To use variables, a variable is created which represents this parameter to
the TypeISPData. It is declared in the schema file where TypeISPData is
defined.
Dear Steve,
sorry for the delay due to Summer break and other projects.
Here is the .zip.
Please note that we realised that one of our previous replies date 25 July 2016 at 17:45:05 GMT+2 was a misunderstanding from our part and not applicable.
Regarding your questions please find answers interleaved below
Michele
This message and any attachments are intended for the use of the addressee or addressees only. The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its content is not permitted. If you received this message in error, please notify the sender and delete it from your system. Emails can be altered and their integrity cannot be guaranteed by the sender.
Please consider the environment before printing this email.
Regards
*From:* Steve Hanson [*smh@uk.ibm.com*
] *Sent:* Tuesday, July 26, 2016 3:20 AM *To:* Michele Zundo *Cc:* Mike Beckerle; *rui.mestre@deimos.com.pt* *Subject:* Re: Fwd: OGF DFDL WG Call Minutes 2016-07-05 ….snip
dfdl:length="{/Packet_Primary_Header/Packet_Data_Length + 1 - contentLength(/Packet_Data_Fi eld/(.*)Packet_Secondary_Header, 'bytes') - 2}"
Firstly, contentLength is a DFDL function so it needs to be in the DFDL namespace, eg, dfdl:contentLength().
Yes agree with you. We will add the* dfdl:* it in future releases and modify the applications accordingly.
Secondly, the first argument to dfdl:contentLength() is a path, so you are effectively still using regular expressions in path steps.
Yes. For now we are using it and expect this to become part of the standard.
Regards
Steve Hanson *IBM Integration Bus* http://www-03.ibm.com/software/products/en/ibm-integration-bus, Hursley, UK Architect, *IBM DFDL* http://www.ibm.com/developerworks/library/se-dfdl/index.html Co-Chair, *OGF DFDL Working Group* http://www.ogf.org/dfdl/ *smh@uk.ibm.com*
tel:*+44-1962-815848* <%2B44-1962-815848> mob:*+44-7717-378890* <%2B44-7717-378890> From: Michele Zundo <*michele.zundo@esa.int*
> To: Steve Hanson/UK/IBM@IBMGB Cc: Mike Beckerle <*mbeckerle@tresys.com* > Date: 25/07/2016 17:10 Subject: Fwd: OGF DFDL WG Call Minutes 2016-07-05 ------------------------------ Dear Steve,
Please find below the answer from our developers and example.
Note that we have updated our implementation of DFDL to be as compliant as we can at this point in time with the exception noted below.
Michele
Begin forwarded message:
*From: *"Rui Mestre (DME)" <*rui.mestre@deimos.com.pt*
> *Subject: Re: Fwd: OGF DFDL WG Call Minutes 2016-07-05* *Date: *25 July 2016 at 17:45:05 GMT+2 Dear Michele,
I believe that after our DFDL compliance effort the mentioned "use of a regex in the path step of a DFDL expression" is no longer in place.
Currently the only extension implemented in DFDL4S regarding the use of regular expressions is that implementation of dfdl:contentLength is extended to support also regular expressions when specifying the node.
Please find attached a schema file example containing such extension in the use of dfdl:contentLength.
Best regards, Rui
Begin forwarded message:
*From: *Steve Hanson <
*smh@uk.ibm.com* > *Subject: OGF DFDL WG Call Minutes 2016-07-05* *Date: *5 July 2016 at 17:49:13 GMT+2 *To: * *dfdl-wg@ogf.org* *Cc: *"Mike Beckerle" < *mbeckerle@tresys.com* >, "Michele Zundo" <*michele.zundo@esa.int* > Please find minutes from the above call at https://redmine.ogf.org/dmsf_files/13537?download= *https://redmine.ogf.org/dmsf_files/13537?download=* https://redmine.ogf.org/dmsf_files/13537?download=
*@Michele - please can you send to the WG a schema that shows your use of a regex in the path step of a DFDL expression ?*
Next call *Aug 2nd*
Regards
Steve Hanson Architect, IBM DFDL, Co-Chair, *OGF DFDL Working Group* http://www.ogf.org/dfdl/ IBM SWG, Hursley, UK *smh@uk.ibm.com*
*tel:+44-1962-815848* <+44-1962-815848> ----------------------------------------- Michele Zundo
Head of Ground System Definition and Verification Office EOP-PEP European Space Agency, ESTEC e-mail:
*michele.zundo@esa.int* This message and any attachments are intended for the use of the addressee or addressees only. The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its content is not permitted. If you received this message in error, please notify the sender and delete it from your system. Emails can be altered and their integrity cannot be guaranteed by the sender.
Please consider the environment before printing this email.
----------------------------------------- Michele Zundo
Head of Ground System Definition and Verification Office EOP-PEP European Space Agency, ESTEC e-mail: *michele.zundo@esa.int*
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
#### Sentinel2X-bandTMISPData.xsd moved to MyAttachments Repository V3.8 ( Link notes:///802575AF0030E827/5DE5236E5AD1645685256EE0001BBADF/ABC83CA7B3700B038...) on 23 August 2016 by Steve Hanson.
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
----------------------------------------- Michele Zundo
Head of Ground System Definition and Verification Office EOP-PEP European Space Agency, ESTEC e-mail: michele.zundo@esa.int