Mike
I don't think it's strictly necessary
to specify that we will try the longest granularity first, because we can
tell from the length of the input data, together with the presence/absence
of a '.' (fractional second separator), which of the granularities we should
attempt to parse the input data against. Remember that by the time the
parser tries to process the date/time, it will already have extracted the
text using lengthKind and done any trimming.
Granularity
Length
YYYY
4
YYYY-MM
7
YYYY-MM-DD
10
YYYY-MM-DDThh:mmTZD
17 or 22
YYYY-MM-DDThh:mm:ssTZD
20 or 25
YYYY-MM-DDThh:mm:ss.sTZD
22 or more but
input data will contain '.'
(TZD can be Z or +hh:mm or -hh:mm)
Similarly for xs:time we would have,
Granularity
Length
hh:mmTZD
6
or 11
hh:mm:ssTZD
9
or 14
hh:mm:ss.sTZD
11
or more but input data will contain '.'
Any other length of input data, or invalid
data for the granularity matched, would be a processing error.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve
Hanson/UK/IBM on 18/01/2012 16:29 -----
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Steve Hanson/UK/IBM@IBMGB
Cc:
dfdl-wg@ogf.org
Date:
17/01/2012 19:07
Subject:
Issue 162: Re:
[DFDL-WG] Proposed spec errata for calendarPattern I and T symbols
I reviewed this. The proposal looks fine as-is.
I don't think the lack of gYear and gYearMonth types should affect the
functionality offered. We left those out not because people don't want
formats like that, but because we can handle those formats even without
the additional types.
Do we need to clarify that the longest granularity is tried first, and
this is greedy parsing? If the data matches one of the granularities there
will be no backtracking to try others.
Proposal
Here is what is proposed to correct this. Note the T symbol is dropped.
It was introduced to allow xs:dateTime to expect just a time, but that
is not necessary (it was copied from IBM MRM). I've stated any alternatives
that we can consider at the end.
Symbol Meaning
Presentation
Example
I ISO8601
Date/Time
(Text)
2006-10-07T12:06:56.568+01:00
IU ISO8601
Date/Time
(Text)
2006-10-07T12:06:56.568Z
with output "Z" if the
time zone is +00:00)
The 'I' symbol must not be used with any
other symbol other than the 'escape for text' symbol. It represents calendar
formats that match those defined in the restricted profile of the ISO8601
standard proposed by the W3C at http://www.w3.org/TR/NOTE-datetime.The
formats are referred to as 'granularities'.
- xs:dateTime. When parsing, the data must
match one of the granularities. When unparsing, the fullest granularity
is used.
- xs:date. When parsing, the data must match
one of the date-only granularities. When unparsing, the fullest date-only
granularity is used. 'IU' is permitted for xs:date but the 'U' is effectively
ignored as there is no time zone in the date-only granularities.
- xs:time. When parsing, the data must match
only the time components of one of the granularities that contains time
components. When unparsing, the time components of the fullest granularity
are used. The literal 'T' character is not expected in the data when parsing
and is not output when unparsing.
- The number of fractional second digits
supported is implementation dependent but must be at least one.
- For a granularity that omits components,
when parsing the values for the omitted components are supplied from the
Unix epoch 1970-01-01T00:00:00.000
.
Alternatives:
As above but don't support the first two granularities ('Year', 'Year and
month') on the grounds that they are really matching xs:gYear and xs:gYearMonth.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU