Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda item on ICU 'S' symbol

Following on from today's call, the relevant piece of documentation is in http://icu-project.org/apiref/icu4c/classicu_1_1SimpleDateFormat.html When numeric fields abut one another directly, with no intervening delimiter characters, they constitute a run of abutting numeric fields. Such runs are parsed specially. For example, the format "HHmmss" parses the input text "123456" to 12:34:56, parses the input text "12345" to 1:23:45, and fails to parse "1234". In other words, the leftmost field of the run is flexible, while the others keep a fixed width. If the parse fails anywhere in the run, then the leftmost field is shortened by one character, and the entire run is parsed again. This is repeated until either the parse succeeds or the leftmost field is one character in length. If the parse still fails at that point, the parse of the run fails. So it seems that when the 'S' is next to other numeric units in the pattern, it will be subject to the above behaviour. Therefore: - A pattern of HHmmssSSS with the input 112233123 will become 11:22:33.123 but the input 1122331234 will trigger an error. - If the pattern includes a '.' to become HHmmss.SSS, I think the input 1122331234 will become 11:22:33.123 but I'll try and confirm. Steve - Does that description match what you were seeing? HTH, Andy Andy Edwards - IBM Integration Bus - DFDL Email: andy.edwards@uk.ibm.com Snail Mail: MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN Tel int: 247222 Tel ext: +44 (0)1962 817222 Desk: DE3 V17 The Feynman problem solving Algorithm 1) Write down the problem 2) Think real hard 3) Write down the answer -- Murray Gell-mann in the NY Times From: Steve Hanson/UK/IBM To: Andrew Edwards/UK/IBM@IBMGB Cc: DFDL-WG <dfdl-wg@ogf.org> Date: 11/08/2015 12:53 Subject: Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 - agenda item on ICU 'S' symbol Hi Andy Your internal ticket #630 gave rise to external ticket http://bugs.icu-project.org/trac/ticket/10962, which claims to have fixed the API docs to clarify the behaviour. S fractional second - truncates (like other time fields) to the count of letters when formatting. Appends zeros if more than 3 letters specified. Truncates at three significant digits when parsing. S SS SSS SSSS 2 23 235 2350 I can't see anywhere that addresses your point about about abutting versus non-abutting numeric symbols though? As far as DFDL spec is concerned, this is what we say today: S fractional second (see note 1) Number S SS SSS 2 24 235 There is no 'note 1', I think the note was made into a normal paragraph, which reads: Any number of fractional seconds "S" may by specified in the pattern and accepted by implementations, but an implementation is free to represent a limited number of fractional seconds internally. Excess fractional seconds are truncated, not rounded up. At least millisecond accuracy must be implemented. Unlike other fields, fractional seconds are padded on the right with zero. Regards Steve Hanson Architect, IBM DFDL Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Andrew Edwards/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Date: 11/08/2015 11:56 Subject: Re: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 Hi Steve Re agenda item 2 and calendar patterns with 'S', this ICU ticket from last year might be relevant - https://icu.sanjose.ibm.com/gcoctrac/ticket/630. It seems that the error reporting may also depend on whether the pattern has 'S' on it's own or next to other numeric pattern entities. i.e. 'HHmmssS' is subject to length checking, but 'HHmmss S' is not, due to the space before the 'S'. HTH, Andy Andy Edwards - IBM Integration Bus - DFDL Email: andy.edwards@uk.ibm.com Snail Mail: MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN Tel int: 247222 Tel ext: +44 (0)1962 817222 Desk: DE3 V17 The Feynman problem solving Algorithm 1) Write down the problem 2) Think real hard 3) Write down the answer -- Murray Gell-mann in the NY Times From: Steve Hanson/UK/IBM@IBMGB To: dfdl-wg@ogf.org Cc: Mike Beckerle <mbeckerle@tresys.com>, jorge.marizan@gmail.com Date: 10/08/2015 18:29 Subject: [DFDL-WG] OGF DFDL WG Call Agenda 2015-08-11 Sent by: dfdl-wg-bounces@ogf.org Please find agenda for call on Redmine at https://redmine.ogf.org/dmsf_files/13489?download= Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (1)
-
Andrew Edwards