Right but still need dfdl:inputValueCalc to get the @ into the infoset, as per Mike's suggestion.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Suman Kalia <kalia@ca.ibm.com>
To:        Steve Hanson/UK/IBM@IBMGB,
Cc:        "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, dfdl-wg-bounces@ogf.org, Mike Beckerle <mbeckerle.dfdl@gmail.com>
Date:        25/02/2013 17:08
Subject:        Re: [DFDL-WG] How to parse an email address




Other option would be to put @ as separator on sequence.. the sequence contains 2 elements of type string  localPart and domain

Suman Kalia

IBM Canada Lab

WMB Toolkit Architect and Development Lead

Tel: 905-413-3923 T/L 313-3923

Email: kalia@ca.ibm.com


For info on Message broker

http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html





From:        
Steve Hanson <smh@uk.ibm.com>
To:        
Mike Beckerle <mbeckerle.dfdl@gmail.com>,
Cc:        
"dfdl-wg@ogf.org" <dfdl-wg@ogf.org>
Date:        
02/25/2013 05:09 AM
Subject:        
Re: [DFDL-WG] How to parse an email address
Sent by:        
dfdl-wg-bounces@ogf.org




Or you can use dfdl:lengthKind 'pattern' and supply a dfdl:lengthPattern for localPart that consumes all data up to but not including the @.


Note: Neither dfdl:lengthKind 'pattern' or dfdl:inputValueCalc are supported by IBM DFDL yet - this will be addressed in the near future :)


Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        
"Garriss Jr., James P." <jgarriss@mitre.org>,
Cc:        
"dfdl-wg@ogf.org" <dfdl-wg@ogf.org>
Date:        
23/02/2013 00:14
Subject:        
Re: [DFDL-WG] How to parse an email address
Sent by:        
dfdl-wg-bounces@ogf.org





If you want the infoset to contain
dfdl-wg@ogf.org, but also to contain localPart and domain, then you are putting the same data into multiple
fields, which can only be accomplished using inputValueCalc.

I suggest parse the data into localPart and domain by using the @ as terminator for localPart, and whatever the boundary is for the domain.

then have another element which you inputValueCalc which contains the concatenation of those fields with an '@' in the middle.

I choose this way, parse the pieces, compute the concatenation of them, versus the other way round because the expression for the inputValueCalc will be dead simple in this case, and you will get an ordinary error of 'delimiter not found' if the @ is missing from the data.

Contrast if you had one field which contains the whole email address and two calculated elements which take substrings of that. Now you have two inputValueCalc's. And if the @ isn't there in the string, you'll get errors evaluating expressions, not ordinary parsing, etc.

On Fri, Feb 22, 2013 at 11:11 AM, Garriss Jr., James P. <
jgarriss@mitre.org> wrote:
Suppose I want to model an email address:

 

  dfdl-wg@ogf.org

 

It’s a sequence of a string, an a character (@), and a string.

 

In the MBTK it might look like this:

 

 

The problem, of course, is that the parser will grab all of “dfdl-wg@ogf.org” instead of just “dfdl-wg”.  Setting validation facets doesn’t help, b/c validation happens after parsing.  Setting @ as a separator or terminator would work, but that doesn’t seem right either, as the @ is part of the email address, and I would want to keep it in the infoset.

 

How do I think about this problem?

 

TIA

--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg



--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
--
dfdl-wg mailing list
dfdl-wg@ogf.org

https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU