RE: [dfdl-wg] How to handle multi-dimensional arrays

Mike, Some random quick comments - hopefully not to terse or stream-of-conciousness to be incomprehensible: This approach seems to dictate how the user will represent the array in XML (dictating their schema) rather than just describing how to pick up the right content. Which I think we agreed is a bad thing. That's not to say this isn't a reasonable way to represent a multidim array in XML, just that having DFDL go look at the attributes outside the annotation and requiring this formatting in the output XML doesn't seem right. In <dfdl:dataFormat arrayStorageOrder="@y @x"> it seems that DFDL only needs the dimension sizes for its own purposes and we could probably use our referencing mechanism to get them (i.e. if I read the two dimension ints earlier and want to reference them for the array sizes) - maybe some kind of <dfdl:runtimeoccurs> elements for the n dimensions? To allow the kind of output XML you propose, we probably need something new to allow you to loop. If the only place we need looping is for multidimensional arrays (and the special case of one dim), perhaps we can do something similar to what you propose and essentially have the array mechanism define some loop variables that can be referenced (a dfdl layer?). I don't have a full proposal thought out, but imagine defining an array as a stream that can be referenced using multiple dimensions rather than a single cursor, and having a mechanism so that the current value of the cursor(s) are available to the user. So, from the Reference.xsd example, we might want to have an attribute that shows the x value of the xdata elements analogous to the multidim example: <xs:element name="xdata" type="xs:float" maxOccurs="unbounded"> <xs:annotation> <xs:appinfo> <dfdl:runtimeoccurs>../x</dfdl:runtimeoccurs> </xs:appinfo> </xs:annotation> <xs:attribute name="x"> <dfdl:runtimevalue = #currentcursorvalue#/> </xs:attribute> </xs:element> where #currentcursorvalue# is something we have not yet made available for output (or is this available via xpath - the position of the current element in a sequence?). This would change the Reference.xml example output to have elements like <xdata x="1">2.78</xdata> <xdata x="2">3.14</xdata> So, if I can summarize/rephrase, I think we should keep the mechanism for single or multidim arrays separate from how the output is displayed, but I like the idea of making the current cursor(s) available for use, which I don't think we've done yet. And having a real multidimension construct rather than calculating them from a flat cursor is probably a requirement for scientific use, so some multidim analog of dfdl:runtimeoccurs is needed. Jim -----Original Message----- From: owner-dfdl-wg@ggf.org [mailto:owner-dfdl-wg@ggf.org] On Behalf Of mike.beckerle@ascentialsoftware.com Sent: Friday, February 18, 2005 4:22 PM To: dfdl-wg@gridforum.org Subject: [dfdl-wg] How to handle multi-dimensional arrays We have come up with an approach to how to represent multi-dimensional arrays within XSD-described XML. The attached test file (.xml) and DFDL Schema (.dfdl.xsd) illustrate the proposed solution. The proposal does not require any changes to XSD, XML or any other special constructs outside of a single dfdl annotation to specify the storage order of the representation. I'm pretty happy with how this works out. We can handle arrays with different storage orders, like fortran style column-major vs. more common row-major, and it dovetails nicely with XPath expressions and the XSD data model. Schema validation can really do something for you, like tell you if you have all the elements of the array (if it's fixed size), and that you don't have multiple elements occupying the same array location. Those interested in multi-dimensional array support please give this some consideration. That said, I'm departing on vacation for a week, so I'll toss this out there for people to look at, but I won't be able to interact with you all on it until I get back. ...mikeb
participants (1)
-
Myers, James D