
Karl Czajkowski wrote:
On Sep 01, Andreas Savva modulated:
... The question I have if you change the argument type to a string is then why do you need multiple argument elements?
The arguments to a POSIX command are a vector of strings. The strings may have any character in them, be empty strings, etc. You cannot trivially talk about encoding this vector in a string unless you assume a bunch of "quoting" conventions. [...]
I agree completely with this point. The arguments are to contain the strings as seen in the argv argument to main(), excluding whatever well known leading arguments that are there first (i.e. the name of the executable).
That said, Donal's point about string encoding is important to remember. To do it right, the implementation needs to translate the XML info set string (unicode I think, right?) to the locale in which the program will run. This might not be possible, unless we can assume that the implementation can select an appropriately locale for the characters being used.
This is probably going to end up as a requirement on the BES spec. The XML infoset is defined in terms of UNICODE abstract characters, so when mapping to running a real executable there needs to be some encoding employed. Most encodings cannot render all UNICODE characters. Hence, if there are characters which cannot be encoded there probably SHOULD be a fault thrown. (I hesitate to say MUST.) I imagine that eventually all platforms will use UTF-8 as their native encoding - there are excellent reasons to do this shift - and that from then on, the encoding nasties will go away, but it's going to be a long time coming. Donal.