
On Monday 26 October 2009 13:07:31 stephen.burke@stfc.ac.uk wrote:
Paul Millar [mailto:paul.millar@desy.de] said:
The text to be included in Glue 2.0 errata and included in the next revision.
These sound reasonable to me.
Ta.
However, for our current implementation technologies do we know if there is in fact a problem with using UTF-8 everywhere?
I know of no problems with switching to UTF-8. From [1], there are two printable characters that are incompatible: Code IA5String UTF-8 (and ASCII) 0x24 (currency) Dollar 0x7E (over-line) Tilde [1] http://www.zytrax.com/tech/ia5.html Since information is updated periodically from UTF-8 (or, perhaps, ASCII) LDIF data, any problem with this transition should be short-lived.
[Snip: encoding German names]
As some people may have seen, the particular problem that triggered this was a German-localised output from a unix "service xxx status", so even if there are alternative spellings that doesn't mean that you'll get them without some special translation. (Actually a google search finds http://www.manticmoo.com/articles/jeff/programming/perl/converting-from-ut f8-to-ascii.php which looks like a pretty good quick fix if it works.)
I don't know the details here but I'd imagine that, if we supported UTF-8 then publishing arbitrary UTF-8 information would just work. Irrespective of encoding issues, (and with the benefit of hindsight ;-) I'm not sure publishing the values returned from running commands on a machine (e.g., the result of "service xxx status") as computer-interpretable values is such a good idea. The output could be from some i18n software, which could be localised to their local language. Wouldn't this force GLUE clients to understand all possible languages? To my mind, it would be better to publish values taken from a (short) list of acceptable values and to choose the value from the return-code of executing commands (or something similar). If the published value is the name of something (e.g., GlueSEName) then there isn't the same problem since it doesn't have to be machine understandable.
PS I have to say I'm a bit surprised in retrospect that I didn't see this coming until we hit a real example, especially after the long discussion about non-standard characters in DNs a year or so back!
Indeed! Paul.