
Hi all, Cerial came upon an ugly problem with the current spec: the wildcards used in the namespace package collide with the introduction of URLs, as several characters used for wildcards lead to not-well formed URLs. That problem was not present back then when we used strings instead of the saga::url class. Below is an email exchange describing the problem with examples. Opinions on how to solve that _nicely_ are very welcome. Thanks, Andre. ----- Forwarded message from Andre Merzky <andre@merzky.net> -----
Quoting [Ceriel Jacobs] (Nov 20 2007):
Ceriel Jacobs wrote:
Ceriel Jacobs wrote:
Hi,
I am now looking at wildcard expansion, and am totally confused as to where/when that should take place. The ns_directory methods copy(), move(), link() and remove() seem reasonable targets, but they all take an URL parameter. This is sort of OK with the '*' wildcard, but using any of ?, [, ], {, } results in an invalid URL. For instance, ftp://ftp.cs.vu.nl/pub/ceriel/LLgen.?ar.gz is not a valid URL.
Ah! Well, the wildcard spec still assumes that the parameters are strings, not URLs :-(
Ahum, this IS a valid URL, with a query part. Anyway, not what is intended.
Right.
OK, it can be done with %-escapes: I now get a match for
ftp://ftp.cs.vu.nl/pub/ceriel/LLgen.%7Btar%2Cnoot%7D.gz
Yes, I guess that works, but that leaves the effort to write escaped charactes to the end user -- probably not what we want.
So, the problem really is to distinguish between characters which the user added to describe wildcards, and characters the user added to describe legitimate URL parts. Thats impossible I'm afraid :-(
Ugh.
Justing dumping random thoughts from here:
One option would be to forbid query parts etc on URLs. But that would be a severe limitation.
Another option would be to 'mark' wildcard characters, e.g. to escape them with '\':
ftp://ftp.cs.vu.nl/pub/ceriel/LLgen.\?ar.gz
and to transform it internally into a/multiple valid URL(s). That would imply that the saga::url class would need to be aware of the escaping (so you cannot use a native URL class), and the user still has to do some work.
Another option is to revert to strings. Which removes your parsing error, but does not solve the problem semantically - at some point, the string needs to be converted into URLs.
And yet another option is not to use wirldcards in the spec - which is not really an option at this stage I guess, and would be a pity as well.
-----
So, I do not have a good answer at the moment, and will ponder some more on this. Do you mind if I forward the question to Hartmut/Ole and to the list?
Cheers, Andre. -- No trees were destroyed in the sending of this message, however, a significant number of electrons were terribly inconvenienced.