
Hi, Quoting [Thilo Kielmann] (Nov 26 2007):
All,
I did a global search for "wildcard" in the SAGA core spec. The result is that we are having three places using wildcards:
1. attributes 2. logical directory (using both attribute and path wildcards) 3. namespace.directory, using path wildcards.
Attribute wildcards don't pose a problem (at least to me, or until Ceriel will find one ;-)
Attributes will stay strings for the time beeing (i.e. until we introduce properly typed attributes). So, wildcrads should not be a problem for the moment.
The path wildcards from namespace.directory, however, do bring a problem, in combination with URLs.
If I remember correctly, we switched from strings to URLs for a good reason.
Yes, one beeing to enforce parsing on the strings - which is exactly where it bites us now :-P
URLs, however, do not allow for wildcards, according to RFC1738.
Well, RFC1738 actually refers wildcards explicitely, e.g. in 3.6. NEWS: If <newsgroup-name> is "*" (as in <URL:news:*>), it is used to refer to "all available news groups". And here are two other options actually for dealing with wildcards: - allow only *, not the full blown shell wirldcards - or use different characters for wildcards, e.g. data_[a-z].bin -> data_((a-z)).bin image.?pg -> image01.#pg I would find the second one slightly confusing, but an option it is.
And the here mentioned query parts of URLs are for http only, and not for files as we would need them here.
Well, http URLs can refer to files...
If we define some "URL with wildcards" that would no longer be URLs, so this is no way to go.
Why do we want/need wildcards for? The core spec writes about "shell wildcards", so we want to apply a single operation to several namespace entries at a time. (e.g.: move, copy, find,...)
Right.
This reminds me of bulk operations with SAGA tasks. But this also feels like "overkill" for the use case of file wildcards.
Well, it seemed sensible and easy back then when we had strings. Actually, wild cards are just an API optimization, right? It can always be done in user space... (ls + loop + filter). So we thought that wildcards can, in the worst case (e.g. if not supported by the backend), be provided by the implementation, with no penalty if compared to application level code. Yes, we can use the bulk mechanism, but that puts the burden of wildcard expansion back into user code. Unless one provides the expand method of course ;-)
My suggestion is thus to follow Ceriel (version 2):
On Thu, Nov 22, 2007 at 11:12:58AM +0100, Ceriel Jacobs wrote:
Another approach would be to have an explicit method to do wildcard expansion. For instance, in namespace.ns_directory:
expand (in string pattern, out array<saga::url> urls);
Here, the pattern only specifies the "path" part, but with wildcards (the directory implicitly specifies the rest of the url). I am not sure whether the resulting urls should be resolved with respect to the directory or not. I think not.
I think we need to spend some good thoughts on getting the parameters to this call right (do we need a pattern to compose the URLs from the expanded patterns???)
I am not sure what the last sentence means :-( the returned items _are_ URLs, so what other URLs do you want to compose? Sorry for being thick...
Besides this "expand" method, we would have to change the relevant namespace.directory methods to accept arrays of URLs instead of individual URLs.
That makes coding slightly awakward if you want to copy a single file, as you'd need to create an array for that single URL. So we would need two calls (one with array, one without) which would again bloat the API. So I'd rather vote for requireing the code to loop over the entries and to use the normal (singular) calls...
The other radical approach could be: remove file name wildcards alltogether...
More thoughts?
Thilo
My favourite at the moment: - allow * as wildcard in URLs - for all other wirldcards ([a-z], ?, {one,two,three}) use expand(), and require user level loops over te result. Cheers, Andre. -- No trees were destroyed in the sending of this message, however, a significant number of electrons were terribly inconvenienced.