Re: [ogsa-wg] RNS critique

21 Jul 2005

      Hello All,

I was hoping to provide a quick response to Andrew's comments today, 
however due to unexpected personal reasons I was unable to.  I will not be 
available until Monday next week, at which time I will provide a brief 
response to each of the points listed.

Best regards,
Manuel Pereira
===============================
IBM Almaden Research Center
1-408-927-1935  [T/L 457]
mpereira@us.ibm.com

owner-ogsa-wg@ggf.org wrote on 07/20/2005 11:55:00 AM:
...
All,
Here are some of my comments on the RNS document. I jammed my finger
yesterday in basketball, so typing is painful. Therefore, I?m 
limiting my comments to the most significant. Thank you to the RNS 
team for all of their work.
Andrew
High level comments:
1) The basic directory function is to provide a mapping handle 
f(?string?), i.e., to map strings to some form of handle.
Thus, in section 1.1.
Why do we need four different types of junctions? Should we not have
one type of junction: mainly to an EPR. Thus a directory would map a
string to an EPR.
EPR f(?string?); 
Instead there are four ?types? of junctions:
            EPR?s
            Virtualized reference ? either contains an EPR or a URL 
that points to some other service
            Referrals ? point to another RNS service. Q: could this 
not be modeled as an EPR?
            Alias ? points to another ?entry? within the ?same? RNS 
service. This seems to imply a container like model ? more on this 
later.
I suggest that we need only ONE type of junction ? an EPR. That will
simplify client coding ? and the model significantly.
2) Implied model ? Repositories
There is a notion in the spec of ?same service instance? that in 
conversations has also been called a ?repository?. The basic idea is
that an RNS service may ?contain? a set of directories ? and that 
the service has a root. Thus junction types distinguish, for 
example, between internal and external things that they point to. 
Thus an RNS service is a rooted tree ? that may point at the leaves 
to other rooted trees. So, first of all ? it is implied. If we?re 
going to have that model it should be right up front and discussed. 
What is a repository? What are it?s special port types if any, etc. 
Second, I think that it is not the right way to think about the 
problem. Directories should be the resources ? not collections of 
directories. If a particular implementation chooses to multiplex a 
large number of logically independent directories in a single 
container ? great ? we will certainly do that too. The issue is what
is the model. I feel fairly strongly about this.
Links ?into? other RNS servers:1.1.2.4 ?Alias Junction? is 
restricted to pointing to entries in the same repository. I think 
they should be able to point to anything ? including directories in 
?other? repositories. It has been claimed that using the EPR of the 
?repository? and a path you can get that effect. However, what if 
the path changes in the other container? My link would break ? even 
if the directory itself still exists.
3) Full path names
In ANY directory system lookup really takes two parameters ? a 
?root? at which to start, and a path. Often the ?root? is implied, 
or is at some well-known location. RNS ? as written, implies that 
all lookups are based on full paths with an implied, unspecified root. 
Assuming that full paths are to be used on all lookups, the 
potential for both hot spots AND single point of failure are clear.
In conversation with Manual he mentioned that his clients cache 
intermediate parts of the tree in the sense that ?/foo/bar/d1? as a 
prefix leads to a particular RNS service, and then use that info to 
not always traverse the tree. Besides the obvious implementation 
challenges of cache consistency when the tree is changing (a problem
we certainly had/have in Legion) there is the modeling issue. If we 
expect clients to do that ? then perhaps the 
architecture/specification should accommodate that and say that all 
lookups are relative path lookups with respect to some ?root?. The 
root could be a true ?root?, or interior node in a tree, which is 
itself a ?root? of the subtree it defines.
4) Resolve and file system profile. We discussed these on the last 
ogsa call, my understanding is that they are going out.
5) Iterators. OGSA-Data-WG has discussed iterators in a more general
way, e.g., on data base query results etc., I think that whatever is
done in RNS should be consistent with whatever is done in OGSA-Data 
(note ? consistency can happen either way).
Medium level comments:
S 1.1
?In all cases, junctions are capable of maintaining a list of 
references (EPRs/URLs) per entry, that is a single junction my 
render several available EPRs, each of which represent replicas, 
copies of the same resource, or operationally identical services. ?
Why? Are you saying that replication issues and semantics should be 
dealt with in the directory structure? Or are you saying that 
directories are not ?sets? in the sense of only one entry ? but 
rather ?multi-sets? in the sense that one string can map to multiple
things. If the later ? what are the implied semantics.
I think it may be safer to keep them as sets.