RNS critique

20 Jul 2005

      All,

Here are some of my comments on the RNS document. I jammed my finger
yesterday in basketball, so typing is painful. Therefore, I'm limiting my
comments to the most significant. Thank you to the RNS team for all of their
work.

Andrew

High level comments:

1) The basic directory function is to provide a mapping handle f("string"),
i.e., to map strings to some form of handle. 

Thus, in section 1.1.

Why do we need four different types of junctions? Should we not have one
type of junction: mainly to an EPR. Thus a directory would map a string to
an EPR.

EPR f("string"); 

Instead there are four "types" of junctions:

            EPR's

            Virtualized reference - either contains an EPR or a URL that
points to some other service

            Referrals - point to another RNS service. Q: could this not be
modeled as an EPR?

            Alias - points to another "entry" within the "same" RNS service.
This seems to imply a container like model - more on this later.

I suggest that we need only ONE type of junction - an EPR. That will
simplify client coding - and the model significantly. 

2) Implied model - Repositories

There is a notion in the spec of "same service instance" that in
conversations has also been called a "repository". The basic idea is that an
RNS service may "contain" a set of directories - and that the service has a
root. Thus junction types distinguish, for example, between internal and
external things that they point to.  Thus an RNS service is a rooted tree -
that may point at the leaves to other rooted trees. So, first of all - it is
implied. If we're going to have that model it should be right up front and
discussed. What is a repository? What are it's special port types if any,
etc. Second, I think that it is not the right way to think about the
problem. Directories should be the resources - not collections of
directories. If a particular implementation chooses to multiplex a large
number of logically independent directories in a single container - great -
we will certainly do that too. The issue is what is the model. I feel fairly
strongly about this.

Links "into" other RNS servers:1.1.2.4 "Alias Junction" is restricted to
pointing to entries in the same repository. I think they should be able to
point to anything - including directories in "other" repositories. It has
been claimed that using the EPR of the "repository" and a path you can get
that effect. However, what if the path changes in the other container? My
link would break - even if the directory itself still exists. 

3) Full path names

In ANY directory system lookup really takes two parameters - a "root" at
which to start, and a path. Often the "root" is implied, or is at some
well-known location. RNS - as written, implies that all lookups are based on
full paths with an implied, unspecified root. 

Assuming that full paths are to be used on all lookups, the potential for
both hot spots AND single point of failure are clear.

In conversation with Manual he mentioned that his clients cache intermediate
parts of the tree in the sense that "/foo/bar/d1" as a prefix leads to a
particular RNS service, and then use that info to not always traverse the
tree. Besides the obvious implementation challenges of cache consistency
when the tree is changing (a problem we certainly had/have in Legion) there
is the modeling issue. If we expect clients to do that - then perhaps the
architecture/specification should accommodate that and say that all lookups
are relative path lookups with respect to some "root". The root could be a
true "root", or interior node in a tree, which is itself a "root" of the
subtree it defines. 

4) Resolve and file system profile. We discussed these on the last ogsa
call, my understanding is that they are going out.

5) Iterators. OGSA-Data-WG has discussed iterators in a more general way,
e.g., on data base query results etc., I think that whatever is done in RNS
should be consistent with whatever is done in OGSA-Data (note - consistency
can happen either way).

Medium level comments:

S 1.1

"In all cases, junctions are capable of maintaining a list of references
(EPRs/URLs) per entry, that is a single junction my render several available
EPRs, each of which represent replicas, copies of the same resource, or
operationally identical services. "

Why? Are you saying that replication issues and semantics should be dealt
with in the directory structure? Or are you saying that directories are not
"sets" in the sense of only one entry - but rather "multi-sets" in the sense
that one string can map to multiple things. If the later - what are the
implied semantics.

I think it may be safer to keep them as sets.

Andrew Grimshaw

Manuel Pereira

Ted Anderson

Manuel Pereira

tags

participants (3)