Re: [SAGA-RG] SAGA Message API Extension

17 Jan 2007

      On Jan 17, 2007, at 4:23 AM, Andre Merzky wrote:
...
Hi John!
Quoting [John Shalf] (Jan 16 2007):
...
Hi Andre,
In Chapter 2, the paragraphs explaining the message transfer
requirements are a bit confusing.
  1) it says it supports multicast (which is inherently unreliable).
I'm sure you mean to say a "message bus" (sort of like the Linux DBus
concept), which would not specifically call out a particular network
standard (multicast is more specific than a "message bus").
You are right: using 'multicast' here is probably
misleading.  Thilo made a similar comment, so I'll change it
to "message bus" or so.
...
2) It says the message must be received completely and correctly, or
not at all.  This leaves some aspects of the reliability uncertain.
For instance
      a) does this mean it must guarantee delivery of a message?
      Or only  that a delivered message does not contain errors?
It probably needs a more verbose explanation.  Tyhe current
idea is:
- the implementation MUST ensure that the messages are
    complete and error free
- if message delivery is guaranteed or not is up to the
    implementation, and up to the used protocol.  The
    implementation MUST document that aspect, and the URL
    (scheme...) allows to choose between different
    reliability modes.
I agree.  I think the reliability semantics should be a property of  
the channel (and hence underlying protocol) and not something to  
switch at runtime.  However, I think the clients that use the  
connection should be able to query what the channel properties are  
(or define the minimum requirements and throw an exception if they  
cannot be met).  This is just a little introspection (not really a  
deep or fancy coding underneath).
...
I wanted to avoid a 'reliability' attribute on the endpoint,
as I don't see a reasonable easy way to switch reliability
in mid stream.  It is most likely that changes in
reliability policy would require different protocols (only
few protocols will be able to serve both ends), and would
hence need a new connection setup anyway.  Not sure if that
is reasonable though.
Anyway, that would leave us with specification of
reliability on connection setup (i.e. endpoint construction)
- right now that is done via the scheme part of the URL.
Would a flag be more appropriate/explicit?  Probably...
...
b) what about if the message arrives more than once? (eg.
      redundant  copies of messages can occur in practice for a number of
unreliable  or semi-reliable messaging protocols).
Good point.  IMHO we should specify that messages arrive 'AT
MOST ONCE' (unreliable) or EXACTLY ONCE (reliable).  I don't
think that we need to support additional modes, nor the
complete spectrum of (un/)reliable transmission modes.  Or?
I cannot remember the document, but there was a spec for defining  
reliability of another kind of messaging bus.  It had
	unreliable: message may or may not arrive (typical UDP).
	semi-reliable: message must arrive (will be retransmitted if not  
acked).  Source keeps resending until it sees the ack.  It is  
possible that the ack is lost, in which case, the message may arrive  
at the client twice (so this is a Message will arrive at least once,  
and possibly more than once.)  This protocol is useful since the  
message destination does not need to maintain complex message  
state... it just acks messages that arrive (very simple).
	reliable: the message must arrive and it will only arrive once.
...
...
c) It says this document will not address things at a
      protocol  level, but I think these issues are semantic and  
therefore must
be  addressed by the API.
IC, good point.  Well, flags on the endpoint construction
would solve that.
...
d) I think there are a number of attributes that users
      should be  able to supply or query when opening a message service
connnection.   Like the JMS (Java Message Service), we should be  
able to
specify  whether this is a point-to-point (message queue) or a  
publish-
subscribe (message bus) like interface.  The API should not require
any work to support both since point-to-point is a sub-category of
the message-bus, but it should be an attribute of the message
interface that a user can force/specify in the opening of the
connection.
Hmm, point to point could be simply established on
application level, by just setting up a single connection.
Uhm, problem: serve() can only be started, not stopped.  An
int argument to serve(), giving the number of clients to
wait for, would solve that.
// point to point:
  saga::message  msg;
  saga::endpoint ep;
  ep.serve (1);     // allow one client
  ep.recv  (&msg);  // expect one message
  exit;
// publish/subscribe
  saga::message  msg;
  saga::endpoint ep;
  ep.serve ();      // allow n clients
  while ( 1 ) {
    ep.recv  (&msg);  // expect m messages
  }
Does that make sense?
That does address one of the cases (not exactly the one I was  
thinking of).
Also should allow -1 to specify *any-number-of-clients* at the  
underlying protocol's discretion.

But now for a semantic thicket....

Case 1: I connect to a named destination and I am joined together  
with everyone else who has connected to that port.  In this case, any  
message I send will be broadcast to everyone else and vice verse  
(message bus).  This is the equivalent of a publish-subscribe service.
Case 2: I connect to a port and I own the service.  This appears to  
be the case you are setting up above as it only allows one client to  
join the service.
Case 3: I connect to the port and for a sub-process that does not  
share the messages with the other clients that have connected to that  
port (like an HTTP server).  I don't quite see how that kind of point- 
to-point message service is supported.

I think we need an attribute for the message port that says whether  
it is a bus or a point-to-point.  In addition, I like the idea of  
setting the message queue length (as you have above).  I was not  
thinking of that, but I can see the value of creating a first-come- 
first-serve message port as well. (if that is not too complicated).
...
...
Another thing to support is specification or query of
the message service reliability (something that deserves at least a
subsection to define.  That is, the document should define classes of
reliability (just as done with other XML-based messaging APIs that
even define intermediate cases for unreliable messaging that
guarantee message arrival, but do not ensure that duplicate messages
will not arrive).  The API should allow a user to specify these
semantic attributes and will not allow a connection to be built if
the underlying protocol for the message connection cannot meet those
attributes.
Again, that could be done by using flags on the ep creation.
I guess that a 'more reliable' connection than requested
would be possible?  E.g.,
saga::endpoint ep (saga::message::Unreliable);
could actually set up an unreliable or an reliable
connection, as both would fulfill the user req.?
Yes, I think it is sufficient to define these things at the setup of  
the endpoint (not something you would change at runtime).

-john