Re: [SAGA-RG] SAGA Message API Extension

24 Jan 2007

      Hi group, 

Werner and I, and separately Hartmut and I, chatted about
the message object, structured and typed data buffers etc.

We would like to propose the following approach:

  - leave the message data buffer unstructured and untyped
  - allow language bindings to add support for packing
    native data types into the buffer (similar to MPI and
    PVM)
  - use this message buffer for the message API, but
    possibly also for streams and files (additionally to the
    char* we support right now)
  - later discuss more structured message buffers on top of
    the unstructured ones
  - even later discuss message buffer with specific data
    models on top of the structured ones (that may be domain
    specific, and outside of the SAGA scope per se - more as
    informational docs or community practice)

Lets pick this up at OGF next week from here...

Best regards, 

  Andre.

Quoting [Werner Benger] (Jan 19 2007):
...
To: Andre Merzky <andre@merzky.net>
Subject: Re: SAGA Message API Extension
From: Werner Benger <benger@zib.de>
Cc: Andrei Hutanu <ahutanu@cct.lsu.edu>, John Shalf <JShalf@lbl.gov>,
  SAGA RG <saga-rg@ogf.org>,
  Gregor von Laszewski <gregor@mcs.anl.gov>
Hi Andre,
On Thu, 18 Jan 2007 15:47:34 -0600, Andre Merzky <andre@merzky.net> wrote:
...
Hi Werner,
Quoting [Werner Benger] (Jan 18 2007):
...
Hi Andre,
I have two other remarks, which might be orthogonal to the current
draft, but might still be good to have it mentioned there:
* Structured messages:
The current draft just talks about transporting an array of bytes,
  but in practice we might want to transfer floats/doubles/ints etc.
  While this *might* be implemented on top of the current msg API,
  this would be a waste if the low-level protocol implementation
  (e.g. MPI) already would support such types (including byte ordering
  conversion). As such, it were useful to have the option to use
  such mechanisms from a low-level protocol if supported. If not,
  then it would need to be taken care of on top of the current level.
Ah, good point - that at least needs clarification in the
spec!
Yes, you are right: the focus on opaque messages is a
limitation for many use cases.  OTOH, support for primitive
types such such as ints or floats don't by you that much,
and for more complex structures... - well, who knows better
than you that agreeing on a data model is a reeaaally
difficult job? ;-)
ah, data model issues are certainly shooting too high here.
however, a self-descriptive typed message structure like it
is possible in MPI or also HDF5 would do the job in a generic
way. It doesn't need to do any more than native C allows, ie.
support native types and arbitrary structures built from them.
That's just the same level as other low-level protocols might
already support.
...
So, basically the message API tries to avoid that topic for
the main reason that it seems difficult to define.  I would
wholeheartly support any activity which tries to define
domain or use case specific flavours of the API.  That would
be a simple excercise: you would only need to redefine the
set_data method on the msg class accordingly.
A generic self-descriptive of data should be fully independent
 from user cases or application domains.
...
So, the question is: is a very limited support for primitive
data types something (really) useful?
In practice, you would need it anyway. So it's kind of moving
operations that the app needs to do anyway into the common
denominator SAGA. Plus we get the option that these operations
might be done by a lower-level protocol interface which might
be more efficient than if done by the application.
What I would tend to avoid is to use e.g. a protocol, which knows
about floats and byteordering, just to shuffle bytes, and then
do the byteordering manually again.
...
...
* Interfacing Event Loops:
If we want to use this API from within a larger application instead
  of just self-standing programs, we might want to use mechanisms such
  as socket callbacks for event handling (eg. the QSocketNotifier or
  under X11 using XtAppAddInput). Would be good to have some support
  to allow this, even though it might be optional.
Right, thats an important point, in particular for the
visualization use cases.  Its actually in the spec, but well
hidden :-)  The endpoint class definition says:
class endpoint : implements saga::object
                   implements saga::async
                   implements saga::monitoring
                   [...]
saga::async is actually an empty interface, but what that
means is that the class will contain several versions of
every class method: a synchronous one, and 3 additional
ones.  In C++ the rendering would look like:
// connection setup
  saga::endpoint ep;
  ep.serve ();
// normal, synchronous version
  saga::msg m = ep.recv ();
// task version 1: synchronous
  saga::task t1 = ep.recv <saga::task::Sync> (msg);
 // task version 2: asynchronous
  saga::task t2 = ep.recv <saga::task::ASync> (msg);
 // task version 3: task
  saga::task t3 = ep.recv <saga::task::Task>  (msg);
These three versions of the recv method all return a task,
which only differs in its state: t1 is Done, t2 is Running,
and t3 is New (not yet running).  You can get notification
on when a task is Done etc.
Task means a thread or is this just some saga-internal data
type?
...
Additionally, the spec defines some metrics on the endpoint,
among them:
// Metrics:
      //   name:  Message
      //   desc:  fires if a message arrives
      //   mode:  Read
      //   unit:  1
      //   type:  String
      //   value: ""
      //   notes: - the value is the endpoint URL of the
      //            sending party, if known.
These metrics are used by the monitoring interface, which is
also implemented by the endpoint.  With that, you can add
callbacks to an endpoint which gets called when a new msg
arrives:
saga::endpoint ep;
  ep.add_callback ("Message", my_cb);
  ep.serve ();
my_cb is a user defined class which implements
saga::callback, and whose cb() method gets then called on
incoming messages.
Sorry if that was somewhat lengthy.  Anyway, point is: async
ops and notification are covered, by means of the SAGA Core
Look&Feel, which is inherited by this API.
Hm, ok. Good. Maybe you can add some concrete examples in the
appendix, such as how would interfering with QT really look like?
Werner
...
Cheers, Andre.
...
Werner
On Thu, 18 Jan 2007 14:18:52 -0600, Andre Merzky <andre@merzky.net> wrote:
...
Hi John, Andrei,
you are right: getting some feedback from the transport
level folx is certainly a good idea.  The API draft won't go
into public comment for another month or so (at least), and
then it will stay in public comment for another 2 months or
longer - that should give us enough time to contact them.
About ordering: the text Andrei cited is in the spec because
ordering is, as of now, not an attribute of the connection
or endpoint - so the spec tries to nail it down.  It says
"MUST be ordered, but no global ordering is required"
because I thought that this covers the majority of use
cases.
I don't think there are use cases which require global
ordering - or at least not enough to justify a requirement
for global ordering.  What is your opinion?  Also, thats
really difficult to implement in Grids IMHO.
Use cases which do not require ordering should be happy with
order preserving connections, too.  Question now is: does
the benefit of un-ordered implementations (simplier, smaller
footprint) justify an attribute on API level?  Or are there
use cases which require non-ordered delivery for other
reasons?
Cheers, Andre.
Quoting [Andrei Hutanu] (Jan 18 2007):
...
Hi,
...
>
>2) I see ordering is enforced, could that be an option?
I think ordering is *not* enforced, but I do wonder if it should be
an option or a channel property (certainly semireliable will likely
result in some reording whereas a TCP channel would enforce ordering
of the messages for instance).
This is a controversial topic in the HPC message passing community
(whether msg. ordering is a good or bad-thing to enforce in at the
hardware level).
I was thinking the same (no strong feelings for either option or
property) but the text tells otherwise :
In 2.1 introduction :
In contrast, this message API extension guarantees that message blocks
of arbitrary size are delivered in order, and intact, without the need
for additional application level coordination or synchronization.
and
then in 2.1.7 reliability corectness and ordering
The order of sent messages MUST be preserved by the implementation.
Global ordering is, however, not guaranteed to be preserved:
Assume three endpoints A, B and C, all connected to each other. If A
sends two messages [a1, a2], in this order, it is guaranteed that both B
and C receive the messages in this order [a1, a2]. If, however, A sends
a message [a1] and then B sends a message [b1], C may receive the
messages in either order, [a1, b1] or [b1, a1].
Andrei
-- 
"So much time, so little to do..."  -- Garfield