
Hi, these are notes from the ad-hoc meeting at SC-05 about a message oriented communication API. The API might be considered for inclusion at into the GGF SAGA API spec at some point - for now it is only supposed provide coherent discussion and development in the interested groups. As a reminder, material about saga can be found at: http://wiki.cct.lsu.edu/saga/ Meeting Participants: --------------------- - Jason Leigh (EVL) - Venkatram Vishwanath (EVL) - Andrei Hutany (LSU) - John Shalf (LBNL) - Andre Merzky (VU/LSU) Definition: ----------- Message: chunk of data which is potentially larger than a network package. Several independend sets for property flags have been identified: Reliability Requirements: - - - - - - - - - - - - - - Reliable all messages are received exactly once. If received, messages are complete - Unreliable messages are either received or not. If received, messages are complete - AtLeastOnce optional, as for Reliable, but messages can be received more than once Correctness Requirements: - - - - - - - - - - - - - - ByteErrors received messages MAY contain byte errors - NoByteErrors received messages MUST NOT contain byte errors Ordering Requirement: - - - - - - - - - - - - Ordered messages MUST be received in order - NotOrdered messages MAY be received out of order API considerations: ------------------- - it was felt that a BSD like connection setup is most useful - asynchroneous recieving of complete messages is needed (viz use cases!) - striping/multicasting on application level is not considered for now (multiple senders/receivers) API proposal: ------------- - establish connection: - bsd like: listen/accept/connect - port range should be specifiable - properties should be specified as flags (changable at runtime) - write: write (buffer, size, BLOCK | NO_BLOCK) message is written completely or not at all (if possible) - read: two step mechanism: (int handle, int size) = query_size (); (char* buffer) = read (handle, size); - handle can be zero, if size is known (one step read) - buffer needs to be allocated by application. - if size is zero, the buffer is allocated by the implementation and returned to be freed by the application (one step read) - if size is smaller then the real message size, message gets truncated, remainder gets lost (read again not possible) - if size is larger, buffer gets patted by 0 - asynchroneous method calls: - as in the SAGA task model, with callbacks - connection shutdown: - as in bsd (close ()) Please feel free to send corrections, comments etc. Thanks, Andre. -- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

Andre, Thanks for the notes!
- read: two step mechanism: (int handle, int size) = query_size (); (char* buffer) = read (handle, size);
- handle can be zero, if size is known (one step read) - buffer needs to be allocated by application. - if size is zero, the buffer is allocated by the implementation and returned to be freed by the application (one step read)
This part is a bit confusing .. with the one step read you lose the length of the message? it should probably be (char* buffer, int size) = read () for the one step read ..Or am I reading it wrong.. Andrei

Quoting [Andrei Hutanu] (Nov 22 2005):
Andre,
Thanks for the notes!
- read: two step mechanism: (int handle, int size) = query_size (); (char* buffer) = read (handle, size);
- handle can be zero, if size is known (one step read) - buffer needs to be allocated by application. - if size is zero, the buffer is allocated by the implementation and returned to be freed by the application (one step read)
This part is a bit confusing .. with the one step read you lose the length of the message? it should probably be (char* buffer, int size) = read () for the one step read ..Or am I reading it wrong..
No, you are probably right. IIRC, we pinned down two different semantics for one step read: a) size is known, malloc by application (buffer) = read (handle = NULL, size); b) size unknown, malloc by implementation (buffer, size) = read (); Thanks, Andre.
Andrei
-- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

On Nov 22, 2005, at 3:21 PM, Andre Merzky wrote:
Quoting [Andrei Hutanu] (Nov 22 2005):
Andre,
Thanks for the notes!
- read: two step mechanism: (int handle, int size) = query_size (); (char* buffer) = read (handle, size);
- handle can be zero, if size is known (one step read) - buffer needs to be allocated by application. - if size is zero, the buffer is allocated by the implementation and returned to be freed by the application (one step read)
This part is a bit confusing .. with the one step read you lose the length of the message? it should probably be (char* buffer, int size) = read () for the one step read ..Or am I reading it wrong..
No, you are probably right. IIRC, we pinned down two different semantics for one step read:
a) size is known, malloc by application (buffer) = read (handle = NULL, size);
The read should be like POSIX and return the buffer size rather than the buffer itself.
b) size unknown, malloc by implementation (buffer, size) = read ();
Wait, I think Andre had it correct. There is also a c) size unknown, but buffer is presized to what the application programmer expects to the the max message size. Under this circumstance, you can do a single step read which returns the *actual* size of the read. If the message was larger than the buffer size declared in the read(), then the read will be truncated and the truncated message data will be dropped. int read(handle, buffer, size) buffer is presumed to be "size" bytes in length. Read returns the actual message size. If message is larger than buffer size, then truncated data is dropped.
Thanks, Andre.
Andrei
-- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

John Shalf wrote:
On Nov 22, 2005, at 3:21 PM, Andre Merzky wrote:
Quoting [Andrei Hutanu] (Nov 22 2005):
Andre,
Thanks for the notes!
- read: two step mechanism: (int handle, int size) = query_size (); (char* buffer) = read (handle, size);
- handle can be zero, if size is known (one step read) - buffer needs to be allocated by application. - if size is zero, the buffer is allocated by the implementation and returned to be freed by the application (one step read)
This part is a bit confusing .. with the one step read you lose the length of the message? it should probably be (char* buffer, int size) = read () for the one step read ..Or am I reading it wrong..
No, you are probably right. IIRC, we pinned down two different semantics for one step read:
a) size is known, malloc by application (buffer) = read (handle = NULL, size);
The read should be like POSIX and return the buffer size rather than the buffer itself.
b) size unknown, malloc by implementation (buffer, size) = read ();
Wait, I think Andre had it correct. There is also a c) size unknown, but buffer is presized to what the application programmer expects to the the max message size. Under this circumstance, you can do a single step read which returns the *actual* size of the read. If the message was larger than the buffer size declared in the read(), then the read will be truncated and the truncated message data will be dropped. int read(handle, buffer, size) buffer is presumed to be "size" bytes in length. Read returns the actual message size. If message is larger than buffer size, then truncated data is dropped.
Isn't this the same as a? I see only two cases : a) user knows something (max expected size or actual size) - int read(buffer, size) described above b) user doesn't know anything - (buffer, size) = read (); I'm probably missing something obvious .. Andrei

On Nov 22, 2005, at 4:52 PM, Andrei Hutanu wrote:
John Shalf wrote:
On Nov 22, 2005, at 3:21 PM, Andre Merzky wrote:
Quoting [Andrei Hutanu] (Nov 22 2005):
Andre,
Thanks for the notes!
- read: two step mechanism: (int handle, int size) = query_size (); (char* buffer) = read (handle, size);
- handle can be zero, if size is known (one step read) - buffer needs to be allocated by application. - if size is zero, the buffer is allocated by the implementation and returned to be freed by the application (one step read)
This part is a bit confusing .. with the one step read you lose the length of the message? it should probably be (char* buffer, int size) = read () for the one step read ..Or am I reading it wrong..
No, you are probably right. IIRC, we pinned down two different semantics for one step read:
a) size is known, malloc by application (buffer) = read (handle = NULL, size);
The read should be like POSIX and return the buffer size rather than the buffer itself.
b) size unknown, malloc by implementation (buffer, size) = read ();
Wait, I think Andre had it correct. There is also a c) size unknown, but buffer is presized to what the application programmer expects to the the max message size. Under this circumstance, you can do a single step read which returns the *actual* size of the read. If the message was larger than the buffer size declared in the read(), then the read will be truncated and the truncated message data will be dropped. int read(handle, buffer, size) buffer is presumed to be "size" bytes in length. Read returns the actual message size. If message is larger than buffer size, then truncated data is dropped.
Isn't this the same as a? I see only two cases : a) user knows something (max expected size or actual size) - int read(buffer, size) described above b) user doesn't know anything - (buffer, size) = read (); I'm probably missing something obvious ..
Whoops, then we really need a) size is not known, so we do two step int getMessageSize(handle); <malloc by application> int read(handle,buffer,size) b) size is not known, but we let implementation malloc automatically char *readAutoAllocate(handle); c) size is known, so we just do a one-step read int read(handle,buffer,size);

Hmm, that is what I initially wanted to say :-) I should have marked in/out params more clearly... A. Quoting [John Shalf] (Nov 22 2005):
Whoops, then we really need a) size is not known, so we do two step int getMessageSize(handle); <malloc by application> int read(handle,buffer,size)
b) size is not known, but we let implementation malloc automatically char *readAutoAllocate(handle);
c) size is known, so we just do a one-step read int read(handle,buffer,size);
-- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

I compiled all the notes on the wiki : http://wiki.cct.lsu.edu/saga/space/SAGA+API/Messages Andrei Andre Merzky wrote:
Hmm, that is what I initially wanted to say :-) I should have marked in/out params more clearly...
A.
Quoting [John Shalf] (Nov 22 2005):
Whoops, then we really need a) size is not known, so we do two step int getMessageSize(handle); <malloc by application> int read(handle,buffer,size)
b) size is not known, but we let implementation malloc automatically char *readAutoAllocate(handle);
c) size is known, so we just do a one-step read int read(handle,buffer,size);

Hello, Just one comment to the read proposal: "where size is passed to the read method, if size is smaller then the real message size, message gets truncated, remainder gets lost (read again not possible). If size is larger, buffer gets patted by 0" Tailing 0s are unnecessary if the read returns the number of bytes that were read. It is a waist of CPU time to do that. The buffer that is unused should remain "as is". Another issue is that this API forces to write/read the message into/from the first offset of the buffer. In languages like java it ain't be possible to shift the pointer like in C, so the offset must be explicitly given. Again this is a language binding issue, but it could be put in the API, otherwise the implementations will not have the same signatures. We will then end up with implementations that do not reflect the generic API for most functions. Andrei Hutanu wrote:
I compiled all the notes on the wiki : http://wiki.cct.lsu.edu/saga/space/SAGA+API/Messages
Andrei
Andre Merzky wrote:
Hmm, that is what I initially wanted to say :-) I should have marked in/out params more clearly...
A.
Quoting [John Shalf] (Nov 22 2005):
Whoops, then we really need a) size is not known, so we do two step int getMessageSize(handle); <malloc by application> int read(handle,buffer,size)
b) size is not known, but we let implementation malloc automatically char *readAutoAllocate(handle);
c) size is known, so we just do a one-step read int read(handle,buffer,size);
-- Best regards, Pascal Kleijer ---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382

On Dec 4, 2005, at 3:49 PM, Pascal Kleijer wrote:
Hello,
Just one comment to the read proposal:
"where size is passed to the read method, if size is smaller then the real message size, message gets truncated, remainder gets lost (read again not possible). If size is larger, buffer gets patted by 0"
Hi Pascal, the truncation is obviously necessary to prevent buffer-overflow attacks (either accidental or intentional). The padding for oversized buffers is new to me as well. I know that this is done inside of security-oriented software like LibSSH (always zero-padding buffers on short-reads). I assume there is some security reason for doing this, but would appreciate any insight from others on the list as to whether the return size for the short read (that indicates the correct message size) is sufficient.
Tailing 0s are unnecessary if the read returns the number of bytes that were read. It is a waist of CPU time to do that. The buffer that is unused should remain "as is".
Another issue is that this API forces to write/read the message into/from the first offset of the buffer. In languages like java it ain't be possible to shift the pointer like in C, so the offset must be explicitly given.
Good point. This may need to be dealt with for the fortran bindings as well in order to be correct to the standard. (we can pass array (offset) and it will work for an underlying c-implementation, but that might be an abuse of the standard).
Again this is a language binding issue, but it could be put in the API, otherwise the implementations will not have the same signatures. We will then end up with implementations that do not reflect the generic API for most functions.
Andrei Hutanu wrote:
I compiled all the notes on the wiki : http://wiki.cct.lsu.edu/ saga/space/SAGA+API/Messages Andrei Andre Merzky wrote:
Hmm, that is what I initially wanted to say :-) I should have marked in/out params more clearly...
A.
Quoting [John Shalf] (Nov 22 2005):
Whoops, then we really need a) size is not known, so we do two step int getMessageSize(handle); <malloc by application> int read(handle,buffer,size)
b) size is not known, but we let implementation malloc automatically char *readAutoAllocate(handle);
c) size is known, so we just do a one-step read int read(handle,buffer,size);
--
Best regards, Pascal Kleijer
---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382

I find the padding a bit strange as well .. we should leave it out .. Andrei
Hello,
Just one comment to the read proposal:
"where size is passed to the read method, if size is smaller then the real message size, message gets truncated, remainder gets lost (read again not possible). If size is larger, buffer gets patted by 0"
Hi Pascal, the truncation is obviously necessary to prevent buffer-overflow attacks (either accidental or intentional). The padding for oversized buffers is new to me as well. I know that this is done inside of security-oriented software like LibSSH (always zero-padding buffers on short-reads). I assume there is some security reason for doing this, but would appreciate any insight from others on the list as to whether the return size for the short read (that indicates the correct message size) is sufficient.
Tailing 0s are unnecessary if the read returns the number of bytes that were read. It is a waist of CPU time to do that. The buffer that is unused should remain "as is".
Another issue is that this API forces to write/read the message into/from the first offset of the buffer. In languages like java it ain't be possible to shift the pointer like in C, so the offset must be explicitly given.
Good point. This may need to be dealt with for the fortran bindings as well in order to be correct to the standard. (we can pass array (offset) and it will work for an underlying c-implementation, but that might be an abuse of the standard).
Again this is a language binding issue, but it could be put in the API, otherwise the implementations will not have the same signatures. We will then end up with implementations that do not reflect the generic API for most functions.
Andrei Hutanu wrote:
I compiled all the notes on the wiki : http://wiki.cct.lsu.edu/ saga/space/SAGA+API/Messages Andrei Andre Merzky wrote:
Hmm, that is what I initially wanted to say :-) I should have marked in/out params more clearly...
A.
Quoting [John Shalf] (Nov 22 2005):
Whoops, then we really need a) size is not known, so we do two step int getMessageSize(handle); <malloc by application> int read(handle,buffer,size)
b) size is not known, but we let implementation malloc automatically char *readAutoAllocate(handle);
c) size is known, so we just do a one-step read int read(handle,buffer,size);
--
Best regards, Pascal Kleijer
---------------------------------------------------------------- HPC Marketing Promotion Division, NEC Corporation 1-10, Nisshin-cho, Fuchu, Tokyo, 183-8501, Japan. Tel: +81-(0)42/333.6389 Fax: +81-(0)42/333.6382
participants (5)
-
Andre Merzky
-
Andrei Hutanu
-
John Shalf
-
John Shalf
-
Pascal Kleijer