Re: [saga-rg] SAGA spec proof read

9 Jun 2006

      Hi Pascal, 

comments to the comments... :-)

(removed some topics, agreed with them, and will change
accordingly)

Quoting [Pascal Kleijer] (Jun 09 2006):
...
Hi,
some in-line comments...
...
...
Some extensions:
Purpose: ?Reads up to ?size? bytes of data from the stream into an array 
of bytes. An attempt is made to read as many as ?size? bytes, but a 
smaller number but may be read. The number of bytes actually read is 
returned as ?nbytes?. If ?size? is zero, then no bytes are read and 
nbytes is 0; otherwise, there is an attempt to read at least one byte.?
That should already be the case - that is why 'nbytes' is
returned, indicating the actual number of bytes copied.
nbytes can be smaller than size w/o posing an error
condition (as in POSIX).  Is that what you propose?
Well I think the whole description should be redone based on the 
blocking or not behavior. Also the 'nbytes' returned should add, like in 
POSIX, the error code (negative). IF a new read is made after an fatal 
error, then the exception should kick in. See below.
See below.
...
...
...
3. In this implementation we have no choice but to wait for an exception 
to be thrown when the stream has dried up (use the read method). It 
would be nice to have -1 set to nbytes to first time the end of stream 
is reached (if applies), then the next call to read will throw an 
exception.
What is the error condition the exception is supposed to
catch:
- stream died
 - no data available
The first one is available via the state of the stream.
The second one is what you refer to, right?  But I think
that this is not an error condition which should raise an
exception: data might be available again a second later.
If data availability poses an error condition or not should
be decided on application level IMHO.
Am I missing something?
Basically the exception should be thrown if the last read did return 
with an error. If the error was rectified in the mean time, the read 
should go one as a new call. Therefore the state machine is important to 
have, because a change of state will flush the last error.
I am not sure if I would agree with this one.

Assume the following pseudo code:

 char chache[10] = "";
 int  nbytes;

 while ( nbytes = socket.read (cache, 10))
 {
   if ( 0 > nbytes )
   { 
     print (cache);
   }
   else
   { 
     sleep (1);
   }  
 }

Adding an exception on the second failed read breaks that
schema, and makes code more complex.  Why?  Because the
application has to explicitely keep track of the stream
state.

However, POSIX and most other stream APIs I can think of
would fully support the schema above.

Also, what is the application supposed to do after the first
non-read?  It is bound to eventually try again, and raise an
exception.

However, that is related to the next point you raise...

(PS.: new data would not change the state, really, in the
sense of state diagram...)
...
...
...
4. There is no way to check the size of the data waiting for read. It 
would be nice if we can query the stream on how many bytes in the pipe 
are waiting. This is important if the client must allocate a buffer to 
read this data. So an additional method like ?available? should be 
added. Note that this will slightly overlap with the ?wait? method for 
?read? or ?any? mask, but will return a more accurate figure. Another 
solution is to extend the wait method.
I would have no idea how to implement that on a stream:
without an additional protocol, you have no chance to know
how many bytes will arrive.
However, that is one of the reasons why the message API was
proposed: that implies an additional protocol on the wire,
but will allow the application to query the message size
before performing the message read.
I don't say how many will arrive, but how many 'has' arrived. That mean 
how many are currently stored somewhere within the stream and that are 
ensured to be returned when the next call the read is done. In that case 
we can test for a possible block (if blocking) and size of the buffer to 
allocate (if necessary) as well as to avoid a catch block.
Even that would be difficult to implement.  Assume you
implement that on a BSD socket.  If you do that strainght
forward, so just map the read to a socket read etc, you
don't have that test call.

If you want to provide the test call, you have to
continously READ the data from the socket, to push it into a
cache, and to return the current size of the cache on the
test call.

  - How much do you cache?  1kB?  1MB? 1GB?
  - How do you know that the app actually wants all the
    data?
  - you imply the existence of a thread in the SAGA
    implementation which continously reads on the stream
  - you loose zero copy, so degrade performance

These are again exactly the problems the message API is
supposed to solve - there the cache size is given by the
message size, and it is assumed that the application is
interested in one complete message, at least.  And zero copy
should be preserved as the message size will be known in
advance.

Cheers, Andre.

-- 
"So much time, so little to do..."  -- Garfield

Re: [saga-rg] SAGA spec proof read

Andre Merzky