ByteIO Working Group session at GGF14
11.00 - 12.30, 29/6/05

15 attendees

Minutes
--------
* Introduction
	- Next telecon is on the 12th of July, 2005
	- Why was write included?
		-- The hard part is consistency
		-- Not part of the scope.  You can have something like
		   advisory locking outside of the ByteIO interfaces
* Use Cases: Files
	- SAGA has found that it's often useful for the writer of a use
	  case to be able to propose api
	- GridFTP is an example of a very successful "grid" service
	  what is the advantage of byteio over gridftp?
		-- GridFTP folks have no problem with idea of a ByteIO 
		   interface which uses GridFTP underneath
	- Does stream also include push model?
		-- no, it doesn't, it's just a standard request for 
		   data from a stream i.e. you may only get back Y bytes
 		   even though you requested X bytes
	- What is RNS? 
		-- Resource Namespace Service (see GFS-WG)
	- We need to go back and add the write point of view to 
	  file use cases
	- Two issues bundled in streams: sessions (to amortise cost of    	
	  open over many operations) - can you overcome repeated 	  authentication/authorisation overheads though?
		-- somewhat implementation dependant
		-- authorisation using X509 is a issue for sessionless 		   		   access as you may need to parse every time
		-- when caching used to amortise cost, effectively 
		   have sessions underneath 
		-- connection caching gives perf. benefits in GridFTP
		-- implementors can choose to implement either session 
		   or sessionless as required
		-- don't expect that Client API is always sessionless
		   as well (client has the smarts)
* Use Cases: Database 
	- Asynchronous Query
		-- Similar to GridRPC
	- Simple and Complex Insert
	- Copying
* Interfaces
	- not trying to solve all IO problems right now
	- trying to get useful things out to groups now
	- added back in sessionable streamable interface as we think it's too important
	- Conceptual Interface IRandomByteIO
		-- all returns are void?
			--- come back to that
	- SAGA
		-- complex IO that many apps need
			--- read a byte seek a byte write a byte
			--- can cache
			--- but cost of doing all ops is more than bytes transferred
			--- could cluster ops
			--- or do scheduled read/write (readv and writev)
			--- or do patterns (e.g I want every second byte for n bytes)
			--- or highlevel e.g GridFTp eread/estore (complicated to setup though)
				---- out of scope for ByteIO (OGSA-D?)
			--- not sure it's the common case
				---- it is for visualisation
		-- not difficult to do e.g. scattered read/writes does the complication outweigh the benefits
		-- suggest wait til spec phase, if someone says we want these then just do it
		-- if there aren't use cases then don't implement it
			--- send in the SAGA use cases
		-- return from read is obvious
	- very difficult to write normative spec which fits all potential profiles e.g. could use BaseFault for WSRF
		-- need nonnormative doc which describes what we are trying to do
		-- additional documents which render the concept to a particular profile, which cover e.g. exceptions and return values 
	- is there QoS? How would you combine it?
		-- out of scope, look at OGSA-D
	- cleaner to have a separate truncate operation?
		-- can still truncAppend 0 bytes to acheive this
		-- more useful/efficient to have atomic truncAppend op
	- benefits for data going into SOAP, easy
		-- require base SOAP protocol
	- dataTransferMech QualifiedName is just a predefined string?
		-- look at BaseNotification, will define some in the group
		-- inclination to put in another spec, but service could come up with their own
	- Conceptual Interface IStreamableByteIO
		-- a non streamable resource, you're talking to the thing which has the bytes. destroy it, you conceptually destroy the bytes
		-- on streamable, if you destroy the stream notionally you don't destroy the bytes, you destroy stream
		-- "begining of a stream" is not defined (up to implementation). Don't require implementations to cahce forever.
	- seems incomplete without being able to open streams
		-- OGSA traditionally shies away from factories
	- could have a streamable ByteIO interface on a non-streamable object, hard to come up with single factory pattern which covers how you may want to open a resource e.g. additional information for telescope
		-- problem comes up because of the sheer variety of possible sources
		-- isn't everything at least streamable?
		-- example of the other way round: hard disk is not	
	- sessionless interface will not be efficient for remote file servers? caching will work locally but not remotely?
	- why define the non-streamable interface?
		-- easier for implementors: garbage collection
	- change proposed is that you have to have an open call
		-- what's the difference then with the streamable interface
	- is sessionless going to be difficult to get performance
	- Mike Beckerle: I have a problem with two ops for seek and read, seek and write
	- propose that we have seekread, and seekwrite
		-- then what happens to returntype if you just want to do seek?
		-- seek would return a long and read would return bytes
		-- if you want to find out what you are could query e.g. resource properties? could imply position from seekread 
	- seek is a delta from a seek origin which could be beginning current or end
	- seems still slightly cumbersone to go through resource properties
	- *** add getCurrentPosition, seekread and seekwrite ***
	- getCurrentPosition will depend on rendering to a particular profile
	- clients should keep track of where they are!

* Bulk Data transfer
	- might want to make stronger statement: "if you implement DIME, you must name it in this way" but only simple is required 
	- should number of bytes be "long"?
		-- never going to actually transfer a 32bit number of bytes on the wire?
		-- but not really going to argue e.g. how about how big is my file, give me all that many bytes
		-- 4GB seems big just now, but in the future...
	- where does the "whoami" and other details about this go?
		-- all in OGSA Base profile, based on BaseNotification
		-- appears in headers, but ByteIO doesn't see it
* Wrap-Up
	- implementations don't do all OGSA Base profile
	- code isn't available yet, need to clean up
	- SAGA working on client side prototype for fileIO
	- crosspost to SAGA list when it comes out