
Quoting [John Shalf] (Jun 14 2005):
On Jun 14, 2005, at 1:24 AM, Andre Merzky wrote:
Quoting [John Shalf] (Jun 14 2005):
Should we find some case that causes problems for a readv/pread model? The hyperslabbing is clearly not one of those cases. Actually, how would you do an HDF5 hyperslab via readv? The only way I see is instrumenting the HDF5 library, and write a readv file driver - but then you would not use SAGA anyway, that not application level anymore.
The same problem exists with any of the proposed solutions, including eRead. So I'm not sure if I see the point here.
Hm, sorry that I communicate so badly: but that CAN be solved with eRead - and thats exactly the advantage. We implemented that once in a different lib, and it workd like a charm. The code I included below is from a real client (the call was named iowrap_pread instead of file.eRead though ;)
If you want to read hyperslabs on an HDF5 file on application level with readv, you would need to mimic the HDF5 lib in order to find the offset for the data set, and would need to know details about HDF5 file structure and data layout.
Someone will need to solve the very same problem in order to implement an HDF5-specific eRead interface.
Compared to that, eread really is simplier to the application. Here an example we used for hyperslabbing a 3D scalar field:
snprintf (pattern1, 255, "(%d, %d, %d, %d)" , start1, stop1, stride1, reps1); snprintf (pattern2, 255, "(%d, %d, %d, %d, %s)", start2, stop2, stride2, reps2, pattern1); snprintf (pattern3, 255, "(%d, %d, %d, %d, %s)", start3, stop3, stride3, reps3, pattern2); res = file.eRead (pattern3, (char*) buf, buffer_size);
So you would actually need to embed this in-situ with your HDF5 code? Or would you go through the HDF5 libraries so that you can push that information string down to the driver layer? Its not clear where exactly you place these calls.
This call goes into the application! That is supposed to be the saga level. The HDF5 lib does not come into play on the local host at all, but only on the remote host - where the eRead request is received, translated into a nativ HDF5 HS read (translation is simple), and the resulting data are returned. That is why I think SAGA is a good place for eRead - it IS application level...
And when you *do* insert these calls, it requires some understanding of the HDF5 internal file layout. Or are we going to ditch the HDF5 API and use eRead instead? How then do we use eRead to manage all of the other HDF5 features like compression, groups, iteration etc.??? What is the string spec for an HDF5 group iterator using eRead strings?
Ah, right, now I see why we are running circles :-) Imagine a remote web service providing access to HDF5 files. A simple version would provide read and write call only, a more sophisticated version would provide group iterations etc. However, the service would come up with some interface, which resembles HDF5 somewhat, but is probably more taylored toward the specific use case. eRead is nothing but a medium to communicate with such a service, and with similar services. It cannot replace HDF5, but can help in _application specific_ usage of a service providing access to an HDF5 file. As you said before: semantics gets pushed down the pipe. That is right: it gets pushed over the wire, to the remote side, and interpreted there. HOW you specify your semantics in an eRead string is up to the service definition and your use case. app. -> eread -> wire -> service -> HDF5 -> localVFD -> file
This is why I fail to see the benefits of the eRead interface (it didn't prevent us from mucking with the guts of HDF5 if you want to preserve the HDF5 API, but it also didn't reduce complexity for the user if you are going to replace the HDF5 APIs with these stringy pattern requests).
Nop, its not supposed to replace HDF5. Its also not supposed to replace libjpeg, libtiff, ... - you name it. It does not solve world problems. All it does is: it provides the ability to have application specific semantics pushed to the remote side, where it can be efficiently interpreted. The other solutions don't provide that. If you need the HDF5 API, you use the HDF5 api, not SAGA.
start, stop, stride, reps corespond directly to the HDF5 semantics. So, the semantic info is indeed maintained on appliation level, and, as you said before, its interpretation is pushed to lower levels.
It looks like you will end up encoding the entire HDF5 API as eRead pattern strings and push it to the other end of a client-server connection. Again, I'm not sure if we made life easer for the remote HDF5 people.
How would that look for recv?
What I was thinking is that developers of HDF5 may have an interest in defining vector or patterned read operations at the VFD layer of their interface. This would enable them to propagate the kind of information you are attempting to encode in eRead strings down to the driver where vector-read interfaces can take advantage of them for deeper pipelining of high-latency operations. (they could, for instance, use some of the methods that Thorsten was referring to, or they could use vread/vwrite type operations).
So the issue is that 1) if you use eRead to replace the HDF5 API, then we are talking about an enormously complex string-encoding interface. 2) if you use eRead in the VFD, then you have to instrument HDF5 to propagate information about patterned reads down the driver layer. That is of course the same thing you need if you use vread()/readp() (or any of the interfaces that Thorston described). So I don't see much of a difference in capability there except that vread/readp already has information in a form that you can do I/O with. With eRead, you still have to go through and parse some strings to gain access to the same information about the pattern of reads/writes?
I did not assume that SAGA would be the right thing to use to implement a HDF5-VFD. That is not exactly the application community SAGA is targeting at I think. application -> HDF5 -> sagaVFD -> saga -> gridftp (or so) -> file But I see now, and agree: on VFD level, eRead does not buy you much if compared to pitfalls (I'm still unsure about vread, but that won't help this discussion ;-)
So its not merely that eRead is pushing complexity to a different layer... I don't see where it is reducing complexity.
Maybe we should go away from HDF5. Assume a application specific binary file. You want subsampling. Locally you do (seeked before): for ( int x = 0; x < X_MAX / 2; x++ ) { for ( int y = 0; y < Y_MAX / 2; x++ ) { for ( int z = 0; z < Z_MAX / 2; x++ ) { data[x][y][z] = my_file_read (x*2, y*2, z*2); } } } In SAGA now, that is the same: it would call read and seek so and so often. SAGA with readv would allow you to do: for ( int x = 0; x < X_MAX / 2; x++ ) { for ( int y = 0; y < Y_MAX / 2; x++ ) { for ( int z = 0; z < Z_MAX / 2; x++ ) { iovecs[n].iov_base = ... iovecs[n].iov_len = 1; n++; } } } file.readv (iovecs, data, n); SAGA with eread would allow you to do: snprintf (request, 255, "downsample %d %d %d %d", offset, 2, 2, 2); file.eRead (request, data, n); Shorter, but it requires a infrastructure which understands the request (well, for readv you need also a remote counterpart, but that can be agnostic to semantics...). readv is more posix like, and more generic. It always works if you are on read level (e.g. HDF5 VFD layer ;-). eread is more powerful: it allows applicatoin specific optimization which is not achievable with readv (the size of the iovecs in the read request is double of the size of the data returned!). Cheers, Andre. -- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+