I think it's vital that we define these terms, because there has been a lack of clarity for a while about the difference between data access and data transfer.  The existing definition of data resource (originally from DAIS) doesn't distinguish between these two modes in the way that we do, using source and sink to implicitly refer to both.
 
Comments inline =>
-----Original Message-----
From: ogsa-d-wg-bounces@ogf.org [mailto:ogsa-d-wg-bounces@ogf.org] On Behalf Of Allen Luniewski
Sent: 13 April 2007 20:22
To: ogsa-d-wg@ggf.org
Subject: [OGSA-D-WG] Stab at Glossary Terms


At today's teleconference we agreed that we needed to firm up the definitions in our glossary of the terms Data Resource, Data Access, Data Transfer, Source and Sink.  Currently only Data Resource has a definition taken from the OGSA Glossary) as:
        An entity that can act as a source or sink of data together with its associated framework.

The confusion that was identified centered on the use of the terms "source" and "sink".  The above definition is using them in a fairly generic way while the Data Architecture document uses them to mean the source and sink of data transfers.  Here is stab at defining all 5 of these terms.  Once we have agreed on the definitions, we may need to do some minor edits to the architecture document as a result.

Data Access: Data access is any mechanism that allows an entity to identify data held by a data resource and receive that data from the data resource.  
Can I suggest changing "identify data" to "identify a subset of the data"? 
 
I'm also not sure about saying "receive that data", as it may not be the requestor that receives the data (i.e. the data may be sent to a third party).  Would it work to say "and either return that data to the requestor or to make it available for transfer elsewhere"?  I think Mario had a suggestion for some phrasing that might be better than my attempt here.
Data Resource: An entity that can act as a source or sink of data together with its associated framework.  By source we simply mean an entity that can originate data.  And by sink we simply mean an entity that receives data.  
I'm not comfortable with using source and sink in a generic fashion here and in a more restrictive fashion elsewhere in the document (if this is what you are suggesting).  I'd rather just refer readers to the glossary definitions of these terms and make sure that all the entries are consistent.
 
An alternative definition that I suggested was, "An entity (and its associated framework) that supports a data access interface, or that can act as a source or sink for data transfer", although I'm not strongly wedded to that.
Data Transfer: A mechanism to move data from a source of data to a sink of data.  
Should we say "copy" rather than "move"?   Should we also say "physically copy" or some such phrasing to distinguish this from a mere renaming within a global namespace?
Source: A data resource that contains that data to be copied to a sink via a data transfer mechanism.

Sink: A data resource that receives the data copied by a data transfer mechanism from a source.  
Currently we use "source" and "sink" to mean interfaces on a data resource.  I think we should make this clear.  We could use the terms to mean both the resource and the interface, provided that we are clear about what we are doing. 

Allen Luniewski
IBM Cross Brand Services
IBM Silicon Valley Laboratory
555 Bailey Ave.
San Jose, CA 95141

408-463-2255
408-930-1844 (mobile)