Dave,

Comments in-lien below.

Allen Luniewski
IBM Cross Brand Services
IBM Silicon Valley Laboratory
555 Bailey Ave.
San Jose, CA 95141

408-463-2255
408-930-1844 (mobile)


"Dave Berry" <daveb@nesc.ac.uk> wrote on 04/15/2007 03:54:52 AM:

> I think it's vital that we define these terms, because there has
> been a lack of clarity for a while about the difference between data
> access and data transfer.  The existing definition of data resource
> (originally from DAIS) doesn't distinguish between these two modes
> in the way that we do, using source and sink to implicitly refer to both.

>  
> Comments inline =>
> -----Original Message-----
> From: ogsa-d-wg-bounces@ogf.org [mailto:ogsa-d-wg-bounces@ogf.org]
> On Behalf Of Allen Luniewski
> Sent: 13 April 2007 20:22
> To: ogsa-d-wg@ggf.org
> Subject: [OGSA-D-WG] Stab at Glossary Terms

>
> At today's teleconference we agreed that we needed to firm up the
> definitions in our glossary of the terms Data Resource, Data Access,
> Data Transfer, Source and Sink.  Currently only Data Resource has a
> definition taken from the OGSA Glossary) as:
>         An entity that can act as a source or sink of data together
> with its associated framework.
>
> The confusion that was identified centered on the use of the terms
> "source" and "sink".  The above definition is using them in a fairly
> generic way while the Data Architecture document uses them to mean
> the source and sink of data transfers.  Here is stab at defining all
> 5 of these terms.  Once we have agreed on the definitions, we may
> need to do some minor edits to the architecture document as a result.
>
> Data Access: Data access is any mechanism that allows an entity to
> identify data held by a data resource and receive that data from the
> data resource.  

> Can I suggest changing "identify data" to "identify a subset of the data"?

Agreed.  Much better.

>  
> I'm also not sure about saying "receive that data", as it may not be
> the requestor that receives the data (i.e. the data may be sent to a
> third party).  Would it work to say "and either return that data to
> the requestor or to make it available for transfer elsewhere"?  I
> think Mario had a suggestion for some phrasing that might be better
> than my attempt here.


I see your point and I agree that this needs to be fixed.  Your suggestion works for me.

> Data Resource: An entity that can act as a source or sink of data
> together with its associated framework.  By source we simply mean an
> entity that can originate data.  And by sink we simply mean an
> entity that receives data.  

> I'm not comfortable with using source and sink in a generic fashion
> here and in a more restrictive fashion elsewhere in the document (if
> this is what you are suggesting).  I'd rather just refer readers to
> the glossary definitions of these terms and make sure that all the
> entries are consistent.

>  
> An alternative definition that I suggested was, "An entity (and its
> associated framework) that supports a data access interface, or that
> can act as a source or sink for data transfer", although I'm not
> strongly wedded to that.


The problem is that in the context of data resource I don't want to tie it into data transfer, at least not in the DMI sense of data.  The definition should include data that is stored/retrieved directly as a result of a call on the data resource's API.  How about "An entity that can receive data from a requestor or cause its data to be made available to a requestor or a third party denoted by a requestor."

> Data Transfer: A mechanism to move data from a source of data to a sink
> of data.  

> Should we say "copy" rather than "move"?   Should we also say
> "physically copy" or some such phrasing to distinguish this from a
> mere renaming within a global namespace?


Yes, definitely "copy".  I missed changing this one when I changed sink and source to "copy".  I am okay with physical copy although personally I do not see it adding much value to the definition.

> Source: A data resource that contains that data to be copied to a sink via a
> data transfer mechanism.
>
> Sink: A data resource that receives the data copied by a data transfer
> mechanism from a source.  

> Currently we use "source" and "sink" to mean interfaces on a data
> resource.  I think we should make this clear.  We could use the
> terms to mean both the resource and the interface, provided that we
> are clear about what we are doing.


Should we add a phrase along the lines of "A sink is also the interface provided by a data resource as part of its participation in a data transfer."  And similarly for sink.

>
> Allen Luniewski
> IBM Cross Brand Services
> IBM Silicon Valley Laboratory
> 555 Bailey Ave.
> San Jose, CA 95141
>
> 408-463-2255
> 408-930-1844 (mobile)