I
think it's vital that we define these terms, because there has been a lack
of clarity for a while about the difference between data access and data
transfer. The existing definition of data resource (originally from DAIS)
doesn't distinguish between these two modes in the way that we do, using source
and sink to implicitly refer to both.
Comments inline =>
At today's teleconference we agreed that
we needed to firm up the definitions in our glossary of the terms Data
Resource, Data Access, Data Transfer, Source and Sink. Currently only
Data Resource has a definition taken from the OGSA Glossary) as:
An entity that
can act as a source or sink of data together with its associated
framework.
The confusion that was
identified centered on the use of the terms "source" and "sink". The
above definition is using them in a fairly generic way while the Data
Architecture document uses them to mean the source and sink of data transfers.
Here is stab at defining all 5 of these terms. Once we have agreed
on the definitions, we may need to do some minor edits to the architecture
document as a result.
Data Access:
Data access is any mechanism that allows an entity to identify data held by a
data resource and receive that data from the data
resource.
Can I suggest changing "identify data" to "identify a subset of the
data"?
I'm also not sure about saying "receive that data", as it may not be the
requestor that receives the data (i.e. the data may be sent to a third
party). Would it work to say "and either return that data to the
requestor or to make it available for transfer elsewhere"? I
think Mario had a suggestion for some phrasing that might be better
than my attempt here.
Data Resource: An entity that can
act as a source or sink of data together with its associated framework.
By source we simply mean an entity that can originate data. And by
sink we simply mean an entity that receives data.
I'm not comfortable with using source and sink in a generic fashion
here and in a more restrictive fashion elsewhere in the document (if this
is what you are suggesting). I'd rather just refer readers to
the glossary definitions of these terms and make sure that all the
entries are consistent.
An alternative
definition that I suggested was, "An entity (and its associated framework)
that supports a data access interface, or that can act as a source or sink for
data transfer", although I'm not strongly wedded to
that.
Data Transfer: A mechanism to move
data from a source of data to a sink of data.
Should we say "copy" rather than "move"? Should we also say "physically copy" or some such
phrasing to distinguish this from a mere renaming within a global
namespace?
Source: A data resource that
contains that data to be copied to a sink via a data transfer
mechanism.
Sink: A data
resource that receives the data copied by a data transfer mechanism
from a source.
Currently we use "source" and "sink" to mean interfaces on a data
resource. I think we should make this clear. We could use the terms
to mean both the resource and the interface, provided that we are
clear about what we are doing.
Allen Luniewski
IBM Cross Brand Services
IBM Silicon Valley
Laboratory
555 Bailey Ave.
San Jose, CA
95141
408-463-2255
408-930-1844
(mobile)