I like what is below but have a suggestion
to make re "data service". How about changing it to read:
Data
service: A service that provides data access, source or sink interfaces,
as defined in this document, for one or more data resources.
The intent of this suggestion is to
make it clear that the data service provides the SOA interfaces while still
making it clear that the data resource provides some other interface (typically
not a service interface) as a primitive on which the data service is built.
Allen Luniewski
IBM Cross Brand Services
IBM Silicon Valley Laboratory
555 Bailey Ave.
San Jose, CA 95141
408-463-2255
408-930-1844 (mobile)
"Dave Berry"
<daveb@nesc.ac.uk>
05/01/2007 08:56 AM
To
Allen Luniewski/Almaden/IBM@IBMUS, <ogsa-d-wg@ggf.org>
cc
Subject
RE: [OGSA-D-WG] Stab at Glossary Terms
Here's an update based on today's
call.
We finished the call by noting
that in an SOA it is the service, rather than the resource, that provides
the access, source or sink mechanisms. We're not sure whether we
need to revisit the definition of resource (and its use in the other terms),
although a data resource must of course provide some access/source/sink
mechanisms.
Dave.
Data Resource: An entity (and its associated framework)
that can provides a data access mechanism or act as a source or sink.
Source: A data resource that contains the data to be copied
to a sink via a data transfer mechanism.
Sink: A data resource that receives the data copied by a data transfer
mechanism from a source.
Data Access: Any mechanism that allows an entity to identify
a subset of the data held by a data resource and either return that data
to the requestor or to make it available for transfer elsewhere.
Data Transfer: A mechanism to physcially copy data from
a source to a sink.
Data Set: An encoding of data in a defined syntax
suitable for externalisation outside of a data service, for example transfer
to/from a data service. Examples include a WebRowSet encoding of
an SQL query result set, a JPEG encoded byte array, and a ZIP encoded byte
array of a set of files.
Data service: A service that provides
data access, source or sink mechanisms for one or more data resources.
From: Allen Luniewski [mailto:luniew@us.ibm.com]
Sent: 18 April 2007 02:56
To: Dave Berry; ogsa-d-wg@ggf.org
Subject: RE: [OGSA-D-WG] Stab at Glossary Terms
Dave,
Comments in-lien below.
Allen Luniewski
IBM Cross Brand Services
IBM Silicon Valley Laboratory
555 Bailey Ave.
San Jose, CA 95141
408-463-2255
408-930-1844 (mobile)
"Dave Berry" <daveb@nesc.ac.uk> wrote on 04/15/2007 03:54:52
AM:
> I think it's vital that we define these terms, because there has
> been a lack of clarity for a while about the difference between data
> access and data transfer. The existing definition of data resource
> (originally from DAIS) doesn't distinguish between these two modes
> in the way that we do, using source and sink to implicitly refer to
both.
>
> Comments inline =>
> -----Original Message-----
> From: ogsa-d-wg-bounces@ogf.org [mailto:ogsa-d-wg-bounces@ogf.org]
> On Behalf Of Allen Luniewski
> Sent: 13 April 2007 20:22
> To: ogsa-d-wg@ggf.org
> Subject: [OGSA-D-WG] Stab at Glossary Terms
>
> At today's teleconference we agreed that we needed to firm up the
> definitions in our glossary of the terms Data Resource, Data Access,
> Data Transfer, Source andonly Da Sink. Currently ta Resource
has a
> definition taken from the OGSA Glossary) as:
> An entity that can act as a source or
sink of data together
> with its associated framework.
>
> The confusion that was identified centered on the use of the terms
> "source" and "sink". The above definition
is using them in a fairly
> generic way while the Data Architecture document uses them to mean
> the source and sink of data transfers. Here is stab at defining
all
> 5 of these terms. Once we have agreed on the definitions, we
may
> need to do some minor edits to the architecture document as a result.
>
> Data Access: Data access is any mechanism that allows an entity to
> identify data held by a data resource and receive that data from the
> data resource.
> Can I suggest changing "identify data" to "identify
a subset of the data"?
Agreed. Much better.
>
> I'm also not sure about saying "receive that data", as it
may not be
> the requestor that receives the data (i.e. the data may be sent to
a
> third party). Would it work to say "and either return that
data to
> the requestor or to make it available for transfer elsewhere"?
I
> think Mario had a suggestion for some phrasing that might be better
> than my attempt here.
I see your point and I agree that this needs to be fixed. Your suggestion
works for me.
> Data Resource: An entity that can act as a source or sink of data
> together with its associated framework. By source we simply
mean an
> entity that can originate data. And by sink we simply mean an
> entity that receives data.
> I'm not comfortable with using source and sink in a generic fashion
> here and in a more restrictive fashion elsewhere in the document (if
> this is what you are suggesting). I'd rather just refer readers
to
> the glossary definitions of these terms and make sure that all the
> entries are consistent.
>
> An alternative definition that I suggested was, "An entity (and
its
> associated framework) that supports a data access interface, or that
> can act as a source or sink for data transfer", although I'm
not
> strongly wedded to that.
The problem is that in the context of data resource I don't want to tie
it into data transfer, at least not in the DMI sense of data. The
definition should include data that is stored/retrieved directly as a result
of a call on the data resource's API. How about "An entity that
can receive data from a requestor or cause its data to be made available
to a requestor or a third party denoted by a requestor."
> Data Transfer: A mechanism to move data from a source of data to a
sink
> of data.
> Should we say "copy" rather than "move"?
Should we also say
> "physically copy" or some such phrasing to distinguish this
from a
> mere renaming within a global namespace?
Yes, definitely "copy". I missed changing this one when
I changed sink and source to "copy". I am okay with physical
copy although personally I do not see it adding much value to the definition.
> Source: A data resource that contains that data to be copied to a
sink via a
> data transfer mechanism.
>
> Sink: A data resource that receives the data copied by a data transfer
> mechanism from a source.
> Currently we use "source" and "sink" to mean interfaces
on a data
> resource. I think we should make this clear. We could
use the
> terms to mean both the resource and the interface, provided that we
> are clear about what we are doing.
Should we add a phrase along the lines of "A sink is also the interface
provided by a data resource as part of its participation in a data transfer."
And similarly for sink.
>
> Allen Luniewski
> IBM Cross Brand Services
> IBM Silicon Valley Laboratory
> 555 Bailey Ave.
> San Jose, CA 95141
>
> 408-463-2255
> 408-930-1844 (mobile)