RE: Data federation definition
Folks, We need a definition of how we are using the term "data federation", for the OGSA glossary. Fortunately we don';t have to find a definition that covers all the ways the term is used in the world, just how we use it in our document. Following a short discussion at the OGSA F2F, Jem (who is keeper of the glossary) suggested the following; The integration of multiple data resources so that they can be accessed as if they were a single resource. Allen suggested that as we are accessing data via services, this would be better phrased as follows (see the attached message for Allen's explanation in his own words): The integration of multiple services or data resources so that they can be accessed as if they were a single service. We discussed this on yesterday's call and the consensus was that I should post to the list and ask for your comments. We briefly discussed whether we should separately define "data federation" and "data integration". One view was that "integration" didn't necessarily involve distributed resources while "federation" didn't necessarily involve integrating the resources into a single view. The contrasting view was that integration almost always involves distributed data in practice, and especially so in a Grid context, while federation typically requires some way of accessing the distributed data as a whole. So I'm leaning towards treated the terms as synonyms within our documents. What do you think? Dave. _____ From: Treadwell, Jem [mailto:jem.treadwell@hp.com] Sent: 19 January 2006 23:17 To: Dave Berry Subject: Data federation definition Dave: This is (very slightly) modified from your document - though you don't have the glossary entry filled in :0) The integration of multiple data resources to so that they can be accessed as if they were a single resource. - Jem _____ Jem Treadwell Hewlett-Packard Company 6000 Irwin Road Mount Laurel, NJ 08054 Phone: 856-638-6021 Fax: 856-638-6190 E-mail: Jem.Treadwell@hp.com mailto:Jem.Treadwell@hp.com
My initial take was that it meant transparent access to data from any node in the grid system or even between grid systems, provided policy allows access. Further, with caching in play, data might naturally become distributed, based on rules/policy which govern migration. Replication could play a secondary role, and an interesting one if the replicas were able to support access, increasing efficiency. Perhaps that might be a Data Confederation, since it could involve more than one data federation? Dave Berry wrote:
Folks,
We need a definition of how we are using the term "data federation", for the OGSA glossary. Fortunately we don';t have to find a definition that covers all the ways the term is used in the world, just how we use it in our document. Following a short discussion at the OGSA F2F, Jem (who is keeper of the glossary) suggested the following;
The integration of multiple data resources so that they can be accessed as if they were a single resource.
Allen suggested that as we are accessing data via services, this would be better phrased as follows (see the attached message for Allen's explanation in his own words):
The integration of multiple services or data resources so that they can be accessed as if they were a single service.
We discussed this on yesterday's call and the consensus was that I should post to the list and ask for your comments.
We briefly discussed whether we should separately define "data federation" and "data integration". One view was that "integration" didn't necessarily involve distributed resources while "federation" didn't necessarily involve integrating the resources into a single view. The contrasting view was that integration almost always involves distributed data in practice, and especially so in a Grid context, while federation typically requires some way of accessing the distributed data as a whole. So I'm leaning towards treated the terms as synonyms within our documents.
What do you think?
Dave.
------------------------------------------------------------------------ From: Treadwell, Jem [mailto:jem.treadwell@hp.com] Sent: 19 January 2006 23:17 To: Dave Berry Subject: Data federation definition
Dave: This is (very slightly) modified from your document - though you don't have the glossary entry filled in :0)
The integration of multiple data resources to so that they can be accessed as if they were a single resource.
- Jem
------------------------------------------------------------------------ Jem Treadwell Hewlett-Packard Company 6000 Irwin Road Mount Laurel, NJ 08054
Phone: 856-638-6021 Fax: 856-638-6190 E-mail: Jem.Treadwell@hp.com mailto:Jem.Treadwell@hp.com
------------------------------------------------------------------------
Subject: Re: FW: Data federation definition From: "Allen Luniewski"
Date: Sat, 21 Jan 2006 00:30:14 -0000 To: "Dave Berry" To: "Dave Berry"
Dave,
I guess that I am not comfortable with the use of the word "resource" below. I have always seen this as the integration of data services. Since data resources are, basically, opaque boxes in the data architecture, I simply do not know what it means to integrate them (in a formal OGSA sense). Since a federating service could very well have multiple data resources under its wings as well as having access to multiple data services. I would suggest changing "multiple data resources" to "multiple data resources and/or services". And the "data resource" at the end of the definition should, ti seems to me, be data service since it is the service that provides the architected access path the data in the data resource.\
Allen
"Dave Berry"
01/19/2006 06:48 PM
To "Allen Luniewski"
cc Subject FW: Data federation definition
Hi Allen,
Jem asked for a definition of data federation for the OGSA Glossary v1.5. We should check that we're happy with this (and update our own Glossary section).
Dave.
------------------------------------------------------------------------ From: Treadwell, Jem [mailto:jem.treadwell@hp.com] Sent: 19 January 2006 23:17 To: Dave Berry Subject: Data federation definition
Dave: This is (very slightly) modified from your document - though you don't have the glossary entry filled in :0)
The integration of multiple data resources to so that they can be accessed as if they were a single resource.
- Jem
------------------------------------------------------------------------ Jem Treadwell Hewlett-Packard Company 6000 Irwin Road Mount Laurel, NJ 08054
Phone: 856-638-6021 Fax: 856-638-6190 E-mail: Jem.Treadwell@hp.com mailto:Jem.Treadwell@hp.com
-- Michael Behrens R2AD, LLC (571) 594-3008 (cell) (703) 714-0442 (land)
Hi All, I dredged up a response to a similar question I asked someone doing research in the area a while back: --- Federation means data sources that are autonomously managed and are generally heterogeneous. It is also logical to say that the sources are distributed (as a good cause for their autonomy and heterogeneity - but it doesn't have to be the case as you can run multiple data servers on your laptop.) Integration targets local and centralized managed data sources. There is also a difference between a global schema, federated schema, and an integrated schema. - Global schema: result of integrating local schema into one global view - Federated schema: result of federating distributed schemas into one global view - Integrated schema: integration of schemas "in preparation" to build a global/federated schema. The main difference between integrated schema and (global/federated) schema is in the integrated schema the elements "are NOT YET" being mapped to their original data sources using (LaV, GaV, etc). Check papers by Maurizio Lenzerini - "Data Integration: A theoretical perspective" and others. --- In general, I think we could say that a "data federation" service presents a single consistent front end to a number of autonomously managed data sources, whilst a "data integration" service uses some schema integration tactic to map between data sources. In that sense, the phrasing that Allen uses sounds like a data federation service. My 2p, neil
-----Original Message----- From: owner-ogsa-d-wg@ggf.org [mailto:owner-ogsa-d-wg@ggf.org] On Behalf Of Dave Berry Sent: 26 January 2006 10:11 To: Treadwell, Jem; ogsa-d-wg@gridforum.org Subject: [ogsa-d-wg] RE: Data federation definition
Folks,
We need a definition of how we are using the term "data federation", for the OGSA glossary. Fortunately we don';t have to find a definition that covers all the ways the term is used in the world, just how we use it in our document. Following a short discussion at the OGSA F2F, Jem (who is keeper of the glossary) suggested the following;
The integration of multiple data resources so that they can be accessed as if they were a single resource.
Allen suggested that as we are accessing data via services, this would be better phrased as follows (see the attached message for Allen's explanation in his own words):
The integration of multiple services or data resources so that they can be accessed as if they were a single service.
We discussed this on yesterday's call and the consensus was that I should post to the list and ask for your comments.
We briefly discussed whether we should separately define "data federation" and "data integration". One view was that "integration" didn't necessarily involve distributed resources while "federation" didn't necessarily involve integrating the resources into a single view. The contrasting view was that integration almost always involves distributed data in practice, and especially so in a Grid context, while federation typically requires some way of accessing the distributed data as a whole. So I'm leaning towards treated the terms as synonyms within our documents.
What do you think?
Dave.
_____
From: Treadwell, Jem [mailto:jem.treadwell@hp.com] Sent: 19 January 2006 23:17 To: Dave Berry Subject: Data federation definition
Dave: This is (very slightly) modified from your document - though you don't have the glossary entry filled in :0)
The integration of multiple data resources to so that they can be accessed as if they were a single resource.
- Jem
_____
Jem Treadwell Hewlett-Packard Company 6000 Irwin Road Mount Laurel, NJ 08054 Phone: 856-638-6021 Fax: 856-638-6190 E-mail: Jem.Treadwell@hp.com mailto:Jem.Treadwell@hp.com
Neil,
I understand the definitions you used in your note. But I still can not
see how a federation service and an integration service are, in practice,
any different. Consider:
To do federation, a federation service is going to have to do schema
integration. Thus it acts as, or at least uses in your terms, an
integration service. If it does not do this, I do not see how it can
perform its federation function.
In the definition of an integration service, I ask myself: what are the
operations I can ask an integration service to do? The only interesting
answer I can come up with is: access data using the integrated schema. But
this is just what a federation service does.
So I end up where I started: the terms "federation" and "integration"
are, in practice, indistinguishable. I am willing to believe that the two
are different but I am still looking for a way to tell them apart.
Allen
"neil p chue hong"
-----Original Message----- From: owner-ogsa-d-wg@ggf.org [mailto:owner-ogsa-d-wg@ggf.org] On Behalf Of Dave Berry Sent: 26 January 2006 10:11 To: Treadwell, Jem; ogsa-d-wg@gridforum.org Subject: [ogsa-d-wg] RE: Data federation definition
Folks,
We need a definition of how we are using the term "data federation", for the OGSA glossary. Fortunately we don';t have to find a definition that covers all the ways the term is used in the world, just how we use it in our document. Following a short discussion at the OGSA F2F, Jem (who is keeper of the glossary) suggested the following;
The integration of multiple data resources so that they can be accessed as if they were a single resource.
Allen suggested that as we are accessing data via services, this would be better phrased as follows (see the attached message for Allen's explanation in his own words):
The integration of multiple services or data resources so that they can be accessed as if they were a single service.
We discussed this on yesterday's call and the consensus was that I should post to the list and ask for your comments.
We briefly discussed whether we should separately define "data federation" and "data integration". One view was that "integration" didn't necessarily involve distributed resources while "federation" didn't necessarily involve integrating the resources into a single view. The contrasting view was that integration almost always involves distributed data in practice, and especially so in a Grid context, while federation typically requires some way of accessing the distributed data as a whole. So I'm leaning towards treated the terms as synonyms within our documents.
What do you think?
Dave.
_____
From: Treadwell, Jem [mailto:jem.treadwell@hp.com] Sent: 19 January 2006 23:17 To: Dave Berry Subject: Data federation definition
Dave: This is (very slightly) modified from your document - though you don't have the glossary entry filled in :0)
The integration of multiple data resources to so that they can be accessed as if they were a single resource.
- Jem
_____
Jem Treadwell Hewlett-Packard Company 6000 Irwin Road Mount Laurel, NJ 08054 Phone: 856-638-6021 Fax: 856-638-6190 E-mail: Jem.Treadwell@hp.com < mailto:Jem.Treadwell@hp.com>
participants (4)
-
Allen Luniewski
-
Dave Berry
-
Michael Behrens
-
neil p chue hong