Hi All, I dredged up a response to a similar question I asked someone doing research in the area a while back: --- Federation means data sources that are autonomously managed and are generally heterogeneous. It is also logical to say that the sources are distributed (as a good cause for their autonomy and heterogeneity - but it doesn't have to be the case as you can run multiple data servers on your laptop.) Integration targets local and centralized managed data sources. There is also a difference between a global schema, federated schema, and an integrated schema. - Global schema: result of integrating local schema into one global view - Federated schema: result of federating distributed schemas into one global view - Integrated schema: integration of schemas "in preparation" to build a global/federated schema. The main difference between integrated schema and (global/federated) schema is in the integrated schema the elements "are NOT YET" being mapped to their original data sources using (LaV, GaV, etc). Check papers by Maurizio Lenzerini - "Data Integration: A theoretical perspective" and others. --- In general, I think we could say that a "data federation" service presents a single consistent front end to a number of autonomously managed data sources, whilst a "data integration" service uses some schema integration tactic to map between data sources. In that sense, the phrasing that Allen uses sounds like a data federation service. My 2p, neil
-----Original Message----- From: owner-ogsa-d-wg@ggf.org [mailto:owner-ogsa-d-wg@ggf.org] On Behalf Of Dave Berry Sent: 26 January 2006 10:11 To: Treadwell, Jem; ogsa-d-wg@gridforum.org Subject: [ogsa-d-wg] RE: Data federation definition
Folks,
We need a definition of how we are using the term "data federation", for the OGSA glossary. Fortunately we don';t have to find a definition that covers all the ways the term is used in the world, just how we use it in our document. Following a short discussion at the OGSA F2F, Jem (who is keeper of the glossary) suggested the following;
The integration of multiple data resources so that they can be accessed as if they were a single resource.
Allen suggested that as we are accessing data via services, this would be better phrased as follows (see the attached message for Allen's explanation in his own words):
The integration of multiple services or data resources so that they can be accessed as if they were a single service.
We discussed this on yesterday's call and the consensus was that I should post to the list and ask for your comments.
We briefly discussed whether we should separately define "data federation" and "data integration". One view was that "integration" didn't necessarily involve distributed resources while "federation" didn't necessarily involve integrating the resources into a single view. The contrasting view was that integration almost always involves distributed data in practice, and especially so in a Grid context, while federation typically requires some way of accessing the distributed data as a whole. So I'm leaning towards treated the terms as synonyms within our documents.
What do you think?
Dave.
_____
From: Treadwell, Jem [mailto:jem.treadwell@hp.com] Sent: 19 January 2006 23:17 To: Dave Berry Subject: Data federation definition
Dave: This is (very slightly) modified from your document - though you don't have the glossary entry filled in :0)
The integration of multiple data resources to so that they can be accessed as if they were a single resource.
- Jem
_____
Jem Treadwell Hewlett-Packard Company 6000 Irwin Road Mount Laurel, NJ 08054 Phone: 856-638-6021 Fax: 856-638-6190 E-mail: Jem.Treadwell@hp.com mailto:Jem.Treadwell@hp.com