Neil,

I understand the definitions you used in your note.  But I still can not see how a federation service and an integration service are, in practice, any different.  Consider:

To do federation, a federation service is going to have to do schema integration.  Thus it acts as, or at least uses in your terms, an integration service.  If it does not do this, I do not see how it can perform its federation function.

In the definition of an integration service, I ask myself: what are the operations I can ask an integration service to do?  The only interesting answer I can come up with is: access data using the integrated schema.  But this is just what a federation service does.

So I end  up where I started: the terms "federation" and "integration" are, in practice, indistinguishable.  I am willing to believe that the two are different but I am still looking for a way to tell them apart.

Allen



"neil p chue hong" <N.ChueHong@epcc.ed.ac.uk>
Sent by: owner-ogsa-d-wg@ggf.org

01/27/2006 04:44 AM
Please respond to
N.ChueHong

To
"'Dave Berry'" <daveb@nesc.ac.uk>, "'Treadwell, Jem'" <jem.treadwell@hp.com>, <ogsa-d-wg@gridforum.org>
cc
Subject
RE: [ogsa-d-wg] RE: Data federation definition






Hi All,

I dredged up a response to a similar question I asked someone doing research
in the area a while back:

---
Federation means data sources that are autonomously managed and are
generally heterogeneous. It is also logical to say that the sources are
distributed (as a good cause for their autonomy and heterogeneity - but it
doesn't have to be the case as you can run multiple data servers on your
laptop.)

Integration targets local and centralized managed data sources.

There is also a difference between a global schema, federated schema, and an
integrated schema.
- Global schema: result of integrating local schema into one global view
- Federated schema: result of federating distributed schemas into one
global view
- Integrated schema: integration of schemas "in preparation" to build a
global/federated schema. The main difference between integrated schema and
(global/federated) schema is in the integrated schema the elements "are NOT
YET" being mapped to their original data sources using (LaV, GaV, etc).

Check papers by Maurizio Lenzerini - "Data Integration: A theoretical
perspective" and others.
---

In general, I think we could say that a "data federation" service presents a
single consistent front end to a number of autonomously managed data
sources, whilst a "data integration" service uses some schema integration
tactic to map between data sources.

In that sense, the phrasing that Allen uses sounds like a data federation
service.

My 2p,
neil


> -----Original Message-----
> From: owner-ogsa-d-wg@ggf.org
> [mailto:owner-ogsa-d-wg@ggf.org] On Behalf Of Dave Berry
> Sent: 26 January 2006 10:11
> To: Treadwell, Jem; ogsa-d-wg@gridforum.org
> Subject: [ogsa-d-wg] RE: Data federation definition
>
> Folks,
>  
> We need a definition of how we are using the term "data
> federation", for the OGSA glossary.  Fortunately we don';t
> have to find a definition that covers all the ways the term
> is used in the world, just how we use it in our document.  
> Following a short discussion at the OGSA F2F, Jem (who is
> keeper of the glossary) suggested the following;
>  
> The integration of multiple data resources so that they can
> be accessed as if they were a single resource.
>  
> Allen suggested that as we are accessing data via services,
> this would be better phrased as follows (see the attached
> message for Allen's explanation in his own words):
>  
> The integration of multiple services or data resources so
> that they can be accessed as if they were a single service.
>  
> We discussed this on yesterday's call and the consensus was
> that I should post to the list and ask for your comments.
>  
> We briefly discussed whether we should separately define
> "data federation" and "data integration".  One view was that
> "integration"
> didn't necessarily involve distributed resources while "federation"
> didn't necessarily involve integrating the resources into a
> single view.
> The contrasting view was that integration almost always
> involves distributed data in practice, and especially so in a
> Grid context, while federation typically requires some way of
> accessing the distributed data as a whole.  So I'm leaning
> towards treated the terms as synonyms within our documents.
>  
> What do you think?
>  
> Dave.
>  
>
>
>   _____  
>
>                  From: Treadwell, Jem [mailto:jem.treadwell@hp.com]
>                  Sent: 19 January 2006 23:17
>                  To: Dave Berry
>                  Subject: Data federation definition
>                  
>                  
>                  Dave: This is (very slightly) modified from your
> document - though you don't have the glossary entry filled in :0)
>                  
>                  The integration of multiple data resources to so that
> they can be accessed as if they were a single resource.
>                  
>                  - Jem
>                  
>   _____  
>
> Jem Treadwell
> Hewlett-Packard Company
> 6000 Irwin Road
> Mount Laurel, NJ 08054                                                      
> Phone:                  856-638-6021                
> Fax:                  856-638-6190                
> E-mail:                  Jem.Treadwell@hp.com <mailto:Jem.Treadwell@hp.com>                  
>                  
>
>