
A small comment to the pull/push question raised in the meeting notes: why not support both models? IMHO, there are always use cases which are very unhappy if you support either one, and not the other... You can always allow for _implementations_ to support only one model. My 2 cent, Andre. Quoting [Steven Newhouse] (Jul 03 2008):
Notes from today's telecom.
Steven
-- Nothing is ever easy.

Andre Merzky wrote:
A small comment to the pull/push question raised in the meeting notes: why not support both models? IMHO, there are always use cases which are very unhappy if you support either one, and not the other... You can always allow for _implementations_ to support only one model.
Information bound for clients has to be pulled; you just can't rely on them being able to receive unsolicited SOAP messages due to firewalls and the like[*]. On the other hand, when building an information service you want the basic info pushed in from the collection points (e.g. the BES container publishes an advert locally that it exists). The tricky bit is working out where to switch from push to pull; my instinct is to put that at the boundary between service provider and service consumer (this is in the simple no-middlemen case). The project I'm working on is looking at using RDF/SPARQL for the information system. On the one hand it does mean that we'd have good expressibility, but on the other hand I worry about practicality and performance. The gripping hand is that this isn't done yet anyway, so it's pure speculation as to whether it is a good idea. :-) Thinking back to what Lawrence was saying, the thing that worried me about it was that he was describing a system that pulled a lot of data to the local system (well, it was actually to the job execution system but that's the local system from the perspective of the query) before taking a decision. That seems horribly inefficient. It's better IMHO to push a slightly more complex query to the info system and have that return a smaller set of results of higher quality (ideally, you'd get it down to a single message each way for the majority of cases). As a side advantage then is that it is possible to evaluate the query while taking into account information that you don't want to expose to the client; for example, you don't need to let them see the provisioning schedule of your (i.e. the provider's) disk arrays, as you can just tell them that the space they asked for will be there when the job completes. Similarly for jobs; you don't need to reveal whether an application is installed or whether you use virtualized images, just that the job can be executed when needed. [Hmmm, this message is already longer than I set out to write...] Donal. [* Someone ought to do a profile of SOAP using XMPP as a transport, since with that you *could* do real message push. ]

[* Someone ought to do a profile of SOAP using XMPP as a transport, since with that you *could* do real message push. ]
XEP-0072: SOAP Over XMPP http://www.xmpp.org/extensions/xep-0072.html On Fri, 04 Jul 2008 19:30:25 +0900, Donal K. Fellows <donal.k.fellows@manchester.ac.uk> wrote:
Andre Merzky wrote:
A small comment to the pull/push question raised in the meeting notes: why not support both models? IMHO, there are always use cases which are very unhappy if you support either one, and not the other... You can always allow for _implementations_ to support only one model.
Information bound for clients has to be pulled; you just can't rely on them being able to receive unsolicited SOAP messages due to firewalls and the like[*]. On the other hand, when building an information service you want the basic info pushed in from the collection points (e.g. the BES container publishes an advert locally that it exists). The tricky bit is working out where to switch from push to pull; my instinct is to put that at the boundary between service provider and service consumer (this is in the simple no-middlemen case).
The project I'm working on is looking at using RDF/SPARQL for the information system. On the one hand it does mean that we'd have good expressibility, but on the other hand I worry about practicality and performance. The gripping hand is that this isn't done yet anyway, so it's pure speculation as to whether it is a good idea. :-)
Thinking back to what Lawrence was saying, the thing that worried me about it was that he was describing a system that pulled a lot of data to the local system (well, it was actually to the job execution system but that's the local system from the perspective of the query) before taking a decision. That seems horribly inefficient. It's better IMHO to push a slightly more complex query to the info system and have that return a smaller set of results of higher quality (ideally, you'd get it down to a single message each way for the majority of cases). As a side advantage then is that it is possible to evaluate the query while taking into account information that you don't want to expose to the client; for example, you don't need to let them see the provisioning schedule of your (i.e. the provider's) disk arrays, as you can just tell them that the space they asked for will be there when the job completes. Similarly for jobs; you don't need to reveal whether an application is installed or whether you use virtualized images, just that the job can be executed when needed.
[Hmmm, this message is already longer than I set out to write...]
Donal. [* Someone ought to do a profile of SOAP using XMPP as a transport, since with that you *could* do real message push. ] -- ogsa-wg mailing list ogsa-wg@ogf.org http://www.ogf.org/mailman/listinfo/ogsa-wg
-- Andreas Savva
participants (4)
-
Andre Merzky
-
Andreas Savva
-
Donal K. Fellows
-
Steven Newhouse