Late Arrival & Resource Workflow Conceptualization

Hi everyone! I'm sorry to say that I'm going to be arriving late at the F2F. I missed my flight due to bad weather, and so I had to go onwards a day later. I'll be arriving sometime on Wednesday afternoon, depending on how long it takes to get my bags and the hire-car, as well as the traffic (though the last time I arrived on that flight, the traffic wasn't a big deal). I'm especially annoyed to be missing the Workflow discussion and the Data grid use-cases, but I'll be in the air or on the road when those are on. I suspect I'll arrive sometime during the QoS part. Since I'm not going to be in the Workflow discussion, I'll instead write about an idea I've been working on as part of the EU BREIN project. It is an attempt to produce a conceptualization of Workflows based on (abstract) Resources instead the more conventional ideas of Services, Messages or Jobs. It's specifically intended to be a conceptualization, and as such it does not make *any* attempt to define an actual language or concrete terms for things covered by the likes of JSDL or BES (or even CIM for that matter). It's much more abstract than that right now. The document so far is available in the Workflow Design Team area of the OGSA documents tree on GridForge. I was going to put it in here, but I object to receiving large documents through email and I assume others do too. :-) Alas, the style used is not very nice; that's driven by project requirements. A few things to note about what I've done with it so far. I'm fairly sure I can capture the vast majority of resources, including things like CPUs, batch queues, applications, networks, disk space. I think I can also capture the notions of reservations, software licenses. The key to this is a three-layer model: 1: Locality - represents a space in which resources exist. 2: Resource - represents something that can be allocated, and which has some kind of usage/consumption pattern. (I do not know whether access control should be applied at the level of the locality or the resource.) 3: Aspect - represents a description of a view on a resource. I've identified three basic consumption patterns, being: * Presence Required - resource must be present in the locality for the duration of an activity using the resource * Exclusive Hold - resource must be present in locality at start of the activity, is removed/hidden from the locality while the activity is processing, and is returned to the locality when the activity completes * Consume - as with Exclusive Hold, except that upon activity completion the resource is destroyed and not returned to the locality Activities gain access to resources by describing which aspects they require and in how many localities; resources are not referred to directly (though since resources have a unique name, using that as an aspect is trivial). I don't define any logic for combining aspects (an aspect's description could be bounded in time, which is necessary for characterizing reservations using the intermediate concept of allocated time slices) or subsuming them (so an Intel Core 2 Duo is more specific than an x86, but still capable of satisfying a request for an x86), though I feel that such a logic is a Very Good Idea (if furiously difficult to define). When an activity's required aspects are satisfied, it may execute, and upon finishing execution, may create further resources in one or more localities. Such result resources (which another activity may depend upon) represent the outcome of the activity, and might also include things like a transferred dataset or a deployed software system. Result resources won't "overwrite" any resource with the same name, though they may be unnamed, in which case a new (globally?) unique name will be chosen for them at creation time. A workflow is just a set of activities that presumably depend on each other (though I suppose they don't have to) and a workflow is said to be complete when no activities are enabled (i.e. have their aspect requirements satisfiable) or the action of any enabled activity would force the overwrite of one or more named resources. It is obviously possible to define a workflow that is always creating more and more resources by having an activity that only produces unnamed resources; don't do that! :-) A few examples of actions: * Simple job execution: the action requires aspects from a single locality that correspond to the resources and application name/version that you'd expect based on a JSDL description of the job. * Dataset transfer: the action requires an aspect that corresponds to the dataset in the source locality, and another aspect that corresponds to sufficient free disk space in the target locality. If using some sort of network reservation, requiring an aspect for bandwidth from the locality/ies representing the network between the source and the target is a good idea. * Distributed metacomputing: the action is like a simple job execution on several distinct localities, and quite possibly with some extra requirements on the network locality/ies for bandwidth and/or latency. Known open questions: * Undeployment of resources that are not typically Consumed. * Access control (especially the locus of AuthZ decisions). * Whether localities are themselves resources. * How good the model of reservations is! * Whether localities need to be referred to explicitly by actions and how localities and action aspect requirements interact in the description of actions. * How much quality-of-service can be described as aspects on suitable resources. Feedback is definitely welcome on this idea, which is definitely an example of a product of the half-bakery. But I think it has merit (especially as it is not the same as either Orchestration or Choreography as I understand it; I expect these ideas to instead complement more standard workflows) especially when applied to the sorts of things which come up on the Grid. Acknowledgements: This is purely my own work, but many people were responsible for the forment of ideas from which it sprang. The following people are especially notable: * Jay Ungar, who started me thinking about abstract resources and workflows * Dean Kuo and Viktor Yarmolenko at Manchester, who started me thinking about abstract actions and reservations * My colleagues from the BREIN project, especially from the Agents group at the University of Hohenheim, who helped me understand what sorts of abstractions are needed in (some kinds of) workflows Hmm, this message has turned out to be rather longer than I originally anticipated! My apologies. Donal Fellows (stuck overnight in Amsterdam, on the grounds that it is less awful than being stuck in JFK).
participants (1)
-
Donal K. Fellows