WS-GRAM protocol sketch, as promised on telecon

This is excerpted from a larger summary I sent to the BES BoF list back at the beginning of the year... it shows the "boxcar" submission+subscription mechanism as well as the hold/release mechanism. I've left out extra content that was in the previous email to focus on these parts. Note, WS-GRAM currently has only one "hold" state that is used for synchronizing _cleanup_ of jobs. It would be meaningful to have a _startup_ hold state such as we had in previous GRAM protocols, but it is not strictly necessary for reliable invocation because we use the idempotent message-id idiom instead. Also, keep in mind that the subscription patterns works because we are using WSRF style rendering w/ properties and topics defined for the job resource. So, the semantics are quite clear: create the job resource and then invoke the subscription operation on it w/ the arguments embedded in the creation message. Please see the Globus website documentation, for example, http://www-unix.globus.org/toolkit/docs/4.0/execution/wsgram/developer/WS_GR... for examples of client-side programming to build up and send such protocol messages. karl --------------------------------------------------------- PORTTYPE ManagedJobFactoryPortType ---------- OPERATION job:createManagedJob Request creation of a Managed Executable Job Resource whose EPR will be returned in the response. INPUT message: createManagedJobInputMessage has one part: <createManagedJob> <InitialTerminationTime>xsd:dateTime</InitialTerminationTime>? <JobID>wsa:AttributedURI</JobID>? <wsnt:Subscribe></wsnt:Subscribe>? <desc:job> ... </desc:job> </createManagedJob> The optional JobID element is used to request idempotent invocation semantics in a binding-independent manner. The optional wsnt:Subscribe element is used to request automatic subscription to the newly created Managed Job. This call can also create a Managed Multi-Job Resource, i.e. a co-allocated job spread across multiple WS-GRAM hosts, because the job element is actually in an XSD choice with a multijob element. OUTPUT message: createManagedJobOutputMessage has one part: <createManagedJobResponse> <NewTerminationTime>xsd:dateTime</NewTerminationTime> <CurrentTime>xsd:dateTime</CurrentTime> <managedJobEndpoint>wsa:EndpointReferenceType</managedJobEndpoint> <subscriptionEndpoint>wsa:EndpointReferenceType</subscriptionEndpoint>? </createManagedJobResponse> The optional subscriptionEndpoint is returned if a corresponding subscription request was included in the input. --------------------------------------------------------- PORTTYPE ManagedExecutableJobPortType A Managed Executable Job Resource (MEJR) represents one job that has already been submitted by a client. It is a WSRF style resource with a resource properties document to represent status of the job: <managedExecutableJobResourceProperties> <stdoutURL>xsd:anyURI</stdoutURL>? <stderrURL>xsd:anyURI</stderrURL>? <credentialPath>xsd:string</credentialPath>? <exitCode>xsd:int<exitCode/>? <serviceLevelAgreement> <desc:job> ... </desc:job> </serviceLevelAgreement> <Capacity>xsd:int</Capacity> <userSubject>xsd:string</userSubject> <fault/> <TopicExpressionDialects>xsd:anyURI</TopicExpressionDialects> <Topic Dialect=xsd:anyURI> xsd:any? </Topic>+ <TerminationTime>xsd:dateTime</TerminationTime> <localUserId>xsd:string</localUserId> <CurrentTime>xsd:dateTime</CurrentTime> <holding>xsd:boolean</holding> <RegistrantData>xsd:base64Binary</RegistrantData> <RendezvousCompleted>xsd:boolean</RendezvousCompleted> <FixedTopicSet>xsd:boolean</FixedTopicSet> <state>Unsubmitted|StageIn|Pending|Active|Suspended|StageOut|Cleanup|Done|Failed</state> </managedExecutableJobResourceProperties> These properties relate to job output file management: stdoutURL, stderrURL delegated credential management: credentialPath parallel task rendezvous (for MPICH-G2): Capacity, RegistrantData, RendezvousCompleted job status: exitCode, fault, holding, state job introspection: credentialPath, serviceLevelAgreement, localUserId for WSRF introspection: TopicExpressionDialects, Topic, FixedTopicSet, TerminationTime, CurrentTime. ---------- OPERATION exec:release Releases job from hold state. The hold state is an optional behavior selected in the job description to prevent post-execution file deletions (clean-up) from occuring while a remote client is still attempting to access the files. The release operation permits the normal clean-up to occur. INPUT message: releaseInputMessage has one (empty) part: <release/> OUTPUT message: releaseOutputMessage has one (empty) part: <releaseResponse/> -- Karl Czajkowski karlcz@univa.com
participants (1)
-
Karl Czajkowski