Meeting - Tuesday, 8 Dec 2009, 16:00 (CET) - Notes

Dear all, please see below the notes from today's meeting. Best, Johannes Meeting - Tuesday, 8 Dec 2009, 16:00 (CET) Participants: Morris, Balazs, Etienne, Johannes, Aleksander Agenda: Discussion of the work of Etienne on the "OGF PGI - AGU Execution Service Strawman Rendering" Morris: go through meeting notes from last meeting would be good to have gLite participation Balazs: we need at least one more person from gLite real work can only be done with a screensharing tool just discuss the ideas behind Etienne's work Balazs (chairing the strawman rendering document discussion): doc is the latest version of the doc (version 37) by Aleksander on the PGI GridForge page page 4 synchronize, merge Etiennes's cahnges with actual model in createActivity (second sentence) Etienne: the initial state of a job is the submitted state Balazs: substate? Etienne: incoming, waiting, outgoing input queue, has not been processed yet submitted waiting: it is being processed submitted outgoing: id or epr has been given to the job, ready for next state change that in a consistent way Balazs: combining substates in a certain way -> clearify more not combine submitted outgoing with response Etienne: useful that each state will represent something if you do not give job id at the end of submitted state, you have to completely include the submitted state Balazs: only the first level states, 2nd third are optional? Etienne: important: at the end of submitted state - job id associated Balazs: why at the end? Etienne: at least at the end! could be returned earlier also it is useful for the execution service for checking if it is possible (JSDL checking) Balazs: not validate JSDL here when execution service returns response? Morris: page 4, comments in terms of UNICORE it is alway possible directly, you don't do brokering not exactly sure how this relates to the state model out of the scope of the session Balazs: not discussing staging of data discussing in which state job should be does not see this written down now Morris: it is in the job description document validation Balazs: missing the satet model how do you read the job description validation? Morris: matchmaking if there is a must - there is a must in gLite, first from the broker, then from execution service Balazs: first mandatory validation, then respose, maybe validation errors you have to validate first, to check for errors Etienne: first mandatory validations: XML schema, semantic some validation may be only possible later?! Balazs: all validation should be done before then assign state you submit a job, execution service does couple of things like validation the first time you can check the state, when you have the job id Etienne: in the beginning did not write the internal second level submit state added them for some request ready to remove them Balazs: hold point is important substate of the submitted which is a hold! Etienne: first sent on 13 Aug Balazs: changeState operation resume like operations andy kind of hold states Morris: need a figure not the state model figure should have a step by step base like figure of presentation Etienne: agree of the scope of the submitted state inside the submitted state as soon has a job id, the job should go to the preprocessing state, where it can be held submitted state a certain time, in the beginning already job id can go to a hold substate Balazs: preprocessing: different data staging Etienne: preprocessing includes data staging any other actions performed that's why third level hold we have another preprocessiong hold suspended when user wants to hold withoud relationship to staging Morris: i.e. preprocessiong stage, only one CPU before using the whole machine not related to data staging Balazs: proprocessing is very complex state Morris: during the execution vs. pre and post processing "before finished" other requirement for putting it somewhere Etienne: postprocessing hold suspened state Morris: clearly identify it is postprocessing - job is finished Etienne: can go in any order at any time you want Balazs: pre, post processing very complex, dangerous does not see a sequence how the job goes through Balazs: really would like to see a linear state model Etienne: agree with Balazs, inside the preprocessing hold there is no clear idication in which order the third level substates are processed implicitely suppose: can be processed in any order and any number of times achieved only ba the implementation then it is ok the order must be well known from the beginnging, we have to renew the state model Balazs: when exactly this will happen? put the job in a hold before any kind of data staging takes place Etienne. if we keep model exacly as it is, it is not very precise, implementations can do it more easy if we create a too precise model, it will be only implemented by one middleware to many details, will limit the chance to have different middlewares are you shure you want to do that? Balazs: just want to stop the job before preprocessing Morris: use case? you don't do data staging then? Balazs: just to be sure, the job does no other things for safety reason Morris: pre means before preprocessing is not data staging Balazs: in ARC it is a safety measure to be able to stop the job before running Morris: does not really see the difference user gets the feedback -> submitted submitted hold sounds strange no destinction to preprocessing Balazs: submitted hold is the end of submitted Morris. seeking to understand the difference consistency in the sate model automatic resubmission people may want to specify certain points use case maybe accounting problems Etienne: inside the submitted state, reformulate substates one where the jobid is associated to the job after second job already has the job id, user is able to perform an operation on the job the jobid is associated in the middle of the submitted state there can be a submitted hold where the id has already given to the job we have to synchronize the state model with the messages exchanged Aleksander: how can it be an activity without an id Morris: validation steps createActivity operations should only be successful if validation successful if validation not successful, no job created, no hold Etienne: in figure rename submitteds waiting to submitted validating add another substate, submitted hold id must be associated to the job corresponding messages have to be sent back to the user validation, creation of id, then send id to user Morris: put still the waiting Etienne: rename waiting by validating Balazs: everything is done, job is ready to go to the next state preprocessing, delegating, postprocessing Etienne: in which ord third level substates are processed? Balozs: hold blocks blocking before going to the next state Etienne: second level hold substate before outgoing substate? Balazs: yes Etienne: understands the requirement infact in the three secondlevel states preproc,delegating postproc secondlevel hold substate corresponds to what Balazs wishes Balazs: different states secondlevel hold, thrid level hold? Morris: there might be middlewares without third level Balazs: needs preprocessing hold Morris: how is the third level flow? Balazs: optional: third, second level Morris: disagree second level should be adopted from all submitted, preprocessing, delegating, postprocessing i.e. preprocessing hold failed recoverable not many middlewares can do this agree with Balazs: aligning the thrird level of states Etienne: first two levels are mandatory inside submitted: add a hold substate Morris: submitted incoming is missing comparing to the state model submitted incoming is a state or is it no state? Balazs: job ide already Etienne: no initial substate of the submitted state state without a name remove submitted incoming at this point no job id yet - cannot be handled just internal state Morris: problems reported in logfiles Etienne: expose only states which can be used by the submitter remove the submitted incoming label - it is not relevant to the submitter agreement on what is going on in the submitted state. Morris: submitted validated only job id if it is validated Etienne. job id is given at validated substate return to the document inside the submitted state, the execution servcie cannot always provide the location Balazs: switch to the data staging issue data staging on createActivity pull approach service can publish information Etienne: mail: three ways for message transfer which way is the best? Balazs: client needs the info about the data staging location Etienne: have to repeatedly poll until the job comes to manual staging if there is manual staging the job goes the preprocessing state, hold cannot guarantee the consistency of that - to be discussed perhaps not so easy to link the return to a late state Morris. one figure not enough maybe an animation hopefully can work on this ugently needed here!!! Balazs: strongly avoid notification publish the staging location, not be coupled to job state server needs time to figure out the staging location - publish after some time why should this be coupled to a job state?? Etienne: in some cases the service has inner knowledge about the storage location inside the submitted state Balazs: job should not be preprocessing before it does not know the staging location Morris: compared to job description validation i.e. we need 2 GB, requirement how it could be still vaild to have it in submitted? Etienne: computing element whcih really owns storage but if it is divided into sequential parts where submitted parts do not know about physical storage but of batch systems or strage systeme, there is a problem Etienne: at validated an execution service may already have allocated storage Balazs: we should edit the text when everybody can see the document Aleksander: validation, allocation data staging after allocation id should be assigned as soon job is validated client receives information, polling or notification Morris: continue next time with desktop sharing tool Etienne: are these three ways from email consistent? which ways are preferred? Aleksander: resend the mail! Etienne: sent on 26 Oct, 4 Dec Morris: continue next week -- _ _ _ _ _ _ Johannes Watzl |\/| |\ | |\/| Institut für Informatik / Dept. of CS | | | \| | | Ludwig-Maximilians-Universität München ======= TEAM ======= Oettingenstr. 67, 80538 Munich, Germany Room E 005, Phone +49-89-2180-9162 Munich Network Management Team Email: watzl@nm.ifi.lmu.de Münchner Netz-Management Team http://www.nm.ifi.lmu.de/~watzl

Johannes and all, Lot of thanks to Johannes for his notes. I have improved below the formulation of : - what I said, - what I understood other participants said. Best regards. Etienne URBAH Meeting - Tuesday, 8 Dec 2009, 16:00 (CET) Participants: Morris, Balazs, Etienne, Johannes, Aleksander Agenda: Discussion of the work of Etienne on the "OGF PGI - AGU Execution Service Strawman Rendering" Morris: Go through meeting notes from last meeting Would be good to have gLite participation Balazs: We need at least one more person from gLite Real work can only be done with a screensharing tool Just discuss the ideas behind Etienne's work Balazs (chairing the strawman rendering document discussion): Doc is the latest version of the doc (version 37) by Aleksander on the PGI GridForge page Page 4 Synchronize, merge Etiennes's changes with actual model Inside createActivity (second sentence) Etienne: The initial state of a job is the submitted state Balazs: Substates ? Etienne: Substates of 'Submitted' state are : Incoming, Waiting, Outgoing - Incoming = Job inside input queue, has not been processed yet - Waiting = JSDL is being processed - Outgoing = JobId or EPR has been given to the job, ready for next state Any change must be performed in a consistent way Balazs: Combining substates in a certain way -> clarify more Not combine submitted outgoing with response Etienne: It is useful that each state really represent something If you do not give JobId at the end of 'Submitted' state, you have to completely include the 'Submitted' state inside the 'Pre-processing' state. Balazs: Only the first level states, 2nd third are optional? Etienne: Important: at the end of submitted state - JobId associated Balazs: Why at the end ? Etienne: At latest at the end ! Could be returned earlier also It is useful for the execution service for checking if it is possible (JSDL checking) Balazs: Not validate JSDL here When does execution service return response ? Morris: Page 4, comments - In terms of UNICORE it is alway possible directly, you don't do brokering - Not exactly sure how this relates to the state model - Out of the scope of the session ? Balazs: Now, we are not discussing staging of data We are discussing in which state job should be I do not see this written down now Morris: It is in the job description document validation Balazs: Missing the state model How do you read the job description validation? Morris: Matchmaking If there is a must - there is a must In gLite, first from the broker, then from execution service Balazs: First mandatory validation, then response containing maybe validation errors You have to validate first, to check for errors Etienne: First mandatory validations: XML, Schema, Semantic, Some validation may be only possible later ? Balazs: All validation should be done before, then assign state You submit a job, execution service does couple of things like validation The first time the submitter can check the state is when he/she has the JobId Etienne: At the beginning, I did not write the second level substates of 'Submitted'. I added them after some request, and I am ready to remove them. Balazs: Hold point is important : Substate of 'Submitted' which is a 'Hold' ! Etienne: Concerning the 'Submitted:Hold' substate, I sent a mail on 13 August Balazs: changeState operation resume like operations andy kind of hold states Morris: We need a figure : - not only the state model figure - but a step by step base like figure of presentation Etienne: We have to agree on the scope of the 'Submitted' state : Inside the 'Submitted' state, as soon a job has its JobId, the job should go to the 'Pre-processing' state, where it can be put in a 'Hold' substate Balazs: Is 'Pre-processing' different from data staging ? Etienne: 'Pre-processing' includes data staging Any other actions can be performed That's why there are third level 'Hold' substates There is 'Pre-processing:Hold:Suspended' when user wants 'Hold' without relationship to staging Morris: i.e. preprocessiong stage, only one CPU before using the whole machine not related to data staging Balazs: 'Pre-processing' is very complex state Morris: During the execution vs. pre and post processing "before finished" Other requirement for putting it somewhere Etienne: There is also the 'Post-processing:Hold:Suspend' state Morris: Clearly identify it is 'Post-processing' : Job is finished Etienne: Inside second level 'Hold' substates, third level substates can occur in any order at any time you want Balazs: Pre, post processing very complex, dangerous I do not see a sequence how the job goes through I really would like to see a linear state model Etienne: I agree with Balazs, inside the preprocessing hold there is no clear indication in which order the third level substates are processed I implicitely suppose that they can be processed in any order and any number of times - If it is achieved only by the implementation, then it is OK - If the order must be well known from the beginning, we have to completely review the state model Balazs: When exactly this will happen? Put the job in 'Hold' state before any kind of data staging takes place Etienne: If we keep model exactly as it is, it is not very precise, and several middlewares can implement it more easily. If we create a too precise model, it will be only implemented by one middleware. Too many details will limit the chance to have different middlewares. Are you sure you want to do that ? Balazs: Just want to stop the Job before 'Pre-processing' Morris: Use case? You don't do data staging then? Balazs: Use case : For safety reasons, the Submitter just wants to be sure that the job does not perform anything. Morris: 'Pre' means before 'Pre-processing' is not data staging Balazs: Inside ARC it is a safety measure to be able to stop the job before running Morris: I do not really see the difference User gets the feedback -> 'Submitted' 'Submitted:Hold' sounds strange No distinction with 'Pre-processing' Balazs: 'Submitted:Hold' is the end of 'Submitted' Morris: I am seeking to understand the difference ... Consistency in the state model In the case of 'automatic resubmission', people may want to specify certain 'Hold' points Use case maybe accounting problems Etienne: Inside the submitted state, I propose to change the list of substates : Create one substate where the JobId is associated to the Job When the Job already has its JobId, the submitter is able to perform operations on the job If the JobId is associated to the Job in the middle of the 'Submitted' state, there can be a 'Submitted:Hold' substate where the JobId has already been given to the Job We have to synchronize the state model with the messages exchanged Aleksander: How can it be an activity without a JobId ? Morris: Validation steps createActivity operations should only be successful if validation is successful If validation is not successful, no Job is created, no 'Hold' is possible Etienne: Inside the 'Submitted' state, I propose to : - rename 'Submitted:Waiting' to 'Submitted:Validating' - add another substate, 'Submitted:Hold' JobId must be associated to the Job Corresponding messages have to be sent back to the Submitter Sequence : Validation, Creation of JobId, then send JobId to Submitter Morris: Still keep the 'Waiting' substate Etienne: No, rename 'Waiting' by 'Validating' Balazs: When everything is done, the job is ready to go to the next state Pre-processing, Delegated, Post-processing Etienne: In which order should the third level substates be processed? Balazs: 'Hold' blocks : blocking before going to the next state Etienne: Second level 'Hold' substate before outgoing substate ? Balazs: Yes Etienne: I understand the requirement In fact, inside the 3 first level states preproc, delegated, postproc, there already is a second level 'Hold' substate corresponding to what Balazs wishes Balazs: Different states ? Second level 'Hold', third level 'Hold' ? Morris: There might be middlewares without third level Balazs: needs preprocessing hold Morris: How is the third level flow ? Balazs: Third and second level are optional Morris: I disagree : Second level should be adopted by all submitted, preprocessing, delegating, postprocessing i.e. preprocessing hold failed recoverable not many middlewares can do this I agree with Balazs: aligning the third level of states Etienne: I had written that only the first level was mandatory, but I agree to write that the first two levels are mandatory Besides, inside 'Submitted', I will add a 'Hold' substate Morris: 'Submitted:Incoming' is missing when comparing to the state model Is 'Submitted:Incoming' a substate or is it not? Balazs: For a substate, the JobId must already exist Etienne: I agree with Balazs : The initial substate of the 'Submitted' state should have no name I will remove the 'Submitted:Incoming' label, because at this point there is no JobId yet - The Job cannot be handled by the Submitter, it is just an internal substate Morris: But we should take into account problems reported in logfiles Etienne: Such problems are implementation-specific. We should expose only states which can be used by the Submitter So I will remove the 'Submitted:Incoming' label, which is not relevant for the Submitter Is there now an agreement on what is going on in the submitted state ? Morris: 'Submitted:Validating' should be renamed as 'Submitted:Validated' because there is a JobId only if the JSDL has already been validated. Etienne. I agree, the JobId is given at 'Submitted:Validated' substate Now, I propose to return to the document Inside the 'Submitted' state, the Execution Service cannot always provide a storage location Balazs: So we are switching the subject to to the data staging issue Data staging on createActivity Pull approach Service can publish information Etienne: Please look at my mail, where I present 3 ways for message transfer Which way is the best ? Someone: We can not suppose that the Submitter implements subscription to notifications. Balazs: The Submitter needs the info about the data staging location Etienne: Without notifications, the consistent solution is that the Submitter has to repeatedly poll until the job comes to manual staging If the Execution Service returns the JobId only at manual staging when the Job is inside the 'Pre-processing:Hold' substate, I cannot guarantee the consistency This has to be discussed, because I a afraid that it is not so easy to link the 'CreateActivity' return message of to such a late state Morris. one figure not enough maybe an animation hopefully can work on this urgently needed here !!! Balazs: I strongly wish to avoid notifications Publishing the staging location is not be coupled to job state Execution Service needs time to figure out the staging location - It publishes it after some time Why should this be coupled to a job state ?? Etienne: In some cases, the Execution Service has knowledge about the storage location already inside the submitted state, in some other cases it has not this knowledge Balazs: The Job should not go to 'Pre-processing' before if the Execution Service does not know the staging location Morris: Compared to job description validation i.e. we need 2 GB, requirement How it could be still valid to have it in submitted? Etienne: A standard computing element really owns storage, so that at 'Submitted:Validated', an Execution Service may already have allocated storage to the Job But if the Execution Service is divided into sequential parts where the 'Submitted' part does not know about physical storage but only about capabilities of batch systems and storage systems, there is a problem Balazs: We should edit the text when everybody can see the document Aleksander: validation, allocation data staging after allocation id should be assigned as soon job is validated client receives information, polling or notification Morris: We will continue next time with desktop sharing tool Etienne: Important questions : - Are these 3 ways described in my email consistent ? - Which ways are preferred? Aleksander: Which mail ? Etienne: Sent on 26 October, resent on 4 December Morris: Continue next week ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- On Tue, 08 Dec 2009, Johannes Watzl wrote:
Dear all,
please see below the notes from today's meeting.
Best, Johannes
...
participants (2)
-
Etienne URBAH
-
Johannes Watzl