
As we agreed on yesterday's call, we really need to have an answer to the failure mode where the DTF chooses a protocol that does not work, for whatever reason, between the source and sink. I am a strong proponent of option #1 below. Here is how I see this playing out. We add the following fault to the description of the DTI (my apologies for the poor WSDL and arbitrary choice of pseudo-code but I think that they make the point): The DTI MUST raise the following fault if the protocol being used for the transfer can not be used to transfer data between the source and sink. FailedProtocol [reason: xsd: any; updatedSource, updatedSink: DEPR] The returned value "reason" MAY be used by the DTI to provide details about the protocol failure. The value of "reason" is dependent upon the DTI implementation. The returned values "updatedSource" and "updatedSink" MUST refer to the same source/sink of data as that passed into the DTF that created the DTI that has just failed. These MUST differ from the source/sink passed into that DTF only in that the value of the "dmi:SupportedProtocol" field has been updated to remove the failing protocol, and any other protocols that the DTI has determined will also fail. The client MAY use "reason" to determine its next action. The following pseudo-code is the core code that a typical client of DMI is expected to use: source, sink: DEPR; retry: BOOLEAN := TRUE; << client sets source and sink in an implementation dependent manner >> WHILE (retry) DO { dti: DTI; retry := FALSE; dti := RequestDataTransferInstance (source, sink, ...) [ DMI: FailedProtcol(reason: ANY, newSource, newSink: DEPR) => { retry := TRUE; source := newSource; sink := newSInk;}]; }; Here are the pros and cons of this approach as I see them: 1. The client is insulated from the logic needed to properly invoke the DTF after this failure. The client simply revokes the DTF. 2. The client recovery code is just another failure case that the client needs to handle when using the DTI. 3. The client recovery code is simple/trivial. 4. The DTI must have specific code to properly raise this error. The code to modify the source/sink DEPRs is fairly simple. In any event, this becomes something that the is written only once by the DTI implementation and effectively used many times by the clients of DMI who end up using this particular DTI implementation. 5. We avoid introducing new architectural elements into a DMI implementation. 6. The client is aware of the error and must recover from it. In an ideal solution, we would hide this error situation from the client unless there was no protocol that worked. Allen Luniewski IBM WebSphere Cross Brand Services IBM Silicon Valley Laboratory 555 Bailey Ave. San Jose, CA 95141 408-463-2255 408-930-1844 (mobile) Mario Antonioletti <mario@epcc.ed.ac To .uk> OGSA-DMI <ogsa-dmi-wg@ogf.org> Sent by: cc ogsa-dmi-wg-bounc es@ogf.org Subject [ogsa-dmi-wg] DMI Telcon Minutes 11/08/2007 02:12 AM Note: No call next week and the next call is on the 20th at the usual time (not the 21st!). OGSA-DMI Telcon - 07/11/07 ========================== Attendees: Steve Newhouse, Microsoft Mario Antonioletti, EPCC Allen Luniewski, IBM Michel Drescher, Fujitsu Agenda: 1. IPR Notice 2. Previous Minutes and Action Review 3. Agenda Bashing 4. Issues Arising 5. Spec Progress 6. AOB Actions: [Ravi] Resolve minor comments on the spec. [Ravi] Come up with a proposal on how to represent data aggregates (e.g. multiple files) with Data EPRs to the mailing list. [Ravi] Come up with a proposal for a multiple retries property to the mailing list. [Steve/Michel] Come up with a proposal for Data EPRs that can deal with the three scenarios proposed below. http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000293.html +--- Discussion on the Data EPRs and scenarios was postponed until Ravi becomes available or comments on the suggestions proposed by Michel: http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000293.html and Steven: http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000294.html Discussion thus proceeded on how to address transfer failures by the DMI architecture. DTF transport negotiation is done by protocol matching but, at run time, other factors may come into play that prevent the transfer from succeeding, e.g. fire walls. At the moment this means that the DTI will report the failure to the client but the client has no current way to communicate this information back to the DTF - the client can talk to the DTF again to effect the transfer but this will probably lead to the same protocols being chosen and thus reproduce the same failure. Allen suggested three possible solutions. http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-November/000301.html Neither 1 and 2 are preferred, 3 requires active negotiation which is not in scope for this version of the spec. Not clear that it would not be able to do the transfer unless it performed a small test. The following solutions were discussed all starting from the point where the data transfer has began with the DTI but then failed: 1. The DTI returns the failure to the client with modified DEPRs to the client - effectively new ones with the failed protocol removed from the list of supported protocols. The client can then use these DEPRs to retry the transfer through the DTF. 2. The DTI returns the failed protocol to the client, the client modifies the data EPRs (as in 1 by removing the failed transport protocol) and resubmits these to the DEPRs. 3. The client passes on the DEPRs to the DTF as well as any failure messages returned by the DTI to the DTF which is then informed of the protocols that do not work (this might mean aggregating multiple failure messages which does not seem desirable). 4. The DTI is able to communicate an outcome of a transfer to the DTF which is then informed of what protocols do not work and it may be able to act on this. 5. There is a user agent between the DTF and DTI that maintains state and is able to apply some re-try policy on behalf of the client. This would maintain the clean interfaces and state models already in the spec. Michel is not at all keen on 1 as the minting of the DEPRs should be done by third parties. Allen not keen on 2 as this is making clients do stuff that should not really be in their scope. Idea now is that interested parties will flesh out the use case above (or some other) that most appeals to them noting the pros and cons. It was felt that we should not produce a first version of this spec that does not address this problem. Other factor addressed in the call was when the DTF can match more than one protocol - how does it chose what protocol to use? The fastest? The cheapest? etc? It was thought that for version 1 of the scope this would not be addressed - i.e. the client does not provide hints or QoS parameters BUT the DTF should publish what algorithm it will use to choose a transport protocol when there is more than one valid choice. There will be no DMI call next week in part to SC07 and other commitments. The next call on the 20th of November at the same time (this is so as to not clash with the night before thanks giving). +-----------------------------------------------------------------------+ |Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ. | |Tel:0131 650 5141|mario@epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ | +-----------------------------------------------------------------------+ -- ogsa-dmi-wg mailing list ogsa-dmi-wg@ogf.org http://www.ogf.org/mailman/listinfo/ogsa-dmi-wg