Note: No call next week and the next call is on the 20th at the usual time (not the 21st!). OGSA-DMI Telcon - 07/11/07 ========================== Attendees: Steve Newhouse, Microsoft Mario Antonioletti, EPCC Allen Luniewski, IBM Michel Drescher, Fujitsu Agenda: 1. IPR Notice 2. Previous Minutes and Action Review 3. Agenda Bashing 4. Issues Arising 5. Spec Progress 6. AOB Actions: [Ravi] Resolve minor comments on the spec. [Ravi] Come up with a proposal on how to represent data aggregates (e.g. multiple files) with Data EPRs to the mailing list. [Ravi] Come up with a proposal for a multiple retries property to the mailing list. [Steve/Michel] Come up with a proposal for Data EPRs that can deal with the three scenarios proposed below. http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000293.html +--- Discussion on the Data EPRs and scenarios was postponed until Ravi becomes available or comments on the suggestions proposed by Michel: http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000293.html and Steven: http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000294.html Discussion thus proceeded on how to address transfer failures by the DMI architecture. DTF transport negotiation is done by protocol matching but, at run time, other factors may come into play that prevent the transfer from succeeding, e.g. fire walls. At the moment this means that the DTI will report the failure to the client but the client has no current way to communicate this information back to the DTF - the client can talk to the DTF again to effect the transfer but this will probably lead to the same protocols being chosen and thus reproduce the same failure. Allen suggested three possible solutions. http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-November/000301.html Neither 1 and 2 are preferred, 3 requires active negotiation which is not in scope for this version of the spec. Not clear that it would not be able to do the transfer unless it performed a small test. The following solutions were discussed all starting from the point where the data transfer has began with the DTI but then failed: 1. The DTI returns the failure to the client with modified DEPRs to the client - effectively new ones with the failed protocol removed from the list of supported protocols. The client can then use these DEPRs to retry the transfer through the DTF. 2. The DTI returns the failed protocol to the client, the client modifies the data EPRs (as in 1 by removing the failed transport protocol) and resubmits these to the DEPRs. 3. The client passes on the DEPRs to the DTF as well as any failure messages returned by the DTI to the DTF which is then informed of the protocols that do not work (this might mean aggregating multiple failure messages which does not seem desirable). 4. The DTI is able to communicate an outcome of a transfer to the DTF which is then informed of what protocols do not work and it may be able to act on this. 5. There is a user agent between the DTF and DTI that maintains state and is able to apply some re-try policy on behalf of the client. This would maintain the clean interfaces and state models already in the spec. Michel is not at all keen on 1 as the minting of the DEPRs should be done by third parties. Allen not keen on 2 as this is making clients do stuff that should not really be in their scope. Idea now is that interested parties will flesh out the use case above (or some other) that most appeals to them noting the pros and cons. It was felt that we should not produce a first version of this spec that does not address this problem. Other factor addressed in the call was when the DTF can match more than one protocol - how does it chose what protocol to use? The fastest? The cheapest? etc? It was thought that for version 1 of the scope this would not be addressed - i.e. the client does not provide hints or QoS parameters BUT the DTF should publish what algorithm it will use to choose a transport protocol when there is more than one valid choice. There will be no DMI call next week in part to SC07 and other commitments. The next call on the 20th of November at the same time (this is so as to not clash with the night before thanks giving). +-----------------------------------------------------------------------+ |Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ. | |Tel:0131 650 5141|mario@epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ | +-----------------------------------------------------------------------+
As we agreed on yesterday's call, we really need to have an answer to the
failure mode where the DTF chooses a protocol that does not work, for
whatever reason, between the source and sink.
I am a strong proponent of option #1 below. Here is how I see this playing
out.
We add the following fault to the description of the DTI (my apologies for
the poor WSDL and arbitrary choice of pseudo-code but I think that they
make the point):
The DTI MUST raise the following fault if the protocol being used for
the transfer can not be used to transfer data between the source and sink.
FailedProtocol [reason: xsd: any; updatedSource, updatedSink:
DEPR]
The returned value "reason" MAY be used by the DTI to provide details
about the protocol failure. The value of "reason" is dependent upon
the DTI implementation.
The returned values "updatedSource" and "updatedSink" MUST refer to
the same source/sink of data as that passed into the DTF that created
the DTI that has just failed. These MUST differ from the
source/sink passed into that DTF only in that the value of the
"dmi:SupportedProtocol"
field has been updated to remove the failing protocol, and any
other protocols that the DTI has determined will also fail.
The client MAY use "reason" to determine its next action. The
following pseudo-code is the core code that a typical client of DMI is
expected to use:
source, sink: DEPR;
retry: BOOLEAN := TRUE;
<< client sets source and sink in an implementation dependent
manner >>
WHILE (retry) DO {
dti: DTI;
retry := FALSE;
dti := RequestDataTransferInstance (source, sink, ...) [
DMI: FailedProtcol(reason: ANY, newSource, newSink:
DEPR) => {
retry := TRUE;
source := newSource;
sink := newSInk;}];
};
Here are the pros and cons of this approach as I see them:
1. The client is insulated from the logic needed to properly invoke
the DTF after this failure. The client simply revokes the DTF.
2. The client recovery code is just another failure case that the
client needs to handle when using the DTI.
3. The client recovery code is simple/trivial.
4. The DTI must have specific code to properly raise this error. The
code to modify the source/sink DEPRs is fairly simple. In any event, this
becomes
something that the is written only once by the DTI
implementation and effectively used many times by the clients of DMI who
end up using this
particular DTI implementation.
5. We avoid introducing new architectural elements into a DMI
implementation.
6. The client is aware of the error and must recover from it. In an
ideal solution, we would hide this error situation from the client unless
there was no
protocol that worked.
Allen Luniewski
IBM WebSphere Cross Brand Services
IBM Silicon Valley Laboratory
555 Bailey Ave.
San Jose, CA 95141
408-463-2255
408-930-1844 (mobile)
Mario
Antonioletti
Folks, here's my take on this matter. The description tries to limit itself to the scope of V1 of the specification. To briefly re-layout the general architecture. There are several entities and data structures involved in a DMI-based data transfer. The Source entity holds the data as a sequence of bytes. The Sink entity eventually receives the data as a sequence of bytes and persistently stores them (e.g. on a hard disk). The data at the Source is described by one, possibly many DataEPR data structure(s). This description encoded in the DataEPRs is transport protocol-neutral and eventually needs to be "incarnated" (UNICORE terminology), "grounded" (Globus terminology?) or "concretized" (general terminology?) into a transport specific format before transferring the data can commence. For the remainder of this text only one DataEPR instead of many is considered. This DataEPR describing the data at the Source also contains information about the specific data transport protocols that can be used to retrieve the described sequence of bytes off the Source. Associated with that information are transfer protocol specific, concrete, formats of the abstract description for the data. Likewise, there exists a DataEPR data structure that describes how (and where) the data shall be stored at the Sink. This DataEPR has the same structure as the DataEPR that describes the data at the Source: It contains an abstract description of the data an its intended location at the Sink, and a set of data transfer protocols that are available for use to store the data at the Sink. There are also transfer protocol specific formats of the abstract description associated with the transport protocol descriptions in the DataEPR for the Sink. A Client can instruct a DMI implementation to carry out a data transfer on its behalf. For this, the DMI specification requires that the Client supplies two DataEPRs, and a set of requirements (mostly timing requirements). The requirements are not considered in this text. The two DataEPRs the Client must supply are the DataEPRs that describe the data in question at the Source and at the Sink, respectively. For brevity they are called "Source DataEPR" and "Sink DataEPR". THe DMI implementation, at the time of writing this text, consists of two entities. The first entity is the DTF (Data Transfer Factory) which is responsible for setting up a data transfer between the Source and the Sink. The Client described above, when using DMI to carry out a data transfer, in fact contacts the DTF and supplies the Source DataEPR and the SinkDataEPR to the DTF. THe DTF sets up a data transfer, and returns to the Client an EPR that addresses a DTI entity, for further communication. The second entity in a DMI implementation is a DTI (Data Transfer Instance). In reality, there will exist many DTI instances; one for each set up data transfer. Once a DTI is set up and the EPR is returned to the Client, the Client further communicates with the DTI. The description above intentionally does not flesh out many o the details - it was intended as a recap of the entities and data structures. The big question here is who does what, at which point in time, and why. This is described below. The first issue that comes up is the question "Who is the authority (or has the authority) to create or manipulate the Source DataEPR and Sink DataEPR?". To answer this question I will first give the answer to a second question, which is "What is the role of DMI in this picture?" The answer to that second question already lies within the current specification of the architecture: In short, a DMI implementation is to: - Select a data transfer protocol based on the information supplied in the Source DataPR and the Sink DataEPR. - Expect that the information in both DataEPRs is complete. Otherwise the DTF must throw a fault. - Set up a DTI which in turn invokes the specific data transfer protocol that transfers the data. - Throw a fault in the DTI if the underlying data transfer protocol fails for whatever reason. - Attempt a cleanup of the environment as configured for the specific data transfer protocol, in case the transfer failed. In one sentence, DMI is clearly a CONSUMER of the Source DataEPR and the Sink DataEPR. It acts upon the information given in the DataEPRs, nothing more. This answer partly covers the answer for the first question of who actually is the authority to mint the DataEPRs. The complete answer to that depends on the viewpoint to DMI. In any case, however, DMI itself cannot be the that authority. This leaves the Client, or the Source and Sink, in the picture. Considering the broader architecture of a distributed system, e.g. a Grid, then the natural answer is clearly that it has to be the Source and the Sink. In fact, whether it is the Source and Sink themselves, or Web Services closely associated with the Source and the Sink, respectively, is considered a negligible implementation detail. Considering, however, only the DMI specification, the answer is pretty simple: I don't care as DMI si just the consumer of those DataEPRs. The next question that needs an answer is this: "What happens if the data transfer fails?". DMI, in the current specification, attempts only one data transfer. If this one fails, then DMI has the obligation to report back this failure. The current specification clearly lacks n how this fault should be reported. The only information a client currently can query about is the status of the data transfer. There is no indication of the details of the failure. The specification hence needs amendments so that more information is provide, particularly the transfer protocol that has been used to attempt the data transfer. By providing such information the MI's obligations are fulfilled, and from the DMI standpoint of view the Client can take further actions on whether to retry a data transfer or not. From the DMI viewpoint, it is of no concern whether the Client retries the data transfer with unchanged or changed DataEPRs. Even if the DataEPRs are changed, it is of no concern to DMI who changed them; it will simply act upon the information provided therein. Finally, there is one thing we must be clear about: Whoever puts the information about available data transfer protocols (and their specific data descriptions) into the Source DataEPR and the Sink DataEPR, assumes, from the DMI point of view, full authority over the correctness of the information provided therein. Cheers, Michel On 8 Nov 2007, at 19:19, Allen Luniewski wrote:
As we agreed on yesterday's call, we really need to have an answer to the failure mode where the DTF chooses a protocol that does not work, for whatever reason, between the source and sink.
I am a strong proponent of option #1 below. Here is how I see this playing out.
We add the following fault to the description of the DTI (my apologies for the poor WSDL and arbitrary choice of pseudo-code but I think that they make the point):
The DTI MUST raise the following fault if the protocol being used for the transfer can not be used to transfer data between the source and sink. FailedProtocol [reason: xsd: any; updatedSource, updatedSink: DEPR] The returned value "reason" MAY be used by the DTI to provide details about the protocol failure. The value of "reason" is dependent upon the DTI implementation. The returned values "updatedSource" and "updatedSink" MUST refer to the same source/sink of data as that passed into the DTF that created the DTI that has just failed. These MUST differ from the source/ sink passed into that DTF only in that the value of the "dmi:SupportedProtocol" field has been updated to remove the failing protocol, and any other protocols that the DTI has determined will also fail. The client MAY use "reason" to determine its next action. The following pseudo-code is the core code that a typical client of DMI is expected to use: source, sink: DEPR; retry: BOOLEAN := TRUE; << client sets source and sink in an implementation dependent manner >> WHILE (retry) DO { dti: DTI; retry := FALSE; dti := RequestDataTransferInstance (source, sink, ...) [ DMI: FailedProtcol(reason: ANY, newSource, newSink: DEPR) => { retry := TRUE; source := newSource; sink := newSInk;}]; };
Here are the pros and cons of this approach as I see them: 1. The client is insulated from the logic needed to properly invoke the DTF after this failure. The client simply revokes the DTF. 2. The client recovery code is just another failure case that the client needs to handle when using the DTI. 3. The client recovery code is simple/trivial. 4. The DTI must have specific code to properly raise this error. The code to modify the source/sink DEPRs is fairly simple. In any event, this becomes something that the is written only once by the DTI implementation and effectively used many times by the clients of DMI who end up using this particular DTI implementation. 5. We avoid introducing new architectural elements into a DMI implementation. 6. The client is aware of the error and must recover from it. In an ideal solution, we would hide this error situation from the client unless there was no protocol that worked.
Allen Luniewski IBM WebSphere Cross Brand Services IBM Silicon Valley Laboratory 555 Bailey Ave. San Jose, CA 95141
408-463-2255 408-930-1844 (mobile)
Mario Antonioletti Mario Antonioletti
Sent by: ogsa-dmi-wg-bounces@ogf.org 11/08/2007 02:12 AM
To OGSA-DMI cc Subject [ogsa-dmi-wg] DMI Telcon Minutes Note: No call next week and the next call is on the 20th at the usual time (not the 21st!).
OGSA-DMI Telcon - 07/11/07 ==========================
Attendees:
Steve Newhouse, Microsoft Mario Antonioletti, EPCC Allen Luniewski, IBM Michel Drescher, Fujitsu
Agenda:
1. IPR Notice 2. Previous Minutes and Action Review 3. Agenda Bashing 4. Issues Arising 5. Spec Progress 6. AOB
Actions:
[Ravi] Resolve minor comments on the spec. [Ravi] Come up with a proposal on how to represent data aggregates (e.g. multiple files) with Data EPRs to the mailing list. [Ravi] Come up with a proposal for a multiple retries property to the mailing list. [Steve/Michel] Come up with a proposal for Data EPRs that can deal with the three scenarios proposed below.
http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000293.html
+---
Discussion on the Data EPRs and scenarios was postponed until Ravi becomes available or comments on the suggestions proposed by Michel:
http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000293.html
and Steven:
http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-October/000294.html
Discussion thus proceeded on how to address transfer failures by the DMI architecture.
DTF transport negotiation is done by protocol matching but, at run time, other factors may come into play that prevent the transfer from succeeding, e.g. fire walls. At the moment this means that the DTI will report the failure to the client but the client has no current way to communicate this information back to the DTF - the client can talk to the DTF again to effect the transfer but this will probably lead to the same protocols being chosen and thus reproduce the same failure. Allen suggested three possible solutions.
http://www.ogf.org/pipermail/ogsa-dmi-wg/2007-November/000301.html
Neither 1 and 2 are preferred, 3 requires active negotiation which is not in scope for this version of the spec. Not clear that it would not be able to do the transfer unless it performed a small test.
The following solutions were discussed all starting from the point where the data transfer has began with the DTI but then failed:
1. The DTI returns the failure to the client with modified DEPRs to the client - effectively new ones with the failed protocol removed from the list of supported protocols. The client can then use these DEPRs to retry the transfer through the DTF.
2. The DTI returns the failed protocol to the client, the client modifies the data EPRs (as in 1 by removing the failed transport protocol) and resubmits these to the DEPRs.
3. The client passes on the DEPRs to the DTF as well as any failure messages returned by the DTI to the DTF which is then informed of the protocols that do not work (this might mean aggregating multiple failure messages which does not seem desirable).
4. The DTI is able to communicate an outcome of a transfer to the DTF which is then informed of what protocols do not work and it may be able to act on this.
5. There is a user agent between the DTF and DTI that maintains state and is able to apply some re-try policy on behalf of the client. This would maintain the clean interfaces and state models already in the spec.
Michel is not at all keen on 1 as the minting of the DEPRs should be done by third parties. Allen not keen on 2 as this is making clients do stuff that should not really be in their scope. Idea now is that interested parties will flesh out the use case above (or some other) that most appeals to them noting the pros and cons. It was felt that we should not produce a first version of this spec that does not address this problem.
Other factor addressed in the call was when the DTF can match more than one protocol - how does it chose what protocol to use? The fastest? The cheapest? etc? It was thought that for version 1 of the scope this would not be addressed - i.e. the client does not provide hints or QoS parameters BUT the DTF should publish what algorithm it will use to choose a transport protocol when there is more than one valid choice.
There will be no DMI call next week in part to SC07 and other commitments.
The next call on the 20th of November at the same time (this is so as to not clash with the night before thanks giving).
+--------------------------------------------------------------------- --+ |Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ. | |Tel:0131 650 5141|mario@epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/ ~mario/ | +--------------------------------------------------------------------- --+ -- ogsa-dmi-wg mailing list ogsa-dmi-wg@ogf.org http://www.ogf.org/mailman/listinfo/ogsa-dmi-wg
-- ogsa-dmi-wg mailing list ogsa-dmi-wg@ogf.org http://www.ogf.org/mailman/listinfo/ogsa-dmi-wg
participants (3)
-
Allen Luniewski
-
Mario Antonioletti
-
Michel Drescher