DMI Telcon - 24 October 2007 ============================ Attendees: Steve Newhouse, Microsoft Mario Antonioletti, EPCC Allen Luniewski, IBM Ravi Madduri, ANL Agenda: 1. IPR Notice 2. Previous Minutes and Action Review 3. OGF21 Review 4. What Next? 5. AOB +--- Actions: [Ravi] Resolve minor comments on the spec. [Ravi] Come up with a proposal on how to represent data aggregates (e.g. multiple files) with Data EPRs to the mailing list. [Ravi] Come up with a proposal for a multiple retries property to the mailing list. [Steve/Michel] Come up with a proposal for Data ERPs that can deal with the three scenarios proposed below. No notes were taken at the OGF21 session. Ravi does not have time to look at the document at the moment so Steven will take the write token to extend the Data EPR section to deal with the scenarios below. Ravi will then resolve some of his own minor outstanding comments in the spec. In addition Ravi will propose how the spec can specify: Number of retries to do a transfer. How Data EPRs can be used to represent data aggregates, in particular multiple files (though it should not be file centric). OGF21 - re-opened the discussion on the level of abstraction to be used to represent data and how to resolve source and sinks. Ravi is concerned that current representation of Data EPRs is too abstract and requires at least an additional operation to resolve the data EPRs to files. This would add unnecessary overhead - a more explicit representation of the data to be transfered is desired. In addition, there is nothing out there that currently implements the name resolution already so we would have to specify this and require data providers to implement in order for DMI to be useful. Allen is concerned that this might make files special in the DMI architecture which would be highly undesirable. Steven discussed the above with Michel in Seattle and the thought that the present architecture could handle all three use cases without having to refactor the DMI document, mainly: 1. The Data EPR could contains a URL which would represent the transfer protocol, for instance ftp, and the data, e.g. a file that is to be transferred. 2. The Data EPR could could have information specifying multiple supported protocols that the DMTF would have to negotiate with the data source/sink to find the best protocol for the data transfer. 3. Have a logical Data EPR that would have to be resolved to obtain information about the data that is to be transferred and the supported protocols. Steven/Michel to substantiate this proposal for the group to consider. Other OGF points Ravi talked to Erwin Laure at OGF - one of the things that would be desirable is for DMI to provide a reference model for RFT, FTS, RDF, srmcopy and SDF to look to similar to each other in terms of the inputs and output and behaviour. Michel & Steven had some more DMI discussion at the OGSA f2f - got some comments there - will be in the OGSA f2f meeting minutes. Largely the comments were positive - there were no objections. We need to get a period of stability in the spec in order to be able to produce some implementation and come together to resolve any issues afore the spec goes to public comment. Steven mentioned that some security stuff came up - The HPC Basic profile is currently looking at activity credentials - passing credentials to do things like data movement on behalf of a user - there is some recognition that the DMI use case for credentials is very similar to the HPC Profile. Can try to leverage off any HPC Profile work on this work. Not worth blocking on but it might be worth reconsidering after the DMI implementations phase. Mario will not be able to make the call next week. +-----------------------------------------------------------------------+ |Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ. | |Tel:0131 650 5141|mario@epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/~mario/ | +-----------------------------------------------------------------------+
Hi all, I am very sorry - I had ( and still have) my mind totally around our demonstration next week in Tokyo. This also means that I will not be able to make it to the conference call next week. On the re-opening of the DataEPR at OGF21: After the extensive talk with Steven (and a lesser one with Allen due to time constraints - sorry) I am very positive that we can find a solution having the specification very little changed. To elaborate more, let's again recap the participating agents: Source: Considered to be a Web Service of unspecified nature, but ultimately intended to emit an ordered sequence of bytes. Sink: ditto, except that the Sink is intended to receive a sequence of bytes. Data Transfer Factory: A Web Service portType that architecturally creates instances of Data Transfer Instances. Data Transfer Instance: A Web Service portType that models data transfers on the Web Services layer (but is in fact the controller of an underlying data transfer protocol. Data Transfer Protocol (DTP): A proprietary protocol that is understood by protocol-specific implementations both at the Source and at the Sink. DataEPR A construct which essentially is a profile on the WS-Addressing EndpointReferenceType XML complex type. This profile defines a complex type data structure that further describes details and requirements for particular Data Transfer Protocols. More specifically, this complex type is named "AvailableProtocols", and is a list of triples further defined as: a) <normative identifier for the data transfer protocol> defined by OGSA-DMI, b) <data transfer specific URL>, defined by a suitable entity. Which entity that shall be is not defined by OGSA-DMI. c) <data transfer protocol specific security credentials>, which are, when passed into the Data Transfer Factory, considered as credentials that must be used, when the associated Data Transfer Protocol is actually use to transfer the data from Source to Sink. As DMI uses two DataEPRs, one in the source context, and one in the sink context. That means that two sets of credentials, and also two URLs may be present for each Data Transfer Protocols: One credential set and one URL for the Source, and one credential set and one URL for the Sink, should the associated Data Transfer Protocol be selected to undertake the data transfer. To allow support for all different scenarios as mentioned below, the specification should not mandate that AvailableProtocol constructs must be present in the DataEPRs. Instead, the OGSA-DMI specification should RECOMMEND or make OPTIONAL that AvailableProtocol constructs are present. Furthermore, the OGSA-DMI should explicitly allow the Data Transfer Factory to solicit any information found in the AvailableProtocols section, if present. This allows for the following scenarios, actually Profiles on OGSA- DMI. The reason to call them Profiles is that I do not see us consent on one particular, specific model in practical Grids. However, for Interoperability sessions we will have to consent on at least one. This will be difficult, but this is possible, as the BES/HPCP interop session at SC'06 demonstrated. Now the profiles are as follows, named after the persons that I see them prefer this: The Ravi Profile ============ The Ravi Profile specifies that: - the DataEPR for the Source MUST contain an AvailableProtocols elemet in its metadata section, with exactly one Data Transfer Protocol triplet (i.e. for GridFTP). - the DataEPR for the Sink MUST contain an AvailableProtocols elemet in its metadata section, with exactly one Data Transfer Protocol triplet (i.e. for GridFTP). - the Data Transfer Factory MUST use the provided values to instantiate a data transfer and to instantiate a Data Transfer Instance. This way, the Ravi Profile uses the DataEPR merely as a degenerated container for some specific URLs. No particular logic is necessary, and the DTF can instantiate the DTI straight away. The Ravi Profile does not need to specify anything further because, as far as this profile is concerned, all necessary data will ultimately be provided by the client to the DTF. Where the client gets this data from is irrelevant as the DTF MUST NOT solicit this data and rely on its correctness. The Steven Profile ============== The Steven Profile is similar to the Ravi Profile, only that it allows for more than one Data Transfer Protocol triplets in the DataEPR metadata sections. In this profile, however, the DTF must perform a simple matching of Data Transfer Protocol ids provided in the AvailableProtocols elements in the DataEPRs for the Source and the Sink. If the DTF finds one or more matches, it then will select one Data Transfer Protocol at its discretion and instantiate the DTI accordingly, using the information found in the relevant protocol triples in the DataEPRs. The Michel Profile ============== The Michel Profile goes beyond the Steven Profile in that: - A DataEPR, whether for the Source or the Sink, MUST contain an AvailableProtocols section, and - Each of the AvailableProtocols section MUST contain at least one Protocol triplet, and - Each present Protocols triplet MUST contain a DTP identifier and MUST contain credentials that will be used if the referenced DTP is selected for the data transfer conduction, and - Each present protocol triplet MAY contain a DTP specific URL. The Michel Profile leaves it to the implementation of the DTF whether the DTF solicits the DTP specific URL (and availability of DTPs, by the way!) by either using a standardised mechanism, a proprietary mechanism, or no mechanism at all (i.e. using heuristics). The Michel Profile uses the AvailableProtocols data structure merely as a convenient container for the security credentials that must be used when the associated DTP has been selected to conduct the data transfer. The selection of the DTP may or may not depend on how the DTI is implemented, for example a generic implementation for all sorts of and supported DTPs, or specific implementations each for one DTP. The Allen Profile ============= The Allen Profile is very similar to the Michel Profile, adding that - the DTF MUST solicit the DTP specific URLs using a standardised WS portType at the Source and at the Sink. The Allen Profile clearly identifies the WS portType the MUST be used to solicit which data transfer protocols are available, and which URLs to use when conducting the data transfer using a selected DTP. There is, however, one issue where I probably misunderstood Allen: Security. As far as I understood, the DataEPRs must not (or should not?) contain AvailableProtocols sections. If this is true, then this leaves the question on how the necessary security credentials are passed into the DTF. It might be that Allen thinks that this question is orthogonal to what DMI should standardise on, ad leave it to the implementations of the DTF and the implementations of any prospective clients to the DTF (and the DTI?). ----- Folks, does this longish gibberish makes sense to you? I very much hope so as I see this at the moment as the only solution for us all to consent on one concise OGSA-DMI Functional Specification. Cheers, Michel On 24 Oct 2007, at 18:21, Mario Antonioletti wrote:
DMI Telcon - 24 October 2007 ============================
Attendees:
Steve Newhouse, Microsoft Mario Antonioletti, EPCC Allen Luniewski, IBM Ravi Madduri, ANL
Agenda:
1. IPR Notice 2. Previous Minutes and Action Review 3. OGF21 Review 4. What Next? 5. AOB
+---
Actions:
[Ravi] Resolve minor comments on the spec. [Ravi] Come up with a proposal on how to represent data aggregates (e.g. multiple files) with Data EPRs to the mailing list. [Ravi] Come up with a proposal for a multiple retries property to the mailing list. [Steve/Michel] Come up with a proposal for Data ERPs that can deal with the three scenarios proposed below.
No notes were taken at the OGF21 session.
Ravi does not have time to look at the document at the moment so Steven will take the write token to extend the Data EPR section to deal with the scenarios below.
Ravi will then resolve some of his own minor outstanding comments in the spec. In addition Ravi will propose how the spec can specify:
Number of retries to do a transfer. How Data EPRs can be used to represent data aggregates, in particular multiple files (though it should not be file centric).
OGF21 - re-opened the discussion on the level of abstraction to be used to represent data and how to resolve source and sinks.
Ravi is concerned that current representation of Data EPRs is too abstract and requires at least an additional operation to resolve the data EPRs to files. This would add unnecessary overhead - a more explicit representation of the data to be transfered is desired. In addition, there is nothing out there that currently implements the name resolution already so we would have to specify this and require data providers to implement in order for DMI to be useful.
Allen is concerned that this might make files special in the DMI architecture which would be highly undesirable.
Steven discussed the above with Michel in Seattle and the thought that the present architecture could handle all three use cases without having to refactor the DMI document, mainly:
1. The Data EPR could contains a URL which would represent the transfer protocol, for instance ftp, and the data, e.g. a file that is to be transferred.
2. The Data EPR could could have information specifying multiple supported protocols that the DMTF would have to negotiate with the data source/sink to find the best protocol for the data transfer.
3. Have a logical Data EPR that would have to be resolved to obtain information about the data that is to be transferred and the supported protocols.
Steven/Michel to substantiate this proposal for the group to consider.
Other OGF points
Ravi talked to Erwin Laure at OGF - one of the things that would be desirable is for DMI to provide a reference model for RFT, FTS, RDF, srmcopy and SDF to look to similar to each other in terms of the inputs and output and behaviour.
Michel & Steven had some more DMI discussion at the OGSA f2f - got some comments there - will be in the OGSA f2f meeting minutes. Largely the comments were positive - there were no objections.
We need to get a period of stability in the spec in order to be able to produce some implementation and come together to resolve any issues afore the spec goes to public comment.
Steven mentioned that some security stuff came up - The HPC Basic profile is currently looking at activity credentials - passing credentials to do things like data movement on behalf of a user - there is some recognition that the DMI use case for credentials is very similar to the HPC Profile. Can try to leverage off any HPC Profile work on this work. Not worth blocking on but it might be worth reconsidering after the DMI implementations phase.
Mario will not be able to make the call next week.
+--------------------------------------------------------------------- --+ |Mario Antonioletti:EPCC,JCMB,The King's Buildings,Edinburgh EH9 3JZ. | |Tel:0131 650 5141|mario@epcc.ed.ac.uk|http://www.epcc.ed.ac.uk/ ~mario/ | +--------------------------------------------------------------------- --+ -- ogsa-dmi-wg mailing list ogsa-dmi-wg@ogf.org http://www.ogf.org/mailman/listinfo/ogsa-dmi-wg
participants (2)
-
Mario Antonioletti
-
Michel Drescher