RE: [cddlm] Basic POSIX Component & template draft

Hi, It sounds like you're having interesting discussions. What concerns me is how does this work relate to: - the hierarchical naming level of the OGSA naming system? - the work of the Grid File System working group? - ByteIO? - the OGSA Data Architecture? Dave.
-----Original Message----- From: owner-cddlm-wg@ggf.org [mailto:owner-cddlm-wg@ggf.org] On Behalf Of Hiro Kishimoto Sent: 08 May 2006 14:56 To: Steve Loughran Cc: OGSA WG; cddlm-wg@ggf.org Subject: Re: [cddlm] Basic POSIX Component & template draft
Hi Steve,
Your file & filesystem components are very good. They are what I am looking for. One more question I have in my mind is how to set user-id, group-id, and permissions to deployed files.
Can you call in OGSA-EMS session on Wednesday? It starts 5:45pm JST = 9:45am UK.
The dial-in number for this session; US: +1 718 3541071 (New York) or +1 408 9616509 (San Jose) UK: +44 (0)207 3655269 (London) Japan: +81 (0)3 3570 8225 (Tokyo) PIN: 4371991
Thanks, ---- Hiro Kishimoto
Hiro Kishimoto wrote:
Hi all,
Jun Tatemura and I took a home work assignment about "Basic POSIX Component" for BLAST application deployment.
Attached is my first-cut draft and hope to discuss at GGF17 OGSA EMS session. Please have a look and give your feedbacks before or at the GGF17. Your feedbacks are very important.
Thanks and see you in Tokyo!
This is really good, and I do agree, we do need a set of foundational things to deploy.
One thing I would argue, having learned the lessons of both Ant and SmartFrog, is that a consistent model of file system components comes first. That is, we need components to model
-files (with a liveness test that verifies the file is
(an ordered list of files/directories) -directories (with components that can create a dir on deployment, optionally to delete it and its contents on termination) -wildcarded sets of files (**/*.csv) -temporary files and directories -cached downloads -text files
I'm attaching the PDF file describing the current smartfrog set of components that do this. The underlying way that this is done is that all components that consider themselves to be sources of filenames/paths set the attribute absolutePath at runtime. They usually also export an RMI interface for RPC-style operations, along side their RESTy state.
A standard set of filesystem components lets you include the operations to create and clean up directories, temporary files and other housekeeping operations into the workflows of a deployment.
Looking at the Posix proposal, I'd suggest starting with
set of filesystem types, including a <fs:mount> component
mount a local or remote filesystem, and which would then pass the local location to interested parties as the filesystem:absolutePath attribute.
You absolutely don't need to introduce a new shared datatype FileSystemName, because that implies some kind of alternate naming/locating scheme outside of what CDL and the runtime has built in. Dynamic CDL references are sufficient to crosslink information inside the graph of deployed things, and offer flexibility as well as ease of management (you see them when you walk the graph).
First, we'd need filesystem types:
<tmp cdl:extends="fs:TempFileSystem"> <fs:Description> ... </fs:Description> <fs:DiskSpace> <fs:LowerBoundedRange>10737418240.0</fs:LowerBoundedRange> </fs:DiskSpace> </tmp>
<home cdl:extends="fs:Directory"> <fs:Description>Chris's home directory</fs:Description> <fs:dir>/home/csmith</fs:dir> </home>
When deployed, both of these would add absolutePath as an attribute, so that tmp/absolutePath woudl resolve to, say, tmp/work01234/ and home:absolutePath to /home/csmith.
You'd then mount a blast dir, say on an nfs filesys whose URL you provide
<blastfs cdl:extends="fs:RemoteFileSystem"> <fs:url>nfs://filestore/csmith/blast</fs:url> <fs:mountUser>csmith</fs:mountUser> </tmp>
The database file is under here, and you declare that it must exist. Deployment will fail if it is absent:
<db cdl:extends="fs:File"> <fs:dir cdl:ref="/tmp/absolutePath" cdl:lazy="true" /> <fs:filename>db/ncbiblast/est</fs:filename> <fs:filemustexist>true</fs:filemustexist> </db>
To use it, just refer to its path
<blast:database cdl:ref="/db/absolutePath" cdl:lazy="true"/>
Paths could be represented by some list-like element, whose entries are evaluated at deploy time and then a platform-specific
Steve Loughran wrote: present) -paths the reusable that could path created
as a result
<path cdl:extends="fs:Path"> <pathentry cdl:ref="/blastfs/absolutePath" cdl:lazy="true" /> <pathentry cdl:ref="/home/absolutePath" cdl:lazy="true"/> </path>
On Windows /path/absolutePath would become something like "\\filestore\csmith\blast;c:\documents and settings\csmith" while on unix it could be "/nfs/filestore/csmith/blast:/home/csmith"
This information now goes down to the jsdl component
<jsdl-posix:Environment name="PATH" cdl:ref="path:absolutePath" cdl:lazy="true"/> <jsdl-posix:Environment name="TMPDIR" cdl:ref="tmp/absolutePath" cdl:lazy="true"/>
-Steve

Dave Berry wrote:
Hi,
It sounds like you're having interesting discussions. What concerns me is how does this work relate to:
- the hierarchical naming level of the OGSA naming system?
CDL names are only scoped to the current deployment. You want to talk to something outside it, you need help, which could include components that do directory lookups. Very high on my todo list is to do some bridging between JNDI and the HP smartfrog/cddlm runtime, so that deployed stuff can access the graph of deployed components as if it were a JNDI directory tree. That's just so that you can design libraries that are fully decoupled from any particular deployment framework -all they need to know is the jndi URL to get their configuration. I think I will stop there; LDAP scares me.
- the work of the Grid File System working group?
- ByteIO?
- the OGSA Data Architecture?
Dave.
If what you are deploying is aware of stuff like OGSA--DAI, ByteIO, etc, you dont need all the file staging stuff, because it can talk direct to the services. What will still need is consistent configuration of things across your deployment, so different stuff running on different boxes all get the same pointers to the same remote data sources to work with. normal, non-lazy CDL bindings can share that kind of static information. -steve
participants (2)
-
Dave Berry
-
Steve Loughran