Jini+DRMAA integration plus Condor problem
To the members of the DRMAA list for information. We have developed a Jini wrapper service to interface/integrate legacy batch runtime environments into Jini service communities (specifically into our JGrid service grid - more info at http://pds.irt.vein.hu/jgrid). Our first implementation of this service uses the Sun Grid Engine and its Java DRMAA API in the backend. The service supports input and output file staging, so remote user can upload files, execute commands/programs and save the results more or less transparently. The service is under testing but basically it works very nicely. We are also creating a version with a backend interface to the Condor system. This is where we have a problem and perhaps some of you can help us out. Since Condor only provides a C implementation of the DRMAA spec we our creating a new Java/JNI JDRMAA version based on the SGE one. The first real problem we have is that Condor does not support the mandatory Working Directory attribute and at the moment I cannot see how one can programmatically submit jobs to Condor without this. Where should users put their executables? Will all go to a system-wide submit directory? There may be a simple answer to this, so excuse me if I just miss some information or don't understand how Condor uses the file system for this. Any help is appreciated. Zoltan Zoltan Juhasz Dept of Information Systems University of Veszprem, Hungary
Hello,
We are also creating a version with a backend interface to the Condor system. This is where we have a problem and perhaps some of you can help us
You may be interested in a DRMAA Condor demo I gave on GGF12. One of the outcomes is a configure script which supports both SGE and Condor: http://www.dcl.hpi.uni-potsdam.de/research/grid/drmaa/
out. Since Condor only provides a C implementation of the DRMAA spec we our creating a new Java/JNI JDRMAA version based on the SGE one. The first real problem we have is that Condor does not support the mandatory Working Directory attribute and at the moment I cannot see how one can programmatically submit jobs to Condor without this. Where should users put their executables? Will all go to a system-wide submit directory? There may be a simple answer to this, so excuse me if I just miss some information or don’t understand how Condor uses the file system for this.
Condor distinguishes between installations with and without a shared file system. This is configured with the FILESYSTEM_DOMAIN configuration parameter. In the first case you need to give an absolute path for the executable in the job template. Condor will use exactly this path in order to find the executable, which may lead sometimes to problems with automounter paths. In the second case, Condor will transfer your executable to a spool directory on the destination machine. Therefore it should be enough to use the executable name only, all file accesses are relative to the spool directory. The latest Condor 6.7.2 version lacks some DRMAA functionality, you can find more information in the README file. Don't hesitate to contact me directly if you have any further questions. Regards, Peter.
Peter,
Condor distinguishes between installations with and without a shared file system. This is configured with the FILESYSTEM_DOMAIN configuration parameter. In the first case you need to give an absolute path for the executable in the job template. Condor will use exactly this path in order to find the executable, which may lead sometimes to problems with automounter paths. In the second case, Condor will transfer your executable to a spool directory on the destination machine. Therefore it should be enough to use the executable name only, all file accesses are relative to the spool directory. The latest Condor 6.7.2 version lacks some DRMAA functionality, you can find more information in the README file. Don't hesitate to contact me directly if you have any further questions.
We have solved this problem using the native specification attribute -- setting it transparently so the user does not have to bother with it -- but there is a new problem with Condor. When we submit anything, we get a DRMAA exception and the automatically generated submit file does not contain the name of the executable. See the details below. This is the submit file the Condor C DRMAA implementation creates (note the lack of the executable): ^^^^^^^^ # # Condor Submit file # Automatically generated by DRMAA library on Fri Nov 26 09:34:28 2004 # Log = /tmp/drmaa/logs/jgrid01.$(Cluster).$(Process).log Universe = standard initialdir=/home/jgrid/BatchService/jobs/95f9db31-5792-458a-9583-42a6fe9 dcdf7 Queue 1 The error we received: ^^^^^^^^ ERROR: Used queue command without specifying an executable ========================================================== org.ggf.drmaa.DRMAAException: Submitting job(s) at hu.vein.irt.condor.drmaa.CondorSession.nativeRunJob(Native Method) at hu.vein.irt.condor.drmaa.CondorSession.runJob(CondorSession.java:160) at hu.vein.irt.jgrid.batchservice.DrmaaBatchSubmitter.submit(DrmaaBatchSubm itter.java:47) at hu.vein.irt.jgrid.batchservice.BatchServiceBackend$DownloadPool.run(Batc hServiceBackend.java:957) printout of our job template in the Java drama wrapper -- JNI calls (drmaa_set_attribute). Note that the drama_remote_command is 'hello' , the name of our test program. This is lost by the time we get to the submit file. ^^^^^^^^^^^^^^^^^^^^^^^^ JNI: name=drmaa_remote_command, value=hello JNI: name=drmaa_wd, value=/home/jgrid/BatchService/jobs/95f9db31-5792-458a-9583-42a6fe9dcdf7 this sets the 'drmaa_native_spacification' instead of 'drmaa_wd': JNI: name=drmaa_native_specification, value=initialdir=/home/jgrid/BatchService/jobs/95f9db31-5792-458a-9583-4 2a6fe9dcdf7 JNI: name=drmaa_block_email, value=0 JNI: name=drmaa_output_path, value=:/home/jgrid/BatchService/jobs/95f9db31-5792-458a-9583-42a6fe9dcdf 7/default.out JNI: name=drmaa_error_path, value=:/home/jgrid/BatchService/jobs/95f9db31-5792-458a-9583-42a6fe9dcdf 7/default.err JNI: name=drmaa_join_files, value=n Can you suggest anything why this can happen? Thanks a lot. Zoltan
participants (2)
-
Peter Troeger -
Zoltan Juhasz