some questions on "DRMAA JOB TEMPLATE ATTRIBUTES"
Hi all, I wrote a small DRMAA C program to test how to submit a parallel job through SGE 6. The small program was used for submitting a sge6 script file -- "test_submit.sh" in "/home/group1/test", I did as, 1. fill "drmaa_wd" with "/home/group1/test" 2. fill "drmaa_remote_command" with "test_arg.sh" The source code piece looked like this, ....... drmaa_set_attribute (jt, DRMAA_JOB_NAME, "test_submit", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_WD, "/home/group1/test", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); ........... it did not work, and sge6 gave error information, "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1 failed on host compute2 general searching requested shell because: 12/26/2006 21:25:38 [501:10278]: execvp(test_submit.sh, "test_submit.sh") failed: No such file or directory" but after I changed the value of "DRMAA_REMOTE_COMMAND" to absolutely full path --- drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "/home/group1/test/test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); it worked well. Another question, While I write a SGE script file for parallel job, the script file looks like this, ........ /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS /home/group1/test/test_arg 4 SGE6 will give $NSLOTS a suitable value while it submits the job. If I want to do this only through DRMAA API, how to implement it. Happy New Year to all members. Peter Zhu Dec 26, 2006 __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Happy New Year everyone :) Peter Zhu wrote:
Another question, While I write a SGE script file for parallel job, the script file looks like this, ........ /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS /home/group1/test/test_arg 4 SGE6 will give $NSLOTS a suitable value while it submits the job. If I want to do this only through DRMAA API, how to implement it.
I have been thinking about the same problem, AFAIK at present this is not covered by the standard, SGE_TASK_ID is another such example. However, I think this can easily be implemented by the various DRM vendors through a small utility/script, e.g. drmaa -t # return task id drmaa -n # return number of procs/slots etc. then, in your script you would have NP=`drmaa -n` mpirun -np $NP ... Cheers f.
Hello, happy new year !
The source code piece looked like this, ....... drmaa_set_attribute (jt, DRMAA_JOB_NAME, "test_submit", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_WD, "/home/group1/test", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); ...........
it did not work, and sge6 gave error information, "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1 failed on host compute2 general searching requested shell because: 12/26/2006 21:25:38 [501:10278]: execvp(test_submit.sh, "test_submit.sh") failed: No such file or directory"
but after I changed the value of "DRMAA_REMOTE_COMMAND" to absolutely full path --- drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "/home/group1/test/test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); it worked well.
The DRMAA spec looks a little bit unclear here, but SGE confirms to the traditional Unix thinking. Working directory means the directory "where the job is executed", which is in first place an indication for the location of input and output files. The exec command in Unix searches the PATH directories, which may not include the current directory ("."). Therefore, the exec command cannot locate the binary on your execution host without full path. It is the same reason why you must type "./test_submit.sh" instead of "test_submit.sh" in your shell, even if you are in the right directory. This is for security - ask your local administrator ;-) ... The funny thing is that the same program works with Condor, even if "." is not in the PATH. Condor seems to search the working directory automatically, maybe for compatibility reasons between Windows and Unix submission files. DRMAA has no real chance to do something about that, since the library implementation cannot influence the execution host mechanisms.
Another question, While I write a SGE script file for parallel job, the script file looks like this, ........ /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS /home/group1/test/test_arg 4 SGE6 will give $NSLOTS a suitable value while it submits the job. If I want to do this only through DRMAA API, how to implement it.
This relates to DRM monitoring issues, and is not covered in DRMAA 1.0 so far. We know that users want more placeholders in job templates, so there is a discussion wiki page about possible new parameters: http://www.drmaa.org/wiki/index.php/DrmaaJobTemplatePlaceholders Please feel free to add your demanded job template parameter there. It would be great if you add the SGE-specific implementation of your suggestion, so that we can check if other DRM systems are also able to handle this. Thanks, Peter.
Peter Troeger wrote:
Hello,
happy new year !
The source code piece looked like this, ....... drmaa_set_attribute (jt, DRMAA_JOB_NAME, "test_submit", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_WD, "/home/group1/test", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); ...........
it did not work, and sge6 gave error information, "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1 failed on host compute2 general searching requested shell because: 12/26/2006 21:25:38 [501:10278]: execvp(test_submit.sh, "test_submit.sh") failed: No such file or directory"
but after I changed the value of "DRMAA_REMOTE_COMMAND" to absolutely full path --- drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "/home/group1/test/test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); it worked well.
The DRMAA spec looks a little bit unclear here, but SGE confirms to the traditional Unix thinking. Working directory means the directory "where the job is executed", which is in first place an indication for the location of input and output files. The exec command in Unix searches the PATH directories, which may not include the current directory ("."). Therefore, the exec command cannot locate the binary on your execution host without full path. It is the same reason why you must type "./test_submit.sh" instead of "test_submit.sh" in your shell, even if you are in the right directory. This is for security - ask your local administrator ;-) ...
I think this case is the reason for the $drmaa_wd$ placeholder. If you make your executable path "$drmaa_wd$/test_submit.sh", it should work. Daniel
The funny thing is that the same program works with Condor, even if "." is not in the PATH. Condor seems to search the working directory automatically, maybe for compatibility reasons between Windows and Unix submission files. DRMAA has no real chance to do something about that, since the library implementation cannot influence the execution host mechanisms.
Another question, While I write a SGE script file for parallel job, the script file looks like this, ........ /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS /home/group1/test/test_arg 4 SGE6 will give $NSLOTS a suitable value while it submits the job. If I want to do this only through DRMAA API, how to implement it.
This relates to DRM monitoring issues, and is not covered in DRMAA 1.0 so far. We know that users want more placeholders in job templates, so there is a discussion wiki page about possible new parameters:
http://www.drmaa.org/wiki/index.php/DrmaaJobTemplatePlaceholders
Please feel free to add your demanded job template parameter there. It would be great if you add the SGE-specific implementation of your suggestion, so that we can check if other DRM systems are also able to handle this.
Thanks, Peter.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Daniel Templeton wrote:
Peter Troeger wrote:
Hello,
happy new year !
The source code piece looked like this, ....... drmaa_set_attribute (jt, DRMAA_JOB_NAME, "test_submit", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_WD, "/home/group1/test", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); ...........
it did not work, and sge6 gave error information, "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1 failed on host compute2 general searching requested shell because: 12/26/2006 21:25:38 [501:10278]: execvp(test_submit.sh, "test_submit.sh") failed: No such file or directory"
but after I changed the value of "DRMAA_REMOTE_COMMAND" to absolutely full path --- drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "/home/group1/test/test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); it worked well.
The DRMAA spec looks a little bit unclear here, but SGE confirms to the traditional Unix thinking. Working directory means the directory "where the job is executed", which is in first place an indication for the location of input and output files. The exec command in Unix searches the PATH directories, which may not include the current directory ("."). Therefore, the exec command cannot locate the binary on your execution host without full path. It is the same reason why you must type "./test_submit.sh" instead of "test_submit.sh" in your shell, even if you are in the right directory. This is for security - ask your local administrator ;-) ...
I think this case is the reason for the $drmaa_wd$ placeholder. If you make your executable path "$drmaa_wd$/test_submit.sh", it should work.
Daniel
Oops! That should be $drmaa_wd_ph$. Daniel
The funny thing is that the same program works with Condor, even if "." is not in the PATH. Condor seems to search the working directory automatically, maybe for compatibility reasons between Windows and Unix submission files. DRMAA has no real chance to do something about that, since the library implementation cannot influence the execution host mechanisms.
Another question, While I write a SGE script file for parallel job, the script file looks like this, ........ /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS /home/group1/test/test_arg 4 SGE6 will give $NSLOTS a suitable value while it submits the job. If I want to do this only through DRMAA API, how to implement it.
This relates to DRM monitoring issues, and is not covered in DRMAA 1.0 so far. We know that users want more placeholders in job templates, so there is a discussion wiki page about possible new parameters:
http://www.drmaa.org/wiki/index.php/DrmaaJobTemplatePlaceholders
Please feel free to add your demanded job template parameter there. It would be great if you add the SGE-specific implementation of your suggestion, so that we can check if other DRM systems are also able to handle this.
Thanks, Peter.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Gah! I blame it on the flu. Ignore both of my previous replies. The $drmaa_wd_ph$ string only works in the input, error, and output stream attributes. It cannot be used with the remote command attribute. Now that the holidays are over, I'll have a look into your issue. If you had asked me what the SGE behavior should have been, I would have said exactly what you expected as well. I'm a bit surprised that it doesn't work. Daniel Daniel Templeton wrote:
Daniel Templeton wrote:
Peter Troeger wrote:
Hello,
happy new year !
The source code piece looked like this, ....... drmaa_set_attribute (jt, DRMAA_JOB_NAME, "test_submit", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_WD, "/home/group1/test", error, DRMAA_ERROR_STRING_BUFFER); drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); ...........
it did not work, and sge6 gave error information, "12/26/2006 19:23:04|qmaster|einstein|W|job 43.1 failed on host compute2 general searching requested shell because: 12/26/2006 21:25:38 [501:10278]: execvp(test_submit.sh, "test_submit.sh") failed: No such file or directory"
but after I changed the value of "DRMAA_REMOTE_COMMAND" to absolutely full path --- drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, "/home/group1/test/test_submit.sh", error, DRMAA_ERROR_STRING_BUFFER); it worked well.
The DRMAA spec looks a little bit unclear here, but SGE confirms to the traditional Unix thinking. Working directory means the directory "where the job is executed", which is in first place an indication for the location of input and output files. The exec command in Unix searches the PATH directories, which may not include the current directory ("."). Therefore, the exec command cannot locate the binary on your execution host without full path. It is the same reason why you must type "./test_submit.sh" instead of "test_submit.sh" in your shell, even if you are in the right directory. This is for security - ask your local administrator ;-) ...
I think this case is the reason for the $drmaa_wd$ placeholder. If you make your executable path "$drmaa_wd$/test_submit.sh", it should work.
Daniel
Oops! That should be $drmaa_wd_ph$.
Daniel
The funny thing is that the same program works with Condor, even if "." is not in the PATH. Condor seems to search the working directory automatically, maybe for compatibility reasons between Windows and Unix submission files. DRMAA has no real chance to do something about that, since the library implementation cannot influence the execution host mechanisms.
Another question, While I write a SGE script file for parallel job, the script file looks like this, ........ /usr/local/mpich-ifort/bin/mpirun -np $NSLOTS /home/group1/test/test_arg 4 SGE6 will give $NSLOTS a suitable value while it submits the job. If I want to do this only through DRMAA API, how to implement it.
This relates to DRM monitoring issues, and is not covered in DRMAA 1.0 so far. We know that users want more placeholders in job templates, so there is a discussion wiki page about possible new parameters:
http://www.drmaa.org/wiki/index.php/DrmaaJobTemplatePlaceholders
Please feel free to add your demanded job template parameter there. It would be great if you add the SGE-specific implementation of your suggestion, so that we can check if other DRM systems are also able to handle this.
Thanks, Peter.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
participants (4)
-
Daniel Templeton -
Fred L Youhanaie -
Peter Troeger -
Peter Zhu