About obtaining the machines names in a parallel job

18 Mar 2010

      Dear All,

Discussions yesterday were great.
I understand why you don't want to put a mean to get the hostnamesfile for an 
MPI code, since it's should be transparently done in the configName (correct 
name if my rememberings are well).

But I thought of a different use case: a code is just launched on all 
machines. This code is a socket based one, thus it needs to know the other 
machine names to be able to run correctly.
Of course, this could be bypassed with the use of an external machine where a 
daemon runs, and where running codes can register -- I think of it like an 
omniNames running for example. Another solution is to encapsulate 
applications in an MPI code just to, maybe, have that information.

But don't you think that the cost is very big (if possible: a lot of policy is 
to not let run user code on the frontal, and a machine only knows that itself 
is taking part to the parallel run) compared to the possibility to at least 
having the possibility to copy the file containing the hostnames to all 
reserved nodes?

Bon courage for the discussions today!
Cheers.

.Yves.

-- 
Yves Caniou
Associate Professor at Université Lyon 1,
Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
  * in Information Technology Center, The University of Tokyo,
    2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
    tel: +81-3-5841-0540
  * in National Institute of Informatics
    2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
    tel: +81-3-4212-2412 
http://graal.ens-lyon.fr/~ycaniou/

Yves Caniou

Peter Tröger

Daniel Templeton

Yves Caniou

Peter Tröger

Mariusz Mamoński

tags

participants (4)