MPI in BalticGrid
Descriptiong about how to run MPI in BalticGrid
Using MPI by Kestas: http://www.balticgrid.org/Members/kpaulikas/elen-glite-MPI.pdf
http://goc.grid.sinica.edu.tw/gocwiki/MPI_Support_with_Torque
It could be usefull to install mpirun in $PATH before your official mpirun.
This is because gLite 3.0 WMS wraps mpi jobs to run mpirun.
ssh/scp could be used for emulate rsh/rcp from master
node to other node (in the same PBS job).
local.sh script is used to setup cluster environment with PBS support.
Current Status
Sites, supporting MPI (JobType = "MPICH"):
| CE Name |
NFS home | mpiexec |
MPICH | Notes | Last checked |
|---|---|---|---|---|---|
| grid2.mif.vu.lt |
No |
OK |
1.2.7p1 |
- |
20070422 |
| grid5.mif.vu.lt | No |
OK |
1.2.7p1 |
- |
20070422 |
| birzs.latnet.lv | Yes |
OK |
1.2.6 |
failing ops SFT |
20070308 |
| pupa.elen.ktu.lt | Yes |
OK |
1.2.6 |
- |
20070422 |
| atomas.itpa.lt | Yes |
OK |
1.2.7p1 |
mpicxx does not work, use mpicc for C++ |
2006 |
| kriit.eenet.ee | Yes |
OK |
1.2.7p1 |
- |
20070425 |
| grid.vtu.lt |
No |
OK |
1.2.6 |
- |
20070422 |
| zeus02.cyf-kr.edu.pl | Yes |
OK |
1.2.6 |
- |
20070422 |
| ce01.grid.etf.rtu.lv |
Yes |
OK |
1.2.6 |
- |
20070422 |
| vdupdc.vdu.lt |
No |
OK |
?? |
- |
20070422 |
| grid.su.lt |
Yes |
OK |
?? |
- |
20070422 |
| ce.egee.man.poznan.pl |
?? |
?? |
?? |
incorrect output |
20070422 |
| grid6.mif.vu.lt |
Yes |
OK |
?? |
20070422 | |
| ce.bg.ktu.lt |
Yes |
OK |
?? |
20070422 |
mpiexec
mpiexec siteCurrently all sites have mpiexec installed. This program may be used to control all nodes, allocated for the job. See mpiexec
manual page for details.
Add -kill switch, when possible to mpiexecinvocations to make jobs exit if erros happen.
Start MPI application
To start MPI application on all allocated nodes:
NOTE: File mpiprogram must be available and executable on all nodes. If You compiled it yourself and You are working in non-shared folder (see table above), You must copy mpiprogramto other nodes before.
mpiexec -kill mpiprogram
Run non-MPI system application
To run non-MPI system application hostname
on every node and display output, use:
mpiexec -comm none -pernode -kill -nostdin hostname
Environment variables
If You set environment variables before mpiexec
invocation, they are available on other nodes:
$ export SOMEVAR=somevalue $ mpiexec -comm none -kill echo \$SOMEVAR \$HOSTNAME somevalue host1 somevalue host2 somevalue host3 somevalue host4
To copy file
To copy file myfile
to other nodes:
NOTE: on grid2.mif.vu.lt You can use mpicopy command, seefor details
mpiexec -comm none -pernode -kill -nolocal -allstdin cat \> myfile < myfile
To control nodes
To control individual nodes, You ned to create configuration file for mpiexec
:
node1 node2 node3 : command1 node4 : command2
Use of mpiexec config file
Sample sh script to illustrate use of mpiexec
config file:
#!/bin/sh -x
# NODES will contain host names, allocated to the job without duplicates
NODES=`uniq < $PBS_NODEFILE | xargs`
# copynodes will contain list of hosts, which are missing MPI executable localy
copynodes=""
# Loop to fill copynodes
for node in $NODES; do
echo "Checking node $node ..."
echo "$node : test -x $MPIEXENAME" > mpiexec.conf
if `mpiexec -comm none -pernode -kill -nostdin -config mpiexec.conf`; then
echo "OK. Node \"$node\" already has file $MPIEXENAME"
else
echo "WARNING. Node \"$node\" has no copy of $MPIEXENAME"
copynodes="$node $copynodes"
fi
done
# If copynodes is empty - it means we have shared $HOME, no copying needed
if [ ! -z "$copynodes" ]; then
# This copying should be done for sites with nonshared $HOME
echo "Copying MPI excutable to nodes $copynodes..."
echo "$copynodes : mkdir -p $PWD" > mpiexec.conf
mpiexec -verbose -pernode -comm none -kill -nostdin -config mpiexec.conf
echo "$copynodes : cat > $PWD/$MPIEXENAME" > mpiexec.conf
mpiexec -verbose -pernode -comm none -kill -allstdin -config mpiexec.conf < $MPIEXENAME
echo "$copynodes : chmod a+x $PWD/$MPIEXENAME" > mpiexec.conf
mpiexec -verbose -pernode -comm none -kill -nostdin -config mpiexec.conf
echo "Copy done."
fi
Notes
- check $TMPDIR value. This folder may be unavailable on all nodes for MPICH jobs. Quick and dirty workaround: set TMPDIR to /tmp :) or undefine it. It is not standard torque behavior, but some patch applied. (Kęstas 2006.09.19)
- MPI-2 implementations (MPICH2, OpenMPI, LamMPI) have native mpiexec, which is different tool. (Kęstas 2006.10.11)
- TM interface, which is used by mpiexec is supported by torque-1.2 and above. No errors are displayed if You try to use mpiexec with earlier torque versions. (Kęstas 2006.10.11)
- Middleware tries to run JDL Executable via mpirun. It works ok with mpirun from MPICH package (only one copy of non-MPI program/script is started), but mpirun from OpenMPI package starts N copies on non-MPI executable. (Kęstas 2006-11-24)
MPI "testsuite" and results overview (2006.09.13)
Updated MPI "testsuite" (2006.09.22)
Implementation of MPI:https://twiki.cern.ch/twiki/bin/view/LCG/ImplementationOfMpi
gliteCE patches:https://twiki.cern.ch/twiki/bin/view/LCG/OurExperiments

