middleware:Glite/31/mpi
Contents |
Glite/31/mpi
Prepare
Configuration is necessary on both the CEs (gLite) and WNs in order to support and advertise MPI correctly (see Site configuration for MPI for details). This is performed by the gLite YAIM module glite-yaim-mpi which should be run on both the CE and WNs.
administrator's script: prepare.sh
#!/bin/bash# prepareexit 0
Install
The following packages to install:
-
glite-MPI_utils
administrator's script: install.sh
#!/bin/bash# install MPI packagesecho "[glite-MPI_utils]
name=glite 3.1 MPIenabled=1gpgcheck=0baseurl=http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-MPI_utils/sl4/i386/" >> /etc/yum/repos.d
yum install glite-MPI_utilsexit 0
Configure
- Add the following to the site-info.de of the CE and WNs. see YaimConfig for detailed information.
- export set of environment variables to avoid
INFO: No MPI flavours enabled. - execute yaim command to configure
WARNING: in /etc/hosts you have to set wn with full hostname, otherwise yaim wont't find hostname -f:hostname wn.fzk.de and the yaim will abort the configuration!!!
After yaim configuration has finished edit /etc/hosts again with wn older hostname, other wise the node will be seen twice as different node wn and wn.fzk.de while reserving nodes for an MPI job.!!!
administrator's script: configure.sh
#!/bin/bash# configure mpi on gLite CE# export variablesexport MPI_OPENMPI_ENABLE="yes"
export MPI_OPENMPI_VERSION="1.2.9"
/opt/glite/yaim/bin/yaim -c -s site-info.def -n MPI_CE -n lcg-CE -n glite-MON -n glite-TORQUE_utils -n glite-BDII_site
# To allow Torque to allocate the correct CPU number requisted by a MPI job:vi /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm
# change:$cluster = 0;
$cpu_per_node = 0;
# To:$cluster = 1;
$cpu_per_node = 1;
cat /var/spool/pbs/torque.cfg
SUBMITFILTER /var/spool/pbs/submit_filter.pl
# Change Torque dgipar queue configurationqmgr:#set queue dgipar resources_default.ncpus = 2
#changed to:set queue dgipar resources_default.ncpus = 1
Initial test
- You can try submitting a job to your site using the instructions found via the page job submission
- You can do some basic tests by logging in on a WN as a pool user and running the following:
administrator's script: test.sh
#!/bin/bash# initial test mpiUSER='griduser'
WN='worker node address'
ssh $USER@$WN
env|grep MPI_
# Result should be:# MPI_MPICC_OPTS=-m32# MPI_SSH_HOST_BASED_AUTH=yes# MPI_OPENMPI_PATH=/opt/openmpi/1.1# MPI_LAM_VERSION=7.1.2# MPI_MPICXX_OPTS=-m32# MPI_LAM_PATH=/usr# MPI_OPENMPI_VERSION=1.1# MPI_MPIF77_OPTS=-m32# MPI_MPICH_VERSION=1.2.7# MPI_MPIEXEC_PATH=/opt/mpiexec-0.80# MPI_MPICH2_PATH=/opt/mpich2-1.0.4# MPI_MPICH2_VERSION=1.0.4# I2G_MPI_START=/opt/i2g/bin/mpi-start# MPI_MPICH_PATH=/opt/mpich-1.2.7p1exit 0