cluster:Worker/2

From Dgiref
Jump to: navigation, search

Contents

Introduction

Cluster/ grid middleware

Geographylogo.png Worker Node - a collection of servers to execute submitted batch jobs.
Batch system

Worker Nodes are the work horses of the system. Their role is to run batch jobs, submitted by the batch server. Jobs often spend most of their life cycle executing. While a job is running, its status can be queried with qstat. When a job has completed, by default, the stdout to store the output and stderr to store the errors files are created.


Package:    worker node release 2012.1
 os:             Scientific Linux version 5.6 64 bit
 server:        wn01 ... wn10
 manuals:   worker nodes


Archive links
Information links
Download links


Files links


MPI

Geographylogo.png The Message Passing Interface (MPI) is a standard developed by the Message Passing Interface Forum (MPIF).

The standard includes:

  • Point-to-point communication
  • Collective operations
  • Process groups
  • Communication contexts
  • Process topologies
  • Bindings for Fortran 77 and C
  • Environmental management and inquiry
  • Profiling interface

There are different implementation for the MPI standart. The D-Grid reference installation consider the http://www.open-mpi.org/ Open MPI] (MPI-2 Implementation) project. To use the MPI for the Grid infrastructure, the MPI-Start is needed. MPI-Start is a set of shell scripts to close the gap between the workload management system of a Grid insfrastructure and the configuration of the nodes on which MPI applications are run. To use the MPI, the following hosts should be configured withing the cluster:


Links:


Please open a NGI-DE ticket if you experience any Installation or Configuration problem.

Worker node

Prepare

Operating system
Scientific Linux version 5.6 64 bit

Optimizing the configuration:


Use minimal operating system installation without firewall. To verify installed packages use the command

  • rpm -qa | grep package_name

Install the following additional packages:

  • yum -y install wget yum rpm make gcc gcc-c++ tar sed zlib openssl

After the installation is complete, turn off any unnecessary services (like gpm, sendmail, cups, haldaemon, messagebus, pcmcia, anacron, atd) with the following command:

  • chkconfig <SERVICE> off

Configure the following settings for the server:

Firewall configuration

Allowing incoming connections directed to the WNs is optional and Resource Providers can freely decide whether to permit them on a voluntary base. However, when such inbound connections are blocked, data transfers using GridFTP will be forced to work in "single-stream" mode and their performance might be accordingly degraded (how to open port in firewall).

Service Incoming ports (TCP) Change to default configuration
GridFTP 20000-25000 Yes
Note-icon.png
  
WN should have an access to external network

Install

There are some types of packages to install into the Cluster Node that it will provide the Worker Node functionality

  • glite-WN packages to operate with Grid middleware

Configure

  • Mount File system
  • Configure users
  • Prepare WNs for gLite
    • The packages for the gLite middleware and OGSA-DAI will be provided by NFS server.
    • The middleware configuration for all Worker nodes is unique to any WN.
    • This require the corresponding WN performs the write rights for the configuration scripts to the directory /opt/glite-MW.
    • This directory mounts with appropriate write rights.
    • Rights configuration can be changed later, after general configuration.
    • The specific configuration can be implemented using the prepared templates on the: http://www.d-grid.de/index.php?id=132
    • The info.def, groups.conf and users.conf files are required for the WN configuration.
Note-icon.png
  
Note: The site-info.def have the JAVA_LOCATION which should be configured!

WARNING: The dgrid_env.sh script should be edited and the variables VOS, INSTALL_ROOT and DGRID_VO_DIRECTORY adjusted. The script ensures that only the D-Grid VOs users used the middleware environment variables.

Note-icon.png
  
The dgrid_env.sh is calling another script - grid_env.sh
  • Optional adjustment: In order to accelerate the WN configuration, the Certificates and CRLs configurations can be denied (they will be executed on the gLite-CE). This require removing the following functions from the $GLITE_DIR/glite/yaim/scripts/node-info.def:
    • install_certs_userland
    • config_fix_edg-fetch-crl-cron
    • config_crl

by using the function TAR_WN_FUNCTIONS.

Note-icon.png
  
The following error message is NOT important: [ERROR] Failed to add group


Update

Please open a NGI-DE ticket if you experience any Installation or Configuration problem.

MPI support

Prepare

Configuration is necessary on both the CEs (gLite) and WNs in order to support and advertise MPI correctly (see Site configuration for MPI for details). This is performed by the gLite YAIM module glite-yaim-mpi which should be run on both the CE and WNs.


Install

The following packages to install:

  • openmpi
  • MPI-START


Configure

  1. Add the following to the site-info.de of the CE and WNs. see YaimConfig for detailed information.
  2. export set of environment variables to avoid INFO: No MPI flavours enabled.
  3. execute yaim command to configure

WARNING: in /etc/hosts you have to set wn with full hostname, otherwise yaim wont't find hostname -f:hostname wn.fzk.de and the yaim will abort the configuration!!!

After yaim configuration has finished edit /etc/hosts again with wn older hostname, other wise the node will be seen twice as different node wn and wn.fzk.de while reserving nodes for an MPI job.!!!

Initial test

  • You can try submitting a job to your site using the instructions found via the page job submission
  • You can do some basic tests by logging in on a WN as a pool user and running the following:
Personal tools