cluster:Torque/216/client

From Dgiref
Jump to: navigation, search
Please open a NGI-DE ticket if you experience any Installation or Configuration problem.

Contents

Torque/216/client

Prepare

Operating system
Scientific Linux version 4.5 64 bit

Optimizing the configuration:


Use minimal operating system installation without firewall. To verify installed packages use the command

  • rpm -qa | grep package_name

Install the following additional packages:

  • yum -y install wget yum rpm make gcc gcc-c++ tar sed zlib openssl

After the installation is complete, turn off any unnecessary services (like gpm, sendmail, cups, haldaemon, messagebus, pcmcia, anacron, atd) with the following command:

  • chkconfig <SERVICE> off

Configure the following settings for the server:



Firewall configuration

Be sure that if you have firewalls running on the server or node machines that you allow connections on the appropriate ports for each machine. TORQUE pbs_mom daemons use UDP port 1023 and the pbs_server/pbs_mom daemons use ports 15001-15004 by default (how to open port in firewall).


Note-icon.png
  
Firewall based issues are often associated with server to mom communication failures and messages such as 'premature end of message' in the log files.

Also, the tcpdump program can be used to verify the correct network packets are being sent.

Install

MAUI use encrypted connections between the client and server. Symmetric encryption keys embedded in the binaries. Therefore absolutely necessary to install, RPM packages for clients and servers from the same source, i.e. with the same keys in the binaries!


Configure

  • For each compute host, the MOM server must be configured to trust the pbs_server daemon. In TORQUE 2.0.0p5 and later, this can also be done by creating the $(TORQUECFG)/server_name file and placing the server hostname inside.
  • Additional config parameters may be added to $(TORQUECFG)/mom_priv/config (see the MOM config page for details.)

Data management allows jobs’ data to be staged in/out or to and from the server and compute nodes.

  • For shared filesystems (i.e., NFS, DFS, AFS, etc.) use the $usecp parameter in the mom_priv/config files to specify how to map a user’s home directory. (Example: $usecp gridmaster.tmx.com:/home /home)
  • For local, non-shared filesystems, rcp or scp must be configured to allow direct copy without prompting for passwords (key authentication, etc.)


Proceed

To start / stop use the commands:


Initial test

  • From a user account, it should be possible to use a 'Hello World' job submitting, as well as an interactive shell on a WN
  • The job results are as files STDIN.o<JOBID> (std-output) and STDIN.e<JOBID> (std-error).
  • test MAUI
  • The test on the gLite-CE should work as edginfo user configuration of gLite-packages.


Update

To update use the standart rpm command syntax.


Personal tools